Everyone! We've moved to WordPress.

Is Excel to Blame? (and some updates)


First, the updates.

If you’re an avid follower or even a casual reader, you’ve probably noticed the formatting of this blog seems inconsistent, even amateur at times. At least, that’s how I see it. Maybe it’s me, but Blogger just doesn’t seem to correctly handle anything I want it to do. I’ve recently heard, too, that some folks have had trouble publishing comments. I don’t really know what’s going on; Blogger just sucks, I guess. I think I’m going to switch to WordPress in the future. When that day comes, I’ll let everyone know.

Right now, I’m writing this from Windows Live Writer. Hopefully, it does a better job than blogger’s editor. We’ll see when this is posted.

Ok, now a book update.

My publisher told me I’ve sold 32 copies of my book on preorder. Holy Crap! Thanks guys. Some readers have told me Chapters 1 and 2 are already available if you preorder through the alpha program. I’ve received some really good feedback about them. Thanks to the folks who’ve read them and reached out to me. I’ll be adding a link to the book at the top of the blog. Unfortunately, because blogger sucks, the mere addition of another link will mess up my layout. Then I’ll have to spend a few hours redoing everything. Therefore, it may a while after I poste this before the link appears. Just hold tight.

Is Excel to Blame?

If you’ve been following the news recently, two major events have brought focus on the use of Excel for modeling and research. The first is the London Whale incident, in which JPMorgan employed a spreadsheet based model for their default credit swaps. According to JPMorgan’s internal report, “Spreadsheet-based calculations were conducted with insufficient controls and frequent formula and code changes were made.” Specifically,

…further errors were discovered in the Basel II.5 model, including, most significantly, an operational error in the calculation of the relative changes in hazard rates and correlation estimates. Specifically, after subtracting the old rate from the new rate, the spreadsheet divided by their sum instead of their average, as the modeler had intended. This error likely had the effect of muting volatility by a factor of two and of lowering the VaR, 129 although it is unclear by exactly what amount, particularly given that it is unclear whether this error was present in the VaR calculation for every instrument, and that it would have been offset to some extent by correlation changes. It also remains unclear when this error was introduced in the calculation.

The other event is this this critique of a paper called Growth in a Time of Debt by Harvard economists, Reinhart and Rogoff. (I’m not an economist by the way, so if I make mistake in what follows, feel free to let me know.) From what I gather, Reinhart and Rogoff show that countries with incredibly high public debt (in excess of 90% of gross domestic product, I think), suffer from hampered economic growth. In the critique, authors Herndon et al accuse Reinhart and Rogoff of selective exclusion and coding errors. But what’s really become headlines is the fact that Reinhart and Rogoff used Excel for their analysis.

The argument is that Excel just isn’t a good program to use for data analysis. My response to this has been, “It’s a poor craftsmen who blames his [or her] tools.” To be fair, I actually stole this response from Jon Peltier. I really think it concisely describes how I feel. So kudos to Jon. I used the quote in a response on Andrew Gelman’s blog. I excerpt part of my response below because I think it’s a good lead-in to my overall point.

I remember in my elementary matrix algebra class some years ago, we had to perform Guassian elimination by hand. After the first test, it became clear to me I wasn’t paying close enough attention to my operations. Sometimes, if I wrote down a number too quickly, my hand would “skip” on the sheet of paper and make it appear as if I wrote a negative number. Imagine if I told my professor, “it’s not my fault, my paper isn’t rough enough to provide the adequate friction to perform these operations.” Or: “My pencil breaks off too much graphite, which makes my mistakes too hard to detect.” Or even: “My calculator’s buttons are too close. I can’t help fat fingering operations.” Even if could show that the pen, paper, and calculator actually contributed to my making mistakes, do you think my professor would (or should) care? Let me put it to you this way: What IF Reinhart and Rogoff had used only pen and paper? Should we blame BIC for making a pen unsuitable for analysis? Should we blame Mead for making unsuitable paper? I don’t think so. Reinhart and Rogoff’s analysis was faulty because they were inattentive. That’s not Excel’s fault.

JPMorgan, according to their internal review, knew the potential for error in the Excel model that ultimately became the London Whale scandal. Managers had even recommended that the model be automated and audited, but never followed through. Why not? Why wasn’t the model thoroughly tested when it provided results that appeared counterintuitive? Because they liked the model’s results and its potential for big returns.

Reinhart and Rogoff—what makes them so different? Are we expected to believe that Excel is responsible for their not rigorously unpacking and verifying their data and model? To me, a more reasonable answer is they really liked their results; and their reviewers really liked their results; and economists looking for more ammunition against government spending loved their results. Their results meant big impact.

That’s my thesis: there’s too much pressure to make an impression, to be different, to be sensational. We’re less likely to be skeptical of work that plays on our own confirmation biases, especially if it’s going to be huge. That same sensationalism, by the way, is behind the blame-excel bandwagon. Here are some headlines: “Quote of the day: Excel error destroys the world” from Mother Jones; “Microsoft Excel: the ruiner of Global Economies?” from arstechnica. To me, blaming Excel feels like more causation creep. Sure, Excel was used both by JPMorgan and Reinhart and Rogoff—but so what?

The real problem

The real problem concerns how we, as a society, treat data and research. For some reason, data is given a privilege status in our society. We aren’t as skeptical of it as we should be. We believe first and question later; that is, only if things go wrong—and, at some point, they will. When our news presents a study to us, we tend to interpret the results as being hard facts and not the output of a model that is potentially as flawed as the Humans who devised it. We look to technology to solve our problems, and when technology “fails” to meet our expectations, we toss it out as if it were a consumable good, like an old cell phone. But this view of technology, as the great solver—or destroyer—of the world’s complex problems is foolhardy at best and catastrophic at worse. It removes responsibility from our institutions and companies to make something better with time, instead of simply making it newer. It removes the responsibility from us to apply rigor to our own work.

Sensationalism drives the backlash against Excel. And don’t get me wrong, there are many things to criticize Excel for. However, that someone did not provide the necessary rigor to their own work is not Excel’s fault. In fact, while everyone is arguing over whether Excel provides the necessary information assurance, they’ve missed a recent addition to Excel’s capabilities that should have everyone concerned.

Specifically, Microsoft’s recently unveiled GeoFlow add-in.

From a technology standpoint, the fluid-like movement of anything in Excel is likely to dazzle; it certainly feels impressive. But before you make a flyover tour of your data, consider that this type of visualization has lots of problems.

According to the write up in the Office Blog, GeoFlow should allow you to:

  • Map Data: Plot more than one million rows of data from an Excel workbook, including the Excel Data Model or PowerPivot, in 3D on Bing maps. Choose from columns, heat maps, and bubble visualizations.
  • Discover Insights: Discover new insights by seeing your data in geographic space and seeing time-stamped data change over time. Annotate or compare data in a few clicks.

Discover new insights? No, I don’t think so. We’ve been able to place data on maps for as long as I can remember. And mapping how data changes overtime can probably be better presented using a timeseries plot. What Microsoft isn’t telling you is that plotting a column chart in a 3-d environment makes values hard to discern and occludes data from view. Virtual environments are useful when the physical environment is critical to understand the data within (like, say, in a flight simulation training program). But the data presented above is only hurt by us the use of three dimensions. Truly, what do we gain by adding a horizon which only limits data in the distance is it would in real life? Why should we choose to see less and not more?

Neowin’s blog on GeoFlow best sums up Microsoft’s motivation:

Let's face it; Microsoft's Excel spreadsheet program is certainly useful but it's also kind of dull to look at a flat 2D row of numbers. Isn't there some way that someone could turn these dull strings of numbers into something more exciting? Microsoft thinks the answer to that question can be found in a newly launched Excel add-on, GeoFlow. [emphasis added]

Let’s be concerned about this. Let’s not argue over which technology best allows us to forgo a certain level of rigor in case we make a mistake in our analysis. Instead, let’s focus our concern on technologies that research shows cannot deliver on what they promise. In my book, I argue that vendors are trying to remove the decision making process from us. They want us to trust their gimmicky programs and tell us that we humans are too error prone to make decisions on our own. We need data visualization and dashboards to make the decisions for us, they argue. Things have just gotten too big, and too complex.

It’s true we Humans are error prone. But we can analyze our actions and intentions, and start anew. If our Excel analysis is wrong, we need the ability to unpack and analyze what’s going on. When something pretends to do the analysis for us, to show us new insights like never before, let’s be skeptical. What happens when we rely on a technology that does not allow us to apply rigor? what will happen when it doesn’t deliver? should we stop using it and throw it away? I can see why people view that as the only option.

We should be happy that Excel doesn’t do the work for us. Because when we make mistakes, and when we learn form our mistakes, that’s progress. That’s making the world better, not just newer.

Book Updates

I figured I'd break hibernation to give everyone a small, quick update on the book. According to the original dates agreed upon in my contract with Apress, I should be close to finishing my book by now. Well, internet, I'm not close at all. I still have a lot more work ahead of me. The new release date is September 29th, and I do believe the book will be complete by then. As well, the name might change at least nine more times before it's finally printed. How about Fifty Shades of Excel?

What to expect

Above all, this is a book about Excel development of dashboards and interactive reports. If you're looking for a short book that gives you just enough to make something and move on, this isn't the book for you; although, there are many good books that do this (see Bill Jelen's Mr. Excel LibraryJohn Walkenbach's Mr. Spreadsheet series, and Michael Alexander's Excel 2007 Dashboards & Reports, for example). My book is aimed at filling what I believe is a dearth of quality printed material on the subject of Excel development for building modern tools. Professional Excel Development was the last latest and greatest comprehensive book on really developing tools with Excel. It is my favorite book on Excel development, and a lot of my work owes to what I've read. (There are however many terrific blogs on Excel development, which I believe is in response to the lack of print material.) 

I hope you enjoy the book, I really do. So, in the interest of full disclosure, I do want to highlight what won't be in the book. As of right now, there won't be any Pivot Tables. Sorry for all you Pivot Table lovers out there. Here's the skinny: when you use Pivot Tables for dashboards, you give up a lot of control. I think I make a pretty good case in my book, but I welcome feedback if you disagree. By extension, that also means no slicers. Look, I think slicers are very, very promising concerning where Microsoft wants to take Excel, but I kind of hate them. (Oh no, I've said too much.) They take up a lot of screen real estate and encourage people to make dashboards that are 80% slicers and 20% actual information. That's not good. Also, there will no pie charts. 

Most importantly, this book will be about empowering you as an Excel developer. A lot of people think that you can only do cool stuff in Excel by using VBA. I say, VBA is only part of the whole package. Formulas are a part of it, too. But your mind, your ability to think critically and creatively, will ultimately be the most important tool in your arsenal. Right now, there are a lot of companies who want us to pay big money for infrastructure and data visualization tools we don't need. I argue that you and I can do a better job than many of these vendors by doing the work ourselves in Excel. That is my motivation to write this book.

And now I bet you just can't wait to get your hands on a copy!


So if you're really, really interested in the getting a copy as soon as possible, there are a few options available. 

Through Apress

First, if you're into the e-book thing, you can go to my publisher's website and order a copy through their Alpha Book program. The Alpha Book program allows you to see each chapter as they are approved by my editors. The chapters will still be in development, so if you're the type who likes giving feedback, you could email me directly with your thoughts. If you do choose to do the Alpha Book program, and you do frequently email me with solid feedback about each chapter you've read, I can maybe refund your purchase or even send you a signed copy of a print book. 

Take the refund. I have a terrible signature. 

Through Amazon

You can also order the book through Amazon below its bookstore price. The discount seems to change every few weeks, but this is the lowest I've seen it so far. If you follow the link below, I will get an infinitesimal referral fee.  


More updates coming soon. Thanks again to Winston Snyder for writing his awesome articles while I've been away. If you want to contribute an article, drop me a line.