Michael Milton's Blog

Michael Milton's Blog header image 1

When to use Excel, when to use R?

January 26th, 2010 · Uncategorized

Because I wrote a book on data analysis and am currently finishing a book on Excel and a introductory video series on R for O’Reilly (I’ll tweet the link when I have one), I get asked this question pretty frequently. Generally the person asking this question knows enough Excel to get by and has never really looked at R.

To this person I’d strongly recommend that you extend your Excel repertoire beyond list-making, highlighting, and the occasional SUM formula. My experience showing people how to use Excel suggests that for every Excel power user there are a hundred people using Excel in a minimal way. Excel has plenty of firepower, and you paid for it, so why not learn it?

You should also learn R (more specifically, the S language). It’s not the easiest language to pick up, but once you get a hang of the basic data structures, the data analytic floodgates open. Since I started using R over a year ago, data analysis has become a much more exciting, “bwa-ha-ha!”-inducing experience than it ever was before. The power of R is astonishing. I haven’t been this excited about doing data work since I printed out SuperCalc bar charts on my Epson 9-pin dot matrix printer. If you haven’t already, you should learn R.

(I know what you may be thinking. “Guess what? Michael’s answer to life’s problems is… to buy what Michael’s selling! Oh boy.” Well, touché.)

Ok, so let’s look at the specific question. Here are a few suggestions.

When to use Excel

When you have something that needs a nice presentation. Most people use Excel as a page layout program for quantitative or list-based data. Seriously, as a page layout program, like InDesign or something.

This is probably not the primary usage that the authors of VisiCalc had in mind, but it’s a big need that people have, and the accretion of new formatting features in Excel 2007 and 2010 shows that the good folks at Microsoft recognize that their job is to give people what they want. Excel is a fast and straightforward tool for the presentation of tabular information. It would be the first tool I’d use if I wanted to present a summary of data, and I’ve even dropped graphics created in R into an Excel spreadsheet (which I then ripped to a PDF) when I wanted to create a nice presentation.

Now, people can use Excel for data presentation either well or poorly. I know that the Head First-approved Non-Designer’s Design Book was a big help for me when it comes to recognizing when I’m creating something hideous in Excel. I recommend it.

When you have quick and dirty number crunching to do. With Excel, loading data and writing formulas is quick and easy. With R, there’s generally some configuration overhead you have to endure in order to start crunching numbers. If you need to do a small handful of descriptive stats on your data, or you need to look something up, run a quick sort/filter, or even a pivot table, Excel is the tool.

Some people never need to go beyond this sort of data work. They probably don’t need to learn R, even though I’m inclined to say that everyone needs to learn R.

When I want to eyeball data quickly and maybe run a few basic formulas, I’m happy to fire up Excel instead of R.

When to use R

When you have to explore data. At the start of an analytic project, it’s a good idea to create a bunch of graphical visualizations of your data to get a sense of what’s inside it. In terms of its graphical capabilities, R exists in a whole separate dimension from Excel. This was perhaps the most shocking part to me about using R for the first time: I really thought I had a handle on data analysis even though I’d restricted my software to Excel, but boy was I wrong. The visualizations you can create in R are much more sophisticated and much more nuanced. And, philosophically, you can tell that the visualization tools in R were created by people more interested in good thinking about data than about beautiful presentation. (The result, ironically, is a much more beautiful presentation, IMHO.)

Here’s how I’d put the difference to someone who’s familiar with Excel but not yet with R. The graphics creation options that Excel gives you are all based in the graphical user interface. This is what makes Excel relatively easy to use—all your options are laid out before you with nice buttons and fill-in-the-blank boxes. But in order to create a graphical interface that’s easy to use, the creators of Excel had to make a bunch of decisions about what sorts of graphics you are and are not likely to want. With too many choices, the graphical interface becomes cumbersome and frustrating, so to achieve simplicity they had to eliminate options.

And this isn’t a gripe or anything. I can’t say I’d have done a better job designing Excel’s charting graphical interface. I cut my teeth on it.

These limitations become a problem when you want to inspect data visually in a bunch of different ways in order to explore it. R, through a combination of its well-designed base graphics package, the exceptionally well-designed lattice graphics package, and the jaw-droppingly well-designed ggplot2 graphics package, offers a breathtaking array of visualization options that you access through the command line or scripts. It has power that you just can’t get using a graphical interface to generate your charts.

When you need to be really clear about how you change your data. Setting aside the cool bells and whistles of R, this particular angle has had the most practical significance for me. A lot of the work that I do as a consultant involves direct marketing data. I set up the data either for analysis or for print/web production, and this means that I have to mutate it quite a bit from its original form. It’s rare that I can take client data in the state they store it and use it directly without manipulation. I say “rare,” but it’s really “never.”

The way I’ve always handled the cleaning of data in Excel is to create a bunch of intermediary formulaic columns alongside my raw data columns. I work and rework these intermediaries until I finally have columns containing data that’s been “cleaned” for whatever purpose I have in mind. Then I copy and Paste Special > Values the clean data to a new sheet. That way, I have an audit of what I did to the data, in case I screwed something up, which never happens [COUGH!].

With R all the data mutation I do is now saved in little text files called scripts. I’ll have my raw data in a CSV or something and create a script that loads the data, mutates it in any way I want, then spits it out into another CSV and/or R object. The advantage of using R in this way rather than Excel is that I can be a lot more concise in terms of code and descriptive in terms of commenting about what I do. Another advantage is that I can use regular expressions in R, which one cannot use in Excel. R is worth learning so you can clean data more elegantly.

I think of the raw data > script > clean data workflow in R as similar to the RAW image > metadata manipulation > JPG image workflow in Adobe Lightroom, for those of you familiar with Lightroom. It’s nice.

When you need serious statistical capabilities. Excel has a bad reputation for statistics. Historically there have been a variety of situations where Excel has demonstrated numerical and programmatic errors that produce flat wrong answers. I haven’t seen a statistician’s review of Excel for version 2007 or later, though, so I can’t really pass judgment about Excel’s current incarnations. Let me know if you have a link to something along those lines.

Excel 2007 was, generally speaking, a big improvement over previous versions. I’m a big fan of the SUMIFS and IFERROR functions, and I really appreciate the “Big Grid” and multiprocessor support. And all indications are that Excel 2010 will be another big improvement—check out the Excel team’s very interesting blog. My impression is that the Microsoft folks are sensitive to historical problems and have taken steps to fix them. If so, this is worth applauding.

But I’ll make a last point that is really a restatement of the point I made about the overdetermination of options you get with a graphical interface. This phenomenon applies not just to graphics functions, but to statistical functions generally. The statistical functions you get in R are much more flexible, numerous, and reliable. By a very long shot. This is in large part because R uses a full-blown scripting language rather than a GUI.

Conclusion

I’m quite sure I haven’t hit everything, but these distinctions between the two programs are where I’d start for someone with familiarity with Excel and no knowledge of R. Become a formula whiz in Excel, and learn R!

→ 2 CommentsTags:·······

Structured references and tables in Excel

January 15th, 2010 · Uncategorized

→ No CommentsTags:·····

How to use Excel to compare two versions of a legal contract

December 3rd, 2009 · Uncategorized

I had a chance to use Excel for this purpose and thought you might want to see the technique.

Be sure to watch it on full screen HD.

→ 3 CommentsTags:

How to write a fundraising letter and evaluate its performance

December 1st, 2009 · Uncategorized

Now is the season when nonprofits are scrambling to get out their end of year fundraising letters. Nonprofits who sent out letters over a week ago have been forgotten, since everyone reboots their brains over Thanksgiving, and nonprofits who send out letters after next week are running the risk of being outflanked by competitors.

I’ve written a lot of fundraising letters and have been involved in a few that may hit your mailbox this season, so I thought I’d pass along some general advice on how to write one and how to evaluate its success after you’ve sent it out.

Writing a fundraising letter

There are many creative angles nonprofits can take on fundraising letters, but here are three rules that will always serve you well.

Decide whether you’re going to be personal or pitchy. Because you can’t really be both. Is there a kitten with a thought bubble on the front of your letter’s envelope? Do you have an itemized list of “$100 buys x, $1,000 buys y, $10,000 buys z”? Then your letter is a pitch. It’s not personal because you’d never use those sorts of tactics in an actual personal letter.

The decision of whether to be personal or pitchy is important because it sets the tone for your organization’s relationships with constituents. I myself never create pitchy material, because my style is to use data to make communications as authentically personal as possible. I don’t like the relationship that pitchyness establishes, but I concede that it can make money in the right context.

If you’re personal, make sure your letter has not one ounce of pitchyness in it. If you’re pitchy, don’t waste your efforts hyper-personalizing it.

Be as straight as you can about how the money will be used. Donors want the charity itself to be the thinnest reasonable administrative layer between themselves and the cause they want to support. And if they’re reading your letter at all, they probably agree with your premise about the worthiness of your cause.

So go easy on attempts to persuade them that your cause is important and instead describe explicitly how their money will be used. And I’m not talking about the “$100 buys x, $1,000 buys y, $10,000 buys z” stuff, unless you’re really proposing to buy x, y, and z. If donations simply fund operations, say so, and say why unrestricted giving is important. People will appreciate your candor.

Be as specific as you can about everything you discuss. Talk about specific people you have helped, problems you have solved, supporters you have thanked, difficulties you’ve overcome.

Platitudes are boring and superfluous. When I read a letters with more than a few platitudes I question whether the nonprofit is doing anything at all besides selling platitudes. Specificity increases credibility, and credibility translates into fundraising success.

Evaluating it

Make intelligent comparisons. When I started out in fundraising I was surprised to discover how split people are on how to interpret the results of fundraising letters. Sober interpretation of results is all about making the right comparisons.

What comparisons do you make to determine your letter’s success? Between this letter and last year’s letter? Between this letter’s revenue and its budgeted revenue? Between the board chair’s reaction to this letter and her reaction to the previous one? Between this letter and the one that went out two months ago?

None of these comparisons tell you anything about why one letter performed better than another. To answer that sort of question, the type of comparison you need is what can be found in a randomized controlled experiment. And the vast majority of nonprofits do not do them with their fundraising mail.

If you’re not doing them already, take a look at this free chapter from my book and give it a thought. If you don’t have to time/infrastructure/interest, it’s not the end of the world, but make sure you keep your comparative judgments in the right frame of reference.

Watch for donor signals. People will receive your letter, and they’ll react. A donor you lost five years ago will suddenly make a gift. A donor who reliably gives $20 a year will give $1,000. A donor who responds to every letter will be silent. A nongiving volunteer will become a donor.

What do these signals mean? Talk to your donors and find out. Perhaps one of them is saying that she wants to be cultivated for a major gift. Perhaps another is saying that he’s angry over something you did and wants to discuss it. And perhaps another reacted for no particular reason, because of random variations in outcome.

After the gifts come in, you still have work to do to wring the maximum use out of that letter. Look carefully at your database, try to interpret it to see what your donors are “telling” you through their giving patterns, and then pick up the phone and see what they have to say for real.

Follow up aggressively. If you’ve interpreted those signals correctly, you should be able to put your donors on individual plans. Maybe their giving patterns suggest that you send them mail solicitations for a restricted gift. Maybe their patterns suggest that you upgrade them into a major prospect category. Maybe their patterns suggest they need a much gentler touch the next time you reach out to them. You need to make sure that your reaction is rooted in the intelligence your letter has generated.

Conclusion

These tips for evaluating your letter’s success are all about making sure that you use the letters as learning experiences. And just in case the search for “learning experiences” sounds trite, let me put the point another way.

You know nothing about your donors. Ok, maybe you know something. But your understanding of your donors is spotty, incomplete, inaccurate, and rooted in the past. The only way you can improve your understanding is by eliciting responses through gestures like fundraising letters and interpreting those responses correctly.

It’s easy to construct an idea of “what my donors think and want.” It’s easy to imagine one archetypal person, whose tastes overlap remarkably with my own, and tailor all my communications to that archetype. But the real task for the direct marketer is to tease out the diversity and richness in the beliefs and desires of her constituent population.

→ 1 CommentTags:····

A Head First chapter goes from nothing to something

November 12th, 2009 · Uncategorized

I just bought some new screencasting software and decided to let it run while I did a first-pass layout on Head First Excel chapter 4.

A first-pass layout, at least in my workflow, is when I take a chapter that has been planned out in storyboards and put the elements I’ve written into InDesign. In this video you can see a PDF of storyboards on the right hand side of the screen while I create an InDesign version of the storyboards on the left.

On my second pass, I’ll add screenshots and any code that needs to be written for the chapter. On the third pass, I’ll focus on the writing. On the fourth pass, I’ll make sure everything’s tight before I send it to my editor Brian. Before the first pass in InDesign, the chapter lives as storyboards, and prior to that it lives as a Learner’s Journey, and prior to that it is just a mess of ideas in my head.

Right now, the chapter is crap. It only becomes non-crap on the fourth layout pass.

Every day I sit down and say, “I’m going to make some crap.” That may sound negativistic and depressing, but actually it’s uplifting and liberating. Since the transition from ideas into my head to a finished product takes at least seven steps, and since only on the last step does the material go from crap to good work, most of the times I sit down to work I’m making crap. It’s easier to get motivated if I recognize this from the beginning. I’m usually setting myself up for disappointment if I sit down and say, “I’m going to do good work,” since good work only takes place at step #7.

Flour (even the best flour) tastes like crap. It takes months to grow the wheat, and processing the wheat into flour is expensive and arduous. Actually making bread only takes a few hours. Eating the bread takes minutes. Most of the work of breadmaking is in the creation of individually foul-tasting raw ingredients. An observation for authors.

(Oh, and it appears from my webcam video that eyebrow-raising, lip-pursing, and other assorted grimacing is part of the writing process as well, but that is all news to me.)

→ 3 CommentsTags:

What kind of modeler are you?

September 17th, 2009 · Uncategorized

Here’s a cool interview with Paul Krugman on the occasion of his receiving the Nobel. The interviewer asks,

Interviewer: What makes a good modeler in economics?

Paul Krugman: There are many different ways. One of the things you learn — and I think this is true for the physical sciences as well — is that there are many different personality types who work in distinct ways. My style: I am a ruthless simplifier, I pare away everything, I try to make the math disappear — it never quite does — but I’m a little model guy. I say “Here’s this huge complex subject, there’s got to be some little model that will get to the essence of it.” Sometimes there isn’t.

There are also people who are generalizers, people who will look for some general theorems, general ways that you can think about a large subject. There are people who are magnificently good at sifting through large amounts of data, finding ways to process that data and extract different conclusions. There are just very many different personalities.

There’s a certain style kind of identified with MIT, which is where I did my graduate work, that is the little model that cuts through to the essence of a complex problem, but… there are many different ways you can do that.

What it does take though… there is some requirement that you be able to step back and see things differently, to say that the way that everyone is talking about something is not actually the way you should be thinking about it.

Are you a simplifier, a generalizer, or a sifter? You can find the interview here, with the discussion about modelers at 11:14.

→ No CommentsTags:····

How to learn a new language… the weird way

September 15th, 2009 · Uncategorized

The highly literate mute

When I was in graduate school, a friend of mine earned some extra cash as an English tutor for students from Southeast Asia. As he was sizing up his students’ knowledge of English, he noticed something funny.

Their reading and listening comprehension was exquisite: on these fronts they were at a near-native level of English and had no problems with their lectures or readings from class. But when it came to forming sentences of their own, either in writing or aloud, they made bizarre mistakes. If you listened carefully to the words they used, you could see what they were trying to say, but they made word choice selections that no fluent speaker of English would ever make.

He couldn’t imagine what chain of events would lead these students to have this specific, weird sort of linguistic competence. On a lark one day he decided to quiz one of his students using the dictionary: my friend would select a random word and ask his student what it meant. He was unable to find a word that the student didn’t recognize. But while the student could grasp the meaning of every word, he generally had no sense for its usage.

It was is if his students had just sat down with an English dictionary and memorized it.

My friend and I agreed that, while we were happy to be in grad school and everything, if you had told us several years before that we’d have to memorize a dictionary to get there, we probably would have done something else. And I continued to feel that way until about a year ago, when I began learning the Italian language by memorizing a dictionary.

SuperMemo

Last year I read a neat article in Wired called “Want to Remember Everything You’ll Ever Learn? Surrender to This Algorithm,” which describes the obsessive, ingenious work of a Polish scholar in the development of a piece of software called SuperMemo.

The Wired article is worth reading and goes into a lot of detail about how SuperMemo works, but here’s the basic gist. SuperMemo operates like an electronic flash card deck — for example, it’ll give me an Italian word and my task is to remember the English equivalent. Once it shows me the answer I’ll rate my ability to remember the card on a scale of 0-5, 5 meaning that I recognize the word immediately and 0 meaning that I don’t recognize it at all. SuperMemo uses that number to schedule when it will show me the flash card again in the future. If I knew that card well, it may schedule the card weeks or even months into the future, and if I didn’t know it, SuperMemo might schedule the card again for review tomorrow.

The upshot is that for any given day SuperMemo is trying to show you only material that it thinks you’re about to forget, so you don’t spend a lot of time reviewing material you already know well, and the time you actually spend learning is allotted very efficiently.

After I read the article, I bought the software and have been loading my brain with thousands and thousands of Italian words ever since. This style of learning is passive and rote, and so it’s about as un-Head Firsty as you can imagine. But it’s also really easy once you get the hang of it, and it’s a good complement to more active approaches to learning. On average I spend about ten minutes a day going over new words and expressions.

[Oh, and a quick caveat about SuperMemo. In addition to being (arguably) the most incredible software learning platform of all time, it's also a strong contender for having the most poorly designed software interface of all time. So I'm not recommending SuperMemo. But I can say that, for people who can fight through the frustrations of dealing with the software, it works as advertised. And BTW a free, open source implementation of the SuperMemo algorithm is this program, which looks solid but which I haven't tested myself.]

That’s not how you’re supposed to learn a language!

I think there’s a good chance that my friend’s students learned English the way I learned Italian, using SuperMemo or some similar software. I usually have no problem understanding spoken Italian, and I am pretty solid reading Italian newspapers. But if you’d like me to speak to you in Italian, make yourself comfortable, because it takes me four or five seconds between each word as I try to sort out the genders and conjugations and all that assorted stuff. I’ve simply never practiced speaking Italian.

Now one might say that this is a train wreck approach to learning a language. The only way to really become fluent in another language is to immerse oneself in an environment of native speakers for a long period of time.

I’d totally agree. It’s just that, in my case, long-term linguistic immersion in Italy is an opportunity that hasn’t presented itself. If I were kicking it in Rome right now, you wouldn’t find me tap-tap-tapping away with computer flash cards. But as an alternative to linguistic ignorance, or as a way of preparing for future opportunities, starting off a language with SuperMemo and the newspaper is a great idea.

My friend’s students improved their speaking and writing abilities dramatically once they were immersed in an English-speaking university. They may never have made it there had they not memorized the dictionary. And, once there, they certainly benefited from having preloaded the language into their minds.

Last night I had my first actual Italian class, where I hope to be able to learn to use the language and not simply to understand it. I enrolled in an intermediate class, since I’m already familiar with the first few thousand Italian words one learns. My first impression is that, while my vocabulary is larger than my classmates’ vocabularies, their ability to vocalize sentences is way ahead of mine. I’m sure we’ll all be on the same page in a few months, but my experience up until now has made me a believer in technologically enhanced rote memorization.

→ 1 CommentTags:····

How to sell capacity investments to nonprofits

August 27th, 2009 · Uncategorized

As a wonky new Austinite I was delighted to discover the blog Keep Austin Wonky, and even more pleased to read this post about nonprofit capacity building. The author Julio Gonzalez Altamirano cites this entry from another cool Austin blog, Social Velocity, where the author Nell Edgington writes

“I met with a nonprofit Development Director earlier this month who has had a really hard time convincing their CEO and board to let them spend money on a donor database and some fundraising materials. Yet, at the same time the Development Director is expected to raise millions of dollars in revenue. That sounds completely crazy, doesn’t it?  But in the world in which I work that is often the rule rather than the exception. Infrastructure, capacity, fundraising, marketing, and operations dollars are somehow bad, dirty, not necessary, dismissed.”

As a former development director and fundraising consultant, I’m familiar with nonprofit investments in marketing materials and donor databases.

In fact, I’ve persuaded many nonprofits to increase their investments in these areas, so I’d like to offer a few tips on how to do it. These principles apply whether you’re a trustee, a donor, a vendor, a staffer, or a volunteer. Here goes:

Realize that you’re negotiating with the trustees

Whether you’re trying to persuade the CEO, a development director, or anyone else on staff to increase their investment, you need to know that you’re really negotiating with the nonprofit’s trustees. If the trustees want it, you’ll get it, and if the trustees don’t, you won’t, period.

And if the CEO is empowered to make the investment decision without explicit trustee input, she is still going to spend a large amount of time thinking about the trustees’ wishes. CEOs that spend money in ways that displease their trustees are not long of this world.

So you need to see the nonprofit from the perspective of the trustees, and you need to speak to the motives of the trustees, even if you’re dealing with someone below them in the organizational hierarchy.

Empathize with the trustees

Nonprofit professionals find it tempting to slap their foreheads in exasperation and ask, “Why don’t those trustees get it?” This is a natural but not terribly productive response, and chances are the trustees are asking themselves the same question about the staffers, vendors, and volunteers.

Good salespeople are incredibly empathetic, and their first reaction to hearing “no” is not to complain about how stupid their prospects are. Good salespeople see their prospect’s reluctance to buy as a rational response given their prospect’s worldview and desires. The problem is that the worldview needs to be changed.

To begin to change someone else’s worldview, start by getting a profound understanding of where they’re coming from. Only from there can you pave the path to the worldview you take to be better.

Understanding your trustees’ worldview and coming up with good reasons to invest in light of that worldview is a subtle task. If you’re good at it, your reasons will be overwhelming.

Take responsibility for the results

Trustees don’t want to hear that you believe that the nonprofit falls woefully short of fulfilling industry best practices. What do they care what you think other people consider to be standard practice?

Instead, they want you to stick your neck out and put real numbers to your beliefs about how you think the investment will affect the nonprofit. Do you think your investment will bring the nonprofit more money? Then state how and why, exactly. Make a thorough business case and stake your personal credibility on it. You do want them to make the investment, don’t you?

As a consultant, I know all-too-well that a sales strategy that relies on awareness-raising alone won’t work. If I want the trustees to make the investment, I have to own the results.

So before I ever even give a specific fundraising marketing proposal to nonprofits, I look at their data. The data will tell me how effective they are at raising funds and where they’re ignoring opportunities.

And I’ve certainly found nonprofits that were performing at such a high level that I couldn’t see where my help would increase their revenue. Nonprofit professionals are surprised to hear a consultant say, “You’re doing great, keep it up, I can’t help you.” But it’s better to avoid the business than to take on a project with a high probability of failure.

Once you have a full picture of how the nonprofit functions, you need to generate estimates for the real payoffs of the investments you propose. How much money do you think they’ll add to the bottom line, and how quickly?

Nothing shows trustees that you are serious like putting forward-looking income projections on paper and taking public responsibility for the actual outcomes. Sure, it’s easier just to cite what you take to be widely accepted standards (“Everyone does fundraising letters, how can you not?!?”), but you’re a lot more likely to get their attention if you have your own (metaphorical) skin in the game.

Calibrate expectations carefully

Now that you have their attention, just remember that in every step in the project you need to be aware of and manage their expectations.

If you don’t manage their expectations, here’s what will happen: Your investment will be a great success. It’ll pay for itself immediately and then go on to triple your nonprofit’s income. You call the trustees (or CEO), ready to celebrate, and they tell you that they’re actually quite disappointed because they were expecting their income to quadruple. And you’re fired.

There is no consolation for the salesperson under these circumstances. If the nonprofit’s expectation was indeed to have its income quadrupled, then the salesperson should have found out early in the process and recalibrated.

It’s not important to do well. It’s important to do better than you were expected to do. So you always need to be on top of your nonprofit’s thinking to ensure that expectations are always lower than the actual results.

And I don’t think it’s at all cynical to put the point that way: the trustees want you to do better than they expect. But that can’t happen unless you are in explicit, constant agreement with them as to what to expect.

One final thing. Remember that…

Overinvestment is a real possibility

There’s a high-end database you can buy that’s really common among medium-sized nonprofits. I won’t name the company who makes it, but I imagine that members of the community will know which one I’m talking about. Anyhow, I’ve seen many of situations where, for a 5-figure annual licensing sum, nonprofits have purchased this database without any real institutional support.

That is, they buy the fancy database, but they have it administered by a part-time data entry clerk who has no training because the nonprofit blew its budget paying for the software itself.

This situation is pretty common, and it’s really embarrassing, so it’s not often discussed. (Of course, it’s great for consultants who want to come in and help, since chances are the nonprofit has been recording data meticulously without any idea of how to use it, so it’s easy to come up with a few quick but terribly illuminating database queries.)

So remember that overinvestment is a possibility when you consider persuading a nonprofit to invest more heavily in something, especially if it has to do with fundraising. If the organization doesn’t have the ancillary funding to support it, or if there is no stakeholder behavior to support it, the investment will be a waste of money.

→ No CommentsTags:·····

Four ways that writing Head First has blown my mind

August 19th, 2009 · Uncategorized

If you’re at all familiar with Head First books, you probably know that they’re written differently from most books that aspire to help you learn a new skill. Brett McLaughlin, Brian Sawyer, and Andrew Stellman have all recently weighed in on what exactly makes Head First writing different.

As I wrote Head First Data Analysis over the past year, I began to see the craft in a totally different light, and here I’d like to share with you four ways in which Head First has changed my writing forever.

1. Motivation is on every single page

Andrew’s blog entry says a lot of useful stuff about reader motivation, and I’d add this: consciously maintaining reader motivation throughout the book you’re writing is surprisingly difficult.

Head First authors and editors assume that every single page needs to provide an explicit motivation that will keep the reader continuing. And “This material is interesting for its own sake” is not an adequate motivation. Nor is “Trust me, you need to know this.” The motivation has to relate to a clear, specific functional goal (in Brett’s words, something that might actually get you off work early on a Friday). And I’m not kidding when I say that every single page has to have one of these.

You might think, for example, that after you’ve written 300 pages or so, you could “coast” for a few pages. You could take for granted that you have the reader’s attention for the next ten minutes of reading and just give them material, taking up motivation again later on. But that’s not how we do it. We are so meticulous about motivation because really want you to read the Head First books from cover to cover.

Writing with page-level explicit motivations is, speaking for myself, a brutal discipline. But it’s worthwhile if you’re truly committed to creating a great reader experience.

2. The content is broken into spreads

In none of my writing before Head First Data Analysis did I care about how the text was laid out on a page. The basic writing units I used were words, sentences, and paragraphs. How words were distributed across pages was something handled by my word processor, and it was a matter of indifference to me as to how the final text was laid out into columns or pages.

Because we want you to read our books all the way through, and because there are so many visual elements on the pages, we write Head First books to be mindful of what ends up on the specific spreads. Spreads are the left and right pages the reader sees when she has a book open flat, and what happens as her eye travels from the top left of the spread to bottom right is really important. When you write in a one-dimensional, word-after-word manner, you don’t think about this visual component of your reader’s experience.

Any book that uses graphic elements heavily is likely to have been written on a two-dimensional, spread-by-spread basis. I recently read a novel that relies on spreads as a functional writing unit. And one of the most important parts of writing a Head First chapter correctly is having the pacing in your spreads make sense.

Writing on a spread-by-spread basis blew my mind because it forces you to be more mindful about how the reader experiences your work. And even if one is writing something that doesn’t require spread-by-spread thinking, like a novel or long article, one should think about the esthetic effects and cognitive stimulation that well-formed spreads can offer.

3. The visualizations force you to think harder about what you’re trying to communicate

There are lots of diagrams, flowcharts, arrows, text boxes, and pictures in Head First books. As someone who’d previously only used words to communicate, reframing my knowledge into these formats forced me to think hard about what it was I thought I knew. For example, early drafts of Head First Data Analysis’s first chapter had a big flowchart that ambitiously attempted to be a comprehensive definition of everything that is ”data analysis.”

After I wrote it, I didn’t feel great about it. Nor, it turns out, did my tech reviewers. Ultimately I decided to drop the visualization, since it wasn’t quite right and wasn’t even that important for someone learning data analysis (you don’t need a philosophically bulletproof definition of “cooking” in order to cook, either).

When I tried to communicate a grand unified definition of data analysis visually, I discovered that I didn’t actually have one. Whether a necessary, sufficient, grandiose theory of the nature of data analysis is important or useful is an important philosophical question, and my short answer is “No.”

Anyhow, I’d say that visual thinking imposes a rigid formality on your writing that the hedging and caveats of straight text enable you to avoid. I felt like communicating visually sharpened my mind. And that blew my mind.  :-)

If you’d like to learn more about communicating visually, the places to start are Dan Roam’s Back of the Napkin and the books of the world’s curator of cognitive art, Edward Tufte. My copy of the Roam book became particularly dog-eared during the writing of Head First Data Analysis.

4. We can simulate failure

In one of the chapters of Head First Data Analysis, you experience a big failure. I won’t reveal which chapter it is, but let’s just say that you learn the correct way to use an analytic tool, overreach with it ever so slightly, and experience disastrous results. Results that involve a lot of angry clients and might make you want to skip town for a little while to let people cool off.

Why take a big risk with the reader’s emotions by creating this sort of experience? Because failures, when we use them correctly, are incredibly effective learning experiences. How many of the biggest lessons you’ve learned are a direct result of mistakes you’ve made? If you’re like me, quite a few. Failure hurts, and it brings your attention into sharp focus.

And in the world of data analysis failure is a big deal. Analysis based on faulty assumptions and tool misuse are two of the biggest causes of analytic failure. Not only should you be aware of this, you should actually experience it in the sandbox of a book before trying out your new tools in the real world.

In the world of Head First, we call these failures “Oh Crap” experiences. They’re loads of fun to write, and they can make even the most boring-looking technical skill exciting.

→ 4 CommentsTags:······

Here comes Head First Data Analysis

July 14th, 2009 · Head First, data analysis

Data analysis is hot: every day we read about people doing complex and illuminating things with their data. It’s an exciting, dynamic time to be a professional analyst, and even people with backgrounds in statistics, engineering, or computer science have to work hard keep up with the trends in data.

But how does someone without a background in data learn to do good analysis? Everyone needs strong data skills nowadays because business is data analysis.

In my consulting work as a data analyst, I’m accustomed to being viewed by clients with a mixture of appreciation and apprehension: appreciation because they love learning about what’s inside their data, and apprehension because, well, shouldn’t they know how to do this work themselves? The answer is “yes.” As useful as hired guns like me can be, there’s no substitute for your own thinking about your data. Over the long haul, it’s not a good idea to outsource your brain. But how do you learn to analyze data?

Most books about learning data analysis aren’t much help. If a book has “data analysis” in the title, it’s usually for one of these two audiences: people who need a reference for the data analysis functions of Microsoft Excel, and people with a strong mathematical background. It’s hard to learn data analysis from these books. The Excel-oriented books are about using the software, not about understanding deep analytic principles. And the highly mathematical books presume too much prior knowledge among people who want an introduction to analysis.

The world needs an interactive, learner-oriented book that will allow intelligent data novices to grasp the tools for using data to make better decisions. And not just software tools, the big conceptual tools that underlie the best analysis and make for sharp thinkers. Head First Data Analysis, which ships in just a few weeks, was written in response to this need.

I hope that reading Head First Data Analysis is as exciting for you as writing it was for me. I can’t wait to hear what you think about it.

→ 3 CommentsTags:··

Get Adobe Flash playerPlugin by wpburn.com wordpress themes