How do you compare a pils to an imperial stout?

Which 90-plus beer should I drink tonight?

I have not so much made peace with “best” lists as run out of new ways to say why I don’t care for those that don’t provide sensible context. Thus when the latest lists from Rate Beer and Beer Advocate (in its print edition) arrived I sat silent.

Sure, I was amused reading the conversations that followed Martyn Cornell’s “Why extremophiles are a danger to us all” — both the comments on his blog and posts (such as this one) it inspired — but I didn’t have anything to add.

However, by taking a sledgehammer to college rankings in the current New Yorker magazine Malcolm Gladwell provoked a thought.

Gladwell begins his assault by examining the way Car & Driver ranks automobiles, writing the magazine’s “ambition to grade every car in the world according to the same methodology would be fine if it limited itself to a single dimension.” And, “A heterogeneous ranking systems works fine if it focuses just on, say, how much fun as car is to drive.”

Which leads to what the essay’s really about, rating colleges.

A ranking can be heterogeneous, in other words, as long as it doesn’t try to be too comprehensive. And it can be comprehensive as long as it doesn’t try to measure things that are heterogeneous. But it’s an act of real audacity when a ranking system tries to be comprehensive and heterogeneous — which is the first thing to keep in mind in any consideration of U.S. News & World Report’s annual “Best Colleges” guide.

This is not to say that Rate Beer uses the same methodology to compile its lists as U.S. News does for colleges. But it does endeavor to be comprehensive and heterogeneous (even though the top of the list is dominated by homogeneous, i.e. imperial, beers).

And therefore we are left with rankings that imply we might compare an imperial pumpkin beer to an elegant, well-balanced, low-alcohol cucumber beer. Could we would then use this as a guide when choosing a beer? Doesn’t work, does it?

(In all fairness to the beer rating sites they also group beers “by style,” making some homogeneous comparisons possible.)

Anyway, while I was reading Gladwell’s article — which delves into the subjectivity involved in setting “objective” standards — Pandora managed to feed me song after song that I didn’t feel the need to skip. It’s been a while since The New York Times explained how “The Music Genome Project” works, but it’s still a fascinating story. And one you may hear repeated in the coming months, because Pandora has filed for a $100 million IPO.

Some elements that these musicologists (who, really, are musicians with day jobs) codify are technical, like beats per minute, or the presence of parallel octaves or block chords. Someone taking apart Gnarls Barkley’s “Crazy” documents the prevalence of harmony, chordal patterning, swung 16ths and the like. But their analysis goes beyond such objectively observable metrics. To what extent, on a scale of 1 to 5, does melody dominate the composition of “Hey Jude”? How “joyful” are the lyrics? How much does the music reflect a gospel influence? And how “busy” is Stan Getz’s solo in his recording of “These Foolish Things”? How emotional? How “motion-inducing”? On the continuum of accessible to avant-garde, where does this particular Getz recording fall?

There are more questions for every voice, every instrument, every intrinsic element of the music. And there are always answers, specific numerical ones. It can take 20 minutes to amass the data for a single tune. This has been done for more than 700,000 songs, by 80,000 artists. “The Music Genome Project,” as this undertaking is called, is the back end of Pandora. [Note: The article is from 2009 and those numbers have grown.]

Would it be possible to do something similar for beer? I’m guessing homogeneous would work better than heterogeneous — there’s a reason that Frank Sinatra songs never show up on my Chris Knight station — and finding volunteers for research would be easy.

37 thoughts on “How do you compare a pils to an imperial stout?”

  1. One of the themes running threw the chatter Cornell provoked is that consumers use the rankings in choose new beers (hence the photo at the top).

    Were there a “Beer Genome Project” I would expect one goal would be to compare elements in a pilsner to elements in other beers to suggest what, if any, imperial stouts a drinker might like.

    “How do you compare elements in a pils to elements in an imperial stout?” wouldn’t fit in the headline ;>)

  2. on BA’s ratings, Weihenstephaner Hefeweissbier does manage to come in 17th overall, ahead of things like Old Rasputin, DFH90, and Bourbon Co. Stout.

    Here’s an idea–take the difference between an individual beer’s rating and the average rating for that style. Compare the differences–is Abt 12 “better” than the average Quad to a greater extent than, say, Budvar is to the average Pils?

  3. Jay – Weihenstephaner Hefeweissbier is good example. On a banana-clove scale it is balanced toward banana. I prefer a beer balanced toward clove, say Gutmann. Were I standing in store confronted with a dozen US brewed hefes then a Pandora-like app sure would be nice.

    Of course, we’re talking homogeneous. One step away would likely be Belgian-brewed or Belgian-inspired beers. In that case, though, I want spice rather than clove.

  4. re: the genome project, there are at least 5 easy quantifiable stats for beers that can be used to supplement the rankings–ABV, SRM, IBU, FG, OG. Most people know if they subjectively enjoy beers that are strong, dark, bitter, malty, etc.

    Red Ale is like that AC/DC that keeps sneaking onto my Pandora playlist.

    I’m not really in to the whole online beer-rating community, but it would take one hell of an algorithm to replace the guy at my local liquor store. It takes three minutes to listen to a song, longer to enjoy a beer, with the latter usually being done away from computers.

    Although once they put QR codes on all the bottles…

  5. Interestingly enough, Banana, Clove, and Vanilla flavors are all produced by specific, single chemical compounds that can be measured by Gas Chromatography. I’m guessing most brewers smaller than InBev don’t put that kind of money into analysis, though.

  6. Stan, fully aware of the ongoing discussions and motivation for asking “how?” I was thinking more philosophical in asking “why?”

    I’ll add that once you identify and quantify the qualities for comparison you’ll have to do some math to make recommendations (which I’m sure Pandora does). If you chose your comparison measures with any meaning, the result of the algorithm would be that imperial stout and pils are orthogonal.
    ie. there is no overlap of my Miles Davis and Metallica Pandora stations.

  7. jay – Speaking from experience I can tell you it’s hard enough to get OG and FG from many brewers. Gas chromatography is also a useful tool for analyzing hop aroma – reminding me I should get back to work.

    TimC – Somewhere along the line my Los Lobos station became the one giving me zydeco, rather than my Neville Brothers station. That’s an aside. Given that imperial stouts and pilsners are both beers I’d argue they share pretty important characteristics. But that’s a discussion better continued over beer.

  8. “A heterogeneous ranking systems works fine if it focuses just on, say, how much fun as car is to drive,” Gladwell wrote.

    It seems to me that this is exactly what RB and BA rankings do. These are hedonistic rankings in which thousands of individuals are rating how much they enjoy different beers.

    I’m also not sure the lists attempt to be comprehensive. For that, the scores would have to take into account things like price, availability, label art… And they do not.

    The fact that they’re heavy on imperial stouts is more telling than it is problematic. The sort of person who enjoys scoring beers is more likely to enjoy that sort of beer. Fine. We don’t have to agree with it, any more we have to agree with any of the ridiculous beer lists that come out.

    The only list that really matters is the one I take with me to the beer shop.

  9. Let me begin by saying that I agreed completely with what Martyn wrote. What I see as the problem in understanding his point is not the methodology of these sites, but their user base.

    The two fan sites attract a certain beer fan. This appeal is honed in the forum, where people with non-fan viewpoints are abused. The result is a user base which counts rarity as very important in its evaluation of beer (perhaps there is not place in the form to check that, but it certainly plays a large part in the discussion and, perhaps unconsciously in the ratings). They also seem never to find a beer that they like — why are they always looking for the next beer?

    It has long been my belief that the beer fans (or geeks, if you prefer) don’t actually like beer. When is the last time that a good pilsener was in the top 10 (or even top 100) of either of these sites? Why don’t the fans/geeks like “regular” beer (not industrial), but just average, tasty beers? Why do the highest ranked beers have so little in common with, say, the beers I drink daily?

    Since beer is relatively cheap (I don’t live in the US), I don’t see the need for lists. When I go the beer shop, I know what I like and I buy those beers. If I want to try something new, it’ll usually cost little enough that I don’t mind if, in the end, I don’t like the beer and so, lose the money.

    I hope there will never be a Pandora for beer. It’s just too much fun going out and trying it on your own.

  10. Joe – This is a complete aside, and college tuition is much further out there for you than me (we’re talking kids, folks), but one of Gladwell’s major complaints about college ratings is that price is not factored in.

  11. Stan, I love it when minds think alike (I’ll leave the “great” out of it). A few months ago, I mused on much the same thing. If you wanted to do it properly, you’d have to assemble a group of fairly talented tasters to go through a bunch of beers and assemble a similar genome.

    Creating the criteria would be the most difficult thing. Using a standard like the flavor wheel would get you part of the way there, but I think some evocative criteria would also be necessary. “Sunny,” for example, for beers with spritzy top notes.

    You could assemble the list slowly over time, ensuring that you started broad and shallow and then deepened the pool with more an more similar examples. (You can see I’ve thought about this a bit.) I’d be willing to participate in creating the genome if you’re prepared for launch.

  12. Mike / Dave – Pintley has been called the Pandora for beer, but I’ll make a little distinction between what Pandora and Pintley actually do.

    Pandora employs many musical experts, who listen to every single song that goes into their database, and mark all of it’s attributes. They called it something along the lines of Music DNA. Then, Pandora (the app) recommends songs to you by matching that Music DNA profile to what you tell them you like (by liking songs).

    We don’t create profiles of every beer. We stay objective, and beers in our database have an equal chance of getting recommended as any other. Our recommendations are based on your ratings of beers you’ve tried, and the ratings of your friends and other Pintley users.

    Mike, I’d like to think what we do is still in the spirit of what you’re saying… which is just trying new beer. We don’t recommend beer you’ve already rated, we only recommend stuff you haven’t tried before. We’re not going to keep putting your favorite beer up as what you should drink, like Pandora keeps putting in your favorite songs in your stations.

  13. Dave – I sent Pintley your question (and got an answer even before I could finish the note to Jeff).

    Jeff – I apologize for not remembering that (gee, I even commented). I seem to have hops already written on my calendar every day for the next year. So it hardly seems fair of me to tell others what to do, but I’d like to see it start with a pretty technical background.

    As jay pointed out, only the bigger brewers (although they don’t have to be InBev size) can afford gas chromatography. However, there’s plenty of general research using that and other tools.

    The front end doesn’t need to use terms like isoamyl acetate, 4vg and C1CCCCCCCCCCCC (SMILES for cedarwood), but the backend should.

    Of course, this is all a pipe dream.

  14. “— why are they always looking for the next beer?”

    Funny coincidence that I was just thinking similarly along those lines this morning.

    Why does everything have to be the “next big thing” to the majority of “beer geeks?”

    Stan — I’m going to find a good comparison beer to drink with my Getz discs. Reporting back sometime in the next century.

  15. Stan, I would think a Pandora-esque beer recommendation program (at least a rudimentary version) would actually be fairly easy to develop. Rather than using gas chromatography or chemical analysis, couldn’t you simply do a text analysis of the reviews on popular sites like RateBeer and/or BeerAdvocate? Picture a “tag cloud” for each beer, where the most frequent words (like the aforementioned “banana” for Weihenstephaner Hefeweissbier) would be largest. A user could simply type in the attributes that are most important to them and find beer recommendations that best match their keywords. This would allow for easy filtering. Let’s say you love Scotch Ales, but recognize that all Scotch Ale reviews will include words like “malty”, “roasted”, and “caramel”. Just filter out those words and search for the facets of the beer that you are looking for but perhaps aren’t quite as ubiquitous (say “peaty” or “stone fruits”). Just a thought…

    On another note, I’m the author of the rebuttal to Martyn’s post that Stan linked to above (thanks for that, Stan). I wanted to reiterate that I have NO problem with Mr. Cornell taking issue with the “Best Beer” lists that are so popular on sites like RateBeer and BeerAdvocate. While I love big Imperial brews, I fully recognize that they are just a small part of the entire beer spectrum and that championing them at the expense of other styles is both foolish and short-sighted. My biggest concern with his post (and with some of the follow-up comments) is this “us vs. them” mentality that seems to be sprouting up in the beer world. There’s a sense that craft beer drinkers are divided into two camps: the “Extremophiles” who only drink high-gravity, over-hopped monsters, and the “Sessionistas” who drink what your commenter Mike calls “regular”, “average” beer. This worldview seems to mimic the current state of US politics with the Extremophiles akin to the “crazy, hippie liberals” and the “Sessionistas” serving as analogs to the “reactionary, close-minded conservatives”.

    The craft beer industry is still nascent and, despite its gains in recent years, still somewhat fragile. I would hate to see it cannibalize itself in a cavalcade of in-fighting and verbal warfare. We may disagree (vehemently at times), but overall, we’re all just fans of that magical melding of malt, hops, water and yeast. Whether you worship at the altar of hops or drink the same, smooth lager day after day, we’re all in this together. We’re all Aleheads (if you’ll pardon my shameless self-promotion).

  16. Interesting thoughts.

    However, I think the largest problem with beer rating sites isn’t how beers are ranked necessarily, but who is ranking them. What you end up having is largely a bunch of very enthusiastic (which is great), but generally inexperienced (or at least un-objective) beer drinkers who’re not a great source of data. Pandora doesn’t let users vote on which attributes a song has, they have musicians and qualified people do that. Car & Driver doesn’t rate cars based on an online vote, they have experienced writers and drivers do it, etc. Instead, what we have on beer websites is what I’ll call the “American Idol effect” where you have only a very broad consensus that doesn’t really help anyone.

    I would like to see a beer genome type project with beer writers and brewers etc contributing.

  17. What would be great is something akin to the Netflix/Amazon movie/book recommendation algorithm. No exhaustive lists of characteristics from experts, no comparing pilsners to imperial stouts, just “people who liked this beer also liked _____.” If that’s what Pintley does, I’ll have to try it.

    I’m not like most beer drinkers–the number one thing I look for in a beer is “which of these haven’t I had before.”

    Given limitless time and resources, I’d love to sit around with a GC/MS, LC/MS and a variety of beers. But I’m not like most beer drinkers.

  18. FlagonofAle,

    I think that’s what awards (like those given at GABF) are for. Beer experts objectively judge beers based on a standard set of criteria for that style. The problem with Car & Driver is that I’m sure that all the supercars get excellent reviews, and all subcompacts are all horrible, but they’re happy to tell you which is the least horrible one.

    What you say is a problem I say is an advantage. If you’re a beer expert or judge yourself, then I’ll be the first to admit, Pintley might not be for you. You’re probably more interested in reading about brewing techniques, ingredients, yeast strains and such to decide if you “want” a particular beer or not. What Pintley aims to do is to help people who don’t know what they want, or maybe even people who don’t know the difference between an ale and a lager, figure
    out what they like.

    At Pintley, we’ve said it before and we’ll say it again… The best beer to drink is the one *you* like. We’re only here to help you explore 🙂

  19. Stephen – I respectfully disagree. In a compare and contrast there might be more contrast than compare but they are both beer and well-brewed examples will share many characteristics.

  20. “A heterogeneous ranking systems works fine if it focuses just on, say, how much fun as car is to drive.”

    Which is essentially exactly how the rating systems at Beer Advocate and Ratebeer works. “How much do you enjoy drinking this beer?”

    Which also explains why, for example, I can rank an average American pale ale higher than a technically perfect pilsener simply because I like it better.

  21. Jeremy,

    Of course, the thing is RB/BA only tell you what other people thought of a beer. On a 5-point system, the vast majority of beers are going to have an average rating of ~3.3.

    What this post was talking about, and what Pintley does, is recommendations for a user, as opposed to just making users blindly search for a beer. You can’t go and attempt to find a beer if you haven’t heard of it, right? 🙂

  22. Stan, a well-grilled piece of sirloin and a rack of great barbecue pork ribs are both pieces of cooked animal flash. The latest by Cee Lo Green and a symphony by Beethoven are both music. The Little Engine That Could and A Brief History of Time are both books.

    A set of shared characteristics does not necessarily translate into comparability. Just saying.

  23. Stephen –

    True, but you’re not taking into account that such a system wouldn’t be trying to compare, say, pork ribs to a board game. It would be comparing, say, recipes, and attributes would include main ingredient, preparation style, etc. In that sense, yes, one could compile attributes for both a sirloin and a bbq ribs recipe, and then try to match the attributes of said recipes against what a user says they like.

  24. Perhaps it is because I’m the guy who compared Peter, Paul and Mary to Blue Moon White . . . those things look do-able. Or would if I knew who Cee Lo Green is ;>)

  25. “a well-grilled piece of sirloin and a rack of great barbecue pork ribs are both pieces of cooked animal flash.”

    True, but just as I often ponder a Pils or Stout as I’m standing in front of the beer shelf, my wife & I will ponder what to enjoy off the grill on a Saturday afternoon — comparing, contrasting, debating for what we’re most in the mood.

  26. Exactly, Steve! Each has its own mood, its own set of circumstances — even if, in different months, or even different weeks, those circumstances might change or overlap — its own time. So what is the point of trying to compare them? When I’m in the mood for ribs, I don’t want a sirloin instead, and so at that moment, the sirloin will rate well below the ribs, say, scoring a 85 instead of the 97 it would get when I was really feeling like a good steak.

  27. ” So what is the point of trying to compare them?”

    Just as I said, to decide just what I’m in the mood for! And that is not to say that I might be in the mood for the other 24 hours later, 😉

    Then there’s that dog-gone tuna steak in the fridge… ahh, choices!

  28. Mr. Steve, you and I are saying the same thing different ways. What matters is mood and desire, not point-to-point evaluation of what is quantitatively “better.” That steak or strong IPA is better for one moment, the ribs or pilsner better for another, and your tuna steak or porter better for a third. Comparing them would produce a different victor each time.

  29. It appears to be beyond my comprehension. Should I be able to find first round results online? (Second round is tomorrow.) Should I care?

  30. “…not point-to-point evaluation of what is quantitatively ‘better.'”

    Oh gosh, no — with so many great choices in the beer world I could never start to pontificate on “what is better” — as seems to often be the case at beer forum discussions.

    I always point out that today’s choices, as compared with those before the rise of better beer, ought to make more people happy than try to prove good and bad.

    But this is not to say that different beers can’t be compared without trying to pick a “winner.” The drinker is the winner.

Comments are closed.