Nit-picking Kelley Blue Book
Posted May 15th 2012 9:00AM
CommentsAdd

A natural extension of being a tech journalist is being a stats geek. As it turns out, nothing makes a stats geek happier than finding something wrong with someone Else's numbers. Especially when that someone is Kelley Blue Book (KBB), a cornucopia of car statistics ranging from range to resale value.
Now, we love KBB - lots of great information, insight and reviews. It's the original go-to source for "what should this lemon/diamond-in-the-rough actually cost?"-type queries. In fact, we recently posted a story about KBB's "Best Resale Value Awards," because KBB really does lead the way in generating that kind of informative repository of best and worst and in-betweens.
But sifting through KBB's archive for background info on its award-winners unveiled an odd and often-troubling lack of confluence or concurrence... something in the numbers that KBB uses for various ratings didn't always add up. We don't want to nitpick, but... actually, no, that's exactly what we do want to do. We should make clear that we haven't done any comprehensive statistical analysis, and we make no claims about KBB's methodology or overall system. We've just noticed some things that don't jive, that should jive, and that have us a bit worried.
Read more after the jump.
Nit #1: Numbers say no, Words say yes.
It started with KBB's review of the Chevy Volt. In the reading, it comes off as not quite a rave, but certainly enthusiastic. "With its cool and elegant styling, and even more cool and elegant interior, the 2012 Chevy Volt plug-in hybrid not only makes a statement about your environmental consciousness, it also says something about your good taste and strong sense of style."
Well, the teacher certainly likes you, Volt, can't wait to see your report card!
KBB's Expert Ratings of the 2012 Volt:
Driving Dynamics: 7.0
Comfort/Convenience: 6.5
Design: 7.3
Value: 5.8
Safety: 9.3
Overall: 6.5
Oh... a 6.5 out of 10 overall rating? What happened to all that style? What does that rating mean anyway?
Must have read that wrong, let's check again. Oh look, KBB just updated its opinion (you can do that?).
KBB's Revised Expert Ratings of the 2012 Volt:
Driving Dynamics: 6.4 (down from 7.0 - what happened there, experts?)
Comfort & Convenience: 7.0 (up from 6.5 - broke in the seat, no doubt)
Design: 6.4 (down from 7.3 - fashion is fickle)
Value: 5.2 (down from 5.8 - what a difference a week makes)
Safety: 10.0 (up from 9.3 - not just mostly-safe anymore)
Overall Rating: 7.0 (up from 6.5 - way to go Volt!)
KBB seems to be of a new opinion when it comes to Volt value. Is this because of the recent news about lowered expectations and layoffs and the Volt factory? Hard to say, because though the numbers have changed, the text of the review has not.
Breaking it down, KBB assigns "expert ratings" to several qualities of each car it reviews - driving dynamics, comfort and convenience, design, value, and safety. How it generates these scores isn't clear - it seems to be a quantitative expression of the expert's qualitative opinion of the car. Fair enough, but it's hard to reconcile things like the 6.5 or 7.0 the Volt receives for comfort and convenience with the written review, where the harshest point made is that rear-seat passengers must duck their heads before climbing in. Really? 3 points off for not being a Popemobile?
Also notable is the fact that the overall rating is not an average of its other scores - the mean of the Volt's individual ratings is a more-respectable 7.0 (the original average was 7.2) - so one of the category ratings must be weighted more heavily than the others. Which one and by how much and why is anyone's guess.
Nit #2: Comparing something to nothing.
Another number featured prominently on the Volt's review page (and all KBB review pages) is "How It Ranks." The Volt ranks 1 out of 11 for fuel economy and 10 out of 11 for horsepower. "Whoa!" you're probably saying. "Where did Kelley find ten other electric cars to compare with the Volt?" They didn't, of course. It turns out that each car is compared on several quantitative factors (fuel economy, horsepower, U.S. MSRP, consumer ratings, and five-year cost) to its own specially generated list of cars that are within 3 per cent of its U.S MSRP and in the same category. Which is kind of a neat idea in theory, but defining a car based on its size and shape really glosses over the reasons people are looking to buy cars in the first place. It isn't likely that anyone is thinking, "Hmmm, the KBB or the Buick LaCrosse ... so close in price and they both have four doors ... so hard to choose. Hey! Which has more horsepower?" That's just not how most people shop.
Still, fruit to fruit comparisons are a good place to start, but there aren't that many vehicles like the Volt. The Volt is a pear compared to an apple, but at least it's a fruit with four doors.
Nit #3: Which Civic Am I Talking About? Which One Do You Want Me to Be Talking About?
Once you get started poking around in KBB's rankings, it's hard to stop. But we were brought up short by the realization that where there are different versions of a given model, KBB will use the specs from one version for one ranking, and from a different version for another. You think you're seeing an overall ranking of that model, but the specs have been cherry-picked.
For example: if you're reading about the Honda Civic Hybrid and you click on "See all specifications," or "Read the expert review," you're taken to a page that is not about the hybrid itself but about the 2012 Civic in general. That's kind of annoying but understandable - we can't really expect KBB to write a separate review of each version of each model.
Where it becomes a problem is when you click on "View all rankings" so you can see how the car compares to its peers (again, four-door sedans within 3 per cent of the Civic's U.S. MSRP). The Civic stands up quite well on every measure. Amazingly well, in fact.
It should. Because depending on the factor being measured, KBB uses specs from different model Civic to describe an imaginary Civic for comparison with comparably-priced four-doors. If you check fuel economy, you'll see the Civic Hybrid's impressive 5.3L/100km (44 mpg), putting the Civic in top place. But if you click on horsepower, you'll see instead the Civic-Si power rating, which has by far the greatest horsepower of the Civic line - it ranks third with a formidable 201 hp. Fact is, you can't get a Civic Hybrid with 201 horses under the hood any more than you can find a pear on an apple tree. If KBB had used the hybrid's power rating, its horsepower ranking would have been dead last. If they'd used the Si for fuel economy comparisons, the Civic would again rank dead last.
The Civic presented by KBB for comparison doesn't actually exist. It's an amalgamation that combines the best qualities of one Civic with the best qualities of another - qualities that are generally tradeoffs against one another - producing a sort of hypothetical super-Civic.
Not that KBB is necessarily showing a bias towards Honda. If you look at other models that have a hybrid version, you'll see similar shenanigans with variations. The Hyundai Sonata's rankings use its hybrid version's fuel economy rating, but it's not labeled as belonging to the hybrid. The Ford Escape shows the mpg of the hybrid powertrain, and labels it as such, but uses specs from other Escapes for everything else. Meanwhile, the Ford Focus review page makes no mention of a hybrid model, though one has been available since early 2011. The Ford Fusion review page says that the hybrid is reviewed separately but doesn't say where, and links just bring you back to the original page.
Also, the number juggling coughs up different stats depending on where in the site you start from. If you connect to the Civic's review page from the page describing its award for "Best Resale Value: Hybrid," you're taken to the page showing the rankings described above. If you connect to it from the front page, the Civic's review page will then omit any reference to the rankings, and there's no visible link to them.
Nit-picking, yes, but nit-picking that matters. It matters to consumers, and it should matter to KBB. For consumers, buying a car is all about tradeoffs. Power over fuel economy, size over handling, seating over cargo space, and so on. You may be able to have some of both, but you can't have all of everything. KBB's curious arrangement of rankings obscures your view of the tradeoff you're making.
For KBB, the stakes are higher still; after all, we consumers can just click over to another site. But KBB's business model is based on two main assets: first, a proprietary system of monetizing its wealth of automotive information, and second, our trust in what they do with that information. If consumers start questioning the system, it will erode their trust; and without consumers' trust, the system's not worth much.
At this point, we decided we'd better talk to someone at Kelley Blue Book. Sure, it's fun to point out everyone else's flaws and then sign off. But KBB is a really good site, and we don't want to cast aspersions without asking for an explanation.
Which, it turned out, they were happy to give.
Jason Allan, a KBB editor and the author of both the Civic and the Volt reviews, was forthcoming about the challenges of maintaining the auto industry datasets, which are large and constantly changing. "It's a particular challenge to write and present information about a car with a lot of different versions, like the Civic," he said. "We're trying to show that it does cover the bases. Are you talking about a fast little coupe? Civic has one of those. Are you talking about a really fuel-efficient model? Civic has one of those too."
So how are you supposed to know which is which? Allan pointed out that when you start looking at specific models to compare them - choosing trim, options, etc. - then you only see numbers and rankings that are exactly specific to that model. He acknowledged that it could be misleading to have the "#1 out of 15 on fuel economy" and "#3 out of 15 on horsepower" right next to each other on the same page, and agreed that labeling these stats "Hybrid" and "Si" would be clearer.
"The website is always evolving," he said. "We write the review as soon as we drive the car, but we periodically go through and revisit the ratings, and probably wouldn't amend the review itself unless there was some really substantial change. With the Volt, the safety rating, for example, is tied directly to NHTSA's rating." Which partially explains the change in the Volt's ratings, but we had to ask: why isn't the overall rating the mean of the individual ratings? What else is in there?
"There's another rating," Allan said, "called the 'recommended' rating." He explained that it captures the reviewers' sort of general sense of whether they do or don't recommend the car. It doesn't show up with the others.
Why not?
"It came along after the other ratings were set," Allan said.
But why aren't they shown now?
The design of that part of the display was done a while ago, Allan said. "If we went back and talked about it, we'd remember why we did it that way."
Why do you need a separate, hidden rating to capture "recommended"? Wouldn't this be fully captured by the other ratings?
"It's more to capture overall nuance," Allan said. "Things that aren't fully quantifiable."
Well, we might argue that nuance is no less quantifiable than a reviewer's assessment of a car's design. "7.3 for design" is taking an opinion and turning it into a number; seems you could do the same thing with nuance. Still, this explanation does make some intuitive sense. If you had a trusted friend who knew everything about cars, you might call and ask for his opinion of a certain car's safety, drivability, design, etc. - and still want to know, "But overall, do you, you know, like it?"
But - the key difference is that you would know that you were asking for your friend's general feeling about the car. KBB's hidden "nuance" rating doesn't trust us with that information.
It all comes back around to trust. KBB wants to keep its formulae and all its systems private, and that's fine. They've put plenty of time and money into developing those systems, and should protect them. But just like when you're buying a car, there are tradeoffs: in this case, between privacy and trust.
To maintain our trust, KBB has to think about a bit more transparency. Not every number has to be a black box. Some of them can have a few windows. If a reviewer's gut feeling is part of a car's ratings, tell us! If the rankings you're presenting apply to different versions, tell us! We can handle it - certainly better than we can handle gazing at numbers that don't seem to add up, and noticing that they're surrounded by ads for the car we're reading about... which a minute ago just seemed like good business, but now starts to seem insidious.
Don't worry about your formula, KBB; we don't want to know. You know on your home page where you laughingly say "Don't try this at home"? We don't want to try it at home or anywhere else. We want to trust you to do it for us. And if you can just open a window or two - we will.
News Source: Evergeek