Should a nonprofit auto history group create a credible data bank?

(EXPANDED FROM 4/16/2021)

One of the challenges of working with automotive sales and production data is that the most easily-available sources don’t always appear to be accurate. Thus, it felt validating to come across a comment by MCT over at Curbside Classic.

“I am amazed at how much disagreement and discrepancy there is between different sources,” MCT (2015) wrote. “Sometimes the differences are fairly minor and inconsequential, but sometimes less so. You’d think ‘How many did they make’ would be a pretty easy question to answer, but I’m finding that it’s often surprisingly difficult, at least in exact terms.”

Two major sources of historical data can conflict

MCT went on to give examples of how production data didn’t always align in the two most popular sources of U.S. automotive history:

“1968 fullsize Ford production: From the Standard Catalog, I get 867,247; from the Encyclopedia of American Cars, I get 867,292. The discrepancy is in the number of Custom 500 two-door sedans built. The Standard Catalog says 8,938; the Encyclopedia says 8,983.

“1969 fullsize Ford production: From the Standard Catalog, I get 998,796; from the Encyclopedia of American Cars, I get 1,014,850. The discrepancy is in the number of Country Sedan station wagons built. The Standard Catalog shows 36,287 six-passenger models and 11,563 ten-passenger models; the Encyclopedia shows 36,387 and 27,517″ (MCT, (2015).

This raises the question: If you are an auto history writer, what do you do when you find numbers that don’t look right?

Standard Catalog less reliable than Consumer Guide

I fairly often find similar discrepancies when I dig into the data published in the Standard Catalogs and various editions of the Encyclopedia. I don’t say that to wag a finger — mistakes are inevitable when you are working with the enormous amount of information in their books.

John Gunnell's Standard Catalog of American Cars 1946-75

That said, my sense is that the Standard Catalogs tend to have somewhat less rigorous proofing than the Encyclopedias and other publications involving the auto editors of Consumer Guide. Thus, I usually go with the latter numbers when there is a conflict.

But sometimes neither set of numbers look correct and complete. That’s when I search for data from brand-specific websites or books. For example, Allpar.com offers fairly detailed production data (e.g., Wilson, 2020).

One challenge is that authors can use different cuts of data, such as for the calendar year rather than model year. One must not mix one apples and oranges!

Sometimes I will stumble upon a better set of data after an article is posted. When that occurs, I will update graphs and text. That happened when I found more accurate production figures for American Motors.

Also see ‘Getting primary information can take a lot of work’

Of course, I can also add my own errors. Most of the data you see at Indie Auto has been manually inputted into fairly large and complex Excel spreadsheets, so there is the potential for typos, miscategorizations or incorrect calculation formulas. That’s why I will spot check my data when I start working on a new story or am updating an existing one.

For example, I recently found some categorization errors in my Rambler production data while working on an article critiquing Aaron Severson’s take on AMC.

1967 Rambler Rebel SST convertible
1967 Rambler Rebel SST convertible (Old Car Brochures)

Inaccurate data can distort a major historical debate

Data problems in the major auto reference books have been fairly minor most of the time. However, substantially varying production numbers on the 1954-55 Plymouth have made it harder to assess the relative popularity of the 1953 and 1955 redesigns.

Encyclopedia of American Cars

For example, a Standard Catalog (Gunnell, 2002) listed total model-year production in 1954 / 1955 as 433,000 / 672,100 whereas Wikipedia (2023) stated 463,148 / 705,455.

Consumer Guide publications have ranged from 520,385 / 705,455 in Over 100 Years: The American Auto (2010) to 463,148 / 401,075 in the model-year production totals included in the 1993 edition of the Encyclopedia of American Cars.

When adding up production broken out for individual models, the 1993 edition of the Encyclopedia had a slightly higher total for 1954 than the 2006 edition: 463,148 versus 462,698. For 1955, both editions tallied 704,445 units.

What made the most sense to me was to add up the Standard Catalog’s production data on individual models; these totaled 463,148 / 704,464 for 1954 / 1955. So is this right? I hope so, but I also wish that I had access to primary data sources to verify these numbers.

1954 Plymouth
1954 Plymouth (Old Car Advertisements)

What if someone created an electronic data bank?

One of my big research goals is to get closer to working with primary data. As a case in point, I have thought about holing up in a library and making copies of Automotive News production and sales data going back to 1946. I have thus far not gotten around to it because of the enormous time commitment.

Sometimes I wish that the folks who produce the Standard Catalogs and Encyclopedias would offer electronic editions. In theory, this would allow data errors to be fixed as they are recognized. Us data nerds could be quite helpful in that process. However, I’ve never gotten the sense that these publications — which are produced by for-profit entities — see this as a priority. That makes some sense; most of their readers may have only passing interest in the data.

Also see ‘Wheel spinning happens when car buffs and scholarly historians don’t collaborate’

I suspect that the best scenario would be for a nonprofit automotive history group to create a data bank. Indie Auto is hardly rich, but I consider this information important enough to my work that I would pay a reasonable fee for access to it. The big caveats are that it would need to be in an electronic format, reasonably accurate and comprehensive (e.g., including production/sales figures for both domestic automakers and imports).

Who might do this? The Antique Automobile Club of America recently built a library and research center. Other, volunteer-run groups could also take on this project, but they might struggle to build and maintain a reasonably accurate and complete data bank. This job strikes me as more the province of paid staff with at least some quantitative research skills.

Man fixing antique car at LeMay annual car show

Meanwhile, back in reality

That’s the dream, but in the meantime I will continue to incrementally improve the quality and comprehensiveness of my own databases. For example, a while back I wrote a “Data Dive” story on compact cars in order to build out my spreadsheet on that type of car.

My primary goal today is to give you, dear reader, a better understanding about where my numbers come from. Know that I appreciate it when you point out data in Indie Auto that doesn’t look right. One of my biggest goals is to provide a source of credible information — even if one disagrees with my analysis.

NOTES:

This story was originally posted on April 16, 2021 and expanded on April 12, 2023.

Share your reactions to this post with a comment below or a note to the editor.


RE:SOURCES

Over 100 years: The American Auto

ADVERTISEMENTS & BROCHURES:

6 Comments

  1. The first example looks like one or the other is a misprint. When I saw the headline, I was thinking the discrepancy was due to model year/calendar year/corporate fiscal year discrepancy, but that doesn’t seem to be the case. My guess is that since the discrepancies seem to occur with a particular model, I am guessing human error somewhere in tabulating.

  2. A worthy goal. I fear, thought, that the deeper you dive into the figures the crazier they’ll make you.

    I used to keep track of the features and specs for all American-market cars. In recent years this became increasingly difficult as manufacturer put out increasingly conflicting and incomplete data. If concrete things like standard features and dimensions cannot be accurately reported, good luck with sales and production totals, which exist only in various records and cannot be separately verified. Data gets fudged. If two records disagree, how can it be determined well after the fact which is correct?

    On the demand side, at most a few percent of the population really cares about such things. Even most of the people who think they care just want to FEEL like they care and like they have a source they can rely on. They don’t like it when doubt is cast upon their sources. They’re rather just continue thinking there’s no need to do any digging.

    • Michael, you make good points. I spent much of my career in research-oriented gigs so get that there can be diminishing returns in trying to secure anywhere close to perfect data. I do think it would be helpful if there were an electronic repository of reasonably trustworthy production data so each auto history researcher didn’t have to start from scratch, particularly given the potential for adding error when inputting vast amounts of data from what are already secondary sources (such as the Standard catalogs).

      So this conversation is really about how can we historians — professional and armchair — better organize ourselves to advance the field. Yes, it’s a limited number of people. And historically we have tended to be pretty individualistic, so I don’t have high hopes that something will come of this discussion. However, I think it is important to at least bring it up from time to time. We can accomplish more when we collaborate rather than compete.

      Thank you for the invite. I’ll email you when I get a moment, which may take a few days.

  3. I like the idea of a collaboration among historians.

    One thing that does puzzle me about many of the automotive history books I’ve read is the clear lack of thorough review, for both typos and errors. Did anyone other than the author actually read the work before publication?

    • I gather that they have quite small shops. What’s unfortunate is that there are at least a few of us out there who could help out. My swim lane is mostly production data; I wouldn’t proof a manuscript for free if it is a for-profit venture, but surely the big books have a freelance budget.

      Or maybe they just don’t care? Thus, the need for a nonprofit auto history group to step up to the plate.

Leave a Reply

Your email address will not be published.


*