For those operating on a June year end fiscal year, the finish line is in sight. We are cleaning up our records, gathering our data, and readying our reports. It is Statistics Season. Every year I hear the same thing from someone: ‘statistics lie.’
For years I taught statistics. And, yes, it is true, you can lie with statistics. However, you can only lie with statistics to those who don’t know anything about statistics. On that front, one can lie about anything. I could tell you the world is flat. If you knew nothing, you might believe me, and I would have told a successful lie. For years, those businesses that saw libraries as competition have been saying libraries are dead. It’s a lie. But for those who know nothing about libraries, they believe it. Lies with statistics are often intentional and come about when the presenter fails to include all of the information. This is why one can only lie with statistics to those who don’t know statistics. When looking at numerical information, there are two basic rules:
1. Always know the total real number. Most stats use percentages. This is a convenient tool that makes comparisons easy. It allows whatever one is looking at to be viewed in terms of 100. 50% is half. 33% is a third. But a half or a third of what amount? Think about this in terms of cash. I offer you a half of a dollar, you might yawn. I offer you a half of 10 million dollars – that would get your attention. Without some indication of how many items/cases were included, a percent is vague at best.
It is also an easy way to lie. For example, say I ask 4 people how they like the library’s new pet snake. If three of them say yes and one says no, I can honestly report that 75% like our new pet snake. But if I do not tell you I’ve only asked 4 people, is my assertion that most people like our pet a lie? Many would say yes. This lie is easily uncovered by asking how many people were asked. When the actual total numbers are not offered, I’m skeptical.
2. Always know who or what the numbers are coming from. In libraries, hard numbers can be difficult to come by and there will always be a level of acceptance (or not) of how data is gathered. Most library surveys are taken by people in the library—a slightly biased group, often with no guarantee that one devoted person has not stacked the deck. Circulation numbers are generally gleaned from our ILS and we are at the mercy of our programs and appropriate scanning. There will always be concerns, but generally ones we accept despite a margin of error.
Still, going back to our pet snake, consider that instead of asking 4 people, I asked 12 people. Again, 4 of them said they did not like our pet, but 8 said they did. Again, my report says 75% like our pet. But what if those 8 people were all from the local herpetology club? What if they were all personal snake owners, surveyed from the local pet store? How one gets their data is just as important as the numbers themselves. Who was answering the survey and where they were asked should always be known.
There are certainly other elements to be aware of, but these two elements can take one a long way. Armed with this information, it makes it very difficult to be lied to; presenting this information can make it less likely the accusation can be made. If this information is not shared, it always raises a red flag. When presenting statistics, I am always certain to have the total ‘real’ numbers on hand.