This may not be the column you’re looking for. One of the great advantages this platform has given me is the opportunity to sound like I know what I’m talking about. Working with new discoveries gives you the opportunity to proclaim yourself the first “expert” in any discipline. While it’s always been my goal to bring out the practical applications in emerging technology, I will admit to a getting a thrill from venturing into uncharted territory.
But chasing the bleeding edge has its own pitfalls. Eventually something’s going to come along that leaves you lost in the weeds, which is exactly where I am in the search for new library data metrics.
No matter the industry, there’s a lot of opportunity to be gained if you can sort through large volumes of information and create meaningful conclusions. For me, this fixation was brought on by reading Moneyball: The Art of Winning an Unfair Game (W. W. Norton, 2003) almost a decade ago. Michael Lewis’ book exposed the way data analysis became an essential tool for quantifying the unquantifiable. By mixing sabermetric (taken from SABR, the Society for American Baseball Research) principles with traditional qualitative I-know-a-good-player-when-I-sees-it scouting methods, traditionally under-budgeted baseball teams could identify talented players that other teams had overlooked. The book completely changed the way I watch the sport.
Now, sabermetric principles are everywhere. Every team in the major leagues has a department devoted to crunching numbers. Governments at all levels are opening their datasets to the public, encouraging the creation of new data tools and creative approaches to municipal problems.1 And a certain Baseball Prospectus analyst named Nate Silver applied sabermetric methods to the political sphere, making himself a household name (and helping us circulate a few books in the process).
Bearing all this in mind, I have to ask: are biblio-sabermetrics possible? Based on my conversations with other librarians, I’m not alone in asking this question. (And, yes, I recognize the irony of using anecdotal evidence to support this claim.) Conversations at conferences, the library coffeemaker, and on the web have all come to a similar conclusion: There’s something out there. We’ve just got to figure out how to harness it.
What’s Missing Now
One of the more spirited conversations I had occurred on Facebook, where I stumbled upon a thread started by Emily Clasper, system operations and training manager for the fifty-four libraries in the Suffolk Cooperative Library System in upstate New York. In a recent conversation over email, I asked her to elaborate on the topic.2
Public Libraries: Why do you see a need for new metrics?
Emily Clasper: Old methods for measuring library success just aren’t cutting it. Many library administrators are used to regularly generating a set of circulation stats and calling it a day. All of a sudden, the old circ transaction numbers are not enough to tell the library’s story—not by a long shot. Services are evolving, and they’re having a hard time showing their success to those who control the purse strings, which includes the general public. So I’m getting panicked phone calls from library staff members who intuitively know that their library is successful in serving their community, who see firsthand the growth they’re experiencing, and are frustrated because the numbers they’ve relied on for so long just don’t show that.
PL: What’s missing from our current methods of data-gathering?
EC: Lots. At least with the libraries I work with, many have no methods in place for gathering quantitative data about the non-circulation–related services they provide, even though these are the services seeing the most growth and investment. We need ways to gather meaningful statistics regarding library programming, online services, user engagement, and facilities use, just to name a few.
PL: Do you think extra training or expertise is necessary?
EC: Yes, a resounding yes. It’s really a field of expertise all to itself. But once you have the numbers in your hand, you have to know how to interpret them and what to do with them. This is, in my experience, something that is sorely lacking. I feel a lot of frustration with this, as it’s a complex topic and I’m not an analyst myself. I think we need to work on drawing from outside expertise a bit more, and it’s probably an area where library administrators need to get a bit of education so that even if they can’t do the in-depth analysis themselves, they can at least have an idea of the right questions to ask.
PL: What questions would you like to see answered?
EC: Right now, it’s mostly a matter of return on investment (ROI) for our libraries. They have some serious budgetary concerns to address, so it’s very important to be sure that the things they spend money on are effectively adding value and serving the needs of the community. Also, I think we need to ask many, many more questions about how well we really know our communities and their needs. I think that we make an awful lot of assumptions about what will be valuable to our communities from the comfort of our offices and based primarily on the feedback of the users who come to us, which is a self-selecting group.
PL: Do you see our commitment to patron privacy as a barrier to better data-gathering?
EC: Me, no. We have a duty to protect the interests of our patrons and honor their privacy, which means we have an obligation to find ways to do this without compromising that position. But that doesn’t mean it can’t be done—we just have to be conscientious about it.
How Do We Do This?
Clasper’s responses above represent many of the common themes I’ve encountered talking with other librarians. Yes, we need better ways of gathering and analyzing our data. How we do this is another matter entirely. In some cases, this means diving into our integrated library systems, creating new reports, and building application programming interfaces (APIs) to identify patterns in patron data. It might mean developing new customer management tools, provided we can get our patrons to opt-in to sharing their information with us. And it might mean recruiting outside of our industry: recruiting people from the statistics or computer science world who can give us a better sense of what we’re dealing with.
Like Clasper says, there’s also the matter of what questions we should be asking in the first place. As we’ve learned from many a reference interview, knowing what to ask is more than half the battle. I’ve started a list of some of the things I’d like to examine. Here are just a few:
- Device use: In the age of multiple screens3 and BYOD (bring your own device), the raw counts of public PC use are only telling us part of the story. We could use patterns in our Wi-Fi traffic to identify much more about everything from which devices people prefer to when we can anticipate a spike in broadband usage.
- Third-spacers: The BYOD crowd also introduces the “Starbucks effect” in our buildings, where patrons camp out at open tables or study rooms for long periods of time. We’ve always talked about the library as a third space—why aren’t we supporting this argument with better data?
- Deep circulation: To borrow another concept from baseball, many of the new data measures are created by either combining two metrics together or taking an existing statistic and filtering out the less useful data.4 One possibility for looking at circs could be to measure the time between checkouts. If a book starts to exhibit smaller gaps between one checkout and the next, it might be a sign that it’s gaining in popularity.
- User mapping: Geographic information systems (GIS) software makes it possible to plot anonymized patron data over a map of a library’s service area. My library has used this in its strategic planning process, and it has provided a great deal of useful information about our geographic barriers to library service. As the tools become more robust, the opportunities become even greater, allowing us to identify patterns in everything from checkouts to program attendance to computer use.
It’s going to take a healthy mix of analytic thinking, mathematical ability, and creativity to crack this particular nut. In an essay from the 2013 Baseball Prospectus annual, Russell Carleton calls for more idiosyncratic approaches to data analysis.5 Given that we’re essentially looking for the inverse, I think this principle can easily be applied to our own profession. We’ve got plenty of right-brainers. We just need to develop some experimental tools to test the ideas.
So this is my proposal to you, public librarians. While data analysis has never been our strong suit, we’ve never been one to shy away from a challenge. It’s going to take a lot of trial and error, but eventually something’s going to stick. It’s time we become our own experts.
REFERENCES AND NOTES
- I touched on this in a 2012 Wired Library column. If you’d like more detail, see Toby Greenwalt, “2012: The Year Code Broke,” Public Libraries 51 no. 4 (July/Aug. 2012), accessed Mar. 1, 2013.
- Emily Clasper, email interview with the author, Feb. 27, 2013.
- Mary Madden, “Four or More: The New Demographic,” program presented at ALA Annual Conference, Washington,D.C., June 27, 2010, accessed Mar. 9, 2013.
- One good example of this is BABIP, or Batting Average on Balls in Play. This measure subtracts strikeouts from the original batting average, choosing instead to assess how frequently a hitter can get on base once they’ve actually made contact with the ball. The same stat can also be applied to pitchers, and reflects on their ability to get a batter to ground or fly out. But I digress.
- Russell Carleton, “Sabermetrician Wanted, Must Have MFA,” Baseball Prospectus 2013 edition, King Kaufman and Cecilia Tan, eds. (Hoboken, N.J.: Wiley, John & Sons, 2013), 533-36.