Libraries & Leaky Data: Part 1

By Aaron Skog

The ILA Best Practices Committee has recently been tasked with studying the issues of patron privacy around the use of printed hold wrappers in public areas. It is good to see a focus on the most obvious aspects of protecting patron’s privacy since having a patron’s full name stuck on a book in a public area is just an outright problem when you think about it. If we attempt to square this practice with the widespread acceptance that a patron’s reading habits and their history of checkouts must be protected from other prying eyes (such as government agencies or various Freedom of Information requests) we see the difficult balance between providing convenience and adherence to privacy. There are however, other areas within library services where the patron data being “leaked” is not as easy to see as a hold wrapper printed with a patron name. These sources of data leaks can be found within the software ecosystem used commonly throughout libraries.

What are these potential sources of library data being leaked? Below are some of the more widely used pieces of library technology which potentially have your library patron data or require accessing your patron data at some point within their functions.

  • Integrated Library System
  • Discovery OPAC
  • Self-checks
  • Computer Reservation
  • Print Stations
  • Automated Material Handlers (AMH)

All of these software systems either by design or through its back-end structure may collect patron information within their databases or software logging process. These systems can run for years, quietly collecting data, as they sit somewhat inconspicuously on the library network.

The worst culprit within the library software ecosystem for leaking patron information into your library network is the Standard Interchange Protocol, otherwise known as SIP2. The widespread use of SIP2 was due to our need for standardization of data exchanges between library software systems. This led somewhat innocently to the SIP2 protocol being used far and wide in library technology. Nearly every software vendor that wants to sell a software services to any library will use or work with SIP2. Any library software service that queries the ILS can do this through the use of SIP2, so the adoption by libraries of SIP2 on their networks is near universal.

How bad is SIP2 in terms of data security? Pretty bad in terms of how it is typically deployed “out of the box” within library networks. Here are the 10 patron fields of information shared within a single SIP2 patron authentication query.

  1. User’s barcode
  2. User’s PIN/password
  3. User’s full name
  4. Address
  5. Email address
  6. Phone number
  7. Birthdate
  8. Gender
  9. Age category
  10. Fines owed

A single query to see if a patron can gain access to a library computer or to a service will send all 10 fields of patron across the network regardless of only needing to verify if the patron is in “good standing.” It doesn’t matter if the service only needs to see one of the fields: SIP2 sends all 10 fields of data in response to a query.

With the widespread use of SIP2 protocol within library networks and the preponderance of various systems within the library such as multiple self-check stations or print stations, all of which likely use SIP2 to talk to the ILS, you have a lot of patron data being sent around the library network. Making this problem worse, all of these data fields are sent in plain text, which includes the patron’s PIN/password. Many systems software logging processes will save every SIP2 transaction into a file that can easily have hundreds of patron’s passwords and potentially thousands of transactions showing a patron checking out an item. These computer stations typically utilize local logging or small-scale databases for the purposes of providing libraries statistics on usage at the individual stations. Unless active measures are undertaken to purge logs and remove data collected, libraries have patron data stored throughout library desktops and servers beyond the typically more secure ILS.

It is usually at this stage of describing the problem where there is some questioning  on the severity of the issue. Some folks will minimize the likelihood of this data getting hacked or stolen from the library network. Or they will take solace in the library being a small, unworthy target for any malicious intent. While it is true we have been largely helped by the fact we are a small, perhaps less juicy, target of a data hack, the network data attacks have now reached a more ruthless level. These ransomware attacks simply do not care who their target is and go through an automatically scripted series of software exploits to hijack any computer or server and steal/password encrypt its local data for ransom. This has occurred at the National Health System in the United Kingdom, dozens of countries government networks, and more recently the Baltimore City’s servers. It has even happened to public libraries.

We can no longer sit idly by and wait for the data to be stolen under this scenario and the ensuing PR and financial liability nightmare to befall us. If this were to happen to a library, wouldn’t it be better to know that the only source of patron data available was at a single point on your network rather than dozens? Or that it was understood precisely where this patron data resides and to take better efforts to protect the data on that device? Over a series of blog posts, I will outline the steps to take to help libraries understand the network protocols, ILS configuration strategies, network design, storage and logging that should be considered when undertaking an overall audit of your library’s leaky data problem.

The Great Recession Impact on Illinois Public Libraries: Part 2

By Aaron Skog

US public library usage during the 12 year window of the Great Recession show a dramatic increase between the 2009-2011 years. The Illinois public library usage in the areas of visits, circulation, and program attendance also show evidence of the Great Recession. This impact is primarily seen in the 2009-2011 years. The totals for the Illinois public library metrics for library visits, circulation, and program attendance are below. However, a closer look at data shows that not all Illinois public libraries appear to have been impacted equally by the Great Recession.

Illinois public library visits peaked in 2010, and for the seven years following the visits have fallen, and by 2017 are below the 2006 library visits total.

Illinois public library circulation totals also show 2010 as the highest year for this metric, as it was for library visits in 2010.

Library program attendance has largely defied the Great Recession impact, showing a modest increase over the 2008-2017 period. While this metric is a sign of success, it is worth pointing out that program attendance represents 0.4% of the total public library visits.

Chicago Public Library visits during the 12 years of 2006-2017 reflect the national trend of library visits peaking in 2009 and then falling each year after the Great Recession.

For the libraries serving large populations over 100,000 other than Chicago Public Library, the visits for the 2006-2017 period reflect the Great Recession trends, but not as clearly. The use of a graph also highlights some data anomalies, such as the Naperville Public Library visits in 2016 (see the top blue line which falls in 2016). Schaumburg Public Library also has a steep decline for the 2017 year, but this is the first year since 2008 in which visits data is not rounded to a broad number, which may reflect some change in how visits are recorded at the library.

For the libraries serving a population between 60,000 and 100,000, the visits do not reflect the Great Recession trends overall. There again are some indications of data anomalies, such as the Arlington Heights 2016 library visits reported to IPLAR, but overall the trend for this group of libraries shows visits are holding steady compared to the other public libraries within this population group.

For the libraries serving a population between 50,000 and 60,000 there is a strong indication of the Great Recession in the 2011 year for the Oak Park Public Library. The visits for Oak Park Public Library are strong for this population group—showing this library is consistently is “punching above its weight.”

There are 583 Illinois public libraries under the 50,000-population threshold. Their combined library visits reflect a 2011 peak in library visits, two years after the US public library national trend.

In conclusion, the IMLS data for Illinois during the 12-year range of 2006-2017 shows the Great Recession impact for the Chicago Public Library, Oak Park Public Library, and those 583 combined libraries under the 50,000-service population. Those libraries above 50,000 population served (excluding the Chicago Public Library) do not show a Great Recession impact as dramatically.

The Great Recession Impact on US Public Libraries

By Aaron Skog

With talk of a potential new economic recession, it seems appropriate to assess how the last recession—the Great Recession—impacted public libraries in the US. The Institute for Museum and Library Services (IMLS) makes its public library survey data available online, and if we examine this data on a state by state basis, or as a whole for the United States, we can see the dramatic impact the 2007-2009 Great Recession had on public libraries..

It is important to understand the scope and size of the impact of the Great Recession. If we only look at data for the past 5 years, we would see an alarming downward trend. (Please note: IMLS public library survey data is only available through 2017.)

If your library leadership saw this sort of trendline, alarm bells would be sounding about the precipitous decline in library visits. However, a much wider view of the data is needed in order to see what is truly happening.

What we are seeing is that the impact of the Great Recession still reverberates throughout US public libraries nine years after the fact. US Public library usage peaked in 2009, the year the Great Recession technically ended. Overall, US public library visits are now below the 2006 usage levels.

Using the 8 states in the US Great Lakes region as a comparison, we can see that Illinois public libraries experienced a steep rise in library visits, peaking in 2011 at 83,234,090. Since that year, Illinois has seen public library visits falling steadily.

Illinois public library circulation metrics also reflect this rise and fall, peaking in 2010 at 121,828,806 and falling 12% over the next 7 years. Illinois circulation numbers are holding steady for 2016-2017.

There are some bright spots in the data. Public library program attendance has been steadily rising, showing no evidence in the public library survey data of the falling when compared with circulation and library visit data. It is worth noting, however, Illinois public library program attendance represents only 0.4% of the total Illinois public library visits.

What can library leadership and staff do with this data?

  • If library trustees are overly focused on the year to year circulation and library visit counts, they should be made aware of the overall impact of the 2009 Great Recession. Context is key!
  • Regions matter—if your public library is seeing these downward trends, chances are that your neighboring libraries in the region are seeing the same trends.

The Illinois Public Library Annual Survey (IPLAR) data is provided to the IMLS, so your hard work each year helps with research and analysis. You can look forward to my next post on how Illinois public libraries individually are trending in these same metrics.