Libraries & Leaky Data: Part 3

By Aaron Skog

Part 1 of this series provided an overview of how library user data ends up in a variety of places within your library other than just the ILS. Part 2 of the series explained how your library services communicate over the network or across the internet in a variety of insecure ways. This is Part 3 of the series where you can take steps to secure your library data.

Here are recommendations on the best approaches to protecting library data. These are standard practices within the IT industry and there are many technical resources available on how to accomplish these steps.

Use Firewalls, AKA The Internet is Made of Ports

The basic way your library firewall works is utilizing the known network communication standards for various software and services. These are called ports. Understanding how networks communicate on ports allows a firewall configuration to be set to block a computer from outside your network from reaching something in the network. Many firewall appliances include security features to isolate threats, such as a blocking user who connects to your library WiFi and initiates a port scan from their laptop. It is possible the user unknowingly has a laptop infected with malware that is constantly looking for ways to spread to other devices.

Segment Your Network

The easiest way to envision network segmentation is to imagine every staff computer’s network cable in your library going to a dedicated network switch that does not connect to the public computers. Each group of computers is segmented from each other, and will not “see” each other on the network. It is possible to do this on the physical level and is fairly easy to pull off without technical know how. This basica approach can get expensive as you are duplicating various switches and equipment throughout your building. This is where virtual LANs can help.

All libraries should utilize virtual LAN segmentation within their local area network (LAN) as a basic rule for network security. This does take time and careful planning as every network layout is different. Below is a chart showing how one might group devices on your library network.

Virtual LAN SegmentsExamples of Groups of Computers, DevicesReason to Group Together
VLAN1Staff computers, staff wifiUser Data Present & Communicating Across Network
VLAN2Self-check stations, print release stations, computer reservation stationsUser Data Authenticating, Possibly Logging on Machines
VLAN3Public computers100% Restricted from Accessing VLAN1, Limited VLAN2 Access
VLAN4Public Wifi100% Restricted from VLAN1, VLAN2 (depending), VLAN3
VLAN5Servers on the network are segmented on their own and can only communicate to VLANs 1-4 on the specific ports.Most restricted access to these computer servers

Establish a Virtual Private Network (VPN)

It is vital for data security transport that library consortia members should utilize a VPN if their staff ILS client does not natively communicate securely back to the ILS server. This is especially important for Symphony WorkFlows users and Millennium/Sierra users.

Standalone libraries should also consider a VPN if their ILS is hosted. Older ILS are not using secure transport (notably SirsiDynix Symphony, Innovative’s Millennium/Sierra).

Move Away from Standard Interchange Protocol (SIP2)

As noted in Part 1 and Part 2 of this blog post series, SIP2 is natively insecure and a poor way to connect your library to other 3rd party services. Unless your library utilizes a VPN between you and your hosted service, SIP2 is simply communicating through the internet in plain text, leaking patron data such as addresses, birth dates, and passwords.

Vendors many libraries use such as SirsiDynix or Innovative Interfaces have alternate ways of transporting your library user data other than SIP2. You would need to inquire if application programming interfaces (API) are available for this purpose.

However, the majority of 3rd party library vendors do not offer alternate ways of connecting to your library ILS other than using SIP2. Make sure to inquire with your vendor representative during your annual renewal if alternate methods have been developed or are under consideration.

Understand Your Self-Service Systems

In Part 1 of this blog post series, I noted that many self-check systems and self-service print release stations will retain user data for the purpose of generating statistical reports for the library. It is important to establish a set procedure for retaining this data on these stations. Once your statistical reports are generated, taking the step to purge the logs or clearing the local database should be considered as routine work by library staff.

Understand Your Integrated Library System

There are some ILS that also log user transactions within the server as a separate process from circulation transactions. These logs should also be considered for periodic rotation and retention per your library data policy. Symphony is an example of having logs which can go back to the first day of the system being put in production. Your library ILS administrator can provide you additional details on ILS logging, or open an inquiry with your ILS vendor.

Take the Library Security Quiz

To assist libraries in assessing their data security, I have created an assessment tool to determine a security score. It will take a library director or management team some time to answer the questions and arrive at the final score.

QuestionAnswerYour Library Score
Which is your library ILS staff client? (Keep in mind the staff client is different from the ILS server software)  
WorkFlowsScore 10 for this insecure staff client 
SierraScore 10 for this insecure staff client 
PolarisScore 0 for this remote desktop client 
Polaris LEAPScore 0 for this web-based client 
BLUEcloud StaffScore 0 for this web-based client 
EvergreenScore 0 for this web-based client 
KohaScore 0 for this web-based client 
OCLC WorldShare Management SystemScore 0 for this web-based client 
HorizonScore 10 for this insecure staff client 
VoyagerScore 10 for this insecure staff client 
   
Does your library connect to the following services?  
OverDrive via SIP2Score 10 for this insecure authentication 
OverDrive via SirsiDynix Web ServicesScore 0 for this more secure authentication 
OverDrive via III Patron APIScore 0 for this more secure authentication 
OverDrive is authenticating, but our library does not know howScore 30 for not knowing 
Evanced Solutions via SIP2Score 10 for this insecure authentication 
Bibliotheca Cloudlibrary via SIP2Score 10 for this insecure authentication 
Bibliotheca Cloudlibrary via SirsiDynix Web ServicesScore 0 for this more secure authentication 
User data sent to Unique Management via email for collection purposesScore 10 for this insecure authentication 
User data sent to Unique Management via SFTP for collection purposesScore 0 for this more secure authentication 
Hoopla via SIP2Score 10 for this insecure authentication 
Hoopla via SirsiDynix Web ServicesScore 0 for this more secure authentication 
Hoopla via III Patron APIScore 0 for this more secure authentication 
MyPC via SIP2Score 10 for this insecure authentication 
MyPC via III Patron APIScore 0 for this more secure authentication 
MyPC via SirsiDynix Web ServicesScore 0 for this more secure authentication 
PCReservation (Envisonware) via SIP2Score 10 for this insecure authentication 
PCReservation (Envisonware) via III Patron APIScore 0 for this more secure authentication 
PCReservation (Envisonware) via SirsiDynix Web Services APIScore 0 for this more secure authentication 
   
Does your library use any of the following self-check systems?  
Bibliotheca/3M self-checks using SIP2Score 10 for this insecure authentication 
D-Tech self-checks using SIP2Score 10 for this insecure authentication 
Envisionware self-checks using SIP2Score 10 for this insecure authentication 
   
Does your library use any of the following solutions or techniques?  
Does your library OPAC utilize HTTPS 100% of the time?Score 0 if yes, score 10 for no 
Does your library use an Automated Material Handler using SIP2?Score 10 for this insecure authentication 
Does your library review and purge computer reservation server data?Score 0 if yes, score 10 for no 
Does your ILS require a SIP2 connection to have a login and password?Score 0 if yes, score 10 for no 
Does your library actively rotate and purge ILS server logs?Score 0 if yes, score 10 for no 
Separate VLANs for staff vs public vs public WiFiScore 0 for yes, score 20 for no 
VPN to hosted ILS (consortium or with vendor)Score 0 for yes, score 20 for no 
VPN client on staff laptop to connect to library networkScore 0 for yes, score 20 for no 
   
   
   
Your Library Security Score Total 0
   
   
Scores 90 or Higher  
Your library is extremely insecure with its user data and steps should be taken immediately to start lowering your score. Begin by talking to your IT staff to ensure your vendors have solutions other than SIP2 to connect to your library ILS, and create a plan to lower 40 points over the next year. If you do not have a VPN or VLANs, the library should establish a VPN to the ILS or hosting library consortium and implement VLANs within your network if you have not done so. 
   
   
Scores 50 – 70  
Your library has some insecure areas it needs to focus on, but you are not terrible. The little things matter such as moving away from SIP2 usage when you have the option to do so. 
   
Scores 30-60  
Your library is pretty secure with its data! Take a look at the few scores and see if you can turn those into zeros over the next year. 
   
Scores 0 – 20  
Congratulations for putting your library data in the most secure footing possible! Make sure to reward your library IT staff and thank your vendors for providing secure options to help protect your user data. 
Advertisement

Libraries & Leaky Data: Part 2

By Aaron Skog

In my first post of the series “Libraries & Leaky Data,” I provided an overview of how libraries are accumulating patron information in a variety of “hidden” areas of the library. I noted that if a library were to be subject to a ransomware attack, it is possible that patron information could be stolen from machines dedicated to print release, computer reservation, or self-checkout. For the second part of this series, I will explain how libraries are passing data through their networks and through the internet insecurely.

First, it is important to understand that data traversing the internet from one server location to another are by default insecure unless measures are taken to secure those transactions. So a library patron logging onto their OverDrive account to search and get ebooks is in good shape because the OverDrive website utilizes HTTPS, correct?

Not necessarily. What many libraries are doing is providing this authentication on the back-end of this transaction without any security whatsoever. So the patron actually might submit their barcode and PIN via HTTPS, but the communication back to the library’s integrated library system (ILS) from the vendor, e.g. OverDrive, is likely using SIP2 to verify this barcode and PIN. The back-end communication does this without HTTPS or a VPN to protect that transmission. This creates the illusion of data security to the public, but the reality is the library’s go-to protocol (most likely SIP2) for 3rd party connections are usually deployed without any secure communication in place.

Description: Library Data Communication Diagram

What this means is that for every patron login with OverDrive, the OverDrive servers verify back to the library’s ILS using insecure methods (no HTTPS at all) and the ILS sends a trove of patron data back to OverDrive in plain text. Here are the 10 patron fields of information shared within a single SIP2 patron authentication query.

  1. User’s barcode
  2. User’s PIN/password
  3. User’s full name
  4. Address
  5. Email address
  6. Phone number
  7. Birthdate
  8. Gender
  9. Age category
  10. Fines owed

This problem isn’t just with OverDrive but it is with nearly every 3rd party hosted service a library is using. If your library is authenticating with SIP2, the chances are that your other hosted services such as room reservation are doing the same thing: showing a HTTPS on the patron/staff interface, but communicating without any security on the back-end.

Making matters worse, this insecure communication problem is also inherent with our ILS platforms. The ILS staff client communicating back to the ILS server is another source of data being sent back and forth with potential insecure means. Some ILS platforms handle this well from a security standpoint, e.g. Polaris utilizes encryption within a remote desktop client. Other staff ILS clients require additional layers of security to prevent the client from sending or requesting data from the server in an insecure transaction. ILS platforms such as Symphony or Sierra utilize a staff client that will pass data back to the ILS server in plain text. Some of the newer web-based staff clients such as SirsiDynix BLUEcloud, Polaris LEAP, Evergreen, or Ex Libris Alma utilize the HTTPS security on the staff client, which is the ideal secure communication as it is end-to-end and requires no intermediate network security such as the VPN or VLAN.

How can we improve our library data security? I will outline the various approaches to improve and protect library data transmission in part 3 of this series.

Libraries & Leaky Data: Part 1

By Aaron Skog

The ILA Best Practices Committee has recently been tasked with studying the issues of patron privacy around the use of printed hold wrappers in public areas. It is good to see a focus on the most obvious aspects of protecting patron’s privacy since having a patron’s full name stuck on a book in a public area is just an outright problem when you think about it. If we attempt to square this practice with the widespread acceptance that a patron’s reading habits and their history of checkouts must be protected from other prying eyes (such as government agencies or various Freedom of Information requests) we see the difficult balance between providing convenience and adherence to privacy. There are however, other areas within library services where the patron data being “leaked” is not as easy to see as a hold wrapper printed with a patron name. These sources of data leaks can be found within the software ecosystem used commonly throughout libraries.

What are these potential sources of library data being leaked? Below are some of the more widely used pieces of library technology which potentially have your library patron data or require accessing your patron data at some point within their functions.

  • Integrated Library System
  • Discovery OPAC
  • Self-checks
  • Computer Reservation
  • Print Stations
  • Automated Material Handlers (AMH)

All of these software systems either by design or through its back-end structure may collect patron information within their databases or software logging process. These systems can run for years, quietly collecting data, as they sit somewhat inconspicuously on the library network.

The worst culprit within the library software ecosystem for leaking patron information into your library network is the Standard Interchange Protocol, otherwise known as SIP2. The widespread use of SIP2 was due to our need for standardization of data exchanges between library software systems. This led somewhat innocently to the SIP2 protocol being used far and wide in library technology. Nearly every software vendor that wants to sell a software services to any library will use or work with SIP2. Any library software service that queries the ILS can do this through the use of SIP2, so the adoption by libraries of SIP2 on their networks is near universal.

How bad is SIP2 in terms of data security? Pretty bad in terms of how it is typically deployed “out of the box” within library networks. Here are the 10 patron fields of information shared within a single SIP2 patron authentication query.

  1. User’s barcode
  2. User’s PIN/password
  3. User’s full name
  4. Address
  5. Email address
  6. Phone number
  7. Birthdate
  8. Gender
  9. Age category
  10. Fines owed

A single query to see if a patron can gain access to a library computer or to a service will send all 10 fields of patron across the network regardless of only needing to verify if the patron is in “good standing.” It doesn’t matter if the service only needs to see one of the fields: SIP2 sends all 10 fields of data in response to a query.

With the widespread use of SIP2 protocol within library networks and the preponderance of various systems within the library such as multiple self-check stations or print stations, all of which likely use SIP2 to talk to the ILS, you have a lot of patron data being sent around the library network. Making this problem worse, all of these data fields are sent in plain text, which includes the patron’s PIN/password. Many systems software logging processes will save every SIP2 transaction into a file that can easily have hundreds of patron’s passwords and potentially thousands of transactions showing a patron checking out an item. These computer stations typically utilize local logging or small-scale databases for the purposes of providing libraries statistics on usage at the individual stations. Unless active measures are undertaken to purge logs and remove data collected, libraries have patron data stored throughout library desktops and servers beyond the typically more secure ILS.

It is usually at this stage of describing the problem where there is some questioning  on the severity of the issue. Some folks will minimize the likelihood of this data getting hacked or stolen from the library network. Or they will take solace in the library being a small, unworthy target for any malicious intent. While it is true we have been largely helped by the fact we are a small, perhaps less juicy, target of a data hack, the network data attacks have now reached a more ruthless level. These ransomware attacks simply do not care who their target is and go through an automatically scripted series of software exploits to hijack any computer or server and steal/password encrypt its local data for ransom. This has occurred at the National Health System in the United Kingdom, dozens of countries government networks, and more recently the Baltimore City’s servers. It has even happened to public libraries.

We can no longer sit idly by and wait for the data to be stolen under this scenario and the ensuing PR and financial liability nightmare to befall us. If this were to happen to a library, wouldn’t it be better to know that the only source of patron data available was at a single point on your network rather than dozens? Or that it was understood precisely where this patron data resides and to take better efforts to protect the data on that device? Over a series of blog posts, I will outline the steps to take to help libraries understand the network protocols, ILS configuration strategies, network design, storage and logging that should be considered when undertaking an overall audit of your library’s leaky data problem.

The Great Recession Impact on Illinois Public Libraries: Part 2

By Aaron Skog

US public library usage during the 12 year window of the Great Recession show a dramatic increase between the 2009-2011 years. The Illinois public library usage in the areas of visits, circulation, and program attendance also show evidence of the Great Recession. This impact is primarily seen in the 2009-2011 years. The totals for the Illinois public library metrics for library visits, circulation, and program attendance are below. However, a closer look at data shows that not all Illinois public libraries appear to have been impacted equally by the Great Recession.

Illinois public library visits peaked in 2010, and for the seven years following the visits have fallen, and by 2017 are below the 2006 library visits total.

Illinois public library circulation totals also show 2010 as the highest year for this metric, as it was for library visits in 2010.

Library program attendance has largely defied the Great Recession impact, showing a modest increase over the 2008-2017 period. While this metric is a sign of success, it is worth pointing out that program attendance represents 0.4% of the total public library visits.

Chicago Public Library visits during the 12 years of 2006-2017 reflect the national trend of library visits peaking in 2009 and then falling each year after the Great Recession.

For the libraries serving large populations over 100,000 other than Chicago Public Library, the visits for the 2006-2017 period reflect the Great Recession trends, but not as clearly. The use of a graph also highlights some data anomalies, such as the Naperville Public Library visits in 2016 (see the top blue line which falls in 2016). Schaumburg Public Library also has a steep decline for the 2017 year, but this is the first year since 2008 in which visits data is not rounded to a broad number, which may reflect some change in how visits are recorded at the library.

For the libraries serving a population between 60,000 and 100,000, the visits do not reflect the Great Recession trends overall. There again are some indications of data anomalies, such as the Arlington Heights 2016 library visits reported to IPLAR, but overall the trend for this group of libraries shows visits are holding steady compared to the other public libraries within this population group.

For the libraries serving a population between 50,000 and 60,000 there is a strong indication of the Great Recession in the 2011 year for the Oak Park Public Library. The visits for Oak Park Public Library are strong for this population group—showing this library is consistently is “punching above its weight.”

There are 583 Illinois public libraries under the 50,000-population threshold. Their combined library visits reflect a 2011 peak in library visits, two years after the US public library national trend.

In conclusion, the IMLS data for Illinois during the 12-year range of 2006-2017 shows the Great Recession impact for the Chicago Public Library, Oak Park Public Library, and those 583 combined libraries under the 50,000-service population. Those libraries above 50,000 population served (excluding the Chicago Public Library) do not show a Great Recession impact as dramatically.

The Great Recession Impact on US Public Libraries

By Aaron Skog

With talk of a potential new economic recession, it seems appropriate to assess how the last recession—the Great Recession—impacted public libraries in the US. The Institute for Museum and Library Services (IMLS) makes its public library survey data available online, and if we examine this data on a state by state basis, or as a whole for the United States, we can see the dramatic impact the 2007-2009 Great Recession had on public libraries..

It is important to understand the scope and size of the impact of the Great Recession. If we only look at data for the past 5 years, we would see an alarming downward trend. (Please note: IMLS public library survey data is only available through 2017.)

If your library leadership saw this sort of trendline, alarm bells would be sounding about the precipitous decline in library visits. However, a much wider view of the data is needed in order to see what is truly happening.

What we are seeing is that the impact of the Great Recession still reverberates throughout US public libraries nine years after the fact. US Public library usage peaked in 2009, the year the Great Recession technically ended. Overall, US public library visits are now below the 2006 usage levels.

Using the 8 states in the US Great Lakes region as a comparison, we can see that Illinois public libraries experienced a steep rise in library visits, peaking in 2011 at 83,234,090. Since that year, Illinois has seen public library visits falling steadily.

Illinois public library circulation metrics also reflect this rise and fall, peaking in 2010 at 121,828,806 and falling 12% over the next 7 years. Illinois circulation numbers are holding steady for 2016-2017.

There are some bright spots in the data. Public library program attendance has been steadily rising, showing no evidence in the public library survey data of the falling when compared with circulation and library visit data. It is worth noting, however, Illinois public library program attendance represents only 0.4% of the total Illinois public library visits.

What can library leadership and staff do with this data?

  • If library trustees are overly focused on the year to year circulation and library visit counts, they should be made aware of the overall impact of the 2009 Great Recession. Context is key!
  • Regions matter—if your public library is seeing these downward trends, chances are that your neighboring libraries in the region are seeing the same trends.

The Illinois Public Library Annual Survey (IPLAR) data is provided to the IMLS, so your hard work each year helps with research and analysis. You can look forward to my next post on how Illinois public libraries individually are trending in these same metrics.