Libraries & Leaky Data: Part 3

By Aaron Skog

Part 1 of this series provided an overview of how library user data ends up in a variety of places within your library other than just the ILS. Part 2 of the series explained how your library services communicate over the network or across the internet in a variety of insecure ways. This is Part 3 of the series where you can take steps to secure your library data.

Here are recommendations on the best approaches to protecting library data. These are standard practices within the IT industry and there are many technical resources available on how to accomplish these steps.

Use Firewalls, AKA The Internet is Made of Ports

The basic way your library firewall works is utilizing the known network communication standards for various software and services. These are called ports. Understanding how networks communicate on ports allows a firewall configuration to be set to block a computer from outside your network from reaching something in the network. Many firewall appliances include security features to isolate threats, such as a blocking user who connects to your library WiFi and initiates a port scan from their laptop. It is possible the user unknowingly has a laptop infected with malware that is constantly looking for ways to spread to other devices.

Segment Your Network

The easiest way to envision network segmentation is to imagine every staff computer’s network cable in your library going to a dedicated network switch that does not connect to the public computers. Each group of computers is segmented from each other, and will not “see” each other on the network. It is possible to do this on the physical level and is fairly easy to pull off without technical know how. This basica approach can get expensive as you are duplicating various switches and equipment throughout your building. This is where virtual LANs can help.

All libraries should utilize virtual LAN segmentation within their local area network (LAN) as a basic rule for network security. This does take time and careful planning as every network layout is different. Below is a chart showing how one might group devices on your library network.

Virtual LAN Segments	Examples of Groups of Computers, Devices	Reason to Group Together
VLAN1	Staff computers, staff wifi	User Data Present & Communicating Across Network
VLAN2	Self-check stations, print release stations, computer reservation stations	User Data Authenticating, Possibly Logging on Machines
VLAN3	Public computers	100% Restricted from Accessing VLAN1, Limited VLAN2 Access
VLAN4	Public Wifi	100% Restricted from VLAN1, VLAN2 (depending), VLAN3
VLAN5	Servers on the network are segmented on their own and can only communicate to VLANs 1-4 on the specific ports.	Most restricted access to these computer servers

Establish a Virtual Private Network (VPN)

It is vital for data security transport that library consortia members should utilize a VPN if their staff ILS client does not natively communicate securely back to the ILS server. This is especially important for Symphony WorkFlows users and Millennium/Sierra users.

Standalone libraries should also consider a VPN if their ILS is hosted. Older ILS are not using secure transport (notably SirsiDynix Symphony, Innovative’s Millennium/Sierra).

Move Away from Standard Interchange Protocol (SIP2)

As noted in Part 1 and Part 2 of this blog post series, SIP2 is natively insecure and a poor way to connect your library to other 3rd party services. Unless your library utilizes a VPN between you and your hosted service, SIP2 is simply communicating through the internet in plain text, leaking patron data such as addresses, birth dates, and passwords.

Vendors many libraries use such as SirsiDynix or Innovative Interfaces have alternate ways of transporting your library user data other than SIP2. You would need to inquire if application programming interfaces (API) are available for this purpose.

However, the majority of 3rd party library vendors do not offer alternate ways of connecting to your library ILS other than using SIP2. Make sure to inquire with your vendor representative during your annual renewal if alternate methods have been developed or are under consideration.

Understand Your Self-Service Systems

In Part 1 of this blog post series, I noted that many self-check systems and self-service print release stations will retain user data for the purpose of generating statistical reports for the library. It is important to establish a set procedure for retaining this data on these stations. Once your statistical reports are generated, taking the step to purge the logs or clearing the local database should be considered as routine work by library staff.

Understand Your Integrated Library System

There are some ILS that also log user transactions within the server as a separate process from circulation transactions. These logs should also be considered for periodic rotation and retention per your library data policy. Symphony is an example of having logs which can go back to the first day of the system being put in production. Your library ILS administrator can provide you additional details on ILS logging, or open an inquiry with your ILS vendor.

Take the Library Security Quiz

To assist libraries in assessing their data security, I have created an assessment tool to determine a security score. It will take a library director or management team some time to answer the questions and arrive at the final score.

Question	Answer	Your Library Score
Which is your library ILS staff client? (Keep in mind the staff client is different from the ILS server software)
WorkFlows	Score 10 for this insecure staff client
Sierra	Score 10 for this insecure staff client
Polaris	Score 0 for this remote desktop client
Polaris LEAP	Score 0 for this web-based client
BLUEcloud Staff	Score 0 for this web-based client
Evergreen	Score 0 for this web-based client
Koha	Score 0 for this web-based client
OCLC WorldShare Management System	Score 0 for this web-based client
Horizon	Score 10 for this insecure staff client
Voyager	Score 10 for this insecure staff client

Does your library connect to the following services?
OverDrive via SIP2	Score 10 for this insecure authentication
OverDrive via SirsiDynix Web Services	Score 0 for this more secure authentication
OverDrive via III Patron API	Score 0 for this more secure authentication
OverDrive is authenticating, but our library does not know how	Score 30 for not knowing
Evanced Solutions via SIP2	Score 10 for this insecure authentication
Bibliotheca Cloudlibrary via SIP2	Score 10 for this insecure authentication
Bibliotheca Cloudlibrary via SirsiDynix Web Services	Score 0 for this more secure authentication
User data sent to Unique Management via email for collection purposes	Score 10 for this insecure authentication
User data sent to Unique Management via SFTP for collection purposes	Score 0 for this more secure authentication
Hoopla via SIP2	Score 10 for this insecure authentication
Hoopla via SirsiDynix Web Services	Score 0 for this more secure authentication
Hoopla via III Patron API	Score 0 for this more secure authentication
MyPC via SIP2	Score 10 for this insecure authentication
MyPC via III Patron API	Score 0 for this more secure authentication
MyPC via SirsiDynix Web Services	Score 0 for this more secure authentication
PCReservation (Envisonware) via SIP2	Score 10 for this insecure authentication
PCReservation (Envisonware) via III Patron API	Score 0 for this more secure authentication
PCReservation (Envisonware) via SirsiDynix Web Services API	Score 0 for this more secure authentication

Does your library use any of the following self-check systems?
Bibliotheca/3M self-checks using SIP2	Score 10 for this insecure authentication
D-Tech self-checks using SIP2	Score 10 for this insecure authentication
Envisionware self-checks using SIP2	Score 10 for this insecure authentication

Does your library use any of the following solutions or techniques?
Does your library OPAC utilize HTTPS 100% of the time?	Score 0 if yes, score 10 for no
Does your library use an Automated Material Handler using SIP2?	Score 10 for this insecure authentication
Does your library review and purge computer reservation server data?	Score 0 if yes, score 10 for no
Does your ILS require a SIP2 connection to have a login and password?	Score 0 if yes, score 10 for no
Does your library actively rotate and purge ILS server logs?	Score 0 if yes, score 10 for no
Separate VLANs for staff vs public vs public WiFi	Score 0 for yes, score 20 for no
VPN to hosted ILS (consortium or with vendor)	Score 0 for yes, score 20 for no
VPN client on staff laptop to connect to library network	Score 0 for yes, score 20 for no



Your Library Security Score Total		0


Scores 90 or Higher
Your library is extremely insecure with its user data and steps should be taken immediately to start lowering your score. Begin by talking to your IT staff to ensure your vendors have solutions other than SIP2 to connect to your library ILS, and create a plan to lower 40 points over the next year. If you do not have a VPN or VLANs, the library should establish a VPN to the ILS or hosting library consortium and implement VLANs within your network if you have not done so.


Scores 50 – 70
Your library has some insecure areas it needs to focus on, but you are not terrible. The little things matter such as moving away from SIP2 usage when you have the option to do so.

Scores 30-60
Your library is pretty secure with its data! Take a look at the few scores and see if you can turn those into zeros over the next year.

Scores 0 – 20
Congratulations for putting your library data in the most secure footing possible! Make sure to reward your library IT staff and thank your vendors for providing secure options to help protect your user data.

Libraries & Leaky Data: Part 2

By Aaron Skog

In my first post of the series “Libraries & Leaky Data,” I provided an overview of how libraries are accumulating patron information in a variety of “hidden” areas of the library. I noted that if a library were to be subject to a ransomware attack, it is possible that patron information could be stolen from machines dedicated to print release, computer reservation, or self-checkout. For the second part of this series, I will explain how libraries are passing data through their networks and through the internet insecurely.

First, it is important to understand that data traversing the internet from one server location to another are by default insecure unless measures are taken to secure those transactions. So a library patron logging onto their OverDrive account to search and get ebooks is in good shape because the OverDrive website utilizes HTTPS, correct?

Not necessarily. What many libraries are doing is providing this authentication on the back-end of this transaction without any security whatsoever. So the patron actually might submit their barcode and PIN via HTTPS, but the communication back to the library’s integrated library system (ILS) from the vendor, e.g. OverDrive, is likely using SIP2 to verify this barcode and PIN. The back-end communication does this without HTTPS or a VPN to protect that transmission. This creates the illusion of data security to the public, but the reality is the library’s go-to protocol (most likely SIP2) for 3rd party connections are usually deployed without any secure communication in place.

Description: Library Data Communication Diagram

What this means is that for every patron login with OverDrive, the OverDrive servers verify back to the library’s ILS using insecure methods (no HTTPS at all) and the ILS sends a trove of patron data back to OverDrive in plain text. Here are the 10 patron fields of information shared within a single SIP2 patron authentication query.

User’s barcode
User’s PIN/password
User’s full name
Address
Email address
Phone number
Birthdate
Gender
Age category
Fines owed

This problem isn’t just with OverDrive but it is with nearly every 3rd party hosted service a library is using. If your library is authenticating with SIP2, the chances are that your other hosted services such as room reservation are doing the same thing: showing a HTTPS on the patron/staff interface, but communicating without any security on the back-end.

Making matters worse, this insecure communication problem is also inherent with our ILS platforms. The ILS staff client communicating back to the ILS server is another source of data being sent back and forth with potential insecure means. Some ILS platforms handle this well from a security standpoint, e.g. Polaris utilizes encryption within a remote desktop client. Other staff ILS clients require additional layers of security to prevent the client from sending or requesting data from the server in an insecure transaction. ILS platforms such as Symphony or Sierra utilize a staff client that will pass data back to the ILS server in plain text. Some of the newer web-based staff clients such as SirsiDynix BLUEcloud, Polaris LEAP, Evergreen, or Ex Libris Alma utilize the HTTPS security on the staff client, which is the ideal secure communication as it is end-to-end and requires no intermediate network security such as the VPN or VLAN.

How can we improve our library data security? I will outline the various approaches to improve and protect library data transmission in part 3 of this series.

Libraries & Leaky Data: Part 1

By Aaron Skog

The ILA Best Practices Committee has recently been tasked with studying the issues of patron privacy around the use of printed hold wrappers in public areas. It is good to see a focus on the most obvious aspects of protecting patron’s privacy since having a patron’s full name stuck on a book in a public area is just an outright problem when you think about it. If we attempt to square this practice with the widespread acceptance that a patron’s reading habits and their history of checkouts must be protected from other prying eyes (such as government agencies or various Freedom of Information requests) we see the difficult balance between providing convenience and adherence to privacy. There are however, other areas within library services where the patron data being “leaked” is not as easy to see as a hold wrapper printed with a patron name. These sources of data leaks can be found within the software ecosystem used commonly throughout libraries.

What are these potential sources of library data being leaked? Below are some of the more widely used pieces of library technology which potentially have your library patron data or require accessing your patron data at some point within their functions.

Integrated Library System
Discovery OPAC
Self-checks
Computer Reservation
Print Stations
Automated Material Handlers (AMH)

All of these software systems either by design or through its back-end structure may collect patron information within their databases or software logging process. These systems can run for years, quietly collecting data, as they sit somewhat inconspicuously on the library network.

The worst culprit within the library software ecosystem for leaking patron information into your library network is the Standard Interchange Protocol, otherwise known as SIP2. The widespread use of SIP2 was due to our need for standardization of data exchanges between library software systems. This led somewhat innocently to the SIP2 protocol being used far and wide in library technology. Nearly every software vendor that wants to sell a software services to any library will use or work with SIP2. Any library software service that queries the ILS can do this through the use of SIP2, so the adoption by libraries of SIP2 on their networks is near universal.

How bad is SIP2 in terms of data security? Pretty bad in terms of how it is typically deployed “out of the box” within library networks. Here are the 10 patron fields of information shared within a single SIP2 patron authentication query.

User’s barcode
User’s PIN/password
User’s full name
Address
Email address
Phone number
Birthdate
Gender
Age category
Fines owed

A single query to see if a patron can gain access to a library computer or to a service will send all 10 fields of patron across the network regardless of only needing to verify if the patron is in “good standing.” It doesn’t matter if the service only needs to see one of the fields: SIP2 sends all 10 fields of data in response to a query.

With the widespread use of SIP2 protocol within library networks and the preponderance of various systems within the library such as multiple self-check stations or print stations, all of which likely use SIP2 to talk to the ILS, you have a lot of patron data being sent around the library network. Making this problem worse, all of these data fields are sent in plain text, which includes the patron’s PIN/password. Many systems software logging processes will save every SIP2 transaction into a file that can easily have hundreds of patron’s passwords and potentially thousands of transactions showing a patron checking out an item. These computer stations typically utilize local logging or small-scale databases for the purposes of providing libraries statistics on usage at the individual stations. Unless active measures are undertaken to purge logs and remove data collected, libraries have patron data stored throughout library desktops and servers beyond the typically more secure ILS.

It is usually at this stage of describing the problem where there is some questioning on the severity of the issue. Some folks will minimize the likelihood of this data getting hacked or stolen from the library network. Or they will take solace in the library being a small, unworthy target for any malicious intent. While it is true we have been largely helped by the fact we are a small, perhaps less juicy, target of a data hack, the network data attacks have now reached a more ruthless level. These ransomware attacks simply do not care who their target is and go through an automatically scripted series of software exploits to hijack any computer or server and steal/password encrypt its local data for ransom. This has occurred at the National Health System in the United Kingdom, dozens of countries government networks, and more recently the Baltimore City’s servers. It has even happened to public libraries.

We can no longer sit idly by and wait for the data to be stolen under this scenario and the ensuing PR and financial liability nightmare to befall us. If this were to happen to a library, wouldn’t it be better to know that the only source of patron data available was at a single point on your network rather than dozens? Or that it was understood precisely where this patron data resides and to take better efforts to protect the data on that device? Over a series of blog posts, I will outline the steps to take to help libraries understand the network protocols, ILS configuration strategies, network design, storage and logging that should be considered when undertaking an overall audit of your library’s leaky data problem.