The e-reader privacy paradox

In his book “Intellectual privacy: rethinking civil liberties in the digital age. Oxford University Press”, Neil Richards talks about the ereader privacy paradox

He uses Fifty shades of grey to illustrate the point he wants to make. Though print editions of the book were hard to find in Britain and the United States, the book sold millions of copies as an e-book. Its largely female readership repeatedly praised the privacy that the ebook version allowed.

An article in the New York Times makes the same point (Julie Bosman Discreetly digital, erotic novel sets American women abuzz It quotes Valerie Hoskins, agent to the author of Shades of Grey as saying “…women have the ability to read this kind of material without anybody knowing what they’re reading, because they can read them on their iPads and Kindles.”

But of course, Amazon knows what you read on your Kindle, what pages you have looked at, what annotations you have made, how long you have spent reading the novel, where you are up to and so on. The same would be true of Adobe if you had used the Adobe Digital Editions software.

Richards is quite right to speak of there being an e-reader privacy paradox. Isn’t that true of the internet too – that people are lured into a false sense of security. They can search the web from the privacy of their own homes, but those searches are far from being private and yet it creates the illusion of privacy.

If you went into a library, and were followed around by someone – whether it were a member of library staff, or a library user – wouldn’t you start to feel uncomfortable, almost as though someone was stalking you, watching your every move – what books you browsed on the shelves, which subject areas you headed towards, which titles you picked off the shelves to scan through, which books you took to the self-checkout machine etc. Wouldn’t you be outraged? So why are people not outraged by the tracking that takes place on the web which is far more pernicious – for example the tracking that occurs if they use Google Books, or Amazon’s “search inside the book” feature. Surely that sort of tracking is exponentially worse, because with ruthless efficiency huge quantities of data are being gathered, building up a profile about you where the data is kept permanently.

And what about the tracking that takes place when you use some library websites – ones that track in the form of analytics software, advertisers, social networking plugins, and the like)? Marshall Breeding (IN Privacy and security for library systems – Chapter 3:Data from library implementations. Library Technology Reports, 52(4) 2016, pp. 29-35) notes the tracking that was found on library websites / discovery services / online catalogs. It included (among others) Google Analytics, Google Ajax search API, Google AdSense, Google Translate, Google Tag Manager, DoubleClick, Yahoo Analytics, Adobe Omniture Analytics, Adobe Tag Manager, Adobe TypeKit, Facebook Connect, Facebook Social Plugin, Twitter Button, WebTrends.

In theory librarians are committed to protecting user privacy. What happens in practice?

The brief title of my PhD research is “Protecting the privacy of library users”. From the desk research that I have done so far, and the work to date on my literature review I have found evidence that what happens in reality doesn’t always live up to the theoretical commitment to protect user privacy: examples of data breaches, of library websites that leak privacy, of critical vulnerabilities in digital libraries (see for example KUZMA, J., 2010. European digital libraries: web security vulnerabilities. Library Hi Tech, 28(3), pp. 402-413) etc.

I am interested to know what are the root cause(s) of this failure to deliver on protecting user privacy.

Huang (HUANG, S., HAN, Z. and YANG, B., 2016. Factor identification and computation in the assessment of information security risks for digital libraries. Journal of Librarianship and Information Science, , pp. 1-17) says “Vulnerabilities may arise out of deficiencies in organizational structure, personnel, management, procedures and assets”.

I have put together a set of rough notes about a number of areas where I suspect that some clues as to the causes(s) might be found – or at least, where there might be potential for things to go wrong. But have I listed the right areas, am I missing any key ones, or should I be zooming in on any in particular:

  • Education/training
  • Contracts/licences
  • Law/regulation
  • Ethics/values
  • Technology
  • Information security
  • Who takes overall responsibility
  • Compliance issues
  • Physical (rather than digital) world
  • Vendors
  • Standards/guidelines
  • Third parties
  • Visibility/transparency
  • Privacy by design/by default

I can flesh these out a bit further if anyone is interested.

My research is in its early stages, so I haven’t yet reached the point of finalising the research questions I want to examine

Of course it is also worth asking the questions:

          Do library users actually worry about privacy in libraries?

          If so, why?

          If so, what in particular concerns them?

          And how would a failure to protect their privacy impact upon them?

Zines, libraries & privacy issues

Zines are usually devoted to specialized and often unconventional subject matter. They are often a vehicle for radical voices. They could be a political zine, a feminist zine, an LGBT zine and so on. They are ephemeral in nature, and often have very small print-runs.

The idea of privacy and trackless searching/use is often a very important principle for infoshops

Not all zine makers want their names listed on the internet

There’s a risk that easy availability of information about zine makers, and those who are interested in their zines could be used to flag people up to the authorities.

There’s a need for searching and using the library with a degree of privacy and untraceability (“rather than give the government fodder to harass them” (Hedtke, 2007 p41)

There are a number of examples of people talking of setting up separate public and private catalogues in order to keep certain information such as zine makers names more private

Vermillion (2009) writes that “we have been contacted to remove a last name from our database that was associated with a zine title that the author felt damaged her reputation in her current career—at age 16, she had no idea that the flippant title would ever be available online”.

Digitization of fanzines from many decades ago can throw up privacy issues – fans may have used their formal legal name (rather than a pseudonym), fully in the expectation of privacy, where the material was produced a long time before the world wide web was invented, and where the circulation of the fanzines was quite limited. In a chapter entitled “Identity, ethics, and fan privacy” written by Kristina Bussee and Karen Hellekson (in “Fan culture: theory/practice” edited by Katherine Larsen and Lynn Zubernis) they say “…many fans published under their legal names, before the adoption of pseudonyms became commonplace. The full names of many fans thus appear in print on the cover of fanzines, in their tables of contents, and in ads circulated to market the zines….These fans…deserve privacy”

Zine librarians code of ethics

Siobhan Britton dissertation What we do is secret? A study of issues relating to the collection, care, and accessibility of zines in institutional and alternative collections in the UK

Legal cases relevant to library privacy

I have been slowly putting together a listing of legal cases relevant to library privacy. If there are any that I have missed and may not be aware of, do let me know

John Doe v Gonzales 2005 (the case of the “Connecticut Four”)

Quad Graphics v Southern Adirondack Library System 1997 174 Misc.2d 291 (1997) 664 N.Y.S.2d 225 (on obtaining electronically stored information if warranted)

Tattered Cover, Inc. v. City of Thornton, 44 P. 3d 1044 – Colo: Supreme Court 2002 which looked at how the court must balance the law enforcement officials’ need for bookstore records against the harm caused to constitutional interests by execution of a search warrant.

Brown v Johnston 328 N.W. 2d 510 (Iowa) At issue was whether a county attorney subpoena for certain library circulation records is limited or restricted by section 68A.7(13) of the Iowa Code.

re Grand Jury Subpoena to Kramerbooks & Afterwords Inc 26 Med. L. Rptr. 1599 (D.D.C. 1998) (Kenneth Starr’s demand for the book buying habits of Monica Lewinsky)

United States v Rumley 1953. In the early 1950s the Supreme Court found it unconstitutional to convict a bookseller for refusing “…to provide the government with a list of individuals who had purchased political books.”   Justice Douglas observed, “Once the government can demand of a publisher the names of the purchasers of his publications . . . [f]ear of criticism goes with every person into the bookstall . . . [and] inquiry will be discouraged.”

re Grand Jury Subpoena to 2007 (demand for identities of 24,000 Amazon. com book buyers)

United States v Curtin. whether reading habits can be used to prove criminal intent in trials


Working on Literature Review

I’m currently working on a review of the literature around protecting the privacy of library users. The whole exercise is a learning experience in so many different ways. Its fascinating how the process of working through the literature makes you step back and try to take a “helicopter view”, where you take a step back to try and identify key themes and issues; as well as going to the other extreme and reading individual items and going into a lot of depth about a particular and very specific aspect of the topic.

The process of importing references from a number of different sources into a citation software package comes up with a number of formatting inconsistencies, and I know that I need to spend time checking and rechecking the details below.

There’s loads more work to do. The list below is by no means complete, and indeed I need to work my way through the list to see whether to keep all of the items listed. But I thought people might be interested to see the literature I have selected.


How many vulnerabilities do library websites have?

In a study by Joanne Kuzma (European digital libraries: web security vulnerabilities. Library Hi Tech, 28(3), 2010, pp. 402-413) a web vulnerability testing tool was used to analyse 80 European library sites in four countries to determine how many security vulnerabilities each had and what were the most common types of problems.

Her analysis showed that the majority of the libraries surveyed had serious security flaws in their web applications. Indeed, the UK accounted for the highest proportion of high level (critical vulnerabilities) and medium level (moderate ranked problems that could pose some risk to web applications) security flaws.

A report by Cenzic (Web application security trends report Q3-Q4,2008) found that nearly 80% of web-related flaws were caused by web application vulnerabilities:

  • Cross site scripting (XSS)
  • Denial of service
  • Structured query language


In the WhiteHat security “web applications security statistics report 2016” they list vulnerability likelihood by class (in descending order of likelihood). The top ones they listed for 2016 were:

  1. Insufficient transport layer protection (Not all traffic flowing between two endpoints is properly secured, which makes it possible for attackers to perform man-in-the-middle attacks)
  2. Information leakage
  3. Cross site scripting
  4. Content spoofing
  5. Brute force
  6. Cross site request forgery

Kunza holds that systems librarians should monitor security alerts from CERT and immediately install software patches and update their software to defend against attacks.

But should responsibility be placed solely on the systems librarian? It is all very well for librarians to hold privacy as one of their core values if they fail to take account of web security risks, whether through lack of awareness or some other reason.

Library procedures & privacy

I have put together a set of powerpoint slides setting out examples of how privacy impacts upon the work of libraries. The slides cover things like : physical layout of the library; co-location with other services; the procedures relating to self-service holds; the length of time users’ reading histories are retained and more. If you have other examples that I haven’t covered, do by all means get in touch (libraryprivacy @   practical-examples