Public/private & the concept of “practical obscurity”

(Hartzog 2013) Practical obscurity is a concept which pre-dates the online world. It refers to the impediments to data retrieval. The concept was articulated by the Supreme Court in U.S. Department of Justice v. Reporters Committee for Freedom of the Press. In evaluating the privacy of a “rap sheet” containing aggregated public records, the Supreme Court found a privacy interest in information that was technically available to the public, but could only be found by spending a burdensome and unrealistic amount of time and effort in obtaining it. The information was considered practically obscure because of the extremely high cost and low likelihood of the information being compiled by the public.

I have an example from the UK, which I first learnt of some years ago in connection with the law of defamation and contempt of court:

In the Scottish case of Her Majesty’s Advocate v. William Frederick Ian Beggs (High Court of Justiciary 2001)  (No2) (2002 S.L.T. 139).  the judge ruled that information held on the internet archives of newspapers was published anew each time someone accessed it. (The Defamation Act 2013 has changed the law on this issue). However, he didn’t take the same view of the paper archives held by public libraries. This distinction takes into account the ease with which material on the internet can be accessed.

The matters were considered again for an appeal case involving the same parties [2010] HCJAC 27 where the judgment makes a different point about online obscurity, saying “It appears also to have been accepted by both sides that those materials were archived material which had originally been published before the criminal proceedings became active on 21 December 1999 and that the action of entering the appellant’s name into a standard search engine would not lead the searcher to these materials. Instead the searcher would have to go to the website of a particular newspaper or broadcaster and then search its archived material”

Scope for an international forum covering library privacy issues?

Is there scope for an international forum on library privacy issues – there’s clearly an interest in library privacy issues around the world, if the visitors to my blog are anything to go by (see the list of countries below in order of frequency of occurrence). Just makes me wonder if there’s a need for a discussion forum, or some other vehicle for knowledge sharing on library privacy (across the different library sectors, incorporating technical expertise, knowledge of the regulatory framework, expertise negotiating privacy clauses in  contracts with vendors etc etc).

What role on privacy should librarians have

What is, or what should be the role of libraries and librarians with regard to privacy? Apart from ensuring that they protect any personally identifiable information relating to their users, should they do more over and above that:

  • Should they offer training on how users can protect their privacy (such as using browser addons and other tools; making full use of privacy settings within browsers etc)
  • Should they go a step further and offer their users the facility to search the web anonymously (a number of American libraries have set up Tor relays)
  • Should they organise cryptoparties, encouraging people to use encrypted services
  • Should they lobby government for laws that are more respectful of user privacy
  • Should they work together to encourage vendors to incorporate measures that respect user privacy

In recent weeks I have been thinking about people who would identify themselves as being “radical librarians”, because they seem to place a particularly high priority on ethical issues. Ian Clark IN Journal of Radical Librarianship (2016) says that “If we cannot (or do not) protect the intellectual privacy of our users, then we are failing as professionals”. Its interesting, too, to see the response from American librarians in recent months to the ways in which the current administration appears to be weakening or removing privacy protections.

Magi (2013) says “More than ever, libraries hold a unique and critically important place in the information landscape. I can think of few other information providers that do what libraries do: provide a broad range of information, make it accessible to everyone regardless of means, while embracing the ethical principle that our users’ personal information is not a commodity to be traded or sold. Our commitment to user confidentiality is rare and special, and it’s a characteristic that research tells us is important to people…. I believe it’s essential that we work to preserve that competitive advantage, both because it’s the ethical thing to do, and because it’s a practical way to stay relevant.

While Mattlage (2015 p76) says that “It is the unique role of information professionals to be last to abandon the defense of these rights, even if this leads others—who do not have these special obligations—to perceive information professionals as unreasonable”.

Is it only the wealthy who can afford to manage their digital footprints?

“Digital footprint” refers to the body of data that exists as a result of an individual’s actions and communications online.

The contents of one’s digital footprint is so important that it can impact upon an individual’s chances of getting a job, or on their personal relationships. And that is why for many years now there have been companies who are willing – for a price – to tidy up your online footprint.
Online reputation management is the practice of trying to shape the public perception of an individual (or indeed an organization or institution) by influencing the way that online information about them appears.

(Fertik, Thompson 2015) has written a book about the topic: The reputation economy: how to optimize your digital footprint in a world where your reputation is your most valuable asset. Penguin Random House.

What I wonder, though, is the fairness of this. Reputation management can be expensive. It isn’t simply a question of trying to shape the way information appears as a one off exercise. A reputation management company might have successfully got some embarrassing content to appear far lower down the search engine rankings than it did before, to the point where it has to all intents and purposes disappeared. But what happens when those search engines tweak their algorithm, and the story/stories appear more prominently once again.

It raises the question as to whether it is only people who are well-off who are able to utilize reputation management companies.

Don’t libraries have an important role in digital literacy training for their users, and more particularly digital privacy literacy training, to help them have a better understanding of how their digital footprint is created, what the implications are, and what can be done to minimize the data that is gathered; or to manage it more effectively if it has already been gathered.

86% of internet users have taken steps online to remove or mask their digital footprints – ranging from clearing cookies to encrypting their email
55% of internet users have taken steps to avoid observation by specific people, organizations or the government.
Source: Pew Foundation American Life Project.

A video entitled “Inside the mind of google” which is dated 2009 says that Google has become a vacuum cleaner hoovering up digital data. Even in 2009 they spoke of there being over a billion searches a day. Just think of all of those searches, and how they are leaving a digital footprint. To the point where Google could be described as a database of intentions, because those searches will indicate what you were thinking at any given moment.

As the 2013 report on IFLA trends says “In situations where posting information online effectively surrenders future control over that information, people have to balance their desire to engage, create and communicate against any risks connected with leaving a permanent digital footprint”.

In “Group privacy: new challenges of data technologies”, Luciano Floridi says “We are constantly leaving behind a trail of data, pretty much in the same sense in which we are shedding a huge trail of dead cells”

The privacy of the library (as a public space)

(Sturges, Iliffe et al. 2001) recognise that “The library, whether public, academic or institutional, is both a communal and a private space: a paradox that has always contained a certain potential for tensions.” They acknowledge that privacy is even less possible in the digital library than it is in the print library.

(Campbell, Cowan 2016) also acknowledge that privacy can have a paradoxical relation to the public sphere. They cite (Keizer 2012) who suggests that individuals frequently move into the public sphere, not to sacrifice their privacy, but to retain it. Indeed, in an analysis of a court decision that grappled with the question of privacy in public places, Keizer writes of “the number of people whose very act of stepping out the front door represents a “subjective expectation of privacy”—because the public sphere is the only place where they can have a reasonable hope of finding it”.

As Campbell says, the “library occupies a position of significant though paradoxical importance: its status as a public place makes it an ideal place in which to experience genuine privacy”. Referring to the concept of “open inquiry”, Campbell says that it “consists of the freedom to inquire, unrestricted by familial, communal, or tribal obligations”. Indeed Keizer suggests that “The public sphere may well be the most important factor in an individual’s quest to use information sources to explore and articulate a sexual identity with a reasonable expectation of privacy”.

The idea of open inquiry only being achieved in the privacy of the public sphere may seem like a contradiction in terms. But a teenager exploring their sexuality might well turn to the library on the basis that they crave the privacy offered by (the library as) a public space. They may well use it to look up references that would give some validity to the feelings inside them which marks them out as being somehow different. They might begin by looking up words in dictionaries, before moving on to finding both descriptions and images that they can identify with as being of other people just like them.  (Curry 2005) cites Steven Joyce whose dissertation notes that many youths still living at home may be reluctant to undertake web research on their home computer, preferring instead the anonymity and safety of the public library.

(Floridi 2014a) cites a Pew Internet & American Life project on “Teens, privacy and online social networks”. For youth, “privacy” is not a singular variable. Different types of information are seen as more or less private; choosing what to conceal or reveal is an intense and ongoing process. Rather than viewing a distinct division between “private” and “public”, young people view social contexts as multiple and overlapping”. Indeed, the very distinction between “public” and “private” is problematic for many young people, who tend to view privacy in more nuanced ways, conceptualizing internet spaces as “semi-public” or making distinctions between different groups of “friends”.

Reasonable expectation of privacy (public v private place)

The question inevitably arises as to whether one can have a reasonable expectation of privacy in a public place. For (Campbell, Cowan 2016) “the public sphere may well be the most important factor in an individual’s quest to use information sources to explore and articulate a sexual identity with a reasonable expectation of privacy”.

(Gorman 2015) cites Gabriel Garcia Marquez who told us we all have three lives: a public life, a private life, and a secret life; and that as far as the public life was concerned, that it was open to the world, an environment in which there is no reasonable expectation of privacy.

Brandeis believed that people are entitled to reasonable expectations of privacy. (Mirmina 2016) says that this view leads to some interesting questions, and she tries to ask what Brandeis would have made of modern day issues in light of the technology that is available to us so many years after the famous (Warren, Brandeis 1890)  article.

In a landmark 1967 case, Katz v United States (389 US 347), the US Supreme Court found that a warrantless police recording device attached to the outside of a telephone booth violated the Fourth Amendment’s protection against unreasonable searches and seizures. This protection had formerly been construed primarily in cases involving intrusion into a physical place, but in Katz (at 351) the justices famously held that the Fourth Amendment “protects people, not places.”

The e-reader privacy paradox

In his book “Intellectual privacy: rethinking civil liberties in the digital age. Oxford University Press”, Neil Richards talks about the ereader privacy paradox

He uses Fifty shades of grey to illustrate the point he wants to make. Though print editions of the book were hard to find in Britain and the United States, the book sold millions of copies as an e-book. Its largely female readership repeatedly praised the privacy that the ebook version allowed.

An article in the New York Times makes the same point (Julie Bosman Discreetly digital, erotic novel sets American women abuzz It quotes Valerie Hoskins, agent to the author of Shades of Grey as saying “…women have the ability to read this kind of material without anybody knowing what they’re reading, because they can read them on their iPads and Kindles.”

But of course, Amazon knows what you read on your Kindle, what pages you have looked at, what annotations you have made, how long you have spent reading the novel, where you are up to and so on. The same would be true of Adobe if you had used the Adobe Digital Editions software.

Richards is quite right to speak of there being an e-reader privacy paradox. Isn’t that true of the internet too – that people are lured into a false sense of security. They can search the web from the privacy of their own homes, but those searches are far from being private and yet it creates the illusion of privacy.

If you went into a library, and were followed around by someone – whether it were a member of library staff, or a library user – wouldn’t you start to feel uncomfortable, almost as though someone was stalking you, watching your every move – what books you browsed on the shelves, which subject areas you headed towards, which titles you picked off the shelves to scan through, which books you took to the self-checkout machine etc. Wouldn’t you be outraged? So why are people not outraged by the tracking that takes place on the web which is far more pernicious – for example the tracking that occurs if they use Google Books, or Amazon’s “search inside the book” feature. Surely that sort of tracking is exponentially worse, because with ruthless efficiency huge quantities of data are being gathered, building up a profile about you where the data is kept permanently.

And what about the tracking that takes place when you use some library websites – ones that track in the form of analytics software, advertisers, social networking plugins, and the like)? Marshall Breeding (IN Privacy and security for library systems – Chapter 3:Data from library implementations. Library Technology Reports, 52(4) 2016, pp. 29-35) notes the tracking that was found on library websites / discovery services / online catalogs. It included (among others) Google Analytics, Google Ajax search API, Google AdSense, Google Translate, Google Tag Manager, DoubleClick, Yahoo Analytics, Adobe Omniture Analytics, Adobe Tag Manager, Adobe TypeKit, Facebook Connect, Facebook Social Plugin, Twitter Button, WebTrends.

In theory librarians are committed to protecting user privacy. What happens in practice?

The brief title of my PhD research is “Protecting the privacy of library users”. From the desk research that I have done so far, and the work to date on my literature review I have found evidence that what happens in reality doesn’t always live up to the theoretical commitment to protect user privacy: examples of data breaches, of library websites that leak privacy, of critical vulnerabilities in digital libraries (see for example KUZMA, J., 2010. European digital libraries: web security vulnerabilities. Library Hi Tech, 28(3), pp. 402-413) etc.

I am interested to know what are the root cause(s) of this failure to deliver on protecting user privacy.

Huang (HUANG, S., HAN, Z. and YANG, B., 2016. Factor identification and computation in the assessment of information security risks for digital libraries. Journal of Librarianship and Information Science, , pp. 1-17) says “Vulnerabilities may arise out of deficiencies in organizational structure, personnel, management, procedures and assets”.

I have put together a set of rough notes about a number of areas where I suspect that some clues as to the causes(s) might be found – or at least, where there might be potential for things to go wrong. But have I listed the right areas, am I missing any key ones, or should I be zooming in on any in particular:

  • Education/training
  • Contracts/licences
  • Law/regulation
  • Ethics/values
  • Technology
  • Information security
  • Who takes overall responsibility
  • Compliance issues
  • Physical (rather than digital) world
  • Vendors
  • Standards/guidelines
  • Third parties
  • Visibility/transparency
  • Privacy by design/by default

I can flesh these out a bit further if anyone is interested.

My research is in its early stages, so I haven’t yet reached the point of finalising the research questions I want to examine

Of course it is also worth asking the questions:

          Do library users actually worry about privacy in libraries?

          If so, why?

          If so, what in particular concerns them?

          And how would a failure to protect their privacy impact upon them?