Steven Heywood's Blog o'Library Stuff: information sharing

Showing posts with label information sharing. Show all posts

Wednesday, 2 November 2016

Library data part three-and-a-bit: sharing customer data

Having had a quick scamper through the worry list, what customer data could be shared openly?

Let's start with what can't be shared:

Name
Full address
Unique identifier for the data record
Nearly all combinations of data elements within the record

The first two are obvious Data Protection precautions; the last two are less obvious precautions for the same reason: they make it possible to identify the individual data subject.

Any data extraction for release as open data must specify the required data elements. Required fields need to be selected for extraction rather than having fields not required filtered out post-extraction. This prevents any accidents. Once data's openly out in the wild it's out in the wild.

"Registration location" and "Library/libraries used" (if available) are both safe in themselves as they aren't personal data and will have data sets broad enough not to be able to identify individual data subjects. They could be combined with each other and any one of the following:

Category (e.g. type of borrower)
Ethnicity
Disability
Gender
Year of birth/age in years (if only date of birth can be extracted then this data shouldn't be used)

The data extract could be:

       Bedlam Library     Child
       Bedlam Library     Child
       Bedlam Library     Adult
       Bedlam Library     Adult

But not:

       Bedlam Library     Child     Male
       Bedlam Library     Child     Female
       Bedlam Library     Adult     Female
       Bedlam Library     Adult     Male

Any two of these could be combined:

Category (e.g. type of borrower)
Ethnicity
Disability
Gender
Year of birth/age in years (if only date of birth can be extracted then this data shouldn't be used)

A postcode dump for the whole library authority could be made available but not combined with any other data because of its very specific nature for identification purposes.

I think that's pretty much it. And I'd still want to run it by an Information Governance expert before going ahead (and for them to check my Privacy Impact Assessment).

Tuesday, 1 November 2016

Library data part three: dangerous demographics

The data about the people registered with a library is at one and the same time the most potentially useful and the most potentially dangerous. So dangerous, in fact, that when it comes to making this data openly-available the default position must be: Don't. Do. It.

That position will be strongly challenged by many so I'll devote the rest of this post to explaining the dangers and the next one will have a look at the data that might be openly-shareable so long as all the necessary precautions are taken.

Demographic information is immensely useful to a library service. Operationally it's important to see that the service is meeting the needs of all its communities and not just providing a service "for people like us by people like us." It's important to be able to make sure that particular services are reaching their target audiences and that you're not doing anything to put sections of the community off using your services. And it's essential that you have this data for Equality Impact Assessment of policy decisions. So why would you not want to share the data to get a bigger picture?

Generally speaking there are three main concerns:

Privacy. The library is one of the few safe public places left for the individual. Removing the right to privacy is an information governance issue just as much as an ethical one and both need to be taken very seriously (both are generally given too much lip service and too little analysis and action).

It also compromises the quality of the service being provided: if library customers know that how they as individuals use the library will be made public a good many of them will modify their behaviour and not use the library the way they want or need to. If they don't know this the library will have committed a significant breach of trust.

Legality. Does the library have the legal right to share the data? If it is possible to identify individual data subjects then the answer is categorically: No, unless the data subject has explicitly said that their data may be shared.

Anonymising the data so that it is no longer personal data is easier said than done. It isn't a matter of just removing all the names. We'll have a look at this later on.

The agreement has to be an opt-in and the purpose of this data sharing has to be clearly stated. "We want to make your data open so that other, as yet unknown, people can manipulate it to get as yet unknown information and outcomes" would be an open invitation to the Information Commissioners' Office to come and investigate your organisation.

Safeguarding. This is the most problematic and under-appreciated concern. Anybody knowing whether or not a person even visits a library, let alone uses it, may put that person in actual physical danger. In some controlling relationships a partner may only be allowed out to go to the shops and heaven help them if they do anything else. They may be allowed to take a child to library activities such as story times but not for themselves. An abusive partner discovering that somebody was somewhere they shouldn't be — the wrong end of town or even the wrong town — could be a trigger for violence. The test here isn't: "What is reasonable?" because this isn't about safeguarding people against reasonable action. It's about safeguarding them from action that may be anything but reasonable.

In my head I can hear somebody saying: "If they let us know that they're in an abusive relationship we could put a flag in their record to say their data's not to be shared."

This requires the data subject to actively opt out of data sharing.

Identifying yourself as a person in an abusive relationship is a brave thing to do and not something that should be required to be done at a public service point in a library.

The library suddenly becomes a less safe place.

Someone's got to remember to filter out the flagged records before sharing the data.

Anonymising the data requires more than stripping out all the names. The Information Commissioner's Office has a useful checklist (pdf).

In public libraries the combination of nearly any two data elements may be enough to make that data subject identifiable, or at least narrow the number of possibilities down enough to make it statistically probable they could be identified. The combination of "library where registered" and "library used" plus one other datum is usually OK but this needs to be tested with the particular data set, in case of nasty surprises. Other combinations very quickly narrow down to the individual.

A lot depends on the data itself: if the categories used are very general it might be safe to combine it, though it may be so general as to be pretty useless. I really did once work with a library service that thought it was OK to have two ethnic identifiers in the system: blank for "people like us" and "ethnic" for anyone who looked or sounded a bit foreign; I put a block on that the first chance I got; even so it wasn't until we got all the libraries onto the library management system that we finally got right of the last of the old Browne Issue tickets with a red E on them (disturbing symbologies like that make me wonder what librarians were thinking about in the eighties).

I'd imagine my local library authority will have thousands of white adult males in their database. How many — or few — teenage Bangladeshi females would there be?

Postcode data very quickly narrows down. There are perhaps ninety people in my postcode area. Twenty-odd adult white males. About four males in their fifties. One white male in his fifties.

Age data gets very specific very quickly. "Adult" and "Child" is pretty safe but as soon as you start refining that down it becomes problematic. Full date of birth is so specific it 's a red flag.

So we would need to be very careful about what data — and what combination of data — is made available. In a library consortium setting this should be governed by formal data sharing protocols that had been passed by each authority's information governance experts and given the OK by whoever is responsible for the authority's information risk so all the data of all the people who have actively agreed to their data's being shared can be made available to the appropriate staff for the appropriate purpose within the consortium. That's a very specific remit for a very specific purpose for the use of a very specific group of people, with checks and balances and sanctions for abuse.

Which is exactly not the case with the open release of data, so different rules need to apply and need to be applied proactively (the genie doesn't go back into the bottle if you find you've made a mistake). Hence the greater need for precaution.

Tuesday, 18 October 2016

Library data part two: what do we know about the stock?

In principle stock data is much the least problematic data set held by libraries when it comes to trying to map it and potentially share it across local authority boundaries or make the data openly-available. There are good reasons for this:

Every English public library service has a catalogue of resources
There has been decades' worth of data-sharing for the purposes of interlibrary loans including, but not limited to, the UnityUK database
There are long-established standards for title-level bibliographic data
The outsourcing of most bibliographic metadata, limits the number of original sources of data and so imposes some consistency

Added to this can be the data mapping work involved in setting up an interface with the evidence-based stock management system CollectionHQ and the increased use of library management systems in consortium settings. Both of these get library systems people thinking about the way their data maps against external frameworks,

Technically, data about virtual stock holdings can be treated the same way as physical stock holdings. Culturally, there is some variation in approach between library services.

For the purposes of this post we'll assume that all stock has been catalogued and the records held in the library management system. In reality this will be true of most, if not all, lending library stock and a high proportion of whatever reference library stock there is these days. Many local studies collections and special collections are still playing catch-up

Title-level bibliographic data

All the bibliographic records come from the same place so this is standard data and would be easy to share and compare, right? Well… up to a point, Lord Copper.

Not all library authorities are buying in MARC records.
Of those that do, not all of them are retrospectively updating their old records so they'll have a mix of bought-in MARC records and locally-sourced records which may or may not be good MARC records in the first place and which certainly have variations in the mapping details.
Those that did do a retrospective update may have hit a few glitches. Like the library authority that had an LMS that had ISBN as a required field and so had to put dummy data in this field which turned out to be the valid ISBNs of extremely different titles to the ones they actually had. (This wasn't Rochdale, though it did cause us some collateral damage.)
There may be local additions to commercial MARC records, for instance local context-specific subject headings and notes.
Commercial MARC records may not be available for some very local or special collection materials so these will need to be locally-sourced.

Taking these into factors into consideration this would be much the most the most reliably uniform component of a national core data set for libraries if any such were ever developed. The data available would be either:

A full MARC record + the unique identifier for this bib record in this LMS (this is required to act as a link between the title-level data and the item-level data); or
A non-MARC record including:
- Title
- Author
- Publisher
- Publication date
- ISBN/ISSN or other appropriate control number, if available
- Class number
- Unique identifier for this bib record
(I think there's a limit to the amount of non-MARC data that should be admissible.)

For the purposes of this game RDA-compliant records can be assumed to be ordinary MARC21 records (there's a heap of potential MARC mapping issues involved in any national sharing exercise which we won't go into here). I can see the need for the use of FRBR by public libraries but I don't see it happening any time soon so it's not considered here.

Item-level holdings data

The library catalogue includes holdings data as well as bibliographic data so that, too, could be part of a national data set. The detail and format of this data can vary between LMSs and from one library authority to another:

Some, but not all, item records may have at least some of their data held in MARC 876 — 878 tag format
The traditional concept of a "collection" may be described in different fields according to the LMS or the local policy. Usually it would be labelled as one or other of item type, item category or collection.

Which data to include? Or rather, which would be most likely to be consistently-recorded? My guess:

Unique identifier (usually a barcode)
Location
Key linking to the appropriate bibliographic record
Item type/item category/collection label best approximating to the traditional concept of "collection"
Cost/value
Use, which would generally mean the number of issues
Current status of the item

After that the variations start to kick in big time.

There are a few devils in the detail, for instance:

There is no standard set of "collections," though there is a de facto standard set of higher-level item types:
- Adult Fiction
- Adult Non-Fiction
- Children's Fiction
- Children's Non-Fiction
- Reference
- Audiovisual
- Everything else
The item type/item category/ collection for each library authority would need to be mapped against a standard schedule of “Item types.” For instance, when I used to pull out stock data for CIPFA returns I didn't have the appropriate categories available in fields in the item records; so in Dynix I had a dictionary item set up to do the necessary in Recall and with Spydus I set up a formula field in a Crystal Report, in both cases it involved a formula including sixty-odd "If… Then… Else…" statements.

Are those already used for CIPFA adequate or would a new suite need to be developed and agreed?
Would this translation be done at the library output stage or the data aggregation stage?
For CIPFA our translation was done at output, for CollectionHQ it was done at data aggregation stage according to previously-defined mapping.

Cost could be the actual acquired cost including discount; the supplier's list price at time of purchase, without discount; or the default replacement cost for that type of item applied by the LMS.
Use count data may be tricky:

It could be for the lifetime of the item or just from the time that data was added to this particular LMS if the legacy data was lost during the migration from one system to another.
Some LMSs record both "current use" (e.g. reset at the beginning of the financial year) and total use. You need to be able to identify one from the other.
The use of loanable e-books/e-audiobooks may not be available as this depends on the integration of the LMS with the supplier’s management system.
Curated web pages would be treated as reference stock and not have a use count.
Some LMSs allow the recording of reference use as in-house use.

Item status is always interesting:

Does this status mean the item is actually in stock?
Is the item available?
Has the item gone walkies/been withdrawn?
Again, this would have to be a mapping exercise, similar to the one we did for CollectionHQ

So what have we got?

Overall, then, we could say that every public library could put their hand to a fair bit of title-level data that's reasonably consistent in both structure and content; and some item-level data that wouldn't be difficult to be structurally consistent but would need a bit of work to map the content to a consistent level.

Monday, 17 October 2016

Library data part one: variations on a theme

Over the Summer I've been doing a bit of work for the Public Libraries Taskforce and that set me thinking about the data that public library services hold. Each one holds a shedload of data about its resources, its customers and its performance, but each one holds a slightly different shedload to its neighbours. Why would that be?

Technical reasons

There are surprisingly few standard data structures in play in public libraries
Different management systems hold data in different ways
Even if the data has the same structure a different suite of descriptive labels may be in use

Human reasons

An organisation might not feel the need to record the data at all
The quality — or not — of the data may not be a priority so elements may be missing
Naming conventions, etc. may change over time without retroactive conversion, leading to internal inconsistency
The data may still be on bits of paper

Having said that there are some key data that are generally common to all, though variable in detail. I'll have a look at those over the next few posts.

Wednesday, 17 August 2016

How many?

I've started doing some work with the Libraries Taskforce. I'd been to one of their workshops and it was pretty apparent that potentially there should be a lot of work needing doing by the less than a handful of people involved and it wasn't easy to see how they'd be able to do it on their own. I've got some time now that I've retired from Rochdale Council so I asked them if they needed a hand with anything and they said yes please. So I'm lending a hand with the work strand that's hoping to develop a core data set for English public libraries that can be openly-available for both public use and operational analysis. It's a voluntary effort on my part; it's something I'm interested in and have been impatient about and it's a piece of work that should have some very useful outcomes.

Whenever you start talking about English public libraries data the elephant in the room very quickly makes its presence known. Before we can talk credibly about anything very much there is one inescapable question desperately needing an answer:

Just how many English public libraries are there anyway?

There is no definitive answer. There is no definitive list. There are at least half a dozen well-founded, properly researched lists. They each give a different answer and when you start comparing them you find differences in the detail. There are perfectly valid reasons for this:

Each had been devised and researched for its own purposes without reference to what had gone before. Each started from scratch and each had a differently-patchy response from library authorities when questionnaires were posted.

This data's not easy to keep up to date at a national level — especially these days! So some libraries will have closed, a few will have opened, some will have moved and some will have been renamed.

It wasn't always clear just how old the lists were. Some had been compiled as part of some wider project and there wouldn't necessarily have been the resource available to do any updating anyway.

So the decision was made to tackle this head on so that it could be settled once and for all so that the world could move on and they were a few weeks into this work when I signed on. Very broadly, here's the process:

Julia from the Taskforce, who has infinitely more patience than me, trawled every English local authority's web site for the details of their public libraries.

Between us we scoured the other lists and added any libraries we found in there that we couldn't find in Julia's list.

We then went through this amended list to see if we could identify any points of confusion, for instance where "Trumpton Central Library" has moved from one place to another or where "Greendale Library" has become "The Mrs Goggins Memorial Information and Learning Hub."

The Taskforce has sent each library authority a list of what we think are their libraries asking them to check to see whether or not these details are correct.

The results will be collated and the data published by the Taskforce.

Ten years ago this would have been pretty straightforward. These days the picture is complicated by the various forms of "community library" that have sprung up over the past few years. These run the gamut from "this library is part of the statutory provision though it is staffed by volunteers some of the time" all the way to "we wish them well on their venture but they're nothing to do with us." So where a public library has become a "community library" of one sort or another that needs ro be indicated in the data.

Will this list be 100% correct? Probably not at first, this is a human venture after all. But even if it's only 98% correct in the first instance it should be treated as the definite article. It will then need to be corrected and updated as a matter of course; if that's devolved to the individual library authorities the work becomes manageable and the data becomes authoritative.

Why should anyone bother?

What's in it for anyone to keep their bit of this list up to date and details correct? In my opinion:

It's basic information that should as a matter of principle be available to the public.

In the past year alone, this question has tied up time and effort that could have been more usefully-occupied. All those enquiries, and FoI requests, and debates about data that could just be openly-available and signposted whenever the question arose.

It is essential to the credibility of any English public library statistics. If the number of libraries is suspect then how trustworthy are any of the statistics being bandied around? If the simplest quantitative evidence — the number of libraries — is iffy then how much faith can be placed in quantitative or qualitative evidence that's more exacting to collect?

For instance, counting the number of libraries within a local authority boundary if you're responsible for supporting or managing them is a piece of piss. Reliably counting the number of visitors to any one of those libraries most definitely isn't — I have 80% confidence in the numbers coming out of any automated system (not necessarily due to technical issues) and to my mind if you're relying on manual counts you may as well be burning chicken feathers. So when I hear that visits to English public libraries have dropped by a significant percentage over a given number of years I may be prepared to accept this in the light of a wider narrative, personal observation and anecdotal evidence but I have no empirical reason to know that this is the case.

That's why.

Wednesday, 27 January 2016

Library task force: "community libraries" toolkit

I can't say that I'm impressed with the notion of replacing public libraries with "community libraries," especially not when the engagement with the community is at the end of an Austerity shotgun.

That being said, one of the jobs to be done by the Library Task Force is a review of the process and the building of guidance — for and against the idea — for those thinking of embarking on the adventure. And they're inviting contributions to this toolkit.

This is the contribution I've added to the discussion:

I think we need to address the brief you've been given, not least because it gives the opportunity to explore some of the practical issues involved in taking public libraries out of the public sector and why there are real fears about it.

Firstly, a strategic issue: review after review (and Sieghert was no exception) has noted that part of the problem with the public library service is its fragmented nature. That, together with the fact that nigh on everything in English public library land is optional, means that there's little strategic development; limited opportunity for significant economies of scale outside book-buying consortia; and nationwide initiatives depend for their critical mass on a postcode lottery of acceptance. Other important national failures are an absence of KPIs and no definitive asset register — the debates on the future of public libraries have no benchmarking to work from; no consistent trends data; no nation-wide evidence-based analysis of outcomes; and not only do we not have an empirical national picture of what the public library service is and how it's doing, we don't even know how many public libraries there are in England! (by way of contrast, I chose Moldova at random and found the answer in three clicks). Further fragmenting the service to a hyperlocal extent pretty much puts paid to any hopes that any of this could be corrected.

Secondly, *whose* community? The idea of a single, close-knit, easily-identifiable community sits well with Camberwick Green but is meaningless in dormitory suburbs and mosaic inner cities. Back in Browne Issue days when demographic data was hard to come by it was horribly easy for some public libraries to become by ladies of a certain age for ladies of a certain age. Decades of work dedicated to building the culture that "public libraries are for everybody, not just people like us" risk being a waste of time and effort. How can equality impact assessments be made? How can they be made consistently? If made, what would be done with them?

How accountable can the organisations running the community libraries be, and to whom? Whatever the shortcomings of elected members at least they can be voted out and are accountable to standards authorities. The model of imposition of community management doesn't allow for the organic growth of management and accountability structures. Grassroots voluntary activity works well when it grows from the ground up, it seldom prospers by parachute implementation and recruitment at bayonet point.

Who owns the library data? There are intellectual property rights issues regarding the catalogue data. There are information governance issues, particularly data protection issues, regarding the customer data, loans data, the use of online resources and browsing histories within the library. Who are the Information Asset Owners? What are the information risk plans? Where are the data sharing plans? Who's going to be there to stop that person who thinks it would be a jolly good idea to collect all the names and addresses of library users and sell them to junk mail foundries to earn a few bob?

The culture industry is one of the UK's big earners. A lot of small-scale, small-budget operations won't each have the critical mass needed to be able to afford both enough popular topics and best-sellers required for the bread-and-butter market and also a representative range of niche topics, new authors, locally-relevant stock and experimental guesses at The Next Big Thing. This will be a huge loss of seed-funding to the industry and a huge diminution of opportunity to the communities involved. One of the key drivers of human development is serendipitous discovery; if all that remains to be discovered is what is already known then there'll be a withering effect in both use and effectiveness of these services.

That's my starter before bedtime. I hope more people add to the discussion.There's plenty more left for somebody to go at.

Sunday, 2 August 2015

Figure skating

One of the things that has become horribly apparent over the past couple of years is the abject lack of any evidence-based government data that would lend themselves to a statistical analysis of the decline of the national public library service.

There are no official figures in the public domain for anything that's happening out there: not for visits, or use of libraries or even — God help us! — for the number of publicly-funded public libraries run by local authorities in this country.

This leads to nonsense like the recent claim that there's been an increase in the number of libraries despite all the cuts over the past few years. Anyone wanting to know the number of libraries is better off going to Ian Anstice's Public Libraries News blog than any official government site or press release. All kudos and good karma to Ian for doing the work but this isn't a good state of affairs for a democracy or open government.

One reason often cited for this lack is that the figures are available but only from CIPFA, which charges a hefty fee for their use. And that fee pays for just the figures for one library authority for that year's figures, so pulling together a national picture becomes an expensive business.

Which it would.

If that was the way you were doing it.

But it shouldn't be:

The presentation and analysis of those statistics are CIPFA's property to do as they will with. Which is fair enough as they've done that work.

The data that informs CIPFA's statistics are available within each and every library authority in the land and is collected each year — at no small expense to you the taxpayer — by local council staff then copied into a spreadsheet that's parcelled up and sent to CIPFA.

There is absolutely no good reason why that data — not CIPFA's subsequent work with that data — can't be put into the public domain to be worked on by decision-makers, lobbyists, "Armchair Auditors" or just people who like playing with numbers.

The easiest way to do this would be for each local authority to submit a copy of each year's data — perhaps as a CSV file — to a dataset in Data.Gov.uk or similar. This would then be in the public domain and available for proper analysis of services and trends. It wouldn't cost anything very much to actually do: the data's available, it just needs somewhere to go. And it would be a damned site cheaper than having each local authority have to go through the administrative processes required to deal with a Freedom of Information Request asking the same questions as those on the CIPFA spreadsheet. Or even multiple requests for that data. Once it's in the public domain FoI doesn't apply.

So it would be possible to have an official, verifiable benchmark figure for the number of public libraries in this country at the beginning of the financial year and the net loss/gain at the beginning of the following year.

Which could be why it isn't happening.

Sunday, 3 August 2014

Net neutrality: why worry?

Net neutrality is a topic creating quite a lot of heat at the moment, due to the U.S. Federal Communications Commission's taking a look at the topic and scaring people silly in the process with the implication that there'd be the development of a two-tiered internet with them as can pay going down the line at a premium rate and the rest crawling along as best can. (CNET provides as good a summary as any.)

So what? It's a fuss in a foreign land, isn't it?

Sadly not. It'll affect us however much we may imagine or hope that it wouldn't.

So what would be the effect and why should we care?

The way I see it, the nearest practical model for how the post-net neutrality world would look is cable television. Back in the day when cable TV first came out it was full of all sorts of community engagement. There were local and hyperlocal channels; there was space for the esoteric, the informative and the downright baffling. Much of it was done on the cheap and looked it.Then there were years of consolidation and corporate buyings-up and now I could watch NCIS and CSI: Miami simultaneously on six different channels; or endless hours watching folks in nowhere towns somewhere in America shouting at each other for no apparent reason; an interminable churn of mid-Atlantic reality wannabees being vile to one another; and a carousel of Westminster Village news feeds. None of it is local. All of it is peddling the same corporate narrative. News or features about anything within a hundred miles of where I live is limited to the local half-hour news programmes on weekdays and the ten minutes where the skateboarding ducks used to be after the weekend news.

I quite liked the Internet when it was like the Wild West. We can't go back to those days but that doesn't mean it has to become just another adjumct to the Wall Street Journal.

Thursday, 31 July 2014

Data sharing between libraries

We're at the stage in the evolution of the AGMA library consortium where we're starting to work through the practical — and legal — implications of shared services.

Sharing our catalogue data is relatively easy: the data standards are well-established and most the data itself is published in the public domain on library OPAC's, etc. Which doesn't mean that it was all plain sailing and we've not got some more work to do.

Sharing borrower data is obviously fraught with all sorts of information governance and data protection issues on top of the problem that there isn't any data standard save that imposed by the structure of our shared LMS and the commonalities we've discussed and agreed on a case-by-case basis.

Virtually every circulation dataset is a back door into the borrower data.

I've been thinking through some of the questions we need to be asking ourselves on this journey. It's still early days so isn't exhaustive; at this stage I'm trying to work out what we need to worry about at a general level prior to starting work on a risk analysis.

*Purpose*	*Type of Information*	*Recipients*	*Data Controller*	*Notes/queries*
Membership information including contact details –voluntary service, customers will be asked if they want to opt in	Customer name, address and contact information, DOB. Disability, ethnicity and other demographic details Family relationship details Lending history	Library staff (including all other authorised Spydus users) of approved Authorities within the scheme	Local Authority (Data Subject’s Local Authority will be the data controller)	Which data is to be shared? Is it all or nothing? If partial, which parts and how managed? Same question applies to who the data is being shared with What would be the position of volunteer-managed community libraries? How do we switch sharing on/off? What happens if a customer changes their mind? How are they “quarantined?” What happens to the data held in loans, charges and reservations? What happens to any outstanding loans, fines and charges? Who owns (and is responsible for) the data?
Loans information	Details of the loan including borrower, item, location and status of loan. Loans history	Library staff Specific customers can see all details of their loan(s) All customers can see some details of the loan(s)	Local Authority (which?)	This is the crucial element to be managed: It is the purpose of the data-sharing agreement It is the bridging element between the personal customer data and nearly all the other data sets There is a hierarchy of viewing permissions If a customer has said “no” to data-sharing, how is the borrower data in the loan, charges and reservation records expressed? If the customer changes their mind about sharing their data, is it automatically redacted from these records? Who owns (and is responsible for) this data? Whose loan policies? Applied from the lending library? Including fines and charges? How do exceptions apply? “Non-default” borrower types and collections
Overdue/pre-overdue notices	Contact details including borrower name, address, telephone and email; loan due dates and items involved	Library staff Specific customer	Local Authority (which?)	Derived from loans data and subject to same questions It would make sense to aggregate these to improve efficiency and save costs (see notes on charges, etc.)
Reservations	Contact details including borrower name, address, telephone and email and items requested	Library staff Specific customer	Local Authority (which?)	All the questions for loans apply for reservations (which are effectively loans-in-waiting) Whose charge régime applies? Would the Data Controller be the “owner” of the customer record, the library that placed the reservation or the library it will be picked up from (if a different library authority)?
Requests	Contact details including borrower name, address, telephone and email and items/articles requested	Library staff Specific customer ILL system (bibliographic and/or article data only)	Local Authority (which?)	In nearly all respects as reservations, just more complicated charges [The operating procedures would probably need modifying in the light of the shared lending environment.] This will need to be revised in the event of a fuller integration with UnityWeb or equivalent third-party systems
Notifications for any reserved items	Contact details including borrower name, address, telephone and email and items requested	Library staff Specific customer	Local Authority (which?)	Derived from reservations/requests data and subject to the same questions It would make sense to aggregate these to improve efficiency and save costs (see notes on charges, etc.)
Charges/fines/fees	Contact details including borrower name, address, telephone and email; details of the transaction that generated the charge	Library staff Specific customer	Local Authority (which?)	Derived from loans and reservations/requests data and subject to the same questions How will these be managed: Payable only where incurred? Payable globally? Impact on traps/alerts (whose parameters apply?) In the event of recovery, who legally owns the charge? In the light of the above, what would be the effect (if any) of aggregated notices?
Catalogue/ discovery records — bibliographic data	Title-level catalogue data	Library staff Library customers and general public	Local Authority (which?)	Bibliographic data – already shared data Don’t forget that there is a link to the borrower record from the review/rating in the bib data in Staff Enquiry Potentially links to more than one Data Subject, so which would be the Data Controller for this catalogue data? Shared responsibility? How? Similar questions are required of other customer-created content such as tags (these are lost in the current versions of Spydus 9) (Not all data are published for the public)
Catalogue/discovery records — holdings/item-level data	Catalogue data, including electronic holdings	Library staff Library customers and general public	Local Authority (Which?)	Holdings data Links to personal data via loans/loan history and status/status history Potentially these link to more than one Data Subject, so which would be this Data Controller for the catalogue data? Logically should be the owner of the holding item (Not all data are published for the public)
Management Information/ Business Intelligence	Reports detailing usage of service, per location	Library Managers	Local Authority (Data Subject’s Local Authority will be the data controller)	Essentially should be summary data, though we’d need to have safeguards against breaches caused by very small sample data Proper safeguards and risk analyses are required before making this data available to third parties
	Demographic breakdowns	Library Managers Designated authorised analysts	Local Authority (Data Subject’s Local Authority will be the data controller)	Most would be summary data, though we’d need to have safeguards against breaches caused by very small sample data Some data (e.g. lists of postcodes) are granular enough to easily identify Data Subjects so safeguards need to be in place on the use and presentation of this data are required before making this data available to third parties
	Marketing databases	Library Managers Designated authorised marketing staff	Local Authority (Data Subject’s Local Authority will be the data controller)	Is the “I agree to receive marketing” (or equivalent) field global or local? The selection of data explicitly must be limited to those customers who have agreed to contact so as to comply with Privacy and Electronic Communications Regulations. Proper safeguards and risk analyses are required before making this data available to third parties
	Stock management data	Library staff Designated authorised third-party service providers	Local Authority (which?)	Nothing pertaining to Data Subjects should be included in this data. Stock ownership should be straightforward. Stock usage more problematic: Global usage figures recorded against bibliographic/holdings data? Local usage only? How would (if at all?) third-party stock analysis systems like CollectionHQ differentiate between local and extralimital use? In the early days at least there will be pressure to be able to provide evidence that stock is being used “fairly” with local library customers having first dibs for local stock
	Ad hoc data requests	Library Managers Designated authorised third parties	Local Authority (Data Subject’s Local Authority will be the data controller)	Most would be summary data, though we’d need to have safeguards against breaches caused by very small sample data Some data (e.g. lists of postcodes) are granular enough to easily identify Data Subjects so safeguards need to be in place on the use and presentation of this data Proper safeguards and risk analyses are required before making this data available to third parties FoI requests would be subject to the proper exclusions
SIP2 data	Data used for interfacing between Spydus and third-party systems	Library staff Specific customer	Local Authority (Data Subject’s Local Authority will be the data controller)	The particular case at the moment would be where data held in the customer record determines the access or not to third-party systems and services. Would the data be determined globally or locally? Standard use of data fields? Standard coding sets?

I'd be interested to know if/how this analysis sits with the experience of established consortium libraries, especially if I've missed something that could cause us problems.

Steven Heywood's Blog o'Library Stuff

Wednesday, 2 November 2016

Library data part three-and-a-bit: sharing customer data

Tuesday, 1 November 2016

Library data part three: dangerous demographics

Tuesday, 18 October 2016

Library data part two: what do we know about the stock?

Title-level bibliographic data

Item-level holdings data

So what have we got?

Monday, 17 October 2016

Library data part one: variations on a theme

Technical reasons

Human reasons

Wednesday, 17 August 2016

How many?

Wednesday, 27 January 2016

Library task force: "community libraries" toolkit

Sunday, 2 August 2015

Figure skating

Sunday, 3 August 2014

Net neutrality: why worry?

Thursday, 31 July 2014

Data sharing between libraries

About Me

Meanwhile…

Libraries

Library photos

Labels

About this blog

Blog Archive

Keeping an eye on...

Links

Visitors