October 03, 2007


I spent last Thursday in London at a meeting of the (rather grandly named!) Expert Panel of the JISC-funded Names Project, the principal partners of which are MIMAS at the University of Manchester and The British Library. The project's aims are

to scope the requirements of UK institutional and subject repositories for a service that will reliably and uniquely identify names of individuals and institutions.


[...] to develop a prototype service which will test the various processes involved. This will include determining the data format, setting up an appropriate database, mapping data from different sources, populating the database with records and testing the use of the data.

The project is managed on behalf of MIMAS by Amanda Hill (from her new homestead in rural Ontario!), and Amanda led the meeting on Thursday. She concentrated on presenting three documents, which I think should all be available from the project Web site shortly: a project plan, a review of the "name authority files" landscape, and a small set of "usage scenarios" that the project might seek to support. There are certainly some issues to consider anyway.

The "landscape" document, by Amanda and Alan Danskin & Richard Moore of the BL, summarises some of the standards and specifications used for the representation of descriptions of persons and organisations, and some of the existing systems and services that hold and make available such data. The document concentrates exclusively on (what I think of) as fairly "formal" sources of data (like the Library of Congress/NACO Names Authority File and OCLC's WorldCat Identities), and excludes sources such as Wikipedia - though it may well be the case that Wikipedia's coverage of many of the persons and institutions of interest in this context is limited.

One of the issues that came up quite early in the meeting was that of the constraints imposed by the legal context within which the project is operating. Given the project's focus on supporting - not exclusively, but primarily - systems that deal largely with works created by living individuals, the storage and use of information about these persons is typically covered by legislation - in the UK, by the Data Protection Act and related legislation. Two of the core principles of the DPA are that:

  • Data may only be used for the specific purposes for which it was collected.
  • (Subject to some qualifications) data must not be disclosed to other parties without the consent of the individual whom it is about

Further, there are limitations on the jurisdictions within which the information can be transferred.

There are probably implications here for the Names project, both in terms of obtaining permission to use existing data sources, and in terms of addressing the DPA requirements for the data Names itself holds. Names is funded under the JISC Shared Infrastructure Services programme. Typically these services aren't primarily in the business of providing "user-facing" functions; rather they aggregate and make available data which other applications, developed by other agencies, then access and use to deliver such functions. Given this sort of context, I imagine it may be quite difficult for the Names project itself to specify fully the purposes for which data is being collected: in theory, those third-party services might perform functions on the data that the Names project itself can not predict.

As part of my pre-meeting truffling, I had a look at the (relatively) recent draft of the Functional Requirements for Authority Data (FRAD) specification. FRAD is another product of IFLA, and it is a sibling document to, or extension of, the (probably better known) Functional Requirements for Bibliographic Records (FRBR) specification. More specifically it's the product of an IFLA group called the "Working Group on Functional Requirements and Numbering of Authority Records (FRANAR)", with the rather confusing (to the outsider) consequence that the acronym FRANAR is sometimes used to refer to this area of work too, but I think the intent is that the model is referred to as FRAD.

Like FRBR, FRAD describes an entity-relational model, with the focus of FRAD on the entities related to "authority data" rather than to the "bibliographic record" itself. IIRC, I had looked at an earlier draft of FRAD quite some time ago, but the current version seems to have come on a long way from that version, and - from a fairly cursory reading on my part - it looks as if it may be a very useful document, both for those (like the Names project) seeking to develop applications in this area, but also for the non-librarians (like me) who want to have a better understanding of librarians' conceptualisations of the world, e.g. the relationships between persons (or personas), names, and access points.


