Just back from a conference in Bethesda at the National Library of Medicine
The central problems being addressed were: What makes a good identifier, how are identifiers embedded in working systems, and what technical/service infrastructure is necessary to build effective systems around good identifiers? (All of these questions were asked to determine what role NISO
should have in developing identifier standards.)
The crucial issue was one of trust. A community has to have confidence in its identifiers; organizations have to know what other organizations are using it; Pat Stevens
referred to this as the "fabric of trust", which I thought was a great way of describing it. We discussed, in breakout sessions, the example of the "ESBN" issue that is now confronting the book industry - we don't know who the ESBN people are, what they intend the identifier to be used for, how it's different in nature from the ISBN - and without that trust, people are not going to adopt it. As Stuart Weibel
said, "The only guarantee of the usefulness and persistence of identifier systems is the commitment of the organizations which assign, manage, and resolve them".
And as we developed ideas in further breakout sessions, the issue of trust continued to come up. If a community is not fully engaged in and supportive of an identifier, nothing about that identifier is going to work. However, identifiers can be pushed too far. I brought up the example of an overly-effective identifier - the ISBN - in the case of Barnes & Noble's
database. The top-selling ISBN at Barnes & Noble when I was there was...biscotti. This metaphor continued to crop up throughout the meeting - it's now apparently taken on mythological proportions.
We discussed different types of identifiers, which Stuart labeled as "opaque", "sequentially semantic", and "encoded semantics" - and what the effectiveness of each is. An opaque ID is one that has no intrinsic meaning; a sequentially semantic ID is one which has meaning only in relation to others like it; an encoded semantic ID is one where you can look at the ID and determine attributes from the structure of the ID. An ISBN is an encoded semantic ID - publisher prefix, check digit, country code, ID of the actual product. Another word for an encoded semantic ID became (in shorthand) a "hackable" identifier - once you de-code or reverse-engineer it, you can find other products of the same sort. We discussed the positive and negative qualities of each of these types of IDs, and naturally concluded that you'd need different types for different functions and that even a "hackable" identifier was not necessarily a bad thing. (Which is largely the type of conclusion we came to about everything, it being a NISO conference.)
Another interesting notion we discussed a little - and which I'd like to see more discussion on - is the idea of identifiers as world views. What one leaves out, in defining what one is identifying, is as important as what one puts in. When you say an ISBN is an identifier for a book, what specifically about that book are you identifying? The hegemony that identifiers necessarily impose is an interesting one (a little more philosophical and political than practical, but still fun to think about).
Posted by Laura Dawson
, March 16, 06, 3:00 PM , comments(1)