Back to home.

Metadata! More Important Than Ever!

Category: E-commerce, Metadata

My passion for metadata isn’t a big secret – since my days at Muze and B&N.com, I’ve witnessed firsthand how good metadata helps people find the books they are looking for, and how bad metadata prevents people from finding what they want.

Why is this relevant now?

Well, CES showed us that there is a great interest in ebook readers – 23 of them debuted there, and an entire “Ebook Zone” was created. Apple is negotiating with publishers to sell content (books, magazines, newspapers) on its soon-to-appear tablet. With all these digitized books, search becomes more crucial than ever – web search is the ONLY way people are going to purchase these digital products.

Discovery/review services like NetGalley – as well as all the ecommerce sites – are heavily reliant on metadata not just for listing titles, but also for search algorithms themselves. (You’d think that would go without saying, but it doesn’t.)

Whether it’s “semantic” search or a more traditional browsing hierarchy, search technologies rest on metadata. Tags, definitions, clarifications (“when we say ‘porcelain’ we mean fine china, not toilets”) are all necessary to guide users to the information they want.

This metadata may not come in the form of the traditional ONIX feed. If a book file is marked up in XML (whether via InDesign or anything else), the title, author, BISAC and LC subject codes, price, publisher, and copyright date can all be easily derived from that book file – because those data points are defined in the file (usually in the front matter) with tags.

But just as with ONIX, what’s inside those tags has to be correct. This has a better shot at happening if the search engine is pulling from the book itself (the author name, for example, is not likely to be misspelled in the actual book).

In recently-released recommendations to the publishing industry, BIC has stated: "Publishers must retain responsibility, wherever possible and appropriate, for the metadata of the products they publish, in all formats, print and digital." Another company, Giant Chair has built its entire business around hosting a metadata platform for publishers: “When equipped with the appropriate tools, publishers are naturally the most qualified and motivated source for metadata creation and enrichment.”

Which makes sense!

Except in the real world it doesn’t quite play out that way. In my career, I’ve seen lots of publisher-generated metadata. There’s a reason why NetRead, Eloquence, and other data-scrubbing services exist. There’s a reason why Ingram, Bowker, and Baker & Taylor have departments of data editors who normalize and standardize that data. There’s a reason why librarians spend countless hours re-cataloguing titles for WorldCat. There’s a reason why BISG launched its Product Data Certification Program.

And that reason is: while publishers make the books, they continue not to pay sufficient attention to the accuracy of their data. While publishers are the definitive source of who the author is, what the list price is, what the book is about…they are not recording a lot of that information accurately. Because if they were, Fran Toolan and Greg Aden would have to find new things to do. Richard Stark would suddenly find himself with weeks and weeks of free time. Thousands of library cataloguers would be out of work. Ingram, Bowker, and B&T databases would be redundant. PDCP would not be necessary.

But good metadata IS publishers’ responsibility, fundamentally. They can outsource that responsibility, but ultimately it does all come back to the publishers. As our digital landscape explodes – as web search becomes not just one way but THE way readers find what’s next on their reading lists – metadata only becomes more important. If your sales are dipping, it’s entirely possible that readers can’t find your books. Take a look at your data. The solution is probably there.

Bookmark this post:

Posted by Laura Dawson, 8:46 pm, Comments (1), permalink

ISBNs and ebooks: Part 7624

Category: Bookselling, Digital Publishing, E-commerce, Metadata

Yesterday the AAP’s Digital Working Group hosted a meeting where Phil Madans of Hachette, Angela Bole of BISG, and I talked about ISBNs and identifying digital content. This came on the heels of Mark Bide’s webinar for BISG yesterday on the same subject.

We broke the topic down into three discrete parts: ISBNs and ebooks, ISBNs and chapters, and ISBNs and "chunks". I stopped the presentation after each slide so we could discuss each part before moving on to the next one. And some interesting findings emerged.

Metadata

The primary objection (even more than cost – but of course these were larger publishers who can buy identifiers in bulk at a discount) to assigning an ISBN to each format of ebook is having to track the metadata on each record. Databases begin to bloat with products that are identical except for format, and managing the metadata becomes both repetitive and confusing.

Furthermore, it became apparent that publishers are not particularly using ISBNs to track royalties and sales – they are using SEVERAL fields, and the ISBN is not even necessarily the most important among them. So the ISBN International Agency’s argument that the ISBN is an essential tool for tracking these things falls by the wayside.

We talked a bit about the prospect of third parties assigning ISBNs to different ebook formats – most publishers seem to just want to produce an EPUB file, assign an ISBN to that one, and then send it "into the wild" (as Bide says) for conversion and distribution. The distributors and retailers are primarily book-related and their databases are generally keyed off an ISBN, so those third parties would have to assign ISBNs to whatever formats they are distributing and selling. But the publishers at this meeting did not seem particularly worried about that prospect.

One publisher also stressed that by supporting more than one format, they’re contributing to format proliferation and they would prefer very much not to do that.

However, the downside to allowing third parties to assign ISBNs to digital products on an as-needed basis becomes problematic when there are changes to the metadata. If a pub date shifts, if a price goes up, if there are corrections to author names, additions to synopses and reviews – any time you have to edit the metadata on a title, if you’ve got third parties with their OWN editions of that title, you can’t be sure the edited/corrected metadata will reach those editions.

ISBNs and Chapters

Even less popular than the one-ISBN-per-format model is the one-ISBN-per-chapter idea. This expands the metadata bloat exponentially. At present, most publishers who are offering chapters for sale are doing so from their own websites, so ISBNs are not such an issue. However, once retailers begin offering individual chapters of books, the industry will face the same problems it does with different ebook formats. Multiplied by however many chapters are in a given book.

In addition to identification of chapters for the purposes of trading with third parties, there is the issue of tracking royalties. With textbook authors, this is problematic – many authors contribute to textbooks, and determining who wrote which chapters can be daunting. It was generally agreed that without significant market demand, identifying chapters for the purposes of trade is not a high priority.

ISBNs and "Chunks"

First there was the objection to the term "chunks". Which I agree with! It’s nasty. But Anna Wintour said the same thing about the word "blog"…and look where that got her! It seems "chunk" is the term we’re stuck with, and I am heartily sorry about that.

Second, everyone at the meeting pretty much agreed that this is a vastly esoteric subject and not likely to become a pressing issue anytime soon. Even Amazon does not sell sub-chapter-level content. Licensing content to third parties (such as websites) will likely mean putting together discrete digital assets into various packages, but there seems to be no trade reason right now for ISBNs to be attached to those packages. This may change as the market changes.

We ended with a "watch this space" message, and are now putting together a survey which looks at some of the assumptions behind past ISBN-use recommendations.

Bookmark this post:

Posted by Laura Dawson, 1:26 pm, Comments (0), permalink

Ebay not liking digital sales so much

Category: Bookselling, Digital Publishing, E-commerce

According to WebProNews, Ebay is no longer allowing sales of digital products via its normal channels – purveyors of ebooks and the like have to go through its Classified Ads system. Apparently there’s been some manipulation of feedback on digital products. According to the letter sent out to digital sellers,

Using the Classified Ads format, sellers receive a 30-day ad at a fixed price. This solution enables sellers to continue to market their digital goods on eBay; however, because Classified Ad listings are a lead generation tool and do not result in transactions that go through eBay, Feedback cannot be exchanged between buyer and seller.

Bookmark this post:

Posted by Laura Dawson, 12:51 pm, Comments (1), permalink

No Lightning Source at Amazon?

Category: Bookselling, Company News, Digital Publishing, E-commerce

The intertubes have been flapping today about Amazon’s latest move to get its POD publishers and self-published authors to exclusively use BookSurge for printing their titles. I just posted a over at O’Reilly’s Tools of Change for Publishing blog.

Peter Brantley’s listserv is all over this, as is Michael Cader. It’s pretty huge.

Bookmark this post:

Posted by Laura Dawson, 11:32 am, Comments (0), permalink

Google Book Search Releases API

Category: Digital Libraries, Digital Publishing, E-commerce, Google

Via Peter Brantley’s listserv – apparently Google has released an API that allows developers to link directly to a book in the Google Book Search database. The link is a little touchy, but ultimately Google gives an example of their API at the Deschutes Public Library. In the words of the Google blog:

Web developers can use the Books Viewability API to quickly find out a book’s viewability on Google Book Search and, in an automated fashion, embed a link to that book in Google Book Search on their own sites.

Bookmark this post:

Posted by Laura Dawson, 2:10 pm, Comments (0), permalink

Amazon to buy Audible

Category: Department of Holy Shit, Digital Publishing, E-commerce

For $300 million, Amazon will be acquiring Audible.com – Amazon issued the press release this morning at 7 a.m. This is on the heels of the departure of COO Glenn Rogers.

Bookmark this post:

Posted by Laura Dawson, 9:32 am, Comments (0), permalink

Ebooks up, audiobooks down

Category: Bookselling, Digital Publishing, E-commerce

The AAP released sales figures for the fiscal year ending in November 2007, reports Shelf Awareness this morning. Notable stats (to us, anyway):

Sales of ebooks rose 36.4% over 2006. Sales of audiobooks declined by 24.1%, which I found quite surprising given the hype around audiobooks in the previous year. I’m wondering if it’s because the only downloadable games in town are Overdrive (which does not have a commercial application, only one for institutions) and Audible.com (which does not have an institutional strategy, only a commercial one). MediaBay went out of business last year. It may also be due to the migration from CD audiobooks to downloadable ones – there’s bound to be a dip as people learn new technologies. And, as belts tighten in this economy, it may also be that audiobooks are proving to be a luxury that consumers are deciding they can live without.

Bookmark this post:

Posted by Laura Dawson, 10:22 am, Comments (1), permalink

Borders staffs up in IT

Category: Bookselling, E-commerce

Borders announced that it has hired Gary E. Baker to serve as VP of IT Delivery Services. With deep background in IT (he hosts a radio program called "Internet Advisor" on Saturday nights), Baker will be responsible for

the development and execution of IT strategic processes related to the delivery of technology as well as leading teams to ensure that business goals are met through delivery of necessary IT products and services, among other duties.

Bookmark this post:

Posted by Laura Dawson, 10:02 am, Comments (0), permalink

Glenn Rogers leaving Audible

Category: Department of Holy Shit, E-commerce

Glenn Rogers, COO of Audible and an incredible, reasonable, smart, kind, awesome, all-around-good-guy-mensch, is leaving the company to go back to consulting. I worked with Glenn when I was consulting at Audible and he is fantastic.

Good luck, Glenn!

Bookmark this post:

Posted by Laura Dawson, 11:50 am, Comments (1), permalink

David Cully at B&T

Category: Bookselling, Company News, Digital Libraries, E-commerce, Libraries, Publishing

David Cully, formerly of B&N, has gone over to Baker & Taylor as…well, his title’s far too long so you can go to the press release here. According to this,

Cully’s primary responsibilities include managing all merchandising and purchasing functions, managing BTMS, and managing Baker & Taylor’s new Specialty Markets Group.

Bookmark this post:

Posted by Laura Dawson, 11:35 am, Comments (0), permalink

Older Posts »

Email:

Username:

Password: