Monday, October 13. 2014
My slides from DCMI 2014: schema.org in the wild: open source libraries++.
Last week I was at the Dublin Core Metadata Initiative 2014 conference, where Richard Wallis, Charles MacCathie Nevile and I were slated to present on schema.org and the work of the W3C Schema.org Bibliographic Extension Community Group (#schemabibex). As a first-timer at DCMI, I wasn't sure what kind of an audience to expect: there is a peer-reviewed papers track, and a series of sessions on a truly intimidating topic (RDF Application Profiles), but on the other hand our own topic was fairly basic. As it turned out, there was an invigoratingly mixed set of backgrounds present, and Eric Miller's opening keynote, which gave an oral history of the origins of DCMI and a look towards the future challenges for the organization, reassured me that I wasn't going to be out of my depth.
Special kudos to Eric for his analogy of the Web to a credit card, which offers both human-readable and machine-readable data. A nice, clean image!
Richard, Charles and I opted to structure our 1.5 hour session as a series of short talks followed by a long period of discussion. However, as often happens, the excitement of speaking in front of a room that drew so many attendees that we had to jam with more chairs led to that plan breaking down. I cut my own materials back to illustrating how one of my primary contributions to the #schemabibex effort--representing library holdings using schema.org's GoodRelations-based Product/Offer model--had been implemented in free software library systems, including Evergreen, Koha, and VuFind. I walked from a basic bibliographic record (represented as a Product), through to the associated borrowable items (represented as Offers with a price of $0.00, call numbers as SKUs, and barcodes as serialNumbers), that were offered by a specific Library with its own set of operating hours, address, and contact information... all published out of the box as RDFa in modern Evergreen systems.
I did stray a little to posit that the use case for schema.org is not and should not be limited to "search engine optimization", but that this very simple level of structured data could fairly easily form the basis of an API. In the rather limited discussion that we were able to hold at the end of the session (and encroaching on break time), Charles counselled that libraries shouldn't really bother with dumbing down their beautiful metadata simply to publish schema.org... while I countered that the pursuit of publishing beautiful metadata in the past has generally led librarians to publish no metadata at all, and that schema.org was a great first step towards building a web of cultural heritage metadata meant for machine consumption.
I wish I could have stayed longer at DCMI, but it was Thanksgiving in Canada and there were families to visit and feast with--not to mention children to help take car of--so I had to depart after just a day and a half. I'm encouraged by the steps the organization is taking to renew itself, and I hope to be able to participate again in the future.
Saturday, September 13. 2014
Version 1.91 of the http://schema.org vocabulary was released a few days ago, and I once again had a small part to play in it.
With the addition of the workExample and exampleOfWork properties, we (Richard Wallis, Dan Brickley, and I) realized that examples of these CreativeWork example properties were desperately needed to help clarify their appropriate usage. I had developed one for the blog post that accompanied the launch of those properties, but the question was, where should those examples live in the official schema.org docs? CreativeWork has so many children, and the properties are so broadly applicable, that it could have been added to dozens of type pages.
It turns out that an until-now unused feature of the schema.org infrastructure is that examples can live on property pages; even Dan Brickley didn't think this was working. However, a quick test in my sandbox showed that it _was_ in perfect working order, so we could locate the examples on their most relevant documentation pages... Huzzah!
I was then able to put together a nice, juicy example showing relationships between a Tolkien novel (The Fellowship of the Ring), subsequent editions of that novel published by different companies in different locations at different times, and movies based on that novel. From this librarian's perspective, it's pretty cool to be able to do this; it's a realization of a desire to express relationships that, in most library systems, are hard or impossible to accurately specify. (Should be interesting to try and get this expressed in Evergreen and Koha...)
In an ensuing conversation on public-vocabs about the appropriateness of this approach to work relationships, I was pleased to hear Jeff Young say "+1 for using exampleOfWork / workExample as many times as necessary to move vaguely up or down the bibliographic abstraction layers."... To me, that's a solid endorsement of this pragmatic approach to what is inherently messy bibliographic stuff.
Kudos to Richard for having championed these properties in the first place; sometimes we're a little slow to catch on!
Friday, July 18. 2014
Since returning from my sabbatical, I've felt pretty strongly that one of the things our work place is lacking is open communication about the work that we do--not just outside of the library, but within the library as well. I'm convinced that the more that we know about the demands on each other's time and the goals that we're trying to achieve, the more likely we'll be able to work together towards the same goals and have a better understanding of each other's challenges.
Towards that end, I've decided to try maintaining a work blog so that my colleagues will have a better idea about what I've been up to. I wouldn't be surprised if some of my peers think that I sit in my office all day browsing the internet (which, actually, happens sometimes but I swear I'm doing it to try and find a solution for a problem!), because the day-to-day work of a systems librarian can be pretty esoteric. And when you know that they have many expectations for you to fix the many small annoyances they have to deal with, it might help them to develop some empathy if they understand what you actually are spending your time on.
Anyway, I decided not to mirror the content here because, well, it's probably too site-specific to really be of interest to you, my dear readers. Whoever you are. However, I will link to the two entries that I've cranked out so far; you can decide if you want to follow along from there:
Tuesday, July 1. 2014
On Sunday, June 29th Jenn Riley, Jason Clark, and I presented at the ALCTS/LITA jointly sponsored session Understanding schema.org. The build-up to the session was pretty amazing; I was delighted to learn that Jason and I had been working on pretty much parallel efforts over the past couple of years. Jenn did a great job of organizing the session, and by the time we started talking 276 people had indicated their interest in attending: that was two more than those who had indicated an interest in attending the BIBFRAME Forum Update scheduled in the same time slot. Our room was large and quite full.
Jenn started the session out string by advancing her concept that libraries need to target discovery elsewhere: that is, that there is no way that libraries can compete directly with major search engines like Google, Bing, and Yahoo, either through the discovery tools that we have to offer, our presence in the consciousness of most of the population as the starting point for discovery, or in the resources we can direct towards closing the huge gap in technology, usability, and mindshare that the search engines have opened up over the past two decades. But, we can take steps to start working with the search engines to enable our resources to be discovered and accessed more directly by them.
That led quite naturally to my own part of the session, in which I talked about
my attempt to turn cataloguing's efforts to provide access points in our niche
catalogues into access points for the open web by publishing schema.org
structured data from library catalogues like Evergreen, Koha, and VuFind. I
started things out by pointing out the legacy of restrictive
For this talk I used visualizations generated by the RDFa playground to illustrate the structured
data contained in some real examples of a production Evergreen system (thanks
to Bibliomation). Given that I'm normally a
text-and-talk kind of guy, the illustrations seemed to help out--particularly
in showing how holdings map quite readily to the
Of course, the evolution from unstructured, to structured, to linked data had its payoff beginning with the link from holdings to the libraries that hold the resources. We have plenty more we can and must do, but unlike other efforts which are still crystallizing and which will require significant architectural work to happen before libraries can even begin trying out real systems, you can use schema.org-enabled systems today. And adapting systems to publish schema.org structured data only requires access to the HTML templates for your system (which, hopefully, you have: otherwise you have bigger problems to deal with!) and following the patterns that have already been established by Evergreen, Koha, and VuFind.
Jason did a great job showing both a broader use case for schema.org, including
work he has led on digital collections such as embedding the
Perhaps the best part of the session, however, were the insightful questions from the audience (along with the genuinely enthusiastic response to our talks). We had deliberately left 15 minutes for questions, and we were not disappointed: from questions about how we move from structured data to more linked data (I riffed on the Dodds/Davis Progressive Enrichment linked data pattern, suggesting that we should be able to store links for each field or value of interest directly in our MARC records), to questions about what proprietary systems are doing this with schema.org today (alas, none that I'm aware of, unless something has changed since February).
This work is licensed under a Creative Commons Attribution-Share Alike 2.5 Canada License.