Coffee|Code: Dan Scott's blog - evergreenhttps://coffeecode.net/2017-08-24T16:00:00-04:00Librarian · DeveloperOur nginx caching proxy setup for Evergreen2017-08-24T16:00:00-04:002017-08-24T16:00:00-04:00Dan Scotttag:coffeecode.net,2017-08-24:/our-nginx-caching-proxy-setup-for-evergreen.htmlDetails of our nginx caching proxy settings for Evergreen
<p>
A long time ago, I experimented with using nginx as a caching proxy in front of
Evergreen but never quite got it to work. Since then, a lot has changed in both
nginx and Evergreen, and Bill Erickson figured out how to get nginx to proxy
the websockets that Evergreen now needs for its web-based staff client. This
spring, as part of my work towards <a href="https://coffeecode.net/evergreen-progressive-web-app.html">building prototype offline support for the Evergreen catalogue's <em>My Account</em> section</a>, I dug in and started
figuring out some of the final pieces that are needed to enable nginx to proxy
most of the static content that Apache (with its bloated processes) would
otherwise have to serve up, and wrote a <a href="http://git.evergreen-ils.org/?p=contrib/Conifer.git;a=blob;f=Open-ILS/src/support-scripts/webserver_config.py;h=0b74e6c2f764961088ae2136793ba845fd1cff17;hb=400ce5b233d33d08badcdc0887e97e748020301b">configuration generator script</a> for the
nginx and Apache pieces. And in July, we went live with the configuration.
</p>
<p>
This post documents what we currently (as of August 2017) are running on our
Evergreen 2.12 server with Ubuntu 16.04. If you have any questions about this
or our corresponding Apache configuration, please let me know and I'll attempt
to answer them!
<p>
<h2>/etc/nginx/sites-enabled/evergreen.conf</h2>
<p>
This is the core configuration for the nginx server:
</p>
<pre>proxy_cache_path /tmp/nginx_cache levels=1:2 keys_zone=my_cache:10m max_size=1g
inactive=60m use_temp_path=off;
proxy_cache_key $scheme$http_host$request_uri;
server {
listen 80;
server_name clients.concat.ca;
include /etc/nginx/concat_ssl.conf;
include /etc/nginx/osrf_sockets.conf;
location / {
proxy_pass https://localhost:7443;
rewrite ^/?$ /updates/manualupdate.html permanent;
include /etc/nginx/concat_headers.conf;
}
}</pre>
<ul>
<li>
The <code><a href="http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_path">proxy_cache_path</a></code>
directive tells nginx where to store the data it is caching, what kind
of directory structure it should create (<em>levels</em>), the name of
the shared memory zone to use (<em>keys_zone</em>), the maximum size of
the disk cache (<em>max_size</em>), how long to retain a cached copy of
the file (<em>inactive</em>), and whether to use the value of the
<code><a href="http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_temp_path">proxy_temp_path</a></code>
directive as a parent directory for the cache.
</li>
<li>
The <code><a href="http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_key">proxy_cache_key</a></code>
tells nginx to use a combination of the request scheme (typically HTTP
or HTTPS), the hostname, and the full request URI (including GET
arguments) to store and lookup the cached data. Apache's response
tells nginx how long the request should be cached (whether it should
expire immediately, or as of
<a href="https://bugs.launchpad.net/evergreen/+bug/1681095">#1681095 "Extend browser cache-busting support"</a>,
cache for a full year for images, JavaScript, and CSS (at least until
you run <code>autogen.sh</code> again).
</li>
<li>
We currently include one <code>server</code> directive per hostname
that we support, which is quite repetitive. Looking at this with fresh
eyes, we should probably simply use something like <code>server_name
*.concat.ca</code> to cover all of our hostnames on our domain with a
single directive.
</li>
<li>
In this block, we only listen to port 80, which seems odd given that
we're an HTTPS-only site. Read on!
</li>
<li>
<code>include /etc/nginx/concat_ssl.conf;</code> keeps all of the
TLS-related configuration in one place, including listening to port
443. We'll pry open this file later.
</li>
<li>
<code>include /etc/nginx/osrf_sockets.conf;</code> keeps all of the
OpenSRF websockets translator proxy configuration in one place. We'll
also pry open this file later.
</li>
<li>
The <code>location /</code> block handles the proxying. At first I was nervous
and wanted to proxy the actual hostname instead of
<code>localhost</code> to ensure we got the right templates, etc, but
it turns out the proxy headers guide the request to the right host. So
now I'm relaxed and we simply pass the request on to
<code>https://localhost:443</code>. Be very careful with those trailing
slashes!
</li>
</ul>
<h2>/etc/nginx/concat_ssl.conf</h2>
<pre>listen 443 ssl http2;
ssl_certificate /etc/apache2/ssl/server.crt;
ssl_certificate_key /etc/apache2/ssl/server.key;
if ($scheme != "https") {
return 301 https://$host$request_uri;
}
# generate with openssl dhparam -out dhparams.pem 2048
ssl_dhparam /etc/apache2/dhparams.pem;
# From https://mozilla.github.io/server-side-tls/ssl-config-generator/
ssl_prefer_server_ciphers on;
ssl_session_timeout 1d;
ssl_session_cache shared:SSL:50m;
ssl_session_tickets off;
# intermediate configuration. tweak to your needs.
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_ciphers 'ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS';
# HSTS (ngx_http_headers_module is required) (15768000 seconds = 6 months)
add_header Strict-Transport-Security max-age=15768000;
# OCSP Stapling ---
# fetch OCSP records from URL in ssl_certificate and cache them
ssl_stapling on;
ssl_stapling_verify on;</pre>
<p>
There's a fair bit going on here, but it's almost entirely related to TLS
support and a lot of the content comes either from the
<a href="https://mozilla.github.io/server-side-tls/ssl-config-generator/">Mozilla TLS configuration generator</a>
or from Certbot's configuration plugin for nginx. Perhaps most interesting
is the <code>listen 443 ssl http2;</code> line that enables listening on
the standard HTTPS port and also supports <a href="https://http2.github.io/">HTTP/2</a>
for browsers that support it--effectively a way to use a single connection
from a browser to a server to issue many parallel requests for resources,
amongst other performance enhancements.
</p>
<p>
We also force any HTTP request to use an HTTPS connection using the
<code>if ($scheme != "https") {</code> block.
</p>
<h2>/etc/nginx/concat_headers.conf</h2>
<p>
This is extracted from the <a href="http://git.evergreen-ils.org/?p=OpenSRF.git;a=blob;f=examples/nginx/osrf-ws-http-proxy;h=d079230c62c3580c25435413570a6cc95b4bbd8a;hb=refs/heads/master">sample nginx configuration</a> shipped with OpenSRF:
</p>
<pre>location /osrf-websocket-translator {
proxy_pass https://localhost:7682;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# Needed for websockets proxying.
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# Raise the default nginx proxy timeout values to an arbitrarily
# high value so that we can leverage osrf-websocket-translator's
# timeout settings.
proxy_connect_timeout 5m;
proxy_send_timeout 1h;
proxy_read_timeout 1h;
}</pre>
<h2>/etc/nginx/concat_headers.conf</h2>
<p>
This is not perfectly named; while we do set up the proxy headers in this
file, we also include some of the other statements we would otherwise have
to repeat inside the <code>server</code> block. Here's what the contents
look like:
</p>
<pre>proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_cache my_cache;
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
proxy_cache_lock on;
rewrite ^/?$ /eg/opac/home permanent;</pre>
<ul>
<li>
The <code><a href="http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_set_header">proxy_set_header</a></code>
directive adds headers to the requests forwarded to Apache, so that Apache
can figure out which host was actually requested, accurately log requests
(instead of saying everything is coming from <code>localhost</code>),
etc. These directives were copied directly from the <a href="http://git.evergreen-ils.org/?p=OpenSRF.git;a=blob;f=examples/nginx/osrf-ws-http-proxy;h=d079230c62c3580c25435413570a6cc95b4bbd8a;hb=refs/heads/master">sample nginx configuration</a> shipped with OpenSRF.
</li>
<li>
<code><a href="http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache">proxy_cache</a></code>
tells this server to use the cache we previously named in our <code>keys_zone</code> parameter.
</li>
<li>
<code><a href="http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_use_stale">proxy_cache_use_stale</a></code>
tells this server to return stale data (if it has a cached copy) if Apache
returns an error or a timeout or any of the specified HTTP status codes
while trying to fetch a fresh copy.
</li>
<li>
<code><a href="http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_lock">proxy_cache_lock</a></code>
tells this server to, should multiple identical requests for data that
needs to be cached or refreshed arrive, only allow a single request to be
passed through to Apache and have the other requests wait. This can be one
way to avoid the "someone set a book down on a keyboard and caused 100
identical requests in one second" problem.
</li>
<li>
The <code>rewrite</code> simply directs the request for a bare hostname
(with or without a trailing slash) to the catalogue home page.
</li>
</ul>
</p>
Enriching catalogue pages in Evergreen with Wikidata2017-08-12T16:00:00-04:002017-08-12T16:00:00-04:00Dan Scotttag:coffeecode.net,2017-08-12:/enriching-catalogue-pages-in-evergreen-with-wikidata.htmlAn openly licensed JavaScript widget that enriches library catalogues with Wikidata data
<p>
I'm part of the Music in Canada @ 150 Wikimedia project, organizing wiki
edit-a-thons across Canada to help improve the presence of Canadian music and
musicians in projects like Wikpedia, Wikidata, and Wikimedia Commons. It's
going to be awesome, and it's why I invested time in developing and delivering
the <a href="https://coffeecode.net/wikidata-workshop-for-librarians.html"><em>Wikidata
for Librarians</em></a> presentation at the
<abbr title="Canadian Association of Music Libraries, Archives, and Document Centres">CAML</abbr> preconference.
</p>
<p>
Right now I'm at the Wikimania 2017 conference, because it is being held in
Montréal--just down the road from me when you consider it is an international
affair. The first two days were almost entirely devoted to a massive hackathon
consisting of hundreds of participants with a very welcoming, friendly ambiance.
It was inspiring, and I participated in several activities:
</p>
<ul>
<li>installing Wikibase--the technical foundation for Wikidata--from scratch</li>
<li>an ad-hoc data modelling session with <a href="https://meta.wikimedia.org/wiki/User:Ainali">Jan</a> and
<a href="https://meta.wikimedia.org/wiki/User:smallison">Stacy
Allison-Cassin</a> that resulted in enhancing the periodicals structure on
Wikidata</li>
</ul>
<p>
But I also had the itch to revisit and enhance the JavaScript widget that runs
in our Evergreen catalogue which delivers on-demand cards of additional
metadata about contributors to recorded works. I had originally developed the widget
as a proof-of-concept for the potential value to cultural institutions of
contributing data to Wikidata--bearing in mind a challenge put to the room at
an Evergreen 2017 conference session that asked what tangible value linked open
data offers--but it was quite limited:
</p>
<ul>
<li>it would only show a card for the first listed contributor to the work</li>
<li>it was hastily coded, and thus duplicated code, used shortcuts, and had no comments</li>
<li>the user interface was poorly designed</li>
<li>it was not explicitly licensed for reuse</li>
</ul>
<p>
So I spent some of my hackathon time (and some extra time stolen from various sessions)
fixing those problems--so now, when you look at the <a href="https://laurentian.concat.ca/eg/opac/record/738234">catalogue record
for a musical recording by the most excellent Canadian band Rush</a>, you will find
that each of the contributors to the album has a musical note (♩) which, when clicked,
displays a card based on the data returned from Wikidata using <a href="http://tinyurl.com/ya8gxcpo">a SPARQL query</a> matching the
contributor's name (limited in scope to bands and musicians to avoid too many
ambiguous results).
</p>
<p>
<img src="/uploads/pics/rush_wikidata_enriched.png" />
</p>
<p>
I'm not done yet: the design is still very basic, but I'm happier about the code quality
and it now supports queries for all of the contributors to a given album. It is also
licensed for reuse under the GPL version 2 or later license, so as long as you
can load the script in your catalogue and tweak a few CSS query selector
statements to identify where the script should find contributor names and where
it should place the cards, it should theoretically be usable in any catalogue of musical
recordings. And with the clear <em>"Edit on Wikidata"</em> link, I hope that it
encourages users to jump in and contribute if they find one of their favourite performers
lacks (or shows incorrect!) information.
</p>
<p>
You can find the code on the <a href="https://goo.gl/XvEemr">Evergreen contributor git repository</a>.
</p>
Evergreen as a Progressive Web App?2017-04-14T00:35:00-04:002017-04-14T00:35:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2017-04-14:/evergreen-progressive-web-app.html<p><a class="reference external" href="https://developer.mozilla.org/en-US/Apps/Progressive">Progressive Web Apps</a>
are pretty cool, and for good reason: the idea is to take advantage of the
advanced features of our web browsers to provide capabilities that rival native
apps, while still offering good performance and functionality to users of other
browsers.</p>
<p>However, if you've done much reading about …</p><p><a class="reference external" href="https://developer.mozilla.org/en-US/Apps/Progressive">Progressive Web Apps</a>
are pretty cool, and for good reason: the idea is to take advantage of the
advanced features of our web browsers to provide capabilities that rival native
apps, while still offering good performance and functionality to users of other
browsers.</p>
<p>However, if you've done much reading about PWAs, you could be forgiven for
thinking they require a client-side JavaScript framework like <a class="reference external" href="https://facebook.github.io/react/">React</a> or <a class="reference external" href="https://angular.io">Angular</a> to
be possible. So last week at the <a class="reference external" href="https://evergreen-ils.org/conference/2017-evergreen-international-conference/">2017 Evergreen International Conference</a>,
I demonstrated that it <em>is</em> possible to graft PWA attributes onto Evergreen's
classic Perl-based Template Toolkit web architecture--to the point of scoring
100/100 on Google's <a class="reference external" href="https://developers.google.com/web/tools/lighthouse/">Lighthouse web site audit tool</a> (from a baseline of
37/100).</p>
<p>You might enjoy my presentation, <a class="reference external" href="https://stuff.coffeecode.net/2017/evergreen-progressive-web-app/">We aim to misbehave - Evergreen: Progressive
Web App</a> (yes,
that's a <em>Firefly</em> reference), or you might enjoy poking around the code I posted
in the <a class="reference external" href="http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/dbs/progressive_web_app_example">corresponding branch</a>.
Check out the new <a class="reference external" href="http://git.evergreen-ils.org/?p=working/Evergreen.git;a=tree;f=Open-ILS/examples/pwa;h=278a2eafa948bbf09d164617f337346127fc7a7a;hb=fc584eddbe695e08befed621802aaea281019c9a">pwa examples directory</a>
for a README and the core examples.</p>
<p>It's far from perfect at this point, but as a proof of concept, I'm quite
pleased, and I think it offers a possible vision of the way forward, particularly
for the <strong>My Account</strong> section of the public catalogue, which really deserves to
become its own app. If nothing else, it has refocused attention on enhancing
Evergreen's web performance, and that can only be a good thing, right?</p>
Querying Evergreen from Google Sheets with custom functions via Apps Script2016-04-15T18:36:00-04:002016-04-15T18:36:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2016-04-15:/querying-evergreen-from-google-sheets-with-custom-functions-via-apps-script.html<p>Our staff were recently asked to check thousands of ISBNs to find out if
we already have the corresponding books in our catalogue. They in turn
asked me if I could run a script that would check it for them. It makes
me happy to work with people who believe …</p><p>Our staff were recently asked to check thousands of ISBNs to find out if
we already have the corresponding books in our catalogue. They in turn
asked me if I could run a script that would check it for them. It makes
me happy to work with people who believe in <em>better living through
automation</em> (and saving their time to focus on tasks that only humans
can really achieve).</p>
<p>Rather than taking the approach that I normally would, which would be to
just load the ISBNs into a table in our Evergreen database and then run
some queries to take care of the task as a one-off, I opted to try for
an approach that would enable others to run these sort of adhoc reports
themselves. As with most libraries, I suspect, we work with spreadsheets
a lot--and as our university has adopted Google Apps for Education, we
are slowly using Google Sheets more to enable collaboration. So I was
interested in figuring out how to build a custom function that would
look for the ISBN and then return a simple "Yes" or "No" value according
to what it finds.</p>
<p>Evergreen has a robust SRU interface, which makes it easy to run complex
queries and get predictable output back, and it normalizes ISBNs in the
index so that a search for an 10-digit ISBN will return results for the
corresponding 13-digit ISBN. That made figuring out the lookup part of
the job easy; after that, I just needed to figure out how to create a
custom function in Google Sheets.</p>
<p>As it turns out, there's a dead-simple <a class="reference external" href="https://developers.google.com/apps-script/quickstart/macros">introductory tutorial for
creating a custom function in Apps
Script</a>
which tells you how to create a new function. And to make a call to a
web service, there's the
<a class="reference external" href="https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app">URLFetchApp</a>
class. After that, it's a matter of basic JavaScript. In the end, my
custom function looks like the following:</p>
<div class="highlight"><pre><span></span><span class="cm">/**</span>
<span class="cm">* A custom function that checks for an ISBN in Evergreen</span>
<span class="cm">*</span>
<span class="cm">* Returns "Yes" if there is a match, or "No" if there is no match</span>
<span class="cm">*/</span>
<span class="kd">function</span> <span class="nx">checkForISBN</span><span class="p">(</span><span class="nx">isbn</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">var</span> <span class="nx">hostname</span> <span class="o">=</span> <span class="s1">'https://example.org'</span><span class="p">;</span>
<span class="kd">var</span> <span class="nx">urlBase</span> <span class="o">=</span> <span class="nx">hostname</span> <span class="o">+</span> <span class="s1">'/opac/extras/sru'</span><span class="p">;</span>
<span class="cm">/* Supply a numeric or shortname library identifier</span>
<span class="cm"> * to restrict the search to that part of the organization</span>
<span class="cm"> */</span>
<span class="kd">var</span> <span class="nx">libraryID</span> <span class="o">=</span> <span class="s1">'103'</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">libraryID</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">urlBase</span> <span class="o">+=</span> <span class="s1">'/'</span> <span class="o">+</span> <span class="nx">libraryID</span><span class="p">;</span>
<span class="p">}</span>
<span class="nx">urlBase</span> <span class="o">+=</span> <span class="s1">'?version=1.1&operation=searchRetrieve&maximumRecords=1&query='</span><span class="p">;</span>
<span class="kd">var</span> <span class="nx">q</span> <span class="o">=</span> <span class="nb">encodeURIComponent</span><span class="p">(</span><span class="s1">'identifier|isbn:'</span> <span class="o">+</span> <span class="nx">isbn</span><span class="p">);</span>
<span class="kd">var</span> <span class="nx">url</span> <span class="o">=</span> <span class="nx">urlBase</span> <span class="o">+</span> <span class="nx">q</span><span class="p">;</span>
<span class="kd">var</span> <span class="nx">response</span> <span class="o">=</span> <span class="nx">UrlFetchApp</span><span class="p">.</span><span class="nx">fetch</span><span class="p">(</span><span class="nx">url</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">response</span><span class="p">.</span><span class="nx">getContentText</span><span class="p">().</span><span class="nx">search</span><span class="p">(</span><span class="s1">'1'</span><span class="p">)</span> <span class="o">></span> <span class="o">-</span><span class="mf">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="s2">"Yes"</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="s2">"No"</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
<p>Then I just add a column beside the column with ISBN values and invoke
the function as (for example) <tt class="docutils literal">=CheckForISBN(C2)</tt>.</p>
<p><img alt="CheckForISBN() function being invoked in a Google Sheet" class="serendipity-image-center" src="/uploads/pics/check_for_isbn.png" style="width: 462px; height: 346px;" /></p>
<p>Given a bit more time, it would be easy to tweak the function to make it
more robust, offer variant search types, and contribute it as a module
to the <a class="reference external" href="https://chrome.google.com/webstore">Chrome Web Store</a> "Sheet
Add-ons" section, but for now I thought you might be interested in it.</p>
<p><strong>*Caveats*</strong>: With thousands of ISBNs to check, occasionally you'll get
an HTTP response error ("<tt class="docutils literal">#ERROR</tt>") in the column. You can just paste
the formula back in again and it will resubmit the query. The sheet also
seems to resubmit the request on a periodic basis, so some of your "Yes"
or "No" values might change to "<tt class="docutils literal">#ERROR</tt>" as a result.</p>
Querying Evergreen from Google Sheets with custom functions via Apps Script2016-04-15T18:36:00-04:002016-04-15T18:36:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2016-04-15:/querying-evergreen-from-google-sheets-with-custom-functions-via-apps-script.html<p>Our staff were recently asked to check thousands of ISBNs to find out if
we already have the corresponding books in our catalogue. They in turn
asked me if I could run a script that would check it for them. It makes
me happy to work with people who believe …</p><p>Our staff were recently asked to check thousands of ISBNs to find out if
we already have the corresponding books in our catalogue. They in turn
asked me if I could run a script that would check it for them. It makes
me happy to work with people who believe in <em>better living through
automation</em> (and saving their time to focus on tasks that only humans
can really achieve).</p>
<p>Rather than taking the approach that I normally would, which would be to
just load the ISBNs into a table in our Evergreen database and then run
some queries to take care of the task as a one-off, I opted to try for
an approach that would enable others to run these sort of adhoc reports
themselves. As with most libraries, I suspect, we work with spreadsheets
a lot--and as our university has adopted Google Apps for Education, we
are slowly using Google Sheets more to enable collaboration. So I was
interested in figuring out how to build a custom function that would
look for the ISBN and then return a simple "Yes" or "No" value according
to what it finds.</p>
<p>Evergreen has a robust SRU interface, which makes it easy to run complex
queries and get predictable output back, and it normalizes ISBNs in the
index so that a search for an 10-digit ISBN will return results for the
corresponding 13-digit ISBN. That made figuring out the lookup part of
the job easy; after that, I just needed to figure out how to create a
custom function in Google Sheets.</p>
<p>As it turns out, there's a dead-simple <a class="reference external" href="https://developers.google.com/apps-script/quickstart/macros">introductory tutorial for
creating a custom function in Apps
Script</a>
which tells you how to create a new function. And to make a call to a
web service, there's the
<a class="reference external" href="https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app">URLFetchApp</a>
class. After that, it's a matter of basic JavaScript. In the end, my
custom function looks like the following:</p>
<div class="highlight"><pre><span></span><span class="cm">/**</span>
<span class="cm">* A custom function that checks for an ISBN in Evergreen</span>
<span class="cm">*</span>
<span class="cm">* Returns "Yes" if there is a match, or "No" if there is no match</span>
<span class="cm">*/</span>
<span class="kd">function</span> <span class="nx">checkForISBN</span><span class="p">(</span><span class="nx">isbn</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">var</span> <span class="nx">hostname</span> <span class="o">=</span> <span class="s1">'https://example.org'</span><span class="p">;</span>
<span class="kd">var</span> <span class="nx">urlBase</span> <span class="o">=</span> <span class="nx">hostname</span> <span class="o">+</span> <span class="s1">'/opac/extras/sru'</span><span class="p">;</span>
<span class="cm">/* Supply a numeric or shortname library identifier</span>
<span class="cm"> * to restrict the search to that part of the organization</span>
<span class="cm"> */</span>
<span class="kd">var</span> <span class="nx">libraryID</span> <span class="o">=</span> <span class="s1">'103'</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">libraryID</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">urlBase</span> <span class="o">+=</span> <span class="s1">'/'</span> <span class="o">+</span> <span class="nx">libraryID</span><span class="p">;</span>
<span class="p">}</span>
<span class="nx">urlBase</span> <span class="o">+=</span> <span class="s1">'?version=1.1&operation=searchRetrieve&maximumRecords=1&query='</span><span class="p">;</span>
<span class="kd">var</span> <span class="nx">q</span> <span class="o">=</span> <span class="nb">encodeURIComponent</span><span class="p">(</span><span class="s1">'identifier|isbn:'</span> <span class="o">+</span> <span class="nx">isbn</span><span class="p">);</span>
<span class="kd">var</span> <span class="nx">url</span> <span class="o">=</span> <span class="nx">urlBase</span> <span class="o">+</span> <span class="nx">q</span><span class="p">;</span>
<span class="kd">var</span> <span class="nx">response</span> <span class="o">=</span> <span class="nx">UrlFetchApp</span><span class="p">.</span><span class="nx">fetch</span><span class="p">(</span><span class="nx">url</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">response</span><span class="p">.</span><span class="nx">getContentText</span><span class="p">().</span><span class="nx">search</span><span class="p">(</span><span class="s1">'1'</span><span class="p">)</span> <span class="o">></span> <span class="o">-</span><span class="mf">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="s2">"Yes"</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="s2">"No"</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
<p>Then I just add a column beside the column with ISBN values and invoke
the function as (for example) <tt class="docutils literal">=CheckForISBN(C2)</tt>.</p>
<p><img alt="CheckForISBN() function being invoked in a Google Sheet" class="serendipity-image-center" src="/uploads/pics/check_for_isbn.png" style="width: 462px; height: 346px;" /></p>
<p>Given a bit more time, it would be easy to tweak the function to make it
more robust, offer variant search types, and contribute it as a module
to the <a class="reference external" href="https://chrome.google.com/webstore">Chrome Web Store</a> "Sheet
Add-ons" section, but for now I thought you might be interested in it.</p>
<p><strong>*Caveats*</strong>: With thousands of ISBNs to check, occasionally you'll get
an HTTP response error ("<tt class="docutils literal">#ERROR</tt>") in the column. You can just paste
the formula back in again and it will resubmit the query. The sheet also
seems to resubmit the request on a periodic basis, so some of your "Yes"
or "No" values might change to "<tt class="docutils literal">#ERROR</tt>" as a result.</p>
Library catalogues and HTTP status codes2014-12-29T16:50:00-05:002014-12-29T16:50:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-12-29:/library-catalogues-and-http-status-codes.html<p>I noticed in Google's <a class="reference external" href="https://www.google.com/webmasters/tools/">Webmaster
Tools</a> that our catalogue
had been returning some <em>Soft 404s</em>. Curious, I checked into some of the
URIs suffering from this condition, and realized that Evergreen returns
an HTTP status code of <tt class="docutils literal">200 OK</tt> when it serves up a record details
page for a record …</p><p>I noticed in Google's <a class="reference external" href="https://www.google.com/webmasters/tools/">Webmaster
Tools</a> that our catalogue
had been returning some <em>Soft 404s</em>. Curious, I checked into some of the
URIs suffering from this condition, and realized that Evergreen returns
an HTTP status code of <tt class="docutils literal">200 OK</tt> when it serves up a record details
page for a record that has been deleted. The HTML itself has a nice big
red alert box warning users that the record has been deleted to help
humans realize that what was once there is no longer, but machines
typically don't read English. However, at some point in the past few
months, Google started parsing the HTML and recognizing when HTTP status
codes are misleading.</p>
<p>That led me to wonder what happens when you request a record detail page
by ID for a record that doesn't exist in Evergreen. As it turns out, it
currently returns HTTP status code <tt class="docutils literal">200</tt> with a detail page devoid of
any details. Also not good! Being a good little Evergreen community
member, I <a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1406025">opened a
bug</a> and put
together a fairly simple fix so that the catalogue will return a
<tt class="docutils literal">404 Not Found</tt> for non-existent records and <tt class="docutils literal">410 Gone</tt> for deleted
records. Huzzah for HTTP standards compliance. We build a better web one
small step at a time.</p>
<p>That, in turn, led me to wonder what happens when you request record
details for non-existent records in other library systems. Here's what I
found:</p>
<ul class="simple">
<li><strong>Bibliocommons</strong>: Status <tt class="docutils literal">302 Moved temporarily</tt> that then leads
back to an empty search form. Not good.</li>
<li><strong>Blacklight</strong>: Status <tt class="docutils literal">404 Not Found</tt>. Good!</li>
<li><strong>Encore</strong>: N/A - appears to send up session based URLs for records.
Really?</li>
<li><strong>III</strong>: Status <tt class="docutils literal">200 OK</tt>. Not good.</li>
<li><strong>Koha</strong>: Status <tt class="docutils literal">302 Found</tt> with a <tt class="docutils literal">Location:</tt> header leading to
a page with a status <tt class="docutils literal">404 Not Found</tt>. That redirect probably makes
it harder for the machines to recognize that the resource does not at
all exist than if it directly returned a <tt class="docutils literal">404</tt>.</li>
<li><strong>Polaris</strong>: N/A - it seems that the normal web interface doesn't
link directly to titles; instead it serves up titles in the context
of search results by position. The mobile web interface offers
persistent URLs, but requests for non-existent records return a
status <tt class="docutils literal">302 Found</tt> that redirects back to an empty search form. Not
good.</li>
<li><strong>Primo (using a permalink)</strong>: Status <tt class="docutils literal">302 Found</tt> that then leads
to an empty record details page with a status <tt class="docutils literal">200 OK</tt>. Not good.</li>
<li><strong>Symphony</strong>: N/A - I tried a few systems (Houston Public Library,
Oxnard Public Library) and it seems SirsiDynix still doesn't use
persistent URLs, nor surface permalinks for records in the default
interface.</li>
<li><strong>Voyager</strong>: Status <tt class="docutils literal">200 OK</tt>. Not good.</li>
<li><strong>Vufind</strong>: Status <tt class="docutils literal">404 Not Found</tt>. Good!</li>
<li><strong>WorldCat</strong>: Status <tt class="docutils literal">200 OK</tt>. Not good.</li>
</ul>
<p>Overall, this is a pretty dismal picture of the state of some of the
most commonly used library catalogue systems when it comes to compliance
with basic web standards. Kudos to Blacklight and Vufind for getting it
right--and assuming that my branch gets integrated, Evergreen should
join them in the near future.</p>
<p><img alt="404 Library Catalogue Web Standards Compliance Not Found" src="/uploads/files/404-web-standards-compliance.png" /></p>
Library catalogues and HTTP status codes2014-12-29T16:50:00-05:002014-12-29T16:50:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-12-29:/library-catalogues-and-http-status-codes.html<p>I noticed in Google's <a class="reference external" href="https://www.google.com/webmasters/tools/">Webmaster
Tools</a> that our catalogue
had been returning some <em>Soft 404s</em>. Curious, I checked into some of the
URIs suffering from this condition, and realized that Evergreen returns
an HTTP status code of <tt class="docutils literal">200 OK</tt> when it serves up a record details
page for a record …</p><p>I noticed in Google's <a class="reference external" href="https://www.google.com/webmasters/tools/">Webmaster
Tools</a> that our catalogue
had been returning some <em>Soft 404s</em>. Curious, I checked into some of the
URIs suffering from this condition, and realized that Evergreen returns
an HTTP status code of <tt class="docutils literal">200 OK</tt> when it serves up a record details
page for a record that has been deleted. The HTML itself has a nice big
red alert box warning users that the record has been deleted to help
humans realize that what was once there is no longer, but machines
typically don't read English. However, at some point in the past few
months, Google started parsing the HTML and recognizing when HTTP status
codes are misleading.</p>
<p>That led me to wonder what happens when you request a record detail page
by ID for a record that doesn't exist in Evergreen. As it turns out, it
currently returns HTTP status code <tt class="docutils literal">200</tt> with a detail page devoid of
any details. Also not good! Being a good little Evergreen community
member, I <a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1406025">opened a
bug</a> and put
together a fairly simple fix so that the catalogue will return a
<tt class="docutils literal">404 Not Found</tt> for non-existent records and <tt class="docutils literal">410 Gone</tt> for deleted
records. Huzzah for HTTP standards compliance. We build a better web one
small step at a time.</p>
<p>That, in turn, led me to wonder what happens when you request record
details for non-existent records in other library systems. Here's what I
found:</p>
<ul class="simple">
<li><strong>Bibliocommons</strong>: Status <tt class="docutils literal">302 Moved temporarily</tt> that then leads
back to an empty search form. Not good.</li>
<li><strong>Blacklight</strong>: Status <tt class="docutils literal">404 Not Found</tt>. Good!</li>
<li><strong>Encore</strong>: N/A - appears to send up session based URLs for records.
Really?</li>
<li><strong>III</strong>: Status <tt class="docutils literal">200 OK</tt>. Not good.</li>
<li><strong>Koha</strong>: Status <tt class="docutils literal">302 Found</tt> with a <tt class="docutils literal">Location:</tt> header leading to
a page with a status <tt class="docutils literal">404 Not Found</tt>. That redirect probably makes
it harder for the machines to recognize that the resource does not at
all exist than if it directly returned a <tt class="docutils literal">404</tt>.</li>
<li><strong>Polaris</strong>: N/A - it seems that the normal web interface doesn't
link directly to titles; instead it serves up titles in the context
of search results by position. The mobile web interface offers
persistent URLs, but requests for non-existent records return a
status <tt class="docutils literal">302 Found</tt> that redirects back to an empty search form. Not
good.</li>
<li><strong>Primo (using a permalink)</strong>: Status <tt class="docutils literal">302 Found</tt> that then leads
to an empty record details page with a status <tt class="docutils literal">200 OK</tt>. Not good.</li>
<li><strong>Symphony</strong>: N/A - I tried a few systems (Houston Public Library,
Oxnard Public Library) and it seems SirsiDynix still doesn't use
persistent URLs, nor surface permalinks for records in the default
interface.</li>
<li><strong>Voyager</strong>: Status <tt class="docutils literal">200 OK</tt>. Not good.</li>
<li><strong>Vufind</strong>: Status <tt class="docutils literal">404 Not Found</tt>. Good!</li>
<li><strong>WorldCat</strong>: Status <tt class="docutils literal">200 OK</tt>. Not good.</li>
</ul>
<p>Overall, this is a pretty dismal picture of the state of some of the
most commonly used library catalogue systems when it comes to compliance
with basic web standards. Kudos to Blacklight and Vufind for getting it
right--and assuming that my branch gets integrated, Evergreen should
join them in the near future.</p>
<p><img alt="404 Library Catalogue Web Standards Compliance Not Found" src="/uploads/files/404-web-standards-compliance.png" /></p>
Putting the "Web" back into Semantic Web in Libraries 20142014-12-04T21:15:00-05:002014-12-04T21:15:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-12-04:/putting-the-web-back-into-semantic-web-in-libraries-2014.html<p>I was honoured to lead a workshop and speak at this year's edition of <a class="reference external" href="http://swib.org/swib14">Semantic
Web in Bibliotheken (SWIB)</a> in Bonn, Germany. It was
an amazing experience; there were so many rich projects being described with
obvious dividends for the users of libraries, once again the European library
community fills …</p><p>I was honoured to lead a workshop and speak at this year's edition of <a class="reference external" href="http://swib.org/swib14">Semantic
Web in Bibliotheken (SWIB)</a> in Bonn, Germany. It was
an amazing experience; there were so many rich projects being described with
obvious dividends for the users of libraries, once again the European library
community fills me with hope for the future success of the semantic web.</p>
<p>The subject of my talk "Cataloguing for the open web with RDFa and schema.org"
(<a class="reference external" href="/swib14/talk">slides</a> and
<a class="reference external" href="http://www.scivee.tv/node/63282">video recording</a> - <em>gulp</em>) pivoted while
I was preparing materials for the workshop. I was searching library catalogues
around Bonn looking for a catalogue with persistent URIs that I could use for
an example. To my surprise, catalogue after catalogue used session-based URLs;
it took me quite some time before I was able to find ULB, who had hosted a
VuFind front end for their catalogue. Even then, the <tt class="docutils literal">robots.txt</tt> restricted
crawling by any user agent. This reminded me rather depressingly of my findings
from current "discovery layers", which entirely restrict crawling and therefore
put libraries into a black hole on the web.</p>
<p>These findings in the wild are so antithetical to the basic principles of
enabling discovery of web resources that, in a conference about the semantic
web, I opted to spend over half of my talk making the argument that libraries
need to pay attention to the old-fashioned web of documents first and foremost.</p>
<p>The basic building blocks that I advocated were, in priority order:</p>
<ul class="simple">
<li>Persistent URIs, on which everything else is built</li>
<li>Sitemaps, to facilitate discovery of your resources</li>
<li>A robots.txt file to filter portions of your website that should not
be crawled (for example, search results pages)</li>
<li>RDFa, microdata, or JSON-LD only after you've sorted out the first
three</li>
</ul>
<p>Only after setting that foundation did I feel comfortable launching into my
rationale for RDFa and schema.org as a tool for enabling discovery on the web:
a mapping of the access points that cataloguers create to the world of HTML and
aggregators. The key point for SWIB was that RDFa and schema.org can enable
full RDF expressions in HTML; that is, we can, should, and must go beyond
surfacing structured data to surfacing linked data through <tt class="docutils literal">@resource</tt>
attributes and <a class="reference external" href="http://schema.org/sameAs">schema:sameAs</a> properties.</p>
<blockquote>
The Semantic Web is an extension of the current web in which information is
given well-defined meaning, better enabling computers and people to work in
cooperation. Tim Berners-Lee, Scientific American, 2001</blockquote>
<p>I also argued that using RDFa to enrich the document web was, in fact,
truer to Berners-Lee's 2001 definition of the semantic web, and that we should
focus on enriching the document web so that both humans and machines can
benefit before investing in building an entirely separate and disconnected
semantic web.</p>
<p>I was worried that my talk would not be well received; that it would be
considered obvious, or scolding, or just plain off-topic. But to my relief I
received a great deal of positive feedback. And on the next day, both Eric
Miller and Richard Wallis gave talks on a similar, but more refined, theme:
that libraries need to do a much, much better job of enabling their resources
to be found on the web--not by people who already use our catalogues, but by
people who are <em>not</em> library users today.</p>
<p>There were also some requests for clarification, which I'll try to address
generally here (for the benefit of anyone who wasn't able to talk with me, or
who might watch the livestream in the future).</p>
<div class="section" id="when-you-said-anything-could-be-described-in-schema-org-did-you-mean-we-should-throw-out-marc-and-bibframe-and-ead">
<h2>"When you said anything could be described in schema.org, did you mean we should throw out MARC and BIBFRAME and EAD?"</h2>
<p><em>tldr:</em> I intended <strong>and</strong>, not <strong>instead of</strong>!</p>
<p>The first question I was asked was whether there was anything that I had not
been able to describe in schema.org, to which I answered "No"--especially since
the work that the W3C SchemaBibEx group had done to ensure that some of the
core bibliographic requirements were added to the vocabulary. It was not as
coherent or full a response as I would have liked to have made; I blame the
livestream camera <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
<p>But combined with a part of the presentation where I countered a myth about
schema.org being a very coarse vocabulary by pointing out that it actually
contained 600 classes and over 800 properties, a number of the attendees
interpreted one of the takeaways of my talk as suggesting that libraries should
adopt schema.org as <em>the</em> descriptive vocabulary, and that MARC, BIBFRAME, EAD,
RAD, RDA, and other approaches for describing library resources were no longer
necessary.</p>
<p>This is not at all what I'm advocating! To expand on my response, you <em>can</em>
describe anything in schema.org, but you might lose significant amounts of
richness in your description. For example, short stories and poems would best
be described in schema.org as a <a class="reference external" href="http://schema.org/CreativeWork">CreativeWork</a>. You would have to look at the associated
description or keyword properties to be able to figure out the form of the
work.</p>
<p>What I was advocating was that you should map your rich bibliographic
description into corresponding schema.org classes and properties in RDFa at the
time you generate the HTML representation of that resource and its associated
entities. So your poem might be represented as a
href="<a class="reference external" href="http://schema.org/CreativeWork">http://schema.org/CreativeWork</a>">CreativeWork, with a <a class="reference external" href="http://schema.org/name">name</a>, <a class="reference external" href="http://schema.org/author">author</a>,
<a class="reference external" href="http://schema.org/description">description</a>, <a class="reference external" href="http://schema.org/keywords">keywords</a>, and <a class="reference external" href="http://schema.org/about">about</a> values
and relationships. Ideally, the <tt class="docutils literal">author</tt> will include at least one link
(either via <a class="reference external" href="http://schema.org/sameAs">sameAs</a>, <a class="reference external" href="http://schema.org/url">url</a>, or <tt class="docutils literal">@resource</tt>) to an entity on the web; and you
could do the same with <tt class="docutils literal">about</tt> if you are using a controlled vocabulary.</p>
<p>If you take that approach, then you can serve up schema.org descriptions of
works in HTML that most web-oriented clients will understand (such as search
engines) and provide basic access points such as name / author / keywords,
while retaining and maintaining the full richness of the underlying
bibliographic description--and potentially providing access to that, too, as
part of the embedded RDFa, via content negotiation, or <tt class="docutils literal"><link <span class="pre">rel=""></span></tt>, for
clients that can interpret richer formats.</p>
</div>
<div class="section" id="what-makes-you-think-google-will-want-to-surface-library-holdings-in-search-results">
<h2>"What makes you think Google will want to surface library holdings in search results?"</h2>
<p>There is a perception that Google and other search engines just want to sell
ads, or their own products (such as Google Books). While Google certainly does
want to sell ads and products, they also want to be the most useful tool for
satisfying users' information needs--possibly so they can learn more about
those users and put more effective ads in front of them--but nonetheless, the
motivation is there.</p>
<p>Imagine marking up your resources with the Product / Offer portion of
schema.org you are able to provide search engines with availability information
in the same way that Best Buy, AbeBooks, and other online retailers do (as
Evergreen, Koha, and VuFind already do). That makes it much easier for the
search engines to use everything they may know about their users, such as their
current location, their institutional affiliations, their typical commuting
patterns, their reading and research preferences... to provide a link to a
library's electronic or print copy of a given resource in a knowledge graph box
as one of the possible ways of satisfying that person's information needs.</p>
<p>We don't see it happening with libraries running Evergreen, Koha, and VuFind
yet, realistically because the open source library systems don't have enough
penetration to make it worth a search engine's effort to add that to their set
of possible sources. However, if we as an industry make a concerted effort to
implement this as a standard part of crawlable catalogue or discovery record
detail pages, then it wouldn't surprise me in the least to see such suggestions
start to appear. The best proof that we have that Google, at least, is
interested in supporting discovery of library resources is the continued
investment in Google Scholar.</p>
<p>And as I argued during my talk, even if the search engines never add direct
links to library resources from search results or knowledge graph sidebars,
having a reasonably simple standard like the GoodRelations product / offer
pattern for resource availability enables new web-based approaches for building
appplications. One example could be a fulfillment system that uses sitemaps to
intelligently crawl all of its participating libraries, normalizes the item
request to a work URI, and checks availability by parsing the offers at the
corresponding URIs.</p>
</div>
Putting the "Web" back into Semantic Web in Libraries 20142014-12-04T21:15:00-05:002014-12-04T21:15:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-12-04:/putting-the-web-back-into-semantic-web-in-libraries-2014.html<p>I was honoured to lead a workshop and speak at this year's edition of <a class="reference external" href="http://swib.org/swib14">Semantic
Web in Bibliotheken (SWIB)</a> in Bonn, Germany. It was
an amazing experience; there were so many rich projects being described with
obvious dividends for the users of libraries, once again the European library
community fills …</p><p>I was honoured to lead a workshop and speak at this year's edition of <a class="reference external" href="http://swib.org/swib14">Semantic
Web in Bibliotheken (SWIB)</a> in Bonn, Germany. It was
an amazing experience; there were so many rich projects being described with
obvious dividends for the users of libraries, once again the European library
community fills me with hope for the future success of the semantic web.</p>
<p>The subject of my talk "Cataloguing for the open web with RDFa and schema.org"
(<a class="reference external" href="/swib14/talk">slides</a> and
<a class="reference external" href="http://www.scivee.tv/node/63282">video recording</a> - <em>gulp</em>) pivoted while
I was preparing materials for the workshop. I was searching library catalogues
around Bonn looking for a catalogue with persistent URIs that I could use for
an example. To my surprise, catalogue after catalogue used session-based URLs;
it took me quite some time before I was able to find ULB, who had hosted a
VuFind front end for their catalogue. Even then, the <tt class="docutils literal">robots.txt</tt> restricted
crawling by any user agent. This reminded me rather depressingly of my findings
from current "discovery layers", which entirely restrict crawling and therefore
put libraries into a black hole on the web.</p>
<p>These findings in the wild are so antithetical to the basic principles of
enabling discovery of web resources that, in a conference about the semantic
web, I opted to spend over half of my talk making the argument that libraries
need to pay attention to the old-fashioned web of documents first and foremost.</p>
<p>The basic building blocks that I advocated were, in priority order:</p>
<ul class="simple">
<li>Persistent URIs, on which everything else is built</li>
<li>Sitemaps, to facilitate discovery of your resources</li>
<li>A robots.txt file to filter portions of your website that should not
be crawled (for example, search results pages)</li>
<li>RDFa, microdata, or JSON-LD only after you've sorted out the first
three</li>
</ul>
<p>Only after setting that foundation did I feel comfortable launching into my
rationale for RDFa and schema.org as a tool for enabling discovery on the web:
a mapping of the access points that cataloguers create to the world of HTML and
aggregators. The key point for SWIB was that RDFa and schema.org can enable
full RDF expressions in HTML; that is, we can, should, and must go beyond
surfacing structured data to surfacing linked data through <tt class="docutils literal">@resource</tt>
attributes and <a class="reference external" href="http://schema.org/sameAs">schema:sameAs</a> properties.</p>
<blockquote>
The Semantic Web is an extension of the current web in which information is
given well-defined meaning, better enabling computers and people to work in
cooperation. Tim Berners-Lee, Scientific American, 2001</blockquote>
<p>I also argued that using RDFa to enrich the document web was, in fact,
truer to Berners-Lee's 2001 definition of the semantic web, and that we should
focus on enriching the document web so that both humans and machines can
benefit before investing in building an entirely separate and disconnected
semantic web.</p>
<p>I was worried that my talk would not be well received; that it would be
considered obvious, or scolding, or just plain off-topic. But to my relief I
received a great deal of positive feedback. And on the next day, both Eric
Miller and Richard Wallis gave talks on a similar, but more refined, theme:
that libraries need to do a much, much better job of enabling their resources
to be found on the web--not by people who already use our catalogues, but by
people who are <em>not</em> library users today.</p>
<p>There were also some requests for clarification, which I'll try to address
generally here (for the benefit of anyone who wasn't able to talk with me, or
who might watch the livestream in the future).</p>
<div class="section" id="when-you-said-anything-could-be-described-in-schema-org-did-you-mean-we-should-throw-out-marc-and-bibframe-and-ead">
<h2>"When you said anything could be described in schema.org, did you mean we should throw out MARC and BIBFRAME and EAD?"</h2>
<p><em>tldr:</em> I intended <strong>and</strong>, not <strong>instead of</strong>!</p>
<p>The first question I was asked was whether there was anything that I had not
been able to describe in schema.org, to which I answered "No"--especially since
the work that the W3C SchemaBibEx group had done to ensure that some of the
core bibliographic requirements were added to the vocabulary. It was not as
coherent or full a response as I would have liked to have made; I blame the
livestream camera <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
<p>But combined with a part of the presentation where I countered a myth about
schema.org being a very coarse vocabulary by pointing out that it actually
contained 600 classes and over 800 properties, a number of the attendees
interpreted one of the takeaways of my talk as suggesting that libraries should
adopt schema.org as <em>the</em> descriptive vocabulary, and that MARC, BIBFRAME, EAD,
RAD, RDA, and other approaches for describing library resources were no longer
necessary.</p>
<p>This is not at all what I'm advocating! To expand on my response, you <em>can</em>
describe anything in schema.org, but you might lose significant amounts of
richness in your description. For example, short stories and poems would best
be described in schema.org as a <a class="reference external" href="http://schema.org/CreativeWork">CreativeWork</a>. You would have to look at the associated
description or keyword properties to be able to figure out the form of the
work.</p>
<p>What I was advocating was that you should map your rich bibliographic
description into corresponding schema.org classes and properties in RDFa at the
time you generate the HTML representation of that resource and its associated
entities. So your poem might be represented as a
href="<a class="reference external" href="http://schema.org/CreativeWork">http://schema.org/CreativeWork</a>">CreativeWork, with a <a class="reference external" href="http://schema.org/name">name</a>, <a class="reference external" href="http://schema.org/author">author</a>,
<a class="reference external" href="http://schema.org/description">description</a>, <a class="reference external" href="http://schema.org/keywords">keywords</a>, and <a class="reference external" href="http://schema.org/about">about</a> values
and relationships. Ideally, the <tt class="docutils literal">author</tt> will include at least one link
(either via <a class="reference external" href="http://schema.org/sameAs">sameAs</a>, <a class="reference external" href="http://schema.org/url">url</a>, or <tt class="docutils literal">@resource</tt>) to an entity on the web; and you
could do the same with <tt class="docutils literal">about</tt> if you are using a controlled vocabulary.</p>
<p>If you take that approach, then you can serve up schema.org descriptions of
works in HTML that most web-oriented clients will understand (such as search
engines) and provide basic access points such as name / author / keywords,
while retaining and maintaining the full richness of the underlying
bibliographic description--and potentially providing access to that, too, as
part of the embedded RDFa, via content negotiation, or <tt class="docutils literal"><link <span class="pre">rel=""></span></tt>, for
clients that can interpret richer formats.</p>
</div>
<div class="section" id="what-makes-you-think-google-will-want-to-surface-library-holdings-in-search-results">
<h2>"What makes you think Google will want to surface library holdings in search results?"</h2>
<p>There is a perception that Google and other search engines just want to sell
ads, or their own products (such as Google Books). While Google certainly does
want to sell ads and products, they also want to be the most useful tool for
satisfying users' information needs--possibly so they can learn more about
those users and put more effective ads in front of them--but nonetheless, the
motivation is there.</p>
<p>Imagine marking up your resources with the Product / Offer portion of
schema.org you are able to provide search engines with availability information
in the same way that Best Buy, AbeBooks, and other online retailers do (as
Evergreen, Koha, and VuFind already do). That makes it much easier for the
search engines to use everything they may know about their users, such as their
current location, their institutional affiliations, their typical commuting
patterns, their reading and research preferences... to provide a link to a
library's electronic or print copy of a given resource in a knowledge graph box
as one of the possible ways of satisfying that person's information needs.</p>
<p>We don't see it happening with libraries running Evergreen, Koha, and VuFind
yet, realistically because the open source library systems don't have enough
penetration to make it worth a search engine's effort to add that to their set
of possible sources. However, if we as an industry make a concerted effort to
implement this as a standard part of crawlable catalogue or discovery record
detail pages, then it wouldn't surprise me in the least to see such suggestions
start to appear. The best proof that we have that Google, at least, is
interested in supporting discovery of library resources is the continued
investment in Google Scholar.</p>
<p>And as I argued during my talk, even if the search engines never add direct
links to library resources from search results or knowledge graph sidebars,
having a reasonably simple standard like the GoodRelations product / offer
pattern for resource availability enables new web-based approaches for building
appplications. One example could be a fulfillment system that uses sitemaps to
intelligently crawl all of its participating libraries, normalizes the item
request to a work URI, and checks availability by parsing the offers at the
corresponding URIs.</p>
</div>
How discovery layers have closed off access to library resources, and other tales of schema.org from LITA Forum 20142014-11-08T16:41:00-05:002014-11-08T16:41:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-11-08:/how-discovery-layers-have-closed-off-access-to-library-resources-and-other-tales-of-schemaorg-from-lita-forum-2014.html<p>At the LITA Forum yesterday, I accused
(<a class="reference external" href="http://stuff.coffeecode.net/2014/lita_forum">presentation</a>) most
discovery layers of not solving the discoverability problems of
libraries, but instead exacerbating them by launching us headlong to a
closed, unlinkable world. Coincidentally, Lorcan Dempsey's opening
keynote contained a subtle criticism of discovery layers. I wasn't that
subtle.</p>
<p>Here's why …</p><p>At the LITA Forum yesterday, I accused
(<a class="reference external" href="http://stuff.coffeecode.net/2014/lita_forum">presentation</a>) most
discovery layers of not solving the discoverability problems of
libraries, but instead exacerbating them by launching us headlong to a
closed, unlinkable world. Coincidentally, Lorcan Dempsey's opening
keynote contained a subtle criticism of discovery layers. I wasn't that
subtle.</p>
<p>Here's why I believe commercial discovery layers are not "of the web":
check out their <a class="reference external" href="http://robotstxt.org">robots.txt</a> files. If
you're not familiar with robots.txt files, these are what search engines
and other well-behaved automated crawlers of web resources use to
determine whether they are allowed to visit and index the content of
pages on a site. Here's what the <tt class="docutils literal">robots.txt</tt> files look like for a
few of the best-known discovery layers:</p>
<pre class="literal-block">
User-Agent: *
Disallow /
</pre>
<p>That effectively says "Go away, machines; your kind isn't wanted in
these parts." And that, in turn, closes off access to your libraries
resources to search engines and other aggregators of content, and is
completely counter to the overarching desire to evolve to a linked open
data world.</p>
<p>During the question period, Marshall Breeding challenged my assertion as
being unfair to what are meant to be merely indexes of library content.
I responded that most libraries have replaced their catalogues with
discovery layers, closing off open access to what have traditionally
been their core resources, and he rather quickly acquiesced that that
was indeed a problem.</p>
<p>(By the way, a possible solution might be to simply offer two different
URL patterns, something like <tt class="docutils literal">/library/*</tt> for library-owned resources
to which access should be granted, and <tt class="docutils literal">/licensed/*</tt> for resources to
which open access to the metadata is problematic due to licensing
issues, and which robots can therefore be restricted from accessing.)</p>
<p>Compared to commercial discovery layers on my very handwavy usability
vs. discoverability plot, general search engines rank pretty high on
both axes; they're the ready-at-hand tool in browser address bars. And
they grok schema.org, so if we can improve our discoverability by
publishing schema.org data, maybe we get a discoverability win for our
users.</p>
<p>But even if we don't (SEO is a black art at best, and maybe the general
search engines won't find the right mix of signals that makes them
decide to boost the relevancy of our resources for specific users in
specific locations at specific times) we get access to that structured
data across systems in an extremely reusable way. With sitemaps, we can
build our own specialized search engines (Solr or ElasticSearch or
Google Custom Search Engine or whatever) that represent specific use
cases. Our more sophisticated users can piece together data to, for
example, build dynamic lists of collections, using a common,
well-documented vocabulary and tools rather than having to dip into the
arcane world of library standards (Z39.50 and MARC21).</p>
<p>So why not iterate our way towards the linked open data future by
building on what we already have now?</p>
<p>As <a class="reference external" href="http://kcoyle.blogspot.ca/2014/10/schemaorg-where-it-works.html">Karen Coyle
wrote</a>
in a much more elegant fashion, the transition looks roughly like:</p>
<ul class="simple">
<li>Stored data -> transform/template -> human readable HTML page</li>
<li>Stored data -> transform/template (tweaked) -> machine & human
readable HTML page</li>
</ul>
<p>That is, by simply tweaking the same mechanism you already use to
generate a human readable HTML page from the data you have stored in a
database or flat files or what have you, you can embed machine readable
structured data as well.</p>
<p>That is, in fact, exactly the approach I took with Evergreen, VuFind,
and Koha. And they now expose structured data and generate sitemaps out
of the box using the same old MARC21 data. Evergreen even exposes
information about libraries (locations, contact information, hours of
operation) so that you can connect its holdings to specific locations.</p>
<p>And what about all of our resources outside of the catalogue? Research
guides, fonds descriptions, institutional repositories, publications...
I've been lucky enough to be working with Camilla McKay and Karen Coyle
on applying the same process to the Bryn Mawr Classical Review. At this
stage, we're exposing basic entities (<a class="reference external" href="http://schema.org">Reviews</a>
and <a class="reference external" href="http://schema.org/Person">People</a>) largely as literals, but
we're laying the groundwork for future iterations where we link them up
to external entities. And all of this is built on a Tcl + SGML
infrastructure.</p>
<p>So why schema.org? It has the advantage of being a de-facto generalized
vocabulary that can be understood and parsed across many different
domains, from car dealerships to streaming audio services to libraries,
and it can be relatively simply embedded into existing HTML as long as
you can modify the templating layer of your system.</p>
<p>And schema.org offers much more than just static structured data;
schema.org Actions are surfacing in applications like Gmail as a way of
providing directly actionable links--and there's no reason we shouldn't
embrace that approach to expose "SearchAction", "ReadAction",
"WatchAction", "ListenAction", "ViewAction"--and "OrderAction"
(Request), "BorrowAction" (Borrow or Renew), "Place on Reserve", and
other common actions as a standardized API that exists well beyond
libraries (see Hydra for a developing approach to this problem).</p>
<p>I want to thank Richard Wallis for inviting me to co-present with him;
it was a great experience, and I really enjoy meeting and sharing with
others who are putting linked data theory into practice.</p>
How discovery layers have closed off access to library resources, and other tales of schema.org from LITA Forum 20142014-11-08T16:41:00-05:002014-11-08T16:41:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-11-08:/how-discovery-layers-have-closed-off-access-to-library-resources-and-other-tales-of-schemaorg-from-lita-forum-2014.html<p>At the LITA Forum yesterday, I accused
(<a class="reference external" href="http://stuff.coffeecode.net/2014/lita_forum">presentation</a>) most
discovery layers of not solving the discoverability problems of
libraries, but instead exacerbating them by launching us headlong to a
closed, unlinkable world. Coincidentally, Lorcan Dempsey's opening
keynote contained a subtle criticism of discovery layers. I wasn't that
subtle.</p>
<p>Here's why …</p><p>At the LITA Forum yesterday, I accused
(<a class="reference external" href="http://stuff.coffeecode.net/2014/lita_forum">presentation</a>) most
discovery layers of not solving the discoverability problems of
libraries, but instead exacerbating them by launching us headlong to a
closed, unlinkable world. Coincidentally, Lorcan Dempsey's opening
keynote contained a subtle criticism of discovery layers. I wasn't that
subtle.</p>
<p>Here's why I believe commercial discovery layers are not "of the web":
check out their <a class="reference external" href="http://robotstxt.org">robots.txt</a> files. If
you're not familiar with robots.txt files, these are what search engines
and other well-behaved automated crawlers of web resources use to
determine whether they are allowed to visit and index the content of
pages on a site. Here's what the <tt class="docutils literal">robots.txt</tt> files look like for a
few of the best-known discovery layers:</p>
<pre class="literal-block">
User-Agent: *
Disallow /
</pre>
<p>That effectively says "Go away, machines; your kind isn't wanted in
these parts." And that, in turn, closes off access to your libraries
resources to search engines and other aggregators of content, and is
completely counter to the overarching desire to evolve to a linked open
data world.</p>
<p>During the question period, Marshall Breeding challenged my assertion as
being unfair to what are meant to be merely indexes of library content.
I responded that most libraries have replaced their catalogues with
discovery layers, closing off open access to what have traditionally
been their core resources, and he rather quickly acquiesced that that
was indeed a problem.</p>
<p>(By the way, a possible solution might be to simply offer two different
URL patterns, something like <tt class="docutils literal">/library/*</tt> for library-owned resources
to which access should be granted, and <tt class="docutils literal">/licensed/*</tt> for resources to
which open access to the metadata is problematic due to licensing
issues, and which robots can therefore be restricted from accessing.)</p>
<p>Compared to commercial discovery layers on my very handwavy usability
vs. discoverability plot, general search engines rank pretty high on
both axes; they're the ready-at-hand tool in browser address bars. And
they grok schema.org, so if we can improve our discoverability by
publishing schema.org data, maybe we get a discoverability win for our
users.</p>
<p>But even if we don't (SEO is a black art at best, and maybe the general
search engines won't find the right mix of signals that makes them
decide to boost the relevancy of our resources for specific users in
specific locations at specific times) we get access to that structured
data across systems in an extremely reusable way. With sitemaps, we can
build our own specialized search engines (Solr or ElasticSearch or
Google Custom Search Engine or whatever) that represent specific use
cases. Our more sophisticated users can piece together data to, for
example, build dynamic lists of collections, using a common,
well-documented vocabulary and tools rather than having to dip into the
arcane world of library standards (Z39.50 and MARC21).</p>
<p>So why not iterate our way towards the linked open data future by
building on what we already have now?</p>
<p>As <a class="reference external" href="http://kcoyle.blogspot.ca/2014/10/schemaorg-where-it-works.html">Karen Coyle
wrote</a>
in a much more elegant fashion, the transition looks roughly like:</p>
<ul class="simple">
<li>Stored data -> transform/template -> human readable HTML page</li>
<li>Stored data -> transform/template (tweaked) -> machine & human
readable HTML page</li>
</ul>
<p>That is, by simply tweaking the same mechanism you already use to
generate a human readable HTML page from the data you have stored in a
database or flat files or what have you, you can embed machine readable
structured data as well.</p>
<p>That is, in fact, exactly the approach I took with Evergreen, VuFind,
and Koha. And they now expose structured data and generate sitemaps out
of the box using the same old MARC21 data. Evergreen even exposes
information about libraries (locations, contact information, hours of
operation) so that you can connect its holdings to specific locations.</p>
<p>And what about all of our resources outside of the catalogue? Research
guides, fonds descriptions, institutional repositories, publications...
I've been lucky enough to be working with Camilla McKay and Karen Coyle
on applying the same process to the Bryn Mawr Classical Review. At this
stage, we're exposing basic entities (<a class="reference external" href="http://schema.org">Reviews</a>
and <a class="reference external" href="http://schema.org/Person">People</a>) largely as literals, but
we're laying the groundwork for future iterations where we link them up
to external entities. And all of this is built on a Tcl + SGML
infrastructure.</p>
<p>So why schema.org? It has the advantage of being a de-facto generalized
vocabulary that can be understood and parsed across many different
domains, from car dealerships to streaming audio services to libraries,
and it can be relatively simply embedded into existing HTML as long as
you can modify the templating layer of your system.</p>
<p>And schema.org offers much more than just static structured data;
schema.org Actions are surfacing in applications like Gmail as a way of
providing directly actionable links--and there's no reason we shouldn't
embrace that approach to expose "SearchAction", "ReadAction",
"WatchAction", "ListenAction", "ViewAction"--and "OrderAction"
(Request), "BorrowAction" (Borrow or Renew), "Place on Reserve", and
other common actions as a standardized API that exists well beyond
libraries (see Hydra for a developing approach to this problem).</p>
<p>I want to thank Richard Wallis for inviting me to co-present with him;
it was a great experience, and I really enjoy meeting and sharing with
others who are putting linked data theory into practice.</p>
DCMI 2014: schema.org holdings in open source library systems2014-10-14T01:07:00-04:002014-10-14T01:07:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-10-14:/dcmi-2014-schemaorg-holdings-in-open-source-library-systems.html<p>My slides from DCMI 2014:
<a class="reference external" href="http://stuff.coffeecode.net/2014/dcmi_schemabibex/#/">schema.org in the wild: open source libraries++</a>.</p>
<p>Last week I was at the
<a class="reference external" href="http://dcevents.dublincore.org/IntConf/dc-2014">Dublin Core Metadata</a>
Initiative 2014 conference, where Richard Wallis, Charles MacCathie Nevile and
I were slated to present on schema.org and the work of the W3C Schema.org
Bibliographic Extension …</p><p>My slides from DCMI 2014:
<a class="reference external" href="http://stuff.coffeecode.net/2014/dcmi_schemabibex/#/">schema.org in the wild: open source libraries++</a>.</p>
<p>Last week I was at the
<a class="reference external" href="http://dcevents.dublincore.org/IntConf/dc-2014">Dublin Core Metadata</a>
Initiative 2014 conference, where Richard Wallis, Charles MacCathie Nevile and
I were slated to present on schema.org and the work of the W3C Schema.org
Bibliographic Extension Community Group (#schemabibex). As a first-timer at
DCMI, I wasn't sure what kind of an audience to expect: there is a
peer-reviewed papers track, and a series of sessions on a truly intimidating
topic (RDF Application Profiles), but on the other hand our own topic was
fairly basic. As it turned out, there was an invigoratingly mixed set of
backgrounds present, and Eric Miller's opening keynote, which gave an oral
history of the origins of DCMI and a look towards the future challenges for the
organization, reassured me that I wasn't going to be out of my depth.</p>
<p>Special kudos to Eric for his analogy of the Web to a credit card, which offers
both human-readable and machine-readable data. A nice, clean image!</p>
<p>Richard, Charles and I opted to structure our 1.5 hour session as a series of
short talks followed by a long period of discussion. However, as often happens,
the excitement of speaking in front of a room that drew so many attendees that
we had to jam with more chairs led to that plan breaking down. I cut my own
materials back to illustrating how one of my primary contributions to the
#schemabibex effort--representing library holdings using schema.org's
GoodRelations-based Product/Offer model--had been implemented in free software
library systems, including Evergreen, Koha, and VuFind. I walked from a basic
bibliographic record (represented as a
<a class="reference external" href="http://schema.org/Product">Product</a>), through to the associated borrowable
items (represented as <a class="reference external" href="http://schema.org/Offer">Offers</a> with a price of
$0.00, call numbers as <a class="reference external" href="http://schema.org/sku">SKUs</a>, and barcodes as
<a class="reference external" href="http://schema.org/serialNumber">serialNumbers</a>), that were offered by a
specific <a class="reference external" href="http://schema.org/Library">Library</a> with its own set of operating
hours, address, and contact information... all published out of the box as RDFa
in modern Evergreen systems.</p>
<p>I did stray a little to posit that the use case for schema.org is not and
should not be limited to "search engine optimization", but that this very
simple level of structured data could fairly easily form the basis of an API.
In the rather limited discussion that we were able to hold at the end of the
session (and encroaching on break time), Charles counselled that libraries
shouldn't really bother with dumbing down their beautiful metadata simply to
publish schema.org... while I countered that the pursuit of publishing
beautiful metadata in the past has generally led librarians to publish no
metadata at all, and that schema.org was a great first step towards building a
web of cultural heritage metadata meant for machine consumption.</p>
<p>I wish I could have stayed longer at DCMI, but it was Thanksgiving in Canada
and there were families to visit and feast with--not to mention children to
help take car of--so I had to depart after just a day and a half. I'm
encouraged by the steps the organization is taking to renew itself, and I hope
to be able to participate again in the future.</p>
Cataloguing for the open web: schema.org in library catalogues and websites2014-07-01T20:00:00-04:002014-07-01T20:00:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-07-01:/cataloguing-for-the-open-web-schemaorg-in-library-catalogues-and-websites.html<div vocab="http://schema.org/" typeof="Article"><p><em>tldr;</em> my slides are
href="<a class="reference external" href="http://stuff.coffeecode.net/2014/understanding_schema">http://stuff.coffeecode.net/2014/understanding_schema</a>">here, and the
slides from Jenn and Jason are also available from
href="<a class="reference external" href="http://connect.ala.org/node/222959">http://connect.ala.org/node/222959</a>">ALA Connect.</p>
<p>On Sunday, June 29th Jenn Riley, Jason Clark, and I presented at the ALCTS/LITA
jointly sponsored …</p></div><div vocab="http://schema.org/" typeof="Article"><p><em>tldr;</em> my slides are
href="<a class="reference external" href="http://stuff.coffeecode.net/2014/understanding_schema">http://stuff.coffeecode.net/2014/understanding_schema</a>">here, and the
slides from Jenn and Jason are also available from
href="<a class="reference external" href="http://connect.ala.org/node/222959">http://connect.ala.org/node/222959</a>">ALA Connect.</p>
<p>On Sunday, June 29th Jenn Riley, Jason Clark, and I presented at the ALCTS/LITA
jointly sponsored session href="<a class="reference external" href="http://ala14.ala.org/node/14382">http://ala14.ala.org/node/14382</a>">Understanding
schema.org. The build-up to the session was pretty amazing; I was delighted to
learn that Jason and I had been working on pretty much parallel efforts over
the past couple of years. Jenn did a great job of organizing the session, and
by the time we started talking 276 people had indicated their interest in
attending: that was two more than those who had indicated an interest in
attending the BIBFRAME Forum Update scheduled in the same time slot. Our room
was large and quite full.</p>
<p>Jenn started the session out string by advancing her concept that libraries
need to target <em>discovery elsewhere</em>: that is, that there is no way that
libraries can compete directly with major search engines like Google, Bing, and
Yahoo, either through the discovery tools that we have to offer, our presence
in the consciousness of most of the population as the starting point for
discovery, or in the resources we can direct towards closing the huge gap in
technology, usability, and mindshare that the search engines have opened up
over the past two decades. <em>But</em>, we can take steps to start working with the
search engines to enable our resources to be discovered and accessed more
directly by them.</p>
<p>That led quite naturally to my own part of the session, in which I talked about
my attempt to turn cataloguing's efforts to provide access points in our niche
catalogues into access points for the open web by publishing schema.org
structured data from library catalogues like Evergreen, Koha, and VuFind. I
started things out by pointing out the legacy of restrictive <tt class="docutils literal">robots.txt</tt>
files that still live on in many catalogues today, then worked through some
basics like how sitemaps enable search engines--which strive to provide
relevant, useful results that matter to users in their context at a particular
place and time--to efficiently crawl just the most recently changed pages of
interest. Then I launched into the heart of the talk that showed how catalogues
that publish schema.org structured data can turn an undifferentiated mass of
presentation-oriented HTML and words into machine-comprehensible entities:
classes like <tt class="docutils literal">Book</tt> and <tt class="docutils literal">Organization</tt>, connected by properties like
<tt class="docutils literal">publisher</tt>, and with values for properties like <tt class="docutils literal">author</tt>,
<tt class="docutils literal">datePublished</tt>, and <tt class="docutils literal">isbn</tt>.</p>
<p>For this talk I used visualizations generated by the
href="<a class="reference external" href="http://rdfa.info/play">http://rdfa.info/play</a>">RDFa playground to illustrate the structured data
contained in some real examples of a production Evergreen system (thanks to
<a class="reference external" href="http://biblio.org">Bibliomation</a>). Given that I'm normally a text-and-talk
kind of guy, the illustrations seemed to help out--particularly in showing how
holdings map quite readily to the <tt class="docutils literal">Product</tt> / <tt class="docutils literal">Offer</tt> structure more
commonly used by commercial enterprises to reflect the prices, locations, and
availability of their products.</p>
<p>Of course, the evolution from unstructured, to structured, to linked data had
its payoff beginning with the link from holdings to the libraries that hold the
resources. We have plenty more we can and must do, but unlike other efforts
which are still crystallizing and which will require significant architectural
work to happen before libraries can even begin trying out real systems, you can
use schema.org-enabled systems <em>today</em>. And adapting systems to publish
schema.org structured data only requires access to the HTML templates for your
system (which, hopefully, you have: otherwise you have bigger problems to deal
with!) and following the patterns that have already been established by
Evergreen, Koha, and VuFind.</p>
<p>Jason did a great job showing both a broader use case for schema.org, including
work he has led on digital collections such as embedding the <tt class="docutils literal">Recipe</tt> type in
a book of recipes. And he covered some of the evolution of the vocabulary,
including the exciting possibilities introduced by the <tt class="docutils literal">Action</tt> type and
<tt class="docutils literal">potentialAction</tt> property for describing RESTful APIs... which naturally led
to an off-the-top-of-the-head enumeration of such actions as <tt class="docutils literal">BorrowAction</tt>
and <tt class="docutils literal">LendAction</tt> that are perfect for libraries.</p>
<p>Perhaps the best part of the session, however, were the insightful questions
from the audience (along with the genuinely enthusiastic response to our
talks). We had deliberately left 15 minutes for questions, and we were not
disappointed: from questions about how we move from structured data to more
linked data (I riffed on the Dodds/Davis
href="<a class="reference external" href="http://patterns.dataincubator.org/book/progressive-enrichment.html">http://patterns.dataincubator.org/book/progressive-enrichment.html</a>">Progressive
Enrichment linked data pattern, suggesting that we should be able to
href="/archives/278-Broadening-support-for-linked-data-in-MARC.html">store
links for each field or value of interest directly in our MARC records), to
questions about what proprietary systems are doing this with schema.org today
(alas, none that I'm aware of, unless something has changed since
href="/archives/282-Were-not-waiting-for-the-ILS-to-change.html">February).</p>
</div>Cataloguing for the open web: schema.org in library catalogues and websites2014-07-01T20:00:00-04:002014-07-01T20:00:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-07-01:/cataloguing-for-the-open-web-schemaorg-in-library-catalogues-and-websites.html<div vocab="http://schema.org/" typeof="Article"><p><em>tldr;</em> my slides are
href="<a class="reference external" href="http://stuff.coffeecode.net/2014/understanding_schema">http://stuff.coffeecode.net/2014/understanding_schema</a>">here, and the
slides from Jenn and Jason are also available from
href="<a class="reference external" href="http://connect.ala.org/node/222959">http://connect.ala.org/node/222959</a>">ALA Connect.</p>
<p>On Sunday, June 29th Jenn Riley, Jason Clark, and I presented at the ALCTS/LITA
jointly sponsored …</p></div><div vocab="http://schema.org/" typeof="Article"><p><em>tldr;</em> my slides are
href="<a class="reference external" href="http://stuff.coffeecode.net/2014/understanding_schema">http://stuff.coffeecode.net/2014/understanding_schema</a>">here, and the
slides from Jenn and Jason are also available from
href="<a class="reference external" href="http://connect.ala.org/node/222959">http://connect.ala.org/node/222959</a>">ALA Connect.</p>
<p>On Sunday, June 29th Jenn Riley, Jason Clark, and I presented at the ALCTS/LITA
jointly sponsored session href="<a class="reference external" href="http://ala14.ala.org/node/14382">http://ala14.ala.org/node/14382</a>">Understanding
schema.org. The build-up to the session was pretty amazing; I was delighted to
learn that Jason and I had been working on pretty much parallel efforts over
the past couple of years. Jenn did a great job of organizing the session, and
by the time we started talking 276 people had indicated their interest in
attending: that was two more than those who had indicated an interest in
attending the BIBFRAME Forum Update scheduled in the same time slot. Our room
was large and quite full.</p>
<p>Jenn started the session out string by advancing her concept that libraries
need to target <em>discovery elsewhere</em>: that is, that there is no way that
libraries can compete directly with major search engines like Google, Bing, and
Yahoo, either through the discovery tools that we have to offer, our presence
in the consciousness of most of the population as the starting point for
discovery, or in the resources we can direct towards closing the huge gap in
technology, usability, and mindshare that the search engines have opened up
over the past two decades. <em>But</em>, we can take steps to start working with the
search engines to enable our resources to be discovered and accessed more
directly by them.</p>
<p>That led quite naturally to my own part of the session, in which I talked about
my attempt to turn cataloguing's efforts to provide access points in our niche
catalogues into access points for the open web by publishing schema.org
structured data from library catalogues like Evergreen, Koha, and VuFind. I
started things out by pointing out the legacy of restrictive <tt class="docutils literal">robots.txt</tt>
files that still live on in many catalogues today, then worked through some
basics like how sitemaps enable search engines--which strive to provide
relevant, useful results that matter to users in their context at a particular
place and time--to efficiently crawl just the most recently changed pages of
interest. Then I launched into the heart of the talk that showed how catalogues
that publish schema.org structured data can turn an undifferentiated mass of
presentation-oriented HTML and words into machine-comprehensible entities:
classes like <tt class="docutils literal">Book</tt> and <tt class="docutils literal">Organization</tt>, connected by properties like
<tt class="docutils literal">publisher</tt>, and with values for properties like <tt class="docutils literal">author</tt>,
<tt class="docutils literal">datePublished</tt>, and <tt class="docutils literal">isbn</tt>.</p>
<p>For this talk I used visualizations generated by the
href="<a class="reference external" href="http://rdfa.info/play">http://rdfa.info/play</a>">RDFa playground to illustrate the structured data
contained in some real examples of a production Evergreen system (thanks to
<a class="reference external" href="http://biblio.org">Bibliomation</a>). Given that I'm normally a text-and-talk
kind of guy, the illustrations seemed to help out--particularly in showing how
holdings map quite readily to the <tt class="docutils literal">Product</tt> / <tt class="docutils literal">Offer</tt> structure more
commonly used by commercial enterprises to reflect the prices, locations, and
availability of their products.</p>
<p>Of course, the evolution from unstructured, to structured, to linked data had
its payoff beginning with the link from holdings to the libraries that hold the
resources. We have plenty more we can and must do, but unlike other efforts
which are still crystallizing and which will require significant architectural
work to happen before libraries can even begin trying out real systems, you can
use schema.org-enabled systems <em>today</em>. And adapting systems to publish
schema.org structured data only requires access to the HTML templates for your
system (which, hopefully, you have: otherwise you have bigger problems to deal
with!) and following the patterns that have already been established by
Evergreen, Koha, and VuFind.</p>
<p>Jason did a great job showing both a broader use case for schema.org, including
work he has led on digital collections such as embedding the <tt class="docutils literal">Recipe</tt> type in
a book of recipes. And he covered some of the evolution of the vocabulary,
including the exciting possibilities introduced by the <tt class="docutils literal">Action</tt> type and
<tt class="docutils literal">potentialAction</tt> property for describing RESTful APIs... which naturally led
to an off-the-top-of-the-head enumeration of such actions as <tt class="docutils literal">BorrowAction</tt>
and <tt class="docutils literal">LendAction</tt> that are perfect for libraries.</p>
<p>Perhaps the best part of the session, however, were the insightful questions
from the audience (along with the genuinely enthusiastic response to our
talks). We had deliberately left 15 minutes for questions, and we were not
disappointed: from questions about how we move from structured data to more
linked data (I riffed on the Dodds/Davis
href="<a class="reference external" href="http://patterns.dataincubator.org/book/progressive-enrichment.html">http://patterns.dataincubator.org/book/progressive-enrichment.html</a>">Progressive
Enrichment linked data pattern, suggesting that we should be able to
href="/archives/278-Broadening-support-for-linked-data-in-MARC.html">store
links for each field or value of interest directly in our MARC records), to
questions about what proprietary systems are doing this with schema.org today
(alas, none that I'm aware of, unless something has changed since
href="/archives/282-Were-not-waiting-for-the-ILS-to-change.html">February).</p>
</div>Linked data interest panel, part 12014-06-28T16:14:00-04:002014-06-28T16:14:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-06-28:/linked-data-interest-panel-part-1.html<p>Good talk by Richard Wallis this morning at the ALA Annual Conference on
publishing entities on the web. Many of his points map extremely closely
to what I've been saying and will be saying tomorrow during my own
session (albeit with ten fewer minutes).</p>
<p>I was particularly heartened to hear …</p><p>Good talk by Richard Wallis this morning at the ALA Annual Conference on
publishing entities on the web. Many of his points map extremely closely
to what I've been saying and will be saying tomorrow during my own
session (albeit with ten fewer minutes).</p>
<p>I was particularly heartened to hear him talk about the great potential
for disintermediation of discovery of library resources, from
aggregation by national and global providers like OCLC to directly
crawling a library's own data and providing links directly to the
library resources. This was one of the conclusions of the paper I
published earlier this year.</p>
<p>I would have liked to have heard some mention of Evergreen, Koha, VuFind
and other open source systems that are already publishing schema.org
linked data, either in the context of SchemaBibEx where they served as
reference implementations and proofs of concept, or in the context of
system procurement. But you can't win them all!</p>
Linked data interest panel, part 12014-06-28T16:14:00-04:002014-06-28T16:14:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-06-28:/linked-data-interest-panel-part-1.html<p>Good talk by Richard Wallis this morning at the ALA Annual Conference on
publishing entities on the web. Many of his points map extremely closely
to what I've been saying and will be saying tomorrow during my own
session (albeit with ten fewer minutes).</p>
<p>I was particularly heartened to hear …</p><p>Good talk by Richard Wallis this morning at the ALA Annual Conference on
publishing entities on the web. Many of his points map extremely closely
to what I've been saying and will be saying tomorrow during my own
session (albeit with ten fewer minutes).</p>
<p>I was particularly heartened to hear him talk about the great potential
for disintermediation of discovery of library resources, from
aggregation by national and global providers like OCLC to directly
crawling a library's own data and providing links directly to the
library resources. This was one of the conclusions of the paper I
published earlier this year.</p>
<p>I would have liked to have heard some mention of Evergreen, Koha, VuFind
and other open source systems that are already publishing schema.org
linked data, either in the context of SchemaBibEx where they served as
reference implementations and proofs of concept, or in the context of
system procurement. But you can't win them all!</p>
RDFa introduction and codelabs for libraries2014-06-27T15:06:00-04:002014-06-27T15:06:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-06-27:/rdfa-introduction-and-codelabs-for-libraries.html<p>My <a class="reference external" href="http://stuff.coffeecode.net/2014/lld_preconference">RDFa introduction and codelab
materials</a> for
the ALA 2014 preconference on <a class="reference external" href="http://ala14.ala.org/node/14524">Practical linked data with open
source</a> are now online!</p>
<p>And now I've finished leading the RDFa + schema.org codelab that I've
been stressing over and refining for about a month at the American
Library Association annual conference <em>Practical …</em></p><p>My <a class="reference external" href="http://stuff.coffeecode.net/2014/lld_preconference">RDFa introduction and codelab
materials</a> for
the ALA 2014 preconference on <a class="reference external" href="http://ala14.ala.org/node/14524">Practical linked data with open
source</a> are now online!</p>
<p>And now I've finished leading the RDFa + schema.org codelab that I've
been stressing over and refining for about a month at the American
Library Association annual conference <em>Practical linked data with open
source</em> preconference. Long story short, most people got about as far as
I expected (part-way through the first exercise), but they all got
through the initial hurdles and learned enough to keep learning on their
own. My hopes are that this leads to:</p>
<ul class="simple">
<li>the implementation of structured or even linked data in existing
systems, for those that at least have systems that give them the
ability to edit their HTML templates</li>
<li>the addition of linked data to library web pages the next time they
get refreshed or redesigned (it happens pretty often!)</li>
<li>some patterns of implementation, so that we hopefully arrive at a
relatively standard way of marking up the same metadata (given the
many alternatives that we have just within schema.org for something
like a publisher)</li>
<li>when tweaking templates for display or design purposes, to avoid
mangling existing structured data that a system like Evergreen, Koha,
or VuFind publishes by default</li>
<li>more awesomeness in the world of library metadata!</li>
</ul>
<p>Oh, and for posterity, I temporarily marked up this page to link to our
pizza order form as a really lame short URL service, and as I did that
impishly polluted the schema.org vocabulary with the new type
<tt class="docutils literal">PizzaOrderPreferences</tt>. I don't think that's going to make it into
the official vocab though! The code was:</p>
<pre class="literal-block">
<p vocab="http://schema.org/" typeof="PizzaOrderPreferences"> And <a href="http://doodle.com/exampleblahblah" property="url">order pizza here</a>.</p><p>If our pizza order doesn't get gamed, that just shows how few people visit my blog!
</p>
</pre>
RDFa introduction and codelabs for libraries2014-06-27T15:06:00-04:002014-06-27T15:06:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-06-27:/rdfa-introduction-and-codelabs-for-libraries.html<p>My <a class="reference external" href="http://stuff.coffeecode.net/2014/lld_preconference">RDFa introduction and codelab
materials</a> for
the ALA 2014 preconference on <a class="reference external" href="http://ala14.ala.org/node/14524">Practical linked data with open
source</a> are now online!</p>
<p>And now I've finished leading the RDFa + schema.org codelab that I've
been stressing over and refining for about a month at the American
Library Association annual conference <em>Practical …</em></p><p>My <a class="reference external" href="http://stuff.coffeecode.net/2014/lld_preconference">RDFa introduction and codelab
materials</a> for
the ALA 2014 preconference on <a class="reference external" href="http://ala14.ala.org/node/14524">Practical linked data with open
source</a> are now online!</p>
<p>And now I've finished leading the RDFa + schema.org codelab that I've
been stressing over and refining for about a month at the American
Library Association annual conference <em>Practical linked data with open
source</em> preconference. Long story short, most people got about as far as
I expected (part-way through the first exercise), but they all got
through the initial hurdles and learned enough to keep learning on their
own. My hopes are that this leads to:</p>
<ul class="simple">
<li>the implementation of structured or even linked data in existing
systems, for those that at least have systems that give them the
ability to edit their HTML templates</li>
<li>the addition of linked data to library web pages the next time they
get refreshed or redesigned (it happens pretty often!)</li>
<li>some patterns of implementation, so that we hopefully arrive at a
relatively standard way of marking up the same metadata (given the
many alternatives that we have just within schema.org for something
like a publisher)</li>
<li>when tweaking templates for display or design purposes, to avoid
mangling existing structured data that a system like Evergreen, Koha,
or VuFind publishes by default</li>
<li>more awesomeness in the world of library metadata!</li>
</ul>
<p>Oh, and for posterity, I temporarily marked up this page to link to our
pizza order form as a really lame short URL service, and as I did that
impishly polluted the schema.org vocabulary with the new type
<tt class="docutils literal">PizzaOrderPreferences</tt>. I don't think that's going to make it into
the official vocab though! The code was:</p>
<pre class="literal-block">
<p vocab="http://schema.org/" typeof="PizzaOrderPreferences"> And <a href="http://doodle.com/exampleblahblah" property="url">order pizza here</a>.</p><p>If our pizza order doesn't get gamed, that just shows how few people visit my blog!
</p>
</pre>
The state of structured data in Evergreen: 2.6 edition2014-03-22T15:23:00-04:002014-03-22T15:23:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-03-22:/the-state-of-structured-data-in-evergreen-26-edition.html<p>Yesterday at the <a class="reference external" href="http://evergreen-ils.org/conference/eg14/">2014 Evergreen International
Conference</a> I presented
<a class="reference external" href="http://goo.gl/hDxUep">Structured library data: holdings, libraries, and
beyond</a>--a talk about the work I've done
specifically with Evergreen and making some of the connections with Koha
and VuFind's capabilities. Lots of attendees seemed happy with the talk
and the direction that we're …</p><p>Yesterday at the <a class="reference external" href="http://evergreen-ils.org/conference/eg14/">2014 Evergreen International
Conference</a> I presented
<a class="reference external" href="http://goo.gl/hDxUep">Structured library data: holdings, libraries, and
beyond</a>--a talk about the work I've done
specifically with Evergreen and making some of the connections with Koha
and VuFind's capabilities. Lots of attendees seemed happy with the talk
and the direction that we're going with Evergreen, and have hope for the
future relevance of our libraries' resources within normal search
engines, as well as all of the possibilities opened up by exposing this
open data about our libraries (locations, hours, branch relationships,
contact informatoin) and their resources in a much more consumable form.</p>
<p>There was so much energy in the room, I could have talked for another
hour... I love the Evergreen community!</p>
The state of structured data in Evergreen: 2.6 edition2014-03-22T15:23:00-04:002014-03-22T15:23:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-03-22:/the-state-of-structured-data-in-evergreen-26-edition.html<p>Yesterday at the <a class="reference external" href="http://evergreen-ils.org/conference/eg14/">2014 Evergreen International
Conference</a> I presented
<a class="reference external" href="http://goo.gl/hDxUep">Structured library data: holdings, libraries, and
beyond</a>--a talk about the work I've done
specifically with Evergreen and making some of the connections with Koha
and VuFind's capabilities. Lots of attendees seemed happy with the talk
and the direction that we're …</p><p>Yesterday at the <a class="reference external" href="http://evergreen-ils.org/conference/eg14/">2014 Evergreen International
Conference</a> I presented
<a class="reference external" href="http://goo.gl/hDxUep">Structured library data: holdings, libraries, and
beyond</a>--a talk about the work I've done
specifically with Evergreen and making some of the connections with Koha
and VuFind's capabilities. Lots of attendees seemed happy with the talk
and the direction that we're going with Evergreen, and have hope for the
future relevance of our libraries' resources within normal search
engines, as well as all of the possibilities opened up by exposing this
open data about our libraries (locations, hours, branch relationships,
contact informatoin) and their resources in a much more consumable form.</p>
<p>There was so much energy in the room, I could have talked for another
hour... I love the Evergreen community!</p>
We're not waiting for the ILS to change2014-02-21T15:30:00-05:002014-02-21T15:30:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-02-21:/were-not-waiting-for-the-ils-to-change.html<p>Over at the <strong>Metadata Matters</strong> blog, Diane Hillman wrote <a class="reference external" href="http://managemetadata.com/blog/2014/02/18/why-are-we-waiting-for-the-ils-to-change/">Why Are We
Waiting for the ILS to
Change?</a>,
asking (in the context of the difficulties libraries experience in
making their systems work with RDA):</p>
<blockquote>
What I saw underlying that conversation was the assumption that the
only way change could happen …</blockquote><p>Over at the <strong>Metadata Matters</strong> blog, Diane Hillman wrote <a class="reference external" href="http://managemetadata.com/blog/2014/02/18/why-are-we-waiting-for-the-ils-to-change/">Why Are We
Waiting for the ILS to
Change?</a>,
asking (in the context of the difficulties libraries experience in
making their systems work with RDA):</p>
<blockquote>
What I saw underlying that conversation was the assumption that the
only way change could happen was if the ILS’s themselves changed; in
other words if the ILS vendors decided to lead rather than follow.
The situation now is that system vendors say they’ll build RDA
compliant systems when their customers ask for them, and libraries
say that they’ll use ‘real’ RDA when there are systems that can
support it. This is a dance of death, and nobody wins.</blockquote>
<p>I took this as a jumping-off point to discuss the state of linked data
support in library systems and discovery software and posted the
following comment (currently awaiting moderation):</p>
<blockquote>
<div class="line-block">
<div class="line">Who's waiting? Sweden's LIBRIS took essentially the approach you
suggested back in 2007, and Bibliothèque Nationale de France and
Deutsche Nationalbibliothek have also followed similar paths.</div>
</div>
<div class="line-block">
<div class="line">On the smaller-scale, traditional library "integrated" side of
things Evergreen and Koha, and on the "disintegrated discovery
layer" side VuFind and Blacklight, have integrated RDFa or
microdata to publish structured data using schema.org. Here's
hoping these open source systems can spur the proprietary
alternatives to start competing and doing better.</div>
</div>
<div class="line-block">
<div class="line">Ross Singer mentioned that Capita Prism offers linked data in N3 /
Turtle / RDF/XML / JSON from record details pages like
<a class="reference external" href="http://capitadiscovery.co.uk/surrey-ac/items/1173856">http://capitadiscovery.co.uk/surrey-ac/items/1173856</a>, so happily
there is at least one proprietary catalogue in the smaller-scale
library space doing work in this field.</div>
</div>
</blockquote>
<p>Jumping from RDA to linked data might be a bit of a stretch, but the
lack of movement by proprietary vendors in particular hit a sore point
that I developed during some of our early W3 Schema.org Bibliographic
Extension Community Group discussions. I had asked if anyone else was
trying to actually implement what we were discussing. A response from
one of the proprietary software representatives was "No, we're waiting
to see what develops..." -- which is exactly the attitude that leads to
the "dance of death" that Diane described. It can also lead to decisions
that are suboptimal, ambiguous, or unimplementable because nobody
actually tried to put theory into practice.</p>
<p>Thankfully, a small investment of effort into modifying open source
systems to serve as reference implementations can provide a significant
amount of insight into flaws or possibilities with otherwise theoretical
directions, as well as delivering practical benefits to everyone who
uses that software if those modifications are accepted by the parent
projects. Here's hoping that the more agile options like Koha,
Evergreen, VuFind, and Blacklight continue to push the evolution of
their proprietary competitors.</p>
We're not waiting for the ILS to change2014-02-21T15:30:00-05:002014-02-21T15:30:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-02-21:/were-not-waiting-for-the-ils-to-change.html<p>Over at the <strong>Metadata Matters</strong> blog, Diane Hillman wrote <a class="reference external" href="http://managemetadata.com/blog/2014/02/18/why-are-we-waiting-for-the-ils-to-change/">Why Are We
Waiting for the ILS to
Change?</a>,
asking (in the context of the difficulties libraries experience in
making their systems work with RDA):</p>
<blockquote>
What I saw underlying that conversation was the assumption that the
only way change could happen …</blockquote><p>Over at the <strong>Metadata Matters</strong> blog, Diane Hillman wrote <a class="reference external" href="http://managemetadata.com/blog/2014/02/18/why-are-we-waiting-for-the-ils-to-change/">Why Are We
Waiting for the ILS to
Change?</a>,
asking (in the context of the difficulties libraries experience in
making their systems work with RDA):</p>
<blockquote>
What I saw underlying that conversation was the assumption that the
only way change could happen was if the ILS’s themselves changed; in
other words if the ILS vendors decided to lead rather than follow.
The situation now is that system vendors say they’ll build RDA
compliant systems when their customers ask for them, and libraries
say that they’ll use ‘real’ RDA when there are systems that can
support it. This is a dance of death, and nobody wins.</blockquote>
<p>I took this as a jumping-off point to discuss the state of linked data
support in library systems and discovery software and posted the
following comment (currently awaiting moderation):</p>
<blockquote>
<div class="line-block">
<div class="line">Who's waiting? Sweden's LIBRIS took essentially the approach you
suggested back in 2007, and Bibliothèque Nationale de France and
Deutsche Nationalbibliothek have also followed similar paths.</div>
</div>
<div class="line-block">
<div class="line">On the smaller-scale, traditional library "integrated" side of
things Evergreen and Koha, and on the "disintegrated discovery
layer" side VuFind and Blacklight, have integrated RDFa or
microdata to publish structured data using schema.org. Here's
hoping these open source systems can spur the proprietary
alternatives to start competing and doing better.</div>
</div>
<div class="line-block">
<div class="line">Ross Singer mentioned that Capita Prism offers linked data in N3 /
Turtle / RDF/XML / JSON from record details pages like
<a class="reference external" href="http://capitadiscovery.co.uk/surrey-ac/items/1173856">http://capitadiscovery.co.uk/surrey-ac/items/1173856</a>, so happily
there is at least one proprietary catalogue in the smaller-scale
library space doing work in this field.</div>
</div>
</blockquote>
<p>Jumping from RDA to linked data might be a bit of a stretch, but the
lack of movement by proprietary vendors in particular hit a sore point
that I developed during some of our early W3 Schema.org Bibliographic
Extension Community Group discussions. I had asked if anyone else was
trying to actually implement what we were discussing. A response from
one of the proprietary software representatives was "No, we're waiting
to see what develops..." -- which is exactly the attitude that leads to
the "dance of death" that Diane described. It can also lead to decisions
that are suboptimal, ambiguous, or unimplementable because nobody
actually tried to put theory into practice.</p>
<p>Thankfully, a small investment of effort into modifying open source
systems to serve as reference implementations can provide a significant
amount of insight into flaws or possibilities with otherwise theoretical
directions, as well as delivering practical benefits to everyone who
uses that software if those modifications are accepted by the parent
projects. Here's hoping that the more agile options like Koha,
Evergreen, VuFind, and Blacklight continue to push the evolution of
their proprietary competitors.</p>
Mapping library holdings to the Product / Offer mode in schema.org2014-02-03T18:35:00-05:002014-02-03T18:35:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-02-03:/mapping-library-holdings-to-the-product-offer-mode-in-schemaorg.html<p>Back in August, I
<a class="reference external" href="%20/archives/271-RDFa-and-schema.org-all-the-library-things.html">mentioned</a>
that I taught Evergreen, Koha, and VuFind how to express library
holdings in schema.org via the <tt class="docutils literal"><span class="pre">http://schema.org/Offer</span></tt> class. What I
failed to mention was how others can do the same with their own library
systems (well, okay, I linked to the …</p><p>Back in August, I
<a class="reference external" href="%20/archives/271-RDFa-and-schema.org-all-the-library-things.html">mentioned</a>
that I taught Evergreen, Koha, and VuFind how to express library
holdings in schema.org via the <tt class="docutils literal"><span class="pre">http://schema.org/Offer</span></tt> class. What I
failed to mention was how others can do the same with their own library
systems (well, okay, I linked to the <a class="reference external" href="%20http://www.w3.org/community/schemabibex/wiki/Holdings_via_Offer">W3C Schema.org Bibliographic
Extension Community Group proposal for representing holdings via
Offer</a>
but didn't focus on how one would go about doing that). This might have
led to Diane Hillman recently <a class="reference external" href="http://managemetadata.com/blog/2014/02/03/talking-points-report/">finding the wrong, abandoned holdings
proposal</a>
(thankfully Richard Wallis helped clear things up!). So, better late
than never, here is a quick summary:</p>
<ol class="arabic">
<li><p class="first">Each copy that the library holds is marked up as an individual
<tt class="docutils literal">`Offer</tt> <<a class="reference external" href="http://schema.org/Offer">http://schema.org/Offer</a>>`__.</p>
</li>
<li><p class="first">The <tt class="docutils literal">`itemOffered</tt> <<a class="reference external" href="http://schema.org/itemOffered">http://schema.org/itemOffered</a>>`__ property
attaches an <tt class="docutils literal">Offer</tt> to a corresponding
<tt class="docutils literal">`Product</tt> <<a class="reference external" href="http://schema.org/Product">http://schema.org/Product</a>>`__ that contains the main
description of the goods. In most library systems, this is going to
be the title of the item, list of creators, abstract, subject
classifications, etc; that which we generally refer to as the
bibliographic record. While this will probably have its own type
already (<tt class="docutils literal">Book</tt> or <tt class="docutils literal">Movie</tt> or <tt class="docutils literal">MusicAlbum</tt> or the like), you
can also include <tt class="docutils literal">Product</tt> as a secondary type (either via a
whitespace-delimited list or via the schema.org <tt class="docutils literal">additionalType</tt>
property).</p>
</li>
<li><p class="first">Mapping more familiar library terminology to the pertinent properties
from <tt class="docutils literal">Offer</tt> goes something like this:</p>
</p><ul class="simple">
<li>Library = <tt class="docutils literal">seller</tt> - the range of <tt class="docutils literal">Organization</tt> includes
<tt class="docutils literal">Library</tt> as a child type, so you can link to a highly
structured description of the library including hours of
operation, contact information, location... and that's exactly
what we now do in Evergreen</li>
<li>Call number / shelf number = <tt class="docutils literal">sku</tt> - because a stock-keeping
unit number is "a merchant-specific identifier for a product or
service", and what is a call number if not a means by which you
identify stock on the shelf?</li>
<li>Barcode = <tt class="docutils literal">serialNumber</tt> - the unique "alphanumeric identifier
of a particular product", am I right?</li>
<li>Shelving location = <tt class="docutils literal">availableAtOrFrom</tt> - "[t]he place(s) from
which the offer can be obtained"; with a range of <tt class="docutils literal">Place</tt> this
really should be linked to sub-units of the <tt class="docutils literal">Library</tt> type you
pointed to for the <tt class="docutils literal">seller</tt> property, but schema.org does accept
reality and the inevitability that some plain text values are
going to be supplied where a typed range was indicated.</li>
<li>Item status = <tt class="docutils literal">availability</tt></li>
<li>Borrowing terms = <tt class="docutils literal">businessFunction</tt> - another enumeration, for
which the most likely value for a library is
<tt class="docutils literal"><span class="pre">http://purl.org/goodrelations/v1#LeaseOut</span></tt>. After all, what is
a library loan other than a lease with a limited period during
which the price is $0.00?</li>
<li>Price = <tt class="docutils literal">price</tt> - while theoretically unnecessary, explicitly
specifying a price of $0.00 may satisfy search engines that always
expect to see a price attached to an offer (I'm looking at you,
<a class="reference external" href="http://www.google.com/webmasters/tools/richsnippets">Google Structured Data Testing
Tool</a>!)</li>
</ul>
</p>
<p></li>
</ol>
<p>The language for some of the terminology may seem a little overly
commercial right now, but the next iteration of the schema.org standard
will adopt language that more broadly supports non-commercial
activities... and this broadening of a number of schema.org definitions
is also an outcome of the Schema BibEx community efforts. I'm pretty
happy with the results of the group over the last six months! Hopefully
this sheds some long-overdue light on some of the results of our
efforts, and helps other systems adopt our group's recommended practices
for exposing metadata via schema.org.</p>
Mapping library holdings to the Product / Offer mode in schema.org2014-02-03T18:35:00-05:002014-02-03T18:35:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-02-03:/mapping-library-holdings-to-the-product-offer-mode-in-schemaorg.html<p>Back in August, I
<a class="reference external" href="%20/archives/271-RDFa-and-schema.org-all-the-library-things.html">mentioned</a>
that I taught Evergreen, Koha, and VuFind how to express library
holdings in schema.org via the <tt class="docutils literal"><span class="pre">http://schema.org/Offer</span></tt> class. What I
failed to mention was how others can do the same with their own library
systems (well, okay, I linked to the …</p><p>Back in August, I
<a class="reference external" href="%20/archives/271-RDFa-and-schema.org-all-the-library-things.html">mentioned</a>
that I taught Evergreen, Koha, and VuFind how to express library
holdings in schema.org via the <tt class="docutils literal"><span class="pre">http://schema.org/Offer</span></tt> class. What I
failed to mention was how others can do the same with their own library
systems (well, okay, I linked to the <a class="reference external" href="%20http://www.w3.org/community/schemabibex/wiki/Holdings_via_Offer">W3C Schema.org Bibliographic
Extension Community Group proposal for representing holdings via
Offer</a>
but didn't focus on how one would go about doing that). This might have
led to Diane Hillman recently <a class="reference external" href="http://managemetadata.com/blog/2014/02/03/talking-points-report/">finding the wrong, abandoned holdings
proposal</a>
(thankfully Richard Wallis helped clear things up!). So, better late
than never, here is a quick summary:</p>
<ol class="arabic">
<li><p class="first">Each copy that the library holds is marked up as an individual
<tt class="docutils literal">`Offer</tt> <<a class="reference external" href="http://schema.org/Offer">http://schema.org/Offer</a>>`__.</p>
</li>
<li><p class="first">The <tt class="docutils literal">`itemOffered</tt> <<a class="reference external" href="http://schema.org/itemOffered">http://schema.org/itemOffered</a>>`__ property
attaches an <tt class="docutils literal">Offer</tt> to a corresponding
<tt class="docutils literal">`Product</tt> <<a class="reference external" href="http://schema.org/Product">http://schema.org/Product</a>>`__ that contains the main
description of the goods. In most library systems, this is going to
be the title of the item, list of creators, abstract, subject
classifications, etc; that which we generally refer to as the
bibliographic record. While this will probably have its own type
already (<tt class="docutils literal">Book</tt> or <tt class="docutils literal">Movie</tt> or <tt class="docutils literal">MusicAlbum</tt> or the like), you
can also include <tt class="docutils literal">Product</tt> as a secondary type (either via a
whitespace-delimited list or via the schema.org <tt class="docutils literal">additionalType</tt>
property).</p>
</li>
<li><p class="first">Mapping more familiar library terminology to the pertinent properties
from <tt class="docutils literal">Offer</tt> goes something like this:</p>
</p><ul class="simple">
<li>Library = <tt class="docutils literal">seller</tt> - the range of <tt class="docutils literal">Organization</tt> includes
<tt class="docutils literal">Library</tt> as a child type, so you can link to a highly
structured description of the library including hours of
operation, contact information, location... and that's exactly
what we now do in Evergreen</li>
<li>Call number / shelf number = <tt class="docutils literal">sku</tt> - because a stock-keeping
unit number is "a merchant-specific identifier for a product or
service", and what is a call number if not a means by which you
identify stock on the shelf?</li>
<li>Barcode = <tt class="docutils literal">serialNumber</tt> - the unique "alphanumeric identifier
of a particular product", am I right?</li>
<li>Shelving location = <tt class="docutils literal">availableAtOrFrom</tt> - "[t]he place(s) from
which the offer can be obtained"; with a range of <tt class="docutils literal">Place</tt> this
really should be linked to sub-units of the <tt class="docutils literal">Library</tt> type you
pointed to for the <tt class="docutils literal">seller</tt> property, but schema.org does accept
reality and the inevitability that some plain text values are
going to be supplied where a typed range was indicated.</li>
<li>Item status = <tt class="docutils literal">availability</tt></li>
<li>Borrowing terms = <tt class="docutils literal">businessFunction</tt> - another enumeration, for
which the most likely value for a library is
<tt class="docutils literal"><span class="pre">http://purl.org/goodrelations/v1#LeaseOut</span></tt>. After all, what is
a library loan other than a lease with a limited period during
which the price is $0.00?</li>
<li>Price = <tt class="docutils literal">price</tt> - while theoretically unnecessary, explicitly
specifying a price of $0.00 may satisfy search engines that always
expect to see a price attached to an offer (I'm looking at you,
<a class="reference external" href="http://www.google.com/webmasters/tools/richsnippets">Google Structured Data Testing
Tool</a>!)</li>
</ul>
</p>
<p></li>
</ol>
<p>The language for some of the terminology may seem a little overly
commercial right now, but the next iteration of the schema.org standard
will adopt language that more broadly supports non-commercial
activities... and this broadening of a number of schema.org definitions
is also an outcome of the Schema BibEx community efforts. I'm pretty
happy with the results of the group over the last six months! Hopefully
this sheds some long-overdue light on some of the results of our
efforts, and helps other systems adopt our group's recommended practices
for exposing metadata via schema.org.</p>
What would you understand if you read the entire world wide web?2014-02-03T15:40:00-05:002014-02-03T15:40:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-02-03:/what-would-you-understand-if-you-read-the-entire-world-wide-web.html<div vocab="http://schema.org/"><p>On Tuesday, February 4th, I'll be participating in Laurentian
University's Research Week lightning talks. Unlike most five-minute
lightning talk events in which I've participated, the time limit for
each talk tomorrow will be <strong>one</strong> minute. Imagine 60 different
researchers getting up to summarize their research in one minute each,
and …</p></div><div vocab="http://schema.org/"><p>On Tuesday, February 4th, I'll be participating in Laurentian
University's Research Week lightning talks. Unlike most five-minute
lightning talk events in which I've participated, the time limit for
each talk tomorrow will be <strong>one</strong> minute. Imagine 60 different
researchers getting up to summarize their research in one minute each,
and you have what is likely to be a brain-melting hour. Should be fun!</p>
<p>Here's a rough draft of what I'm planning to say (which, when read at an
even cadence with decent intonation, comes out to exactly one minute:)</p>
<blockquote>
<p>What would you understand if you read the _entire_ world wide web?</p>
</p><p>As humans, we would understand a lot: but we can rely on the
context, structure, and significance of elements of web pages to
derive meaning.</p>
</p><p>The algorithms behind search engines adopt a similar approach, but
struggle with ambiguity; when a web page mentions "Dan Scott", is
it:</p>
</p><ul class="simple">
<li>"Dan Scott" the character from the One Tree Hill TV show</li>
<li>"Dan Scott" the artist from Magic the Gathering card game</li>
<li>"Dan Scott" the Ontario academic professor from the University of
Waterloo</li>
<li>"Dan Scott" the Ontario academic librarian from Laurentian
University</li>
</ul>
</p><p>schema.org is a vocabulary for embedding explicit meaning and intent
within web pages that offers a way to disambiguate those entities.</p>
</p><p>My research is a collaborative effort--within the auspices of the
World Wide Web Consortium--to define bibliographic extensions for
schema.org where necessary, and best practices based on concrete
implementations in three different library systems.</p>
</p></blockquote>
</div>What would you understand if you read the entire world wide web?2014-02-03T15:40:00-05:002014-02-03T15:40:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-02-03:/what-would-you-understand-if-you-read-the-entire-world-wide-web.html<div vocab="http://schema.org/"><p>On Tuesday, February 4th, I'll be participating in Laurentian
University's Research Week lightning talks. Unlike most five-minute
lightning talk events in which I've participated, the time limit for
each talk tomorrow will be <strong>one</strong> minute. Imagine 60 different
researchers getting up to summarize their research in one minute each,
and …</p></div><div vocab="http://schema.org/"><p>On Tuesday, February 4th, I'll be participating in Laurentian
University's Research Week lightning talks. Unlike most five-minute
lightning talk events in which I've participated, the time limit for
each talk tomorrow will be <strong>one</strong> minute. Imagine 60 different
researchers getting up to summarize their research in one minute each,
and you have what is likely to be a brain-melting hour. Should be fun!</p>
<p>Here's a rough draft of what I'm planning to say (which, when read at an
even cadence with decent intonation, comes out to exactly one minute:)</p>
<blockquote>
<p>What would you understand if you read the _entire_ world wide web?</p>
</p><p>As humans, we would understand a lot: but we can rely on the
context, structure, and significance of elements of web pages to
derive meaning.</p>
</p><p>The algorithms behind search engines adopt a similar approach, but
struggle with ambiguity; when a web page mentions "Dan Scott", is
it:</p>
</p><ul class="simple">
<li>"Dan Scott" the character from the One Tree Hill TV show</li>
<li>"Dan Scott" the artist from Magic the Gathering card game</li>
<li>"Dan Scott" the Ontario academic professor from the University of
Waterloo</li>
<li>"Dan Scott" the Ontario academic librarian from Laurentian
University</li>
</ul>
</p><p>schema.org is a vocabulary for embedding explicit meaning and intent
within web pages that offers a way to disambiguate those entities.</p>
</p><p>My research is a collaborative effort--within the auspices of the
World Wide Web Consortium--to define bibliographic extensions for
schema.org where necessary, and best practices based on concrete
implementations in three different library systems.</p>
</p></blockquote>
</div>Ups and downs2014-01-30T15:00:00-05:002014-01-30T15:00:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-01-30:/ups-and-downs.html<p>Tuesday was not the greatest day, but at least each setback resulted in
a triumph...</p>
<p>First, the <a class="reference external" href="http://www.w3.org/community/schemabibex/wiki/Article">periodical proposal for
schema.org</a>--that I
have poured a good couple of months of effort into--took a step closer to
reality when Dan Brickley <a class="reference external" href="http://lists.w3.org/Archives/Public/public-vocabs/2014Jan/0180.html">announced on the public-vocabs list</a> that
he had …</p><p>Tuesday was not the greatest day, but at least each setback resulted in
a triumph...</p>
<p>First, the <a class="reference external" href="http://www.w3.org/community/schemabibex/wiki/Article">periodical proposal for
schema.org</a>--that I
have poured a good couple of months of effort into--took a step closer to
reality when Dan Brickley <a class="reference external" href="http://lists.w3.org/Archives/Public/public-vocabs/2014Jan/0180.html">announced on the public-vocabs list</a> that
he had created a test build that incorporated the RDFS that I had written up.
Excitement rapidly turned to horror, though, as I realized that I had made a
classic copy/paste error, in which I had changed the displayed name of the
<tt class="docutils literal">domainIncludes</tt> value but had not changed the actual URI... Long story
short, the test build looked nothing like what the schemabibex group had agreed
on, and I was terribly embarrassed.</p>
<p>Luckily, after I fixed the RDFS, Dan was able to put together a revised test
build later that day that actually reflected our intentions. So that can
continue moving forward...</p>
<p>Second, our Evergreen instance started acting up rather badly. All of the
connections to the database server were being gobbled up, and we were
scrambling to figure out why. While I'm on sabbatical I'm not really supposed
to be involved in the day-to-day operations, but when a core service stops
running it's okay for research to wait for a little bit... Eventually I tracked
down a fix for a potential denial of service problem (
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1200770">Search result rendering can crush the system</a>) that hadn't been merged
into our production system (the fix came out after the start of my sabbatical),
and shortly after I put that into production we were back up and running.</p>
<p>Third, after the Evergreen problem was resolved, Bill Dueber pinged me
innocently on IRC. He had run into a problem with File_MARC; when serializing
MARC as MARC-in-JSON format, fields with a subfield <tt class="docutils literal">$0</tt> were getting
trashed. Data corrupting bugs are one of the most serious classes of bugs for
any package maintainer, so I jumped on this problem too... After a little bit
of analysis, I figured out that PHP's type coercion for integer-like keys when
creating arrays and its <a class="reference external" href="http://php.net/json_encode">json_encode()</a>
implementation were combining to ruin the MARC-in-JSON serialization in this
one particular case. Faced with rewriting the entire serialization logic, I did
what any (in)sane programmer would and ended up running a regex against the
result of <tt class="docutils literal">json_encode()</tt> to turn the array-ified subfield <tt class="docutils literal">$0</tt> back into a
key/value pair. File_MARC 1.1.1 is now available at your nearest PEAR
mirror...</p>
Ups and downs2014-01-30T15:00:00-05:002014-01-30T15:00:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-01-30:/ups-and-downs.html<p>Tuesday was not the greatest day, but at least each setback resulted in
a triumph...</p>
<p>First, the <a class="reference external" href="http://www.w3.org/community/schemabibex/wiki/Article">periodical proposal for
schema.org</a>--that I
have poured a good couple of months of effort into--took a step closer to
reality when Dan Brickley <a class="reference external" href="http://lists.w3.org/Archives/Public/public-vocabs/2014Jan/0180.html">announced on the public-vocabs list</a> that
he had …</p><p>Tuesday was not the greatest day, but at least each setback resulted in
a triumph...</p>
<p>First, the <a class="reference external" href="http://www.w3.org/community/schemabibex/wiki/Article">periodical proposal for
schema.org</a>--that I
have poured a good couple of months of effort into--took a step closer to
reality when Dan Brickley <a class="reference external" href="http://lists.w3.org/Archives/Public/public-vocabs/2014Jan/0180.html">announced on the public-vocabs list</a> that
he had created a test build that incorporated the RDFS that I had written up.
Excitement rapidly turned to horror, though, as I realized that I had made a
classic copy/paste error, in which I had changed the displayed name of the
<tt class="docutils literal">domainIncludes</tt> value but had not changed the actual URI... Long story
short, the test build looked nothing like what the schemabibex group had agreed
on, and I was terribly embarrassed.</p>
<p>Luckily, after I fixed the RDFS, Dan was able to put together a revised test
build later that day that actually reflected our intentions. So that can
continue moving forward...</p>
<p>Second, our Evergreen instance started acting up rather badly. All of the
connections to the database server were being gobbled up, and we were
scrambling to figure out why. While I'm on sabbatical I'm not really supposed
to be involved in the day-to-day operations, but when a core service stops
running it's okay for research to wait for a little bit... Eventually I tracked
down a fix for a potential denial of service problem (
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1200770">Search result rendering can crush the system</a>) that hadn't been merged
into our production system (the fix came out after the start of my sabbatical),
and shortly after I put that into production we were back up and running.</p>
<p>Third, after the Evergreen problem was resolved, Bill Dueber pinged me
innocently on IRC. He had run into a problem with File_MARC; when serializing
MARC as MARC-in-JSON format, fields with a subfield <tt class="docutils literal">$0</tt> were getting
trashed. Data corrupting bugs are one of the most serious classes of bugs for
any package maintainer, so I jumped on this problem too... After a little bit
of analysis, I figured out that PHP's type coercion for integer-like keys when
creating arrays and its <a class="reference external" href="http://php.net/json_encode">json_encode()</a>
implementation were combining to ruin the MARC-in-JSON serialization in this
one particular case. Faced with rewriting the entire serialization logic, I did
what any (in)sane programmer would and ended up running a regex against the
result of <tt class="docutils literal">json_encode()</tt> to turn the array-ified subfield <tt class="docutils literal">$0</tt> back into a
key/value pair. File_MARC 1.1.1 is now available at your nearest PEAR
mirror...</p>
A slice of sabbatical2014-01-21T03:22:00-05:002014-01-21T03:22:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-01-21:/a-slice-of-sabbatical.html<p>Yesterday I tested, signed off, and pushed a
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1047485">bunch</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/803817">of</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1235474">bug</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1238240">fixes</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1192058">for</a> the
Evergreen library system. Not going to lie; I'm hoping that by clearing
up some of the backlog, a few of my own code contributions (like <a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1261939">"Add
per-library info pages with schema.org structured data
support"</a> and …</p><p>Yesterday I tested, signed off, and pushed a
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1047485">bunch</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/803817">of</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1235474">bug</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1238240">fixes</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1192058">for</a> the
Evergreen library system. Not going to lie; I'm hoping that by clearing
up some of the backlog, a few of my own code contributions (like <a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1261939">"Add
per-library info pages with schema.org structured data
support"</a> and
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1267231">Enhanced title
display</a>) might
get some attention... both branches go a long way towards improving the
state of <a class="reference external" href="http://schema.org">schema.org structured data</a> support in
Evergreen.</p>
<p>Today, I took the W3C Schema.org Bibliographic Extension <a class="reference external" href="http://www.w3.org/community/schemabibex/wiki/Article">proposal for
adding support for
periodicals</a>
and converted it from wiki format into the RDF Schema format desired by
the <a class="reference external" href="http://www.w3.org/wiki/WebSchemas">W3C Web Schemas</a> group. That
draft lives
<a class="reference external" href="https://github.com/dbs/schemabibex/blob/master/schema.org/ext/periodicals.html">here</a>
and doesn't look like much. Funny to think that that represents a few
months of committee work (two hundred emails or thereabouts, with three
conference calls in the mix as well).</p>
<p>I also pushed updated versions of the Perl MARC::Charset and
MARC::Record packages to the Fedora Linux distribution. We library types
need our tools in top condition, and I had let the packages lag behind
the released versions for a while. Nice to clear that off my plate.</p>
A slice of sabbatical2014-01-21T03:22:00-05:002014-01-21T03:22:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2014-01-21:/a-slice-of-sabbatical.html<p>Yesterday I tested, signed off, and pushed a
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1047485">bunch</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/803817">of</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1235474">bug</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1238240">fixes</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1192058">for</a> the
Evergreen library system. Not going to lie; I'm hoping that by clearing
up some of the backlog, a few of my own code contributions (like <a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1261939">"Add
per-library info pages with schema.org structured data
support"</a> and …</p><p>Yesterday I tested, signed off, and pushed a
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1047485">bunch</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/803817">of</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1235474">bug</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1238240">fixes</a>
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1192058">for</a> the
Evergreen library system. Not going to lie; I'm hoping that by clearing
up some of the backlog, a few of my own code contributions (like <a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1261939">"Add
per-library info pages with schema.org structured data
support"</a> and
<a class="reference external" href="https://bugs.launchpad.net/evergreen/+bug/1267231">Enhanced title
display</a>) might
get some attention... both branches go a long way towards improving the
state of <a class="reference external" href="http://schema.org">schema.org structured data</a> support in
Evergreen.</p>
<p>Today, I took the W3C Schema.org Bibliographic Extension <a class="reference external" href="http://www.w3.org/community/schemabibex/wiki/Article">proposal for
adding support for
periodicals</a>
and converted it from wiki format into the RDF Schema format desired by
the <a class="reference external" href="http://www.w3.org/wiki/WebSchemas">W3C Web Schemas</a> group. That
draft lives
<a class="reference external" href="https://github.com/dbs/schemabibex/blob/master/schema.org/ext/periodicals.html">here</a>
and doesn't look like much. Funny to think that that represents a few
months of committee work (two hundred emails or thereabouts, with three
conference calls in the mix as well).</p>
<p>I also pushed updated versions of the Perl MARC::Charset and
MARC::Record packages to the Fedora Linux distribution. We library types
need our tools in top condition, and I had let the packages lag behind
the released versions for a while. Nice to clear that off my plate.</p>
RDFa and schema.org all the library things2013-08-30T16:56:00-04:002013-08-30T16:56:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2013-08-30:/rdfa-and-schemaorg-all-the-library-things.html<p><em>TLDR</em>: The <a class="reference external" href="http://evergreen-ils.org">Evergreen</a> and
<a class="reference external" href="http://koha-community.org">Koha</a> integrated library systems now express
their record details in the schema.org vocabulary out of the box using RDFa.</p>
<p>Individual holdings are expressed as <a class="reference external" href="http://schema.org/Offer">Offer</a>
instances per the
<a class="reference external" href="http://www.w3.org/community/schemabibex/wiki/Holdings_via_Offer">W3C Schema Bib Extension community group proposal</a> to
parallel commercial sales offers. <em>And</em> I have published a …</p><p><em>TLDR</em>: The <a class="reference external" href="http://evergreen-ils.org">Evergreen</a> and
<a class="reference external" href="http://koha-community.org">Koha</a> integrated library systems now express
their record details in the schema.org vocabulary out of the box using RDFa.</p>
<p>Individual holdings are expressed as <a class="reference external" href="http://schema.org/Offer">Offer</a>
instances per the
<a class="reference external" href="http://www.w3.org/community/schemabibex/wiki/Holdings_via_Offer">W3C Schema Bib Extension community group proposal</a> to
parallel commercial sales offers. <em>And</em> I have published a branch to give the
same capabilities to the <a class="reference external" href="http://vufind.org">VuFind</a> discovery layer, as
well.</p>
<p>In the spring of 2012, I took my first steps in the structured data world by
teaching Evergreen 2.2 how to express some record details in
<a class="reference external" href="http://schema.org">schema.org</a>. It was a small step towards taking the
machine-readable data that we had made useful to humans on the record detail
catalogue page and marking it up so that it was once again machine readable. At
that time, Evergreen only knew how to map MARC data to two schema.org types
(<tt class="docutils literal">Book</tt> and <tt class="docutils literal">MusicRecording</tt>--which should have been <tt class="docutils literal">MusicAlbum</tt>, but I
eventually fixed that) and a handful of attributes: <tt class="docutils literal">name</tt>, <tt class="docutils literal">ISBN</tt>,
<tt class="docutils literal">publisher</tt>, <tt class="docutils literal">publication date</tt>, <tt class="docutils literal">author</tt>, <tt class="docutils literal">contributor</tt>, and
<tt class="docutils literal">keywords</tt>. Pretty barebones, but a start nonetheless.</p>
<p>I used the HTML5 microdata approach because I was new to structured data and
microdata was what was demonstrated in all of the schema.org examples, so it
seemed like the obvious choice. Over the last year, however, I realized that
<a class="reference external" href="http://www.w3.org/TR/rdfa-syntax/">RDFa</a> is a W3C standard for
accomplishing the same goals as microdata, bolstered by an open community
standards-making process, and featuring the ability to mix in properties and
types from multiple vocabularies. I touched on this in my Evergreen 2013
conference presentation: <a class="reference external" href="/archives/264-Structured-data-making-metadata-matter-for-machines.html">Structured data: making metadata matter for machines</a>.
While <a class="reference external" href="http://www.w3.org/TR/rdfa-lite/">RDFa Lite</a> is extremely easy to get
started with, I have been diving deeper into RDFa proper to make use of some of
the more advanced properties, such as <tt class="docutils literal">@about</tt> to work around unwanted
chaining introduced by <tt class="docutils literal">@href</tt> attributes.</p>
<p>Over the last few weeks, I was able to concentrate on improving the schema.org
mapping for Evergreen--introducing holdings as instances of the
<a class="reference external" href="http://schema.org/Offer">http://schema.org/Offer</a> class, providing much more granular author and
contributor data--and cut over to RDFa. While the tools at
<a class="reference external" href="http://rdfa.info/tools">RDFa Tools</a> were quite useful for debugging my
efforts, I also have to thank the denizens of the
<a class="reference external" href="irc://irc.w3.org:6665/#rdfa">#rdfa IRC channel</a> (and
<a class="reference external" href="http://manu.sporny.org/about">Manu Sporny</a> in particular) for patiently
helping me understand some of my rookie mistakes. Ben Shum also kept me honest
by patiently testing multiple iterations of my branches with the Google Rich
Snippets tool and reporting any issues that he encountered; this led to my
realization that using <tt class="docutils literal">@resource</tt> and <tt class="docutils literal">@about</tt> were necessary in some
contexts.</p>
<p>Once I had worked out a decent mapping in Evergreen (a library system I have
been contributing to for over six years now), I decided to tackle the VuFind
discovery layer. VuFind uses a straightforward template system, and I was able
to put together a branch that integrated schema.org as RDFa (details at
<a class="reference external" href="http://vufind.org/jira/browse/VUFIND-425">VuFind bug 425</a>), building on
Eoghan Ó Carragáin's initial efforts. Once again I included holdings-as-Offers,
as the Evergreen driver for VuFind made that easy enough to test. As part of my
work, I contributed some enhancements for the Evergreen driver that have
already been integrated into VuFind. The initial reception from the VuFind
community was positive, although my branch arrived too late for the VuFind 2.1
release; if all goes well, it will be integrated for the VuFind 2.2 release. In
the mean time, sites running VuFind that want schema.org structured data can
integrate my branch themselves--and please provide feedback!</p>
<p>As I was on a roll, I also opted to tackle the Koha integrated library system.
With some initial pointers from Galen Charlton and Chris Cormack to the
XSLT-based templating system that Koha uses, I was able to implement schema.org
with holdings-as-Offers in a matter of hours for the first iteration. Jared
Camins then worked patiently with me as I added small commits to address issues
that came up on the Evergreen side, but in under a week from start to finish
the branch was signed off, passed QA, and and pushed to master.</p>
<p>(It actually broke the build due to a coding violation--<strong>doh!</strong>--but that
was quickly cleaned up.)</p>
<p>The upshot? We now have two library systems set to publish rich
schema.org structured data--including holdings--in RDFa, out of the box by
default, in their record detail pages on the Web, and a third system ready to
go.</p>
<p>Let me simply say that I <em>love</em> the agility of open source software. So, for
the future, I intend to tackle a few more library systems; digital repositories
seem like they would be worthwhile targets. On that front, I have
<a class="reference external" href="http://sourceforge.net/mailarchive/message.php?msg_id=31245519">inquired</a>
on the DSpace developers' list about whether there is still interest in
integrating schema.org (as had been expressed a year ago), but have not yet
received a reply. Perhaps ArchivesSpace, or furthering the existing support on
Islandora? Let me know if you're interested!</p>
RDFa and schema.org all the library things2013-08-30T16:56:00-04:002013-08-30T16:56:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2013-08-30:/rdfa-and-schemaorg-all-the-library-things.html<p><em>TLDR</em>: The <a class="reference external" href="http://evergreen-ils.org">Evergreen</a> and
<a class="reference external" href="http://koha-community.org">Koha</a> integrated library systems now express
their record details in the schema.org vocabulary out of the box using RDFa.</p>
<p>Individual holdings are expressed as <a class="reference external" href="http://schema.org/Offer">Offer</a>
instances per the
<a class="reference external" href="http://www.w3.org/community/schemabibex/wiki/Holdings_via_Offer">W3C Schema Bib Extension community group proposal</a> to
parallel commercial sales offers. <em>And</em> I have published a …</p><p><em>TLDR</em>: The <a class="reference external" href="http://evergreen-ils.org">Evergreen</a> and
<a class="reference external" href="http://koha-community.org">Koha</a> integrated library systems now express
their record details in the schema.org vocabulary out of the box using RDFa.</p>
<p>Individual holdings are expressed as <a class="reference external" href="http://schema.org/Offer">Offer</a>
instances per the
<a class="reference external" href="http://www.w3.org/community/schemabibex/wiki/Holdings_via_Offer">W3C Schema Bib Extension community group proposal</a> to
parallel commercial sales offers. <em>And</em> I have published a branch to give the
same capabilities to the <a class="reference external" href="http://vufind.org">VuFind</a> discovery layer, as
well.</p>
<p>In the spring of 2012, I took my first steps in the structured data world by
teaching Evergreen 2.2 how to express some record details in
<a class="reference external" href="http://schema.org">schema.org</a>. It was a small step towards taking the
machine-readable data that we had made useful to humans on the record detail
catalogue page and marking it up so that it was once again machine readable. At
that time, Evergreen only knew how to map MARC data to two schema.org types
(<tt class="docutils literal">Book</tt> and <tt class="docutils literal">MusicRecording</tt>--which should have been <tt class="docutils literal">MusicAlbum</tt>, but I
eventually fixed that) and a handful of attributes: <tt class="docutils literal">name</tt>, <tt class="docutils literal">ISBN</tt>,
<tt class="docutils literal">publisher</tt>, <tt class="docutils literal">publication date</tt>, <tt class="docutils literal">author</tt>, <tt class="docutils literal">contributor</tt>, and
<tt class="docutils literal">keywords</tt>. Pretty barebones, but a start nonetheless.</p>
<p>I used the HTML5 microdata approach because I was new to structured data and
microdata was what was demonstrated in all of the schema.org examples, so it
seemed like the obvious choice. Over the last year, however, I realized that
<a class="reference external" href="http://www.w3.org/TR/rdfa-syntax/">RDFa</a> is a W3C standard for
accomplishing the same goals as microdata, bolstered by an open community
standards-making process, and featuring the ability to mix in properties and
types from multiple vocabularies. I touched on this in my Evergreen 2013
conference presentation: <a class="reference external" href="/archives/264-Structured-data-making-metadata-matter-for-machines.html">Structured data: making metadata matter for machines</a>.
While <a class="reference external" href="http://www.w3.org/TR/rdfa-lite/">RDFa Lite</a> is extremely easy to get
started with, I have been diving deeper into RDFa proper to make use of some of
the more advanced properties, such as <tt class="docutils literal">@about</tt> to work around unwanted
chaining introduced by <tt class="docutils literal">@href</tt> attributes.</p>
<p>Over the last few weeks, I was able to concentrate on improving the schema.org
mapping for Evergreen--introducing holdings as instances of the
<a class="reference external" href="http://schema.org/Offer">http://schema.org/Offer</a> class, providing much more granular author and
contributor data--and cut over to RDFa. While the tools at
<a class="reference external" href="http://rdfa.info/tools">RDFa Tools</a> were quite useful for debugging my
efforts, I also have to thank the denizens of the
<a class="reference external" href="irc://irc.w3.org:6665/#rdfa">#rdfa IRC channel</a> (and
<a class="reference external" href="http://manu.sporny.org/about">Manu Sporny</a> in particular) for patiently
helping me understand some of my rookie mistakes. Ben Shum also kept me honest
by patiently testing multiple iterations of my branches with the Google Rich
Snippets tool and reporting any issues that he encountered; this led to my
realization that using <tt class="docutils literal">@resource</tt> and <tt class="docutils literal">@about</tt> were necessary in some
contexts.</p>
<p>Once I had worked out a decent mapping in Evergreen (a library system I have
been contributing to for over six years now), I decided to tackle the VuFind
discovery layer. VuFind uses a straightforward template system, and I was able
to put together a branch that integrated schema.org as RDFa (details at
<a class="reference external" href="http://vufind.org/jira/browse/VUFIND-425">VuFind bug 425</a>), building on
Eoghan Ó Carragáin's initial efforts. Once again I included holdings-as-Offers,
as the Evergreen driver for VuFind made that easy enough to test. As part of my
work, I contributed some enhancements for the Evergreen driver that have
already been integrated into VuFind. The initial reception from the VuFind
community was positive, although my branch arrived too late for the VuFind 2.1
release; if all goes well, it will be integrated for the VuFind 2.2 release. In
the mean time, sites running VuFind that want schema.org structured data can
integrate my branch themselves--and please provide feedback!</p>
<p>As I was on a roll, I also opted to tackle the Koha integrated library system.
With some initial pointers from Galen Charlton and Chris Cormack to the
XSLT-based templating system that Koha uses, I was able to implement schema.org
with holdings-as-Offers in a matter of hours for the first iteration. Jared
Camins then worked patiently with me as I added small commits to address issues
that came up on the Evergreen side, but in under a week from start to finish
the branch was signed off, passed QA, and and pushed to master.</p>
<p>(It actually broke the build due to a coding violation--<strong>doh!</strong>--but that
was quickly cleaned up.)</p>
<p>The upshot? We now have two library systems set to publish rich
schema.org structured data--including holdings--in RDFa, out of the box by
default, in their record detail pages on the Web, and a third system ready to
go.</p>
<p>Let me simply say that I <em>love</em> the agility of open source software. So, for
the future, I intend to tackle a few more library systems; digital repositories
seem like they would be worthwhile targets. On that front, I have
<a class="reference external" href="http://sourceforge.net/mailarchive/message.php?msg_id=31245519">inquired</a>
on the DSpace developers' list about whether there is still interest in
integrating schema.org (as had been expressed a year ago), but have not yet
received a reply. Perhaps ArchivesSpace, or furthering the existing support on
Islandora? Let me know if you're interested!</p>
Making the Evergreen catalogue mobile-friendly via responsive CSS2013-04-22T02:48:00-04:002013-04-22T02:48:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2013-04-22:/making-the-evergreen-catalogue-mobile-friendly-via-responsive-css.html<p>Back in November the Evergreen community was discussing the desire for a
mobile catalogue, and <a class="reference external" href="http://markmail.org/message/tdvihpd63lu6ksbs">expressed a strong
opinion</a> that the right
way forward would be to teach the current catalogue to be
mobile-friendly by applying principles of responsive design. In fact, I
stated:</p>
<blockquote>
</p><p>Almost all of this can be …</p></blockquote><p>Back in November the Evergreen community was discussing the desire for a
mobile catalogue, and <a class="reference external" href="http://markmail.org/message/tdvihpd63lu6ksbs">expressed a strong
opinion</a> that the right
way forward would be to teach the current catalogue to be
mobile-friendly by applying principles of responsive design. In fact, I
stated:</p>
<blockquote>
</p><p>Almost all of this can be achieved via CSS, possibly with some
changes to the underlying HTML (e.g. tables to divs or whatever so
that "Place Hold" appears under the bib info instead of way over to
the right).</p>
<p></blockquote>
<p>I have this bad habit of talking more than doing. So when I saw the
Beanstalk mobile catalogue resurrected again at the Evergreen 2013
lightning talks, it bugged me that I still hadn't put any effort into a
proof of concept of what was possible with <a class="reference external" href="https://developer.mozilla.org/en-US/docs/CSS/Media_queries">CSS media
queries</a>.
Thus, today, on the last day of my holidays, I spent a few hours trying
things out on our development server and came up with <a class="reference external" href="http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/dbs/responsive_tpac">this *rough*
branch</a>
to work towards making the exact same HTML that we serve up for desktops
provide an experience similar to that of the Beanstalk generation of
catalogues for mobile, just via CSS.</p>
<p>As you can see from the commits, I made one change to the HTML to define
a viewport, and added one set of CSS rules wrapped in a media query; in
essence:</p>
<pre class="literal-block">
...<head>...<meta content="initial-scale=1.0,width=device-width" name="viewport"><style>@media only screen and (max-width: 600px) { #header { padding: 0px; margin: 0px; } .facet_sidebar { display: none; } ...}</style><head>...
</pre>
<div class="section" id="results-and-trade-offs">
<h2>Results and trade-offs</h2>
<p>Here are a few example URLs from our test server (which is slow, and
might get wiped any day, so test them quickly if you have a mobile
device around!):</p>
<ul class="simple">
<li><a class="reference external" href="http://laurentian-test.concat.ca/eg/opac/results?query=open+source&qtype=keyword&locg=103&detail_record_view=1">Search
results</a>
- sacrificed facets, per-item actions, and the language picker</li>
<li><a class="reference external" href="https://laurentian-test.concat.ca/eg/opac/record/729926?query=open%20source;qtype=keyword;locg=103;detail_record_view=1">Record
details</a>
- sacrificed per-item actions, flattened the item table vertically</li>
</ul>
<p>In general, I removed a lot of the frippery from the header, while
trying to retain the most valuable pieces. However, some bits are
broken: <strong>Another Search</strong> doesn't actually let you do another search
because the search bar is totally hidden. Other bits haven't been
touched (<strong>Advanced search</strong> is still overwhelming, and <strong>My Account</strong>,
while functional, could be much prettier.</p>
<p>What I've done so far is oriented towards our 2.3-ish lightly customized
Laurentian skin (we force full details in search results, for example)
but the principles should be applicable to an out-of-the-box Evergreen
catalogue. In working through some of the challenges, I've determined
that I was pretty much on target back in November; with a few HTML
tweaks that would improve the layout for desktops as well, we could keep
the per-item actions and facets around, but just move them to a
different location.</p>
</div>
<div class="section" id="less-talk-more-action">
<h2>Less talk, more action</h2>
<p>So who's with me? What we have to gain is a single set of HTML to
support for TPAC, and a single set of CSS, all available from the same
URL, rather than trying to maintain overlays and monkeying about with
mobile-vs-desktop URLs and the like. Feel free to dig in and start
pushing branches with improvements over my rough attempts and let's make
this thing happen for Evergreen 2.5.</p>
</div>
<div class="section" id="with-thanks-to-firefox">
<h2>With thanks to Firefox...</h2>
<p>I would be remiss if I did not mention the marvellous <a class="reference external" href="https://developer.mozilla.org/en-US/docs/Tools/Responsive_Design_View">Responsive Design
View</a>
introduced in Firefox 15, along with the <a class="reference external" href="https://developer.mozilla.org/en-US/docs/Tools/Style_Editor">Style
Editor</a>;
together, these tools (built into Firefox) made my developing and
testing work <em>so</em> much easier.</p>
<p>If you want to live on the cutting edge of Firefox, you want Aurora - go
and get it <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
<p><a class="reference external image-reference" href="http://affiliates.mozilla.org/link/banner/36536"><img alt="Download Aurora" src="http://affiliates.mozilla.org/media/uploads/banners/6f0132062588b248d44968734668226f9c19d994.png" /></a></p>
</div>
Making the Evergreen catalogue mobile-friendly via responsive CSS2013-04-22T02:48:00-04:002013-04-22T02:48:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2013-04-22:/making-the-evergreen-catalogue-mobile-friendly-via-responsive-css.html<p>Back in November the Evergreen community was discussing the desire for a
mobile catalogue, and <a class="reference external" href="http://markmail.org/message/tdvihpd63lu6ksbs">expressed a strong
opinion</a> that the right
way forward would be to teach the current catalogue to be
mobile-friendly by applying principles of responsive design. In fact, I
stated:</p>
<blockquote>
</p><p>Almost all of this can be …</p></blockquote><p>Back in November the Evergreen community was discussing the desire for a
mobile catalogue, and <a class="reference external" href="http://markmail.org/message/tdvihpd63lu6ksbs">expressed a strong
opinion</a> that the right
way forward would be to teach the current catalogue to be
mobile-friendly by applying principles of responsive design. In fact, I
stated:</p>
<blockquote>
</p><p>Almost all of this can be achieved via CSS, possibly with some
changes to the underlying HTML (e.g. tables to divs or whatever so
that "Place Hold" appears under the bib info instead of way over to
the right).</p>
<p></blockquote>
<p>I have this bad habit of talking more than doing. So when I saw the
Beanstalk mobile catalogue resurrected again at the Evergreen 2013
lightning talks, it bugged me that I still hadn't put any effort into a
proof of concept of what was possible with <a class="reference external" href="https://developer.mozilla.org/en-US/docs/CSS/Media_queries">CSS media
queries</a>.
Thus, today, on the last day of my holidays, I spent a few hours trying
things out on our development server and came up with <a class="reference external" href="http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/dbs/responsive_tpac">this *rough*
branch</a>
to work towards making the exact same HTML that we serve up for desktops
provide an experience similar to that of the Beanstalk generation of
catalogues for mobile, just via CSS.</p>
<p>As you can see from the commits, I made one change to the HTML to define
a viewport, and added one set of CSS rules wrapped in a media query; in
essence:</p>
<pre class="literal-block">
...<head>...<meta content="initial-scale=1.0,width=device-width" name="viewport"><style>@media only screen and (max-width: 600px) { #header { padding: 0px; margin: 0px; } .facet_sidebar { display: none; } ...}</style><head>...
</pre>
<div class="section" id="results-and-trade-offs">
<h2>Results and trade-offs</h2>
<p>Here are a few example URLs from our test server (which is slow, and
might get wiped any day, so test them quickly if you have a mobile
device around!):</p>
<ul class="simple">
<li><a class="reference external" href="http://laurentian-test.concat.ca/eg/opac/results?query=open+source&qtype=keyword&locg=103&detail_record_view=1">Search
results</a>
- sacrificed facets, per-item actions, and the language picker</li>
<li><a class="reference external" href="https://laurentian-test.concat.ca/eg/opac/record/729926?query=open%20source;qtype=keyword;locg=103;detail_record_view=1">Record
details</a>
- sacrificed per-item actions, flattened the item table vertically</li>
</ul>
<p>In general, I removed a lot of the frippery from the header, while
trying to retain the most valuable pieces. However, some bits are
broken: <strong>Another Search</strong> doesn't actually let you do another search
because the search bar is totally hidden. Other bits haven't been
touched (<strong>Advanced search</strong> is still overwhelming, and <strong>My Account</strong>,
while functional, could be much prettier.</p>
<p>What I've done so far is oriented towards our 2.3-ish lightly customized
Laurentian skin (we force full details in search results, for example)
but the principles should be applicable to an out-of-the-box Evergreen
catalogue. In working through some of the challenges, I've determined
that I was pretty much on target back in November; with a few HTML
tweaks that would improve the layout for desktops as well, we could keep
the per-item actions and facets around, but just move them to a
different location.</p>
</div>
<div class="section" id="less-talk-more-action">
<h2>Less talk, more action</h2>
<p>So who's with me? What we have to gain is a single set of HTML to
support for TPAC, and a single set of CSS, all available from the same
URL, rather than trying to maintain overlays and monkeying about with
mobile-vs-desktop URLs and the like. Feel free to dig in and start
pushing branches with improvements over my rough attempts and let's make
this thing happen for Evergreen 2.5.</p>
</div>
<div class="section" id="with-thanks-to-firefox">
<h2>With thanks to Firefox...</h2>
<p>I would be remiss if I did not mention the marvellous <a class="reference external" href="https://developer.mozilla.org/en-US/docs/Tools/Responsive_Design_View">Responsive Design
View</a>
introduced in Firefox 15, along with the <a class="reference external" href="https://developer.mozilla.org/en-US/docs/Tools/Style_Editor">Style
Editor</a>;
together, these tools (built into Firefox) made my developing and
testing work <em>so</em> much easier.</p>
<p>If you want to live on the cutting edge of Firefox, you want Aurora - go
and get it <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
<p><a class="reference external image-reference" href="http://affiliates.mozilla.org/link/banner/36536"><img alt="Download Aurora" src="http://affiliates.mozilla.org/media/uploads/banners/6f0132062588b248d44968734668226f9c19d994.png" /></a></p>
</div>
Structured data: making metadata matter for machines2013-04-12T19:11:00-04:002013-04-12T19:11:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2013-04-12:/structured-data-making-metadata-matter-for-machines.html<p><strong>Update 2013-04-18:</strong> Now with <a class="reference external" href="https://archive.org/details/Microdata">video of the
presentation</a>, thanks to the
awesome #egcon2013 volunteers!</p>
<p>I've been attending the <a class="reference external" href="http://eg2013.evergreen-ils.org">Evergreen 2013
Conference</a> in beautiful Vancouver.
This morning, I was honoured to be able to give a presentation on some
of the work I've been doing on implementing linked data via
<a class="reference external" href="http://schema.org">schema …</a></p><p><strong>Update 2013-04-18:</strong> Now with <a class="reference external" href="https://archive.org/details/Microdata">video of the
presentation</a>, thanks to the
awesome #egcon2013 volunteers!</p>
<p>I've been attending the <a class="reference external" href="http://eg2013.evergreen-ils.org">Evergreen 2013
Conference</a> in beautiful Vancouver.
This morning, I was honoured to be able to give a presentation on some
of the work I've been doing on implementing linked data via
<a class="reference external" href="http://schema.org">schema.org</a> in Evergreen. I <em>think</em> I did a good
job of explaining the potential value of linked data and arguing for
improving Evergreen's schema.org publishing ninja skills.</p>
<p>My slides, with a reasonable number of useful speaker notes to provide
context, are available in <a class="reference external" href="/uploads/talks/2013/structured_data_matters.odp">LibreOffice
format</a>.<a class="reference external" href="#fn1">[1]</a></p>
<p>In addition, the amazing organizers of the conference also streamed
most<a class="reference external" href="#fn2">[2]</a> of the talk and the recording will be available on
the <a class="reference external" href="http://eg2013.evergreen-ils.org">conference web site</a> in a week
or two.</p>
<div class="section" id="footnotes">
<h2>Footnotes</h2>
<ol class="arabic">
<li><div class="first"><div id="fn1"></div></div><p>I felt pretty dirty not using HTML5 + RDFa Lite to actually mark the
whole thing up; there was some question close to the time of the
conference as to whether anything but PPT or perhaps PDF would be an
acceptable format... a concern that was subsequently removed, but a
little too late to be worthwhile changing course.</p>
</li>
<li><div class="first"><div id="fn2"></div></div><p>The room was standing-room only (well, sitting-on-the-floor-room
only), and one of the organizers accidentally sat on and unplugged
the Ethernet cable, so something like ten minutes were lost. Heh!</p>
</li>
</ol>
</div>
Structured data: making metadata matter for machines2013-04-12T19:11:00-04:002013-04-12T19:11:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2013-04-12:/structured-data-making-metadata-matter-for-machines.html<p><strong>Update 2013-04-18:</strong> Now with <a class="reference external" href="https://archive.org/details/Microdata">video of the
presentation</a>, thanks to the
awesome #egcon2013 volunteers!</p>
<p>I've been attending the <a class="reference external" href="http://eg2013.evergreen-ils.org">Evergreen 2013
Conference</a> in beautiful Vancouver.
This morning, I was honoured to be able to give a presentation on some
of the work I've been doing on implementing linked data via
<a class="reference external" href="http://schema.org">schema …</a></p><p><strong>Update 2013-04-18:</strong> Now with <a class="reference external" href="https://archive.org/details/Microdata">video of the
presentation</a>, thanks to the
awesome #egcon2013 volunteers!</p>
<p>I've been attending the <a class="reference external" href="http://eg2013.evergreen-ils.org">Evergreen 2013
Conference</a> in beautiful Vancouver.
This morning, I was honoured to be able to give a presentation on some
of the work I've been doing on implementing linked data via
<a class="reference external" href="http://schema.org">schema.org</a> in Evergreen. I <em>think</em> I did a good
job of explaining the potential value of linked data and arguing for
improving Evergreen's schema.org publishing ninja skills.</p>
<p>My slides, with a reasonable number of useful speaker notes to provide
context, are available in <a class="reference external" href="/uploads/talks/2013/structured_data_matters.odp">LibreOffice
format</a>.<a class="reference external" href="#fn1">[1]</a></p>
<p>In addition, the amazing organizers of the conference also streamed
most<a class="reference external" href="#fn2">[2]</a> of the talk and the recording will be available on
the <a class="reference external" href="http://eg2013.evergreen-ils.org">conference web site</a> in a week
or two.</p>
<div class="section" id="footnotes">
<h2>Footnotes</h2>
<ol class="arabic">
<li><div class="first"><div id="fn1"></div></div><p>I felt pretty dirty not using HTML5 + RDFa Lite to actually mark the
whole thing up; there was some question close to the time of the
conference as to whether anything but PPT or perhaps PDF would be an
acceptable format... a concern that was subsequently removed, but a
little too late to be worthwhile changing course.</p>
</li>
<li><div class="first"><div id="fn2"></div></div><p>The room was standing-room only (well, sitting-on-the-floor-room
only), and one of the organizers accidentally sat on and unplugged
the Ethernet cable, so something like ten minutes were lost. Heh!</p>
</li>
</ol>
</div>
Introducing SQL to Evergreen administrators, round two2013-02-16T02:32:00-05:002013-02-16T02:32:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2013-02-16:/introducing-sql-to-evergreen-administrators-round-two.html<p><a class="reference external" href="/archives/212-Introduction-to-SQL-for-Evergreen-administrators.html">Three years ago</a> I was
asked to create and deliver a two-day course introducing SQL to Evergreen
users. Things went well and I was able to share the resulting materials with
the Evergreen and PostgreSQL community. Perhaps one of my happiest moments at
the Evergreen conference last year was when …</p><p><a class="reference external" href="/archives/212-Introduction-to-SQL-for-Evergreen-administrators.html">Three years ago</a> I was
asked to create and deliver a two-day course introducing SQL to Evergreen
users. Things went well and I was able to share the resulting materials with
the Evergreen and PostgreSQL community. Perhaps one of my happiest moments at
the Evergreen conference last year was when one of the participants in that
course, told me that many of his fellow participants were still successfully
writing SQL queries and getting work done. Huzzah!</p>
<p>Time goes by and another group, <a class="reference external" href="http://www.ohionet.org">OHIONET</a>, was
running into difficulties getting started with PostgreSQL and Evergreen. They
asked me if I would be willing to give the same sort of training I had given a
few years back. "Sure", I said, thinking it would be a great opportunity to
polish the materials and add some updates to cover new features in PostgreSQL
and Evergreen. We also opted to skip the travel and do an entirely virtual
training session via Google Hangouts, which worked out rather nicely (but
that's a different story).</p>
<p>As it turned out, I probably ended up putting about four days worth of effort
(crammed into lots of late nights, weekends, and vacation days) into
overhauling the instruction materials. But the results were worth it, in my
opinion; I'm rather proud of the content, and while I believe it stands up on
its own, the guidance that I was able to provide during the live instruction
sessions was well-received by the participants.</p>
<p>Thus, I am pleased to be able to offer to the broader community the latest
version of the Introduction to SQL for Evergreen Administrators, under a
Creative Commons Attribution-ShareAlike 3.0 (Unported) license.</p>
<ul class="simple">
<li>Reference documentation--30 pages introducing SQL with examples drawn
from the Evergreen schema:
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/v2/introduction_to_sql.html">HTML</a>)
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/v2/introduction_to_sql.pdf">PDF</a>)
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/v2/introduction_to_sql.epub">ePub</a>)
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/introduction_to_sql.txt">AsciiDoc</a>)</li>
<li>Presentation:
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/SQL_instruction.odp">LibreOffice Impress</a>)
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/v2/SQL_instruction.pdf">PDF</a>)</li>
<li>Solutions to exercises:
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/solutions_day_1.txt">Day 1</a>)
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/solutions_day_2.txt">Day 2</a>)</li>
</ul>
<p>So, a huge thanks to OHIONET for giving me the impetus to overhaul this
material, and for giving me a chance to introduce them to the wonders of SQL
with PostgreSQL, and to the inner workings of the Evergreen schema. It was a
blast! And thanks for agreeing to let me share these materials with the broader
community.</p>
Introducing SQL to Evergreen administrators, round two2013-02-16T02:32:00-05:002013-02-16T02:32:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2013-02-16:/introducing-sql-to-evergreen-administrators-round-two.html<p><a class="reference external" href="/archives/212-Introduction-to-SQL-for-Evergreen-administrators.html">Three years ago</a> I was
asked to create and deliver a two-day course introducing SQL to Evergreen
users. Things went well and I was able to share the resulting materials with
the Evergreen and PostgreSQL community. Perhaps one of my happiest moments at
the Evergreen conference last year was when …</p><p><a class="reference external" href="/archives/212-Introduction-to-SQL-for-Evergreen-administrators.html">Three years ago</a> I was
asked to create and deliver a two-day course introducing SQL to Evergreen
users. Things went well and I was able to share the resulting materials with
the Evergreen and PostgreSQL community. Perhaps one of my happiest moments at
the Evergreen conference last year was when one of the participants in that
course, told me that many of his fellow participants were still successfully
writing SQL queries and getting work done. Huzzah!</p>
<p>Time goes by and another group, <a class="reference external" href="http://www.ohionet.org">OHIONET</a>, was
running into difficulties getting started with PostgreSQL and Evergreen. They
asked me if I would be willing to give the same sort of training I had given a
few years back. "Sure", I said, thinking it would be a great opportunity to
polish the materials and add some updates to cover new features in PostgreSQL
and Evergreen. We also opted to skip the travel and do an entirely virtual
training session via Google Hangouts, which worked out rather nicely (but
that's a different story).</p>
<p>As it turned out, I probably ended up putting about four days worth of effort
(crammed into lots of late nights, weekends, and vacation days) into
overhauling the instruction materials. But the results were worth it, in my
opinion; I'm rather proud of the content, and while I believe it stands up on
its own, the guidance that I was able to provide during the live instruction
sessions was well-received by the participants.</p>
<p>Thus, I am pleased to be able to offer to the broader community the latest
version of the Introduction to SQL for Evergreen Administrators, under a
Creative Commons Attribution-ShareAlike 3.0 (Unported) license.</p>
<ul class="simple">
<li>Reference documentation--30 pages introducing SQL with examples drawn
from the Evergreen schema:
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/v2/introduction_to_sql.html">HTML</a>)
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/v2/introduction_to_sql.pdf">PDF</a>)
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/v2/introduction_to_sql.epub">ePub</a>)
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/introduction_to_sql.txt">AsciiDoc</a>)</li>
<li>Presentation:
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/SQL_instruction.odp">LibreOffice Impress</a>)
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/v2/SQL_instruction.pdf">PDF</a>)</li>
<li>Solutions to exercises:
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/solutions_day_1.txt">Day 1</a>)
(<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/solutions_day_2.txt">Day 2</a>)</li>
</ul>
<p>So, a huge thanks to OHIONET for giving me the impetus to overhaul this
material, and for giving me a chance to introduce them to the wonders of SQL
with PostgreSQL, and to the inner workings of the Evergreen schema. It was a
blast! And thanks for agreeing to let me share these materials with the broader
community.</p>
Leaving SELinux in enforcing mode with Evergreen on Fedora 172012-09-02T05:26:00-04:002012-09-02T05:26:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2012-09-02:/leaving-selinux-in-enforcing-mode-with-evergreen-on-fedora-17.html<p>Ever since I switched over to Fedora a few years back (hi Fedora 13!),
I've been guilty of a dirty secret: to run Evergreen, I've had to run
<tt class="docutils literal">setenforce 0</tt> to disable the most excellent SELinux security policies
before I could start up the Apache web server to serve up …</p><p>Ever since I switched over to Fedora a few years back (hi Fedora 13!),
I've been guilty of a dirty secret: to run Evergreen, I've had to run
<tt class="docutils literal">setenforce 0</tt> to disable the most excellent SELinux security policies
before I could start up the Apache web server to serve up the Evergreen
goodness. This worked for development purposes, but tonight something
snapped and I decided that it was no longer acceptable to throw away a
great layer of operating system security simply for the sake of hacking
on Evergreen. So... I stepped into the world of what had formerly seemed
to be inscrutable SELinux concepts, and came out with something that
seems to work (at least for my fairly limited purposes thus far of
searching the TPAC catalogue).</p>
<p>This was a pretty iterative process that involved trying to start the
<strong>httpd.service</strong>, then checking <tt class="docutils literal">/var/log/messages</tt> and
<tt class="docutils literal">/var/log/audit/audit.log</tt> for clues as to why httpd.service was
either not starting, or (once I passed that hurdle) was simply returning
internal server errors.</p>
<p>First, due to my recent experience with running a web.py script under
Fedora, I had learned that the httpd SELinux policy had a number of
booleans for enforcing or allowing particular behaviours, so I
immediately ran the following command to enable httpd to connect to the
network:</p>
<pre class="literal-block">
setsebool httpd_can_network_connect on
</pre>
<p>I then needed to change the labels on many of the OpenSRF and Evergreen
files that were installed and which Fedora gave a default type of
<tt class="docutils literal">unconfined_t</tt>, which is understandably restrictive:</p>
<pre class="literal-block">
# Mark web content as, well, web contentchcon -R --type=httpd_sys_content_t /openils/lib/javascriptchcon -R --type=httpd_sys_content_t /openils/var/webchcon -R --type=httpd_sys_content_t /openils/var/templates*chcon -R --type=httpd_sys_content_t /openils/var/datachcon -R --type=httpd_sys_content_t /openils/var/xslchcon --type=httpd_sys_content_t /openils/conf/opensrf_core.xmlchcon --type=httpd_sys_content_t /openils/conf/fm_IDL.xml # Mark the custom Apache modules chcon --user=system_u --type=httpd_modules_t /usr/lib64/httpd/modules/mod_xmlent.so chcon --user=system_u --type=httpd_modules_t /usr/lib64/httpd/modules/osrf_*# Mark the dynamic libraries we need to load# "-h" changes the context of symlinks as well as fileschcon -h --type=lib_t /openils/lib/*# Mark executable scriptschcon -t httpd_sys_script_exec_t /openils/bin/openurl_map.pl chcon -t httpd_sys_script_exec_t /openils/bin/offline-blocked-list.pl # Might not have been necessarychcon -R --user=system_u /usr/local/share/perl5/chcon --user=system_u /etc/httpd/conf.d/eg.conf chcon --user=system_u /etc/httpd/startup.pl chcon --user=system_u /etc/httpd/eg_vhost.conf chcon -R --user=system_u /etc/httpd/ssl/
</pre>
<p><strong>*Note:*</strong> I'm aware that simply running <tt class="docutils literal">chcon</tt> won't survive a
relabelling of the files. We really need to turn this into a policy, or
alternately use <tt class="docutils literal">semanage</tt> to make the changes permanent...</p>
<p>Next, I opted to finally start running Apache as the stock apache:apache
user/group rather than as the <tt class="docutils literal">opensrf</tt> user. This turned out to
require only a few steps:</p>
<ol class="arabic">
<li><p class="first">Change the <tt class="docutils literal">User</tt> setting in <tt class="docutils literal">/etc/httpd/conf/httpd.conf</tt> back to
<tt class="docutils literal">apache</tt>, reverting the change we made following the default
install documentation.</p>
</li>
<li><p class="first">To avoid errors writing to the <tt class="docutils literal">/openils/var/log</tt> directory, cut
over to using syslog - which, on Fedora, is provided by <strong>rsyslogd</strong>.</p>
</p><ol class="arabic simple">
<li>Copy the very handy <tt class="docutils literal"><span class="pre">Open-ILS/examples/evergreen-rsyslog.conf</span></tt>
file that Bill Erickson created into <tt class="docutils literal">/etc/rsyslog.d/</tt></li>
<li>Restart the <tt class="docutils literal">rsyslogd</tt> service with
<tt class="docutils literal">systemctl restart rsyslog.service</tt>.</li>
<li>Edit <tt class="docutils literal">/etc/httpd/eg_vhost.conf</tt> and
<tt class="docutils literal">/openils/conf/opensrf_core.xml</tt> to use syslog instead of
writing to log files.</li>
<li>Restart the OpenSRF services.</li>
</ol>
</li>
<li><p class="first">One more restart of the httpd service and I was in business.</p>
</li>
</ol>
<p>So this is a start. I think this has broader implications than for just
Fedora; we should stop using the <tt class="docutils literal">opensrf</tt> user to run the Apache
service in the default configuration on all distributions (we've
discussed this several times in the past, but never really done anything
about it).</p>
<p>I hope to update the README accordingly, and I also hope to take the
SELinux work a step further to provide a modified policy so that Fedora
and Red Hat (and derivative) distributions can offer a more secure
environment for running Evergreen.</p>
<p>Oh, and some handy resources:</p>
<ul class="simple">
<li><a class="reference external" href="http://wiki.centos.org/HowTos/SELinux">CentOS SELinux HOWTO</a></li>
<li><a class="reference external" href="http://wiki.centos.org/TipsAndTricks/SelinuxBooleans">CentOS SELinux
Booleans</a></li>
<li><a class="reference external" href="http://docs.fedoraproject.org/en-US/Fedora/13/html/SELinux_FAQ/">Fedora SELinux
FAQ</a></li>
</ul>
Leaving SELinux in enforcing mode with Evergreen on Fedora 172012-09-02T05:26:00-04:002012-09-02T05:26:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2012-09-02:/leaving-selinux-in-enforcing-mode-with-evergreen-on-fedora-17.html<p>Ever since I switched over to Fedora a few years back (hi Fedora 13!),
I've been guilty of a dirty secret: to run Evergreen, I've had to run
<tt class="docutils literal">setenforce 0</tt> to disable the most excellent SELinux security policies
before I could start up the Apache web server to serve up …</p><p>Ever since I switched over to Fedora a few years back (hi Fedora 13!),
I've been guilty of a dirty secret: to run Evergreen, I've had to run
<tt class="docutils literal">setenforce 0</tt> to disable the most excellent SELinux security policies
before I could start up the Apache web server to serve up the Evergreen
goodness. This worked for development purposes, but tonight something
snapped and I decided that it was no longer acceptable to throw away a
great layer of operating system security simply for the sake of hacking
on Evergreen. So... I stepped into the world of what had formerly seemed
to be inscrutable SELinux concepts, and came out with something that
seems to work (at least for my fairly limited purposes thus far of
searching the TPAC catalogue).</p>
<p>This was a pretty iterative process that involved trying to start the
<strong>httpd.service</strong>, then checking <tt class="docutils literal">/var/log/messages</tt> and
<tt class="docutils literal">/var/log/audit/audit.log</tt> for clues as to why httpd.service was
either not starting, or (once I passed that hurdle) was simply returning
internal server errors.</p>
<p>First, due to my recent experience with running a web.py script under
Fedora, I had learned that the httpd SELinux policy had a number of
booleans for enforcing or allowing particular behaviours, so I
immediately ran the following command to enable httpd to connect to the
network:</p>
<pre class="literal-block">
setsebool httpd_can_network_connect on
</pre>
<p>I then needed to change the labels on many of the OpenSRF and Evergreen
files that were installed and which Fedora gave a default type of
<tt class="docutils literal">unconfined_t</tt>, which is understandably restrictive:</p>
<pre class="literal-block">
# Mark web content as, well, web contentchcon -R --type=httpd_sys_content_t /openils/lib/javascriptchcon -R --type=httpd_sys_content_t /openils/var/webchcon -R --type=httpd_sys_content_t /openils/var/templates*chcon -R --type=httpd_sys_content_t /openils/var/datachcon -R --type=httpd_sys_content_t /openils/var/xslchcon --type=httpd_sys_content_t /openils/conf/opensrf_core.xmlchcon --type=httpd_sys_content_t /openils/conf/fm_IDL.xml # Mark the custom Apache modules chcon --user=system_u --type=httpd_modules_t /usr/lib64/httpd/modules/mod_xmlent.so chcon --user=system_u --type=httpd_modules_t /usr/lib64/httpd/modules/osrf_*# Mark the dynamic libraries we need to load# "-h" changes the context of symlinks as well as fileschcon -h --type=lib_t /openils/lib/*# Mark executable scriptschcon -t httpd_sys_script_exec_t /openils/bin/openurl_map.pl chcon -t httpd_sys_script_exec_t /openils/bin/offline-blocked-list.pl # Might not have been necessarychcon -R --user=system_u /usr/local/share/perl5/chcon --user=system_u /etc/httpd/conf.d/eg.conf chcon --user=system_u /etc/httpd/startup.pl chcon --user=system_u /etc/httpd/eg_vhost.conf chcon -R --user=system_u /etc/httpd/ssl/
</pre>
<p><strong>*Note:*</strong> I'm aware that simply running <tt class="docutils literal">chcon</tt> won't survive a
relabelling of the files. We really need to turn this into a policy, or
alternately use <tt class="docutils literal">semanage</tt> to make the changes permanent...</p>
<p>Next, I opted to finally start running Apache as the stock apache:apache
user/group rather than as the <tt class="docutils literal">opensrf</tt> user. This turned out to
require only a few steps:</p>
<ol class="arabic">
<li><p class="first">Change the <tt class="docutils literal">User</tt> setting in <tt class="docutils literal">/etc/httpd/conf/httpd.conf</tt> back to
<tt class="docutils literal">apache</tt>, reverting the change we made following the default
install documentation.</p>
</li>
<li><p class="first">To avoid errors writing to the <tt class="docutils literal">/openils/var/log</tt> directory, cut
over to using syslog - which, on Fedora, is provided by <strong>rsyslogd</strong>.</p>
</p><ol class="arabic simple">
<li>Copy the very handy <tt class="docutils literal"><span class="pre">Open-ILS/examples/evergreen-rsyslog.conf</span></tt>
file that Bill Erickson created into <tt class="docutils literal">/etc/rsyslog.d/</tt></li>
<li>Restart the <tt class="docutils literal">rsyslogd</tt> service with
<tt class="docutils literal">systemctl restart rsyslog.service</tt>.</li>
<li>Edit <tt class="docutils literal">/etc/httpd/eg_vhost.conf</tt> and
<tt class="docutils literal">/openils/conf/opensrf_core.xml</tt> to use syslog instead of
writing to log files.</li>
<li>Restart the OpenSRF services.</li>
</ol>
</li>
<li><p class="first">One more restart of the httpd service and I was in business.</p>
</li>
</ol>
<p>So this is a start. I think this has broader implications than for just
Fedora; we should stop using the <tt class="docutils literal">opensrf</tt> user to run the Apache
service in the default configuration on all distributions (we've
discussed this several times in the past, but never really done anything
about it).</p>
<p>I hope to update the README accordingly, and I also hope to take the
SELinux work a step further to provide a modified policy so that Fedora
and Red Hat (and derivative) distributions can offer a more secure
environment for running Evergreen.</p>
<p>Oh, and some handy resources:</p>
<ul class="simple">
<li><a class="reference external" href="http://wiki.centos.org/HowTos/SELinux">CentOS SELinux HOWTO</a></li>
<li><a class="reference external" href="http://wiki.centos.org/TipsAndTricks/SelinuxBooleans">CentOS SELinux
Booleans</a></li>
<li><a class="reference external" href="http://docs.fedoraproject.org/en-US/Fedora/13/html/SELinux_FAQ/">Fedora SELinux
FAQ</a></li>
</ul>
Enabling mod_wsgi with LDAP access under Fedora 172012-07-11T14:23:00-04:002012-07-11T14:23:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2012-07-11:/enabling-mod_wsgi-with-ldap-access-under-fedora-17.html<p>Continuing my path of <em>new problem to solve = opportunity to try
something new</em>, I opted to give <a class="reference external" href="http://webpy.org">web.py</a> a shot as
a Web front-end for an existing script I had put together to <a class="reference external" href="http://git.evergreen-ils.org/?p=contrib/Conifer.git;a=blob;f=tools/patron-load/ldap_osrf_sync">provision
users in our Evergreen library system based on their LDAP
entry</a>.
The goal was to …</p><p>Continuing my path of <em>new problem to solve = opportunity to try
something new</em>, I opted to give <a class="reference external" href="http://webpy.org">web.py</a> a shot as
a Web front-end for an existing script I had put together to <a class="reference external" href="http://git.evergreen-ils.org/?p=contrib/Conifer.git;a=blob;f=tools/patron-load/ldap_osrf_sync">provision
users in our Evergreen library system based on their LDAP
entry</a>.
The goal was to provide access to the functionality of the script,
without having me as a single point of failure... something I have
intended to put in place for a long time, but which jumped up in
priority once I went on vacation and received a few requests (surprise,
surprise).</p>
<p>Creating a web.py front end was easy enough. It was a bit more
challenging to put it into production, because my production box for
this task runs Fedora 17, and that means SELinux. In the past, my
knee-jerk reaction during development would be to <tt class="docutils literal">setenforce 0</tt> and
be done with it, but exposing it to more than just me at the terminal
means taking some care. So, fortunately, it was pretty easy to sort out
(thanks largely to the assistance gleaned from <a class="reference external" href="http://www.packtpub.com/article/selinux-secured-web-hosting-python-based-web-applications">this Packtpub.com
article</a>,
minus the compiling mod_wsgi from source bits).</p>
<p>The pertinent bits for my case were:</p>
<ol class="arabic simple">
<li>Install mod_wsgi and web.py: <tt class="docutils literal">yum install mod_wsgi <span class="pre">python-webpy</span></tt></li>
<li>Configure <tt class="docutils literal">/etc/httpd/conf/httpd.conf</tt> to include the appropriate
<tt class="docutils literal">WSGIScriptAlias</tt> line</li>
<li>Fix the SELinux label on the WSGI files:
<tt class="docutils literal">chcon <span class="pre">-R</span> httpd_user_content_t <span class="pre">patron-load</span></tt></li>
<li>Allow Apache to connect to an LDAP server:
<tt class="docutils literal">setsebool <span class="pre">-P</span> httpd_can_connect_ldap=1</tt></li>
</ol>
<p>And poof: my server still has the protection of SELinux, and my desired
functionality works. Yay!</p>
Enabling mod_wsgi with LDAP access under Fedora 172012-07-11T14:23:00-04:002012-07-11T14:23:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2012-07-11:/enabling-mod_wsgi-with-ldap-access-under-fedora-17.html<p>Continuing my path of <em>new problem to solve = opportunity to try
something new</em>, I opted to give <a class="reference external" href="http://webpy.org">web.py</a> a shot as
a Web front-end for an existing script I had put together to <a class="reference external" href="http://git.evergreen-ils.org/?p=contrib/Conifer.git;a=blob;f=tools/patron-load/ldap_osrf_sync">provision
users in our Evergreen library system based on their LDAP
entry</a>.
The goal was to …</p><p>Continuing my path of <em>new problem to solve = opportunity to try
something new</em>, I opted to give <a class="reference external" href="http://webpy.org">web.py</a> a shot as
a Web front-end for an existing script I had put together to <a class="reference external" href="http://git.evergreen-ils.org/?p=contrib/Conifer.git;a=blob;f=tools/patron-load/ldap_osrf_sync">provision
users in our Evergreen library system based on their LDAP
entry</a>.
The goal was to provide access to the functionality of the script,
without having me as a single point of failure... something I have
intended to put in place for a long time, but which jumped up in
priority once I went on vacation and received a few requests (surprise,
surprise).</p>
<p>Creating a web.py front end was easy enough. It was a bit more
challenging to put it into production, because my production box for
this task runs Fedora 17, and that means SELinux. In the past, my
knee-jerk reaction during development would be to <tt class="docutils literal">setenforce 0</tt> and
be done with it, but exposing it to more than just me at the terminal
means taking some care. So, fortunately, it was pretty easy to sort out
(thanks largely to the assistance gleaned from <a class="reference external" href="http://www.packtpub.com/article/selinux-secured-web-hosting-python-based-web-applications">this Packtpub.com
article</a>,
minus the compiling mod_wsgi from source bits).</p>
<p>The pertinent bits for my case were:</p>
<ol class="arabic simple">
<li>Install mod_wsgi and web.py: <tt class="docutils literal">yum install mod_wsgi <span class="pre">python-webpy</span></tt></li>
<li>Configure <tt class="docutils literal">/etc/httpd/conf/httpd.conf</tt> to include the appropriate
<tt class="docutils literal">WSGIScriptAlias</tt> line</li>
<li>Fix the SELinux label on the WSGI files:
<tt class="docutils literal">chcon <span class="pre">-R</span> httpd_user_content_t <span class="pre">patron-load</span></tt></li>
<li>Allow Apache to connect to an LDAP server:
<tt class="docutils literal">setsebool <span class="pre">-P</span> httpd_can_connect_ldap=1</tt></li>
</ol>
<p>And poof: my server still has the protection of SELinux, and my desired
functionality works. Yay!</p>
Running libraries on PostgreSQL: PGCon 2012 talk2012-05-20T17:57:00-04:002012-05-20T17:57:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2012-05-20:/running-libraries-on-postgresql-pgcon-2012-talk.html<p>On Friday, May 18th I gave <a class="reference external" href="http://www.pgcon.org/2012/schedule/events/465.en.html">a talk</a> at the PGCon 2012
conference on the use of PostgreSQL by the Evergreen project. My talk fell in
the <em>case study</em> track, which meant that I had been asked to describe to
PostgreSQL developers what Evergreen was, why it was a project …</p><p>On Friday, May 18th I gave <a class="reference external" href="http://www.pgcon.org/2012/schedule/events/465.en.html">a talk</a> at the PGCon 2012
conference on the use of PostgreSQL by the Evergreen project. My talk fell in
the <em>case study</em> track, which meant that I had been asked to describe to
PostgreSQL developers what Evergreen was, why it was a project they might want
to care about, enumerate the advantages that Evergreen gets from using
PostgreSQL, and where our project has some difficulties with PostgreSQL.</p>
<p>I have given a lot of talks before, but I’m used to being on the developer side
of the discussion. In this case, the tables were turned; with noted PostgreSQL
contributors like Josh Berkus, Chris Brown, Simon Riggs, and Robert Treat in
the audience, I was a user talking to the developers of something that I was
very much dependent on and which I understood at a much more basic level than
they did. This was both liberating <em>and</em> humbling; it definitely adds some
perspective to my experiences as a developer in the Evergreen project.</p>
<p>Along with my slides, the whole talk has been professionally recorded - both
video and audio - thanks to Heroku’s sponsorship, so you will be able to relive
each and every word if you really want to. I’ll summarize the main points that
I wanted to convey to the PostgreSQL developers:</p>
<ul class="simple">
<li>I was quite candid that most libraries can’t afford dedicated database
administrators, and that therefore the more that PostgreSQL can provide
reasonable out-of-the-box configuration settings, the better. For example,
results from <a class="reference external" href="http://evergreen-ils.org/~denials/postgresql_survey.html">the survey that I sent out at the last minute</a> (THANK YOU to
the nine sites that responded!) showed many sites running with a default
statistics target of 50, whereas the default had been increased to 100 back
in PostgreSQL 8.1 and much higher settings are often recommended to help the
planner make its decisions. That said, my survey didn’t ask for table-level
statistics settings (did you <strong>know</strong> that you could change the statistics
for particular tables?), so perhaps some sites are using higher statistics
levels for particular tables and a lower default threshold.</li>
<li>It was probably hokey, but I noted that as libraries are often called the
heart of their community, that PostgreSQL was effectively the heart of
Evergreen — and I invited the PostgreSQL community to help our heart beat
faster. With the Evergreen Oversight Board contemplating a strategic
investment fund for initiatives that will have a long-term benefit to
Evergreen, this might be an avenue for getting PostgreSQL experts to assist
us on areas that represent particular bottlenecks (beyond helping us out of
the goodness of their own hearts). As well, I invited the PostgreSQL
community to join in advocacy efforts to get their local libraries to
consider adopting Evergreen.</li>
<li>I described, at a high-level, many of the PostgreSQL features that Evergreen
relies on (full-text search, stored procedures, Hstore, inheritance) and
tried to convey why our schema takes up 355 tables (and counting) to deal
with what, from outside a library perspective, must seem like a relatively
simple problem to deal with. And of course I gave most of the credit for
Evergreen’s PostgreSQL-savviness on multiple levels to Mike Rylander.</li>
</ul>
<p>The talk was well-received, based on a number of people who approached me
afterward to continue the discussion. Josh called it one of the first times he
had seen a presentation designed to solicit assistance directly from the
developers in attendance (I probably overplayed the "help us poor harried
library system administrators" hand) and thought that it hit the mark for a
case study; similarly, Simon was quite interested in Evergreen’s adoption
patterns with (I suspect) an eye towards offering possible consulting in
administration and optimization efforts.</p>
<p>On the "immediate takeaways" from that talk:</p>
<ul class="simple">
<li>For straightforward connection pooling, pgbouncer is the current
recommendation over the more flexible but more complicated pgpool-II.</li>
<li>Recent versions of Slony have lifted limitations that bit us in the
past, like the inability to replicate a TRUNCATE command.</li>
<li>Solr, as a potential alternative to PostgreSQL’s full-text search, is
seen as fast but brittle to manage, and adds in overhead to maintain
consistency with the contents of the database. (I’m not so sure about the
brittleness, given Hathitrust’s ability to run a massive Solr index, but it
is worth following up on…)</li>
<li>Streaming replication in 9.1 has improved significantly over 9.0,
although you’ll still want to have WAL archiving in case of disaster.</li>
</ul>
<p>I have a lot more to say about the intersection of the PostgreSQL and Evergreen
communities in general, but on the whole I think that a closer relationship has
been long overdue. I was delighted that Ben Shum and Robin Isard were both able
to attend the conference, and I firmly believe that building more PostgreSQL
development and administration expertise within the Evergreen community is
critical to our long-term success. While I have long been an advocate of
pointing community members to the documentation of the underlying
infrastructure components for specific administration recommendations, I
believe that effective PostgreSQL tuning and administration is so critical to
the successful implementation of a production Evergreen site that we should add
a section to the Evergreen documentation containing a small set of
considerations and/or processes for going into production—and I hope to start
that relatively soon.</p>
Running libraries on PostgreSQL: PGCon 2012 talk2012-05-20T17:57:00-04:002012-05-20T17:57:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2012-05-20:/running-libraries-on-postgresql-pgcon-2012-talk.html<p>On Friday, May 18th I gave <a class="reference external" href="http://www.pgcon.org/2012/schedule/events/465.en.html">a talk</a> at the PGCon 2012
conference on the use of PostgreSQL by the Evergreen project. My talk fell in
the <em>case study</em> track, which meant that I had been asked to describe to
PostgreSQL developers what Evergreen was, why it was a project …</p><p>On Friday, May 18th I gave <a class="reference external" href="http://www.pgcon.org/2012/schedule/events/465.en.html">a talk</a> at the PGCon 2012
conference on the use of PostgreSQL by the Evergreen project. My talk fell in
the <em>case study</em> track, which meant that I had been asked to describe to
PostgreSQL developers what Evergreen was, why it was a project they might want
to care about, enumerate the advantages that Evergreen gets from using
PostgreSQL, and where our project has some difficulties with PostgreSQL.</p>
<p>I have given a lot of talks before, but I’m used to being on the developer side
of the discussion. In this case, the tables were turned; with noted PostgreSQL
contributors like Josh Berkus, Chris Brown, Simon Riggs, and Robert Treat in
the audience, I was a user talking to the developers of something that I was
very much dependent on and which I understood at a much more basic level than
they did. This was both liberating <em>and</em> humbling; it definitely adds some
perspective to my experiences as a developer in the Evergreen project.</p>
<p>Along with my slides, the whole talk has been professionally recorded - both
video and audio - thanks to Heroku’s sponsorship, so you will be able to relive
each and every word if you really want to. I’ll summarize the main points that
I wanted to convey to the PostgreSQL developers:</p>
<ul class="simple">
<li>I was quite candid that most libraries can’t afford dedicated database
administrators, and that therefore the more that PostgreSQL can provide
reasonable out-of-the-box configuration settings, the better. For example,
results from <a class="reference external" href="http://evergreen-ils.org/~denials/postgresql_survey.html">the survey that I sent out at the last minute</a> (THANK YOU to
the nine sites that responded!) showed many sites running with a default
statistics target of 50, whereas the default had been increased to 100 back
in PostgreSQL 8.1 and much higher settings are often recommended to help the
planner make its decisions. That said, my survey didn’t ask for table-level
statistics settings (did you <strong>know</strong> that you could change the statistics
for particular tables?), so perhaps some sites are using higher statistics
levels for particular tables and a lower default threshold.</li>
<li>It was probably hokey, but I noted that as libraries are often called the
heart of their community, that PostgreSQL was effectively the heart of
Evergreen — and I invited the PostgreSQL community to help our heart beat
faster. With the Evergreen Oversight Board contemplating a strategic
investment fund for initiatives that will have a long-term benefit to
Evergreen, this might be an avenue for getting PostgreSQL experts to assist
us on areas that represent particular bottlenecks (beyond helping us out of
the goodness of their own hearts). As well, I invited the PostgreSQL
community to join in advocacy efforts to get their local libraries to
consider adopting Evergreen.</li>
<li>I described, at a high-level, many of the PostgreSQL features that Evergreen
relies on (full-text search, stored procedures, Hstore, inheritance) and
tried to convey why our schema takes up 355 tables (and counting) to deal
with what, from outside a library perspective, must seem like a relatively
simple problem to deal with. And of course I gave most of the credit for
Evergreen’s PostgreSQL-savviness on multiple levels to Mike Rylander.</li>
</ul>
<p>The talk was well-received, based on a number of people who approached me
afterward to continue the discussion. Josh called it one of the first times he
had seen a presentation designed to solicit assistance directly from the
developers in attendance (I probably overplayed the "help us poor harried
library system administrators" hand) and thought that it hit the mark for a
case study; similarly, Simon was quite interested in Evergreen’s adoption
patterns with (I suspect) an eye towards offering possible consulting in
administration and optimization efforts.</p>
<p>On the "immediate takeaways" from that talk:</p>
<ul class="simple">
<li>For straightforward connection pooling, pgbouncer is the current
recommendation over the more flexible but more complicated pgpool-II.</li>
<li>Recent versions of Slony have lifted limitations that bit us in the
past, like the inability to replicate a TRUNCATE command.</li>
<li>Solr, as a potential alternative to PostgreSQL’s full-text search, is
seen as fast but brittle to manage, and adds in overhead to maintain
consistency with the contents of the database. (I’m not so sure about the
brittleness, given Hathitrust’s ability to run a massive Solr index, but it
is worth following up on…)</li>
<li>Streaming replication in 9.1 has improved significantly over 9.0,
although you’ll still want to have WAL archiving in case of disaster.</li>
</ul>
<p>I have a lot more to say about the intersection of the PostgreSQL and Evergreen
communities in general, but on the whole I think that a closer relationship has
been long overdue. I was delighted that Ben Shum and Robin Isard were both able
to attend the conference, and I firmly believe that building more PostgreSQL
development and administration expertise within the Evergreen community is
critical to our long-term success. While I have long been an advocate of
pointing community members to the documentation of the underlying
infrastructure components for specific administration recommendations, I
believe that effective PostgreSQL tuning and administration is so critical to
the successful implementation of a production Evergreen site that we should add
a section to the Evergreen documentation containing a small set of
considerations and/or processes for going into production—and I hope to start
that relatively soon.</p>
The State / Stats of Evergreen development: 2011-20122012-04-30T00:56:00-04:002012-04-30T00:56:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2012-04-30:/the-state-stats-of-evergreen-development-2011-2012.html<p>On Thursday, April 26, I was part of <strong>The State of Evergreen</strong> talk,
organized by Grace Dunbar, that also included sections by the dynamic
combo of Kathy Lussier, Ben Hyman, and <a class="reference external" href="http://tararobertson.ca">Tara
Robertson</a>. We opened the <a class="reference external" href="http://evergreen2012.org/">Evergreen 2012
conference</a> and lead into the day's
featured keynote speaker <a class="reference external" href="http://jonobacon.org">Mr. Jono Bacon …</a></p><p>On Thursday, April 26, I was part of <strong>The State of Evergreen</strong> talk,
organized by Grace Dunbar, that also included sections by the dynamic
combo of Kathy Lussier, Ben Hyman, and <a class="reference external" href="http://tararobertson.ca">Tara
Robertson</a>. We opened the <a class="reference external" href="http://evergreen2012.org/">Evergreen 2012
conference</a> and lead into the day's
featured keynote speaker <a class="reference external" href="http://jonobacon.org">Mr. Jono Bacon</a> (who,
by the way, gave a good talk about community at an important time in
Evergreen's growth).</p>
<p>My assigned mission was, with a time limit of 5 minutes, to give the
audience an update on the progress in Evergreen development since the
2011 conference. Naturally, I turned to
<a class="reference external" href="http://code.google.com/p/gource/">gource</a> to generate a
visualization of the <a class="reference external" href="http://archive.org/details/Evergreen2011-2012SourceCodeVisualization">changes committed to the Evergreen git
repository</a>
since April 2011.</p>
<p>With the visualization running in the background, I ran over the
following numbers (<em>statistics</em> is probably too strong of a word) with
the audience...</p>
<hr class="docutils" />
<div id="header" class="slide"><p class="rubric" id="state-of-evergreen-development-2012">State of Evergreen Development, 2012</p>
<div class="line-block">
<div class="line">Dan Scott</div>
</div>
<p></div><div id="preamble" class="slide"><div class="sectionbody" style="max-width:45em"><div class="paragraph"><p>Let’s go with <strong>*Stats</strong> of Evergreen development*</p>
</div>
<p></div>
<p></div><div class="sect1 slide"><p class="rubric" id="code-contributors">Code contributors</p>
<div class="sectionbody" style="max-width:45em"><div class="paragraph"><p>Over the past year, we have seen:</p>
</div><ul>
<li><div class="first"></p></div><p>2209 commits from a total of <strong>29</strong> different authors (8 active core
committers)</p>
<p></li>
<li><div class="first"></p></div><p>9 contributors outside of the core committer group with 5 or more
commits:</p>
<ul>
<li><div class="first"></p></div><p><em>Jason Stephenson</em> - 48</p>
<p></li>
<li><div class="first"></p></div><p><em>Michael Peters</em> - 26</p>
<p></li>
<li><div class="first"></p></div><p><em>Scott Prater</em> - 20</p>
<p></li>
<li><div class="first"></p></div><p><em>Joseph Lewis</em> - 19</p>
<p></li>
<li><div class="first"></p></div><p><em>James Fournie</em> - 16</p>
<p></li>
<li><div class="first"></p></div><p><em>Robin Isard</em> - 12</p>
<p></li>
<li><div class="first"></p></div><p><em>Liam Whalen</em> - 6</p>
<p></li>
<li><div class="first"></p></div><p><em>Ben Shum</em> - 6</p>
<p></li>
<li><div class="first"></p></div><p><em>Steven Callender</em> - 5</p>
<p></li>
</ul>
</p>
<p></li>
<li><div class="first"></p></div><p>One female contributor - <em>Sarah Chodrow</em> (More, please!)</p>
<p></li>
</ul>
<div class="paragraph"><p><a class="reference external" href="http://archive.org/details/Evergreen2011-2012SourceCodeVisualization">Source code
visualization</a></p>
</div>
<p></div>
<p></div><div class="sect1 slide"><p class="rubric" id="features">Features</p>
<div class="sectionbody" style="max-width:45em"><ul class="simple">
<li>Autosuggest for searches</li>
<li>TPAC - a sane, fast, functional catalogue
- Print & email & SMS record details
- Opt-in circulation & hold history</li>
<li>Authentication proxy - with example support for LDAP authentication in JSPAC</li>
<li>Custom library hierarchies, library visibility, and copy location
groups</li>
<li>Staff client enhancements: secondary sorting columns, row numbers,
double-clickery, configurable toolbars</li>
<li>Patron statistical categories: defaults, freetext control,
required-ness</li>
<li>Acquisitions, MARC Batch Import/Export, and serials UI enhancements</li>
<li>Circulation limits</li>
</ul>
<p class="rubric" id="policies-and-procedures">Policies and procedures</p>
<div class="sectionbody" style="max-width:45em"><ul>
<li><div class="first"></p></div><p><em>Master is always stable</em></p>
<ul>
<li><div class="first"></p></div><p>To avoid time-wasting regressions, every commit must be reviewed</p>
<p>and tested by a second developer</p>
<p></li>
</ul>
</p>
<p></li>
<li><div class="first"></p></div><p><em>Timed releases</em> - for predictability</p>
<ul>
<li><div class="first"></p></div><p>One major release every six months, starting with 2.2.0</p>
<p></li>
<li><div class="first"></p></div><p>Patch releases - no timed policy as of yet</p>
<p></li>
</ul>
</p>
<p></li>
<li><div class="first"></p></div><p><em>Community support policy</em></p>
<ul>
<li><div class="first"></p></div><p>Each major release gets 12 months of full support, followed by 3</p>
<p>months of security patches</p>
<p></li>
<li><div class="first"></p></div><p>Therefore, sites should plan on one major upgrade per year</p>
<p></li>
</ul>
</p>
<p></li>
<li><p class="first">Database upgrade script sanity</p>
</li>
</ul>
<p class="rubric" id="communication">Communication</p>
<ul>
<li><p class="first"><a class="reference external" href="http://libmail.georgialibraries.org/mailman/listinfo/open-ils-dev">Developer mailing
list</a>
- 970 messages</p>
</li>
<li><p class="first"><a class="reference external" href="http://evergreen-ils.org/irc.php">Internet relay chat (IRC)
channel</a>
- 76,476 lines <a class="reference external" href="http://goo.gl/E0fxd">and other stats</a></p>
<blockquote>
<ul class="simple">
<li><strong>tsbere</strong> and <strong>dbs</strong> in a neck-and-neck race with 13,474 and 12,062 lines, respectively</li>
<li>26 people averaged more than one lines per day</li>
</ul>
</blockquote>
</li>
<li><p class="first"><a class="reference external" href="http://evergreen-ils.org/dokuwiki/doku.php?id=dev:meetings">Developer IRC
meetings</a>
- 19 meetings held</p>
</li>
</ul>
<p class="rubric" id="documentation">Documentation</p>
<div class="sectionbody" style="max-width:45em"><div class="paragraph"><p>Since last year:</p>
</div><ul>
<li><div class="first"></p></div><p>12 meetings</p>
<p></li>
<li><div class="first"></p></div><p>200 commits, covering 2.0, 2.1, and 2.2</p>
<p></li>
<li><div class="first"></p></div><p>Conversion from DocBook to AsciiDoc</p>
<p></li>
<li><div class="first"></p></div><p>Single sourcing install documentation and release notes</p>
<p></li>
</ul>
<div class="paragraph"><p>Kudos to:</p>
</div><ul>
<li><div class="first"></p></div><p>Karen Collier for direction and organization</p>
<p></li>
<li><div class="first"></p></div><p>Robert Soulliere for tirelessly formatting and publishing</p>
<p></li>
<li><p class="first">Yamil Suarez for picking up the torch from Karen</p>
</li>
<li><p class="first">Many other members of the Documentation Interest Group (<em>DIG</em>)</p>
</li>
</ul>
<p class="rubric" id="releases">Releases</p>
<div class="sectionbody" style="max-width:45em"><ul class="simple">
<li><strong>2.0 series</strong></li>
</ul>
<blockquote>
<ul>
<li><p class="first"><em>April 2011</em> - 2.0.5</p>
</li>
<li><p class="first"><em>May 2011</em> - 2.0.6</p>
</li>
<li><p class="first"><em>June 2011</em> - 2.0.7</p>
<p></li>
<li><div class="first"></p></div><p><em>August 2011</em> - 2.0.8, 2.0.9</p>
<p></li>
<li><div class="first"></p></div><p><em>October 2011</em> - 2.0.10, 2.0.10a</p>
<p></li>
</ul>
</p>
<p></blockquote>
<ul>
<li><div class="first"></p></div><p><strong>2.1 series</strong></p>
<ul>
<li><div class="first"></p></div><p><em>October 2011</em> - 2.1.0, 2.1.0a</p>
<p></li>
<li><div class="first"></p></div><p><em>November 2011</em> - 2.1.1</p>
<p></li>
</ul>
</p>
<p></li>
<li><div class="first"></p></div><p><strong>2.2 series</strong></p>
<ul>
<li><div class="first"></p></div><p><em>November 2011</em> - 2.2 alpha1</p>
<p></li>
<li><div class="first"></p></div><p><em>March 2012</em> - 2.2 alpha2, 2.2 alpha3</p>
<p></li>
<li><div class="first"></p></div><p><em>April 2012</em> - 2.2 beta1, 2.2 beta2</p>
<p></li>
</ul>
</p></li>
</ul>
</div>
<p></div>Why I donated to the Software Freedom Conservancy2011-12-26T14:15:00-05:002011-12-26T14:15:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2011-12-26:/why-i-donated-to-the-software-freedom-conservancy.html<p>A few days ago I made a small donation to the <a class="reference external" href="http://sfconservancy.org">Software Freedom Conservancy</a>, a 501(c)(3) non-profit organization registered
in the United States. There are many organizations to which I could have
donated, and indeed Lynn and I have donated to a number of charities again this
year …</p><p>A few days ago I made a small donation to the <a class="reference external" href="http://sfconservancy.org">Software Freedom Conservancy</a>, a 501(c)(3) non-profit organization registered
in the United States. There are many organizations to which I could have
donated, and indeed Lynn and I have donated to a number of charities again this
year, but I felt it was important to direct some funds to the Conservancy for a
number of reasons - which I will attempt to describe and hopefully convince you
as well.</p>
<p>First, for those who know that the Evergreen open source integrated library
system is a member project of the Conservancy and the the project on which I
invest much of my professional and person time, an obvious question might be:
"Why didn't you just <a class="reference external" href="http://evergreen-ils.org/sfc.php">donate to
Evergreen?</a>". Donating to Evergreen does result in a small percentage of those
funds being directed to the Conservancy. Currently, Evergreen directs 5% of its
income to the Conservancy, but I feel that even with $20,000 passing through
the project's hands for the purposes of the 2012 Evergreen conference, that
$1,000 that goes to the Conservancy is far below the value our project has
received in return in the form of Conservancy services. One of those services
is the provision of a trusted third-party home for project assets such as the
aforementioned finances, but also including domain names, trademarks, logos,
and (if desired) copyright. While distributed ownership of these assets is not
a problem for projects when everything is going fine, personal disputes, a
change of business strategy, or new ownership of a contributing company can
lead to severe difficulties for a project. Evergreen's sister project, <a class="reference external" href="http://koha-community.org">Koha</a>, found itself forced to change its domain name
and fight trademark battles over its very name when one company adopted an
aggressive business strategy.</p>
<p>Another service from which Evergreen has thus far derived great benefit is
access to legal counsel familiar with software freedom issues. In September the
Conservancy <a class="reference external" href="http://sfconservancy.org/news/2011/sep/30/general-counsel/">added Tony Sebro as General Counsel</a> to offer basic
legal assistance to its member projects. The Conservancy was most recently
involved in a discussion about Evergreen documentation licensing that evolved
from an unfortunately adversarial position to, shortly after the Conservancy
became involved, a mutually satisfactory agreement. I believe this result was
due not only to Conservancy's legal expertise and familiarity with the specific
licenses in question and the general mechanism of granting licenses, but also
with their ability to understand the goals of the project and its participants
in helping to guide all parties to their desired goals.</p>
<p>The Conservancy also has a wealth of experience to draw upon to offer guidance
expertise on many matters that free software projects have in common, but which
each project tends to rediscover on its own. For example, the Evergreen project
has been able to run conferences on an annual basis for the past three years,
but has historically relied on Equinox's willingess to assume the financial
risks when signing venue contracts. This year, due to the positive results of
the previous conferences, the Conservancy was able to provide the deposit for
the Evergreen 2012 conference in Indiana. While personally I deeply appreciate
the role that Equinox has played in helping to build such a core part of our
community experience, it is an important step for our project that the
Conservancy be able to assume this role.</p>
<p>In addition, the Conservancy's experience with various conference management
packages and the payment fees associated with online financial services such as
Google Checkout and PayPal provided some important guidance early on in the
Evergreen conference 2012 planning process. That advice probably paid for
itself!</p>
<p>I expect that the Evergreen project will continue to benefit from our
membership in the Software Freedom Conservancy as we work towards a mechanism
for electing members of the Evergreen Oversight Board and continue growing and
evolving the project. The $1,000 or so that the Conservancy has earned as a
result of the 5% of revenue that Evergreen directs its way is far below the
value that we have derived from our relationship thus far, and that is why I
have chosen to donate to the Conservancy again this year.</p>
<p>P.S. As a 501(c)(3) non-profit, donations to the Conservancy are tax-deductible
for American citizens. As a Canadian, this particular benefit does not apply to
me - however, the rest of the benefits that the Conservancy provides to free
software projects are international in scope and deserve to be supported.</p>
Current state of academic reserves support for Evergreen2011-09-08T03:09:00-04:002011-09-08T03:09:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2011-09-08:/current-state-of-academic-reserves-support-for-evergreen.html<p>One of the relatively frequent questions that I run into with Evergreen
is "Does Evergreen have an academic reserves module?" And the answer is:
well, yes, and no. There is no official academic reserves module that is
part of the standard Evergreen package that you download and install
from <a class="reference external" href="http://evergreen-ils.org">http …</a></p><p>One of the relatively frequent questions that I run into with Evergreen
is "Does Evergreen have an academic reserves module?" And the answer is:
well, yes, and no. There is no official academic reserves module that is
part of the standard Evergreen package that you download and install
from <a class="reference external" href="http://evergreen-ils.org">http://evergreen-ils.org</a>.</p>
<p>However, I am aware of two free-and-open-source modules that are
available as extensions to Evergreen:</p>
<ol class="arabic">
<li><p class="first">A relatively simple, straightforward module, written by my colleague
Kevin Beswick, is in use at Laurentian University and recently was
adopted by the <strong>emily carr university of art + design</strong>. It builds
on Evergreen's bookbags feature to organize reserves of physical
items by class code and instructor name. The module for that code--a
mix of PHP, Dojo, and SQLite--is available on
<a class="reference external" href="https://github.com/kbeswick/library/tree/master/reserves">Github</a>,
and you can see it in action at <a class="reference external" href="http://biblio.laurentian.ca/reserves/">Laurentian
University</a>.</p>
</p><ul class="simple">
<li><strong>UPDATE 2012-12-21</strong>: See the version I forked at
<a class="reference external" href="https://github.com/dbs/library/tree/master/reserves">https://github.com/dbs/library/tree/master/reserves</a> with updates
supporting TPAC integration</li>
</ul>
</li>
<li><p class="first"><strong>Syrup</strong> is a more sophisticated reserve system (you know it's a
serious project when it has a name!), which supports all kinds of
features - such as mixes of electronic and physical materials,
organizing course content by arbitrary groupings (e.g. readings per
week), limiting user access to the content of specific courses based
on LDAP integration, and much much more. You can see a running
instance at the <a class="reference external" href="http://reserves.uwindsor.ca/syrup/browse/">University of
Windsor</a> and the code
(primarily written in Python) is freely available from the <a class="reference external" href="http://git.evergreen-ils.org/?p=Syrup.git;a=summary">Syrup git
repository</a>
on Evergreen's git server. If you need help getting up and running,
Syrup's <a class="reference external" href="%20http://groups.google.com/group/syrup-reserves-discuss">mailing
list</a> is
probably a good place to start.</p>
</li>
</ol>
<p>So, there are at least two choices for academic reserves for Evergreen.
Go ahead and pick the one that meets your needs!</p>
Current state of academic reserves support for Evergreen2011-09-08T03:09:00-04:002011-09-08T03:09:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2011-09-08:/current-state-of-academic-reserves-support-for-evergreen.html<p>One of the relatively frequent questions that I run into with Evergreen
is "Does Evergreen have an academic reserves module?" And the answer is:
well, yes, and no. There is no official academic reserves module that is
part of the standard Evergreen package that you download and install
from <a class="reference external" href="http://evergreen-ils.org">http …</a></p><p>One of the relatively frequent questions that I run into with Evergreen
is "Does Evergreen have an academic reserves module?" And the answer is:
well, yes, and no. There is no official academic reserves module that is
part of the standard Evergreen package that you download and install
from <a class="reference external" href="http://evergreen-ils.org">http://evergreen-ils.org</a>.</p>
<p>However, I am aware of two free-and-open-source modules that are
available as extensions to Evergreen:</p>
<ol class="arabic">
<li><p class="first">A relatively simple, straightforward module, written by my colleague
Kevin Beswick, is in use at Laurentian University and recently was
adopted by the <strong>emily carr university of art + design</strong>. It builds
on Evergreen's bookbags feature to organize reserves of physical
items by class code and instructor name. The module for that code--a
mix of PHP, Dojo, and SQLite--is available on
<a class="reference external" href="https://github.com/kbeswick/library/tree/master/reserves">Github</a>,
and you can see it in action at <a class="reference external" href="http://biblio.laurentian.ca/reserves/">Laurentian
University</a>.</p>
</p><ul class="simple">
<li><strong>UPDATE 2012-12-21</strong>: See the version I forked at
<a class="reference external" href="https://github.com/dbs/library/tree/master/reserves">https://github.com/dbs/library/tree/master/reserves</a> with updates
supporting TPAC integration</li>
</ul>
</li>
<li><p class="first"><strong>Syrup</strong> is a more sophisticated reserve system (you know it's a
serious project when it has a name!), which supports all kinds of
features - such as mixes of electronic and physical materials,
organizing course content by arbitrary groupings (e.g. readings per
week), limiting user access to the content of specific courses based
on LDAP integration, and much much more. You can see a running
instance at the <a class="reference external" href="http://reserves.uwindsor.ca/syrup/browse/">University of
Windsor</a> and the code
(primarily written in Python) is freely available from the <a class="reference external" href="http://git.evergreen-ils.org/?p=Syrup.git;a=summary">Syrup git
repository</a>
on Evergreen's git server. If you need help getting up and running,
Syrup's <a class="reference external" href="%20http://groups.google.com/group/syrup-reserves-discuss">mailing
list</a> is
probably a good place to start.</p>
</li>
</ol>
<p>So, there are at least two choices for academic reserves for Evergreen.
Go ahead and pick the one that meets your needs!</p>
The wonderful new OpenLibrary Read API and Evergreen integration2011-06-02T20:06:00-04:002011-06-02T20:06:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2011-06-02:/the-wonderful-new-openlibrary-read-api-and-evergreen-integration.html<p>Back in early May, I was in San Francisco for Google I/O. I had booked an extra
day with the hopes of either doing some site-seeing or meeting up with the
<a class="reference external" href="http://openlibrary.org">OpenLibrary</a> team. After firing off an email to
find out if anyone there was interested on working on …</p><p>Back in early May, I was in San Francisco for Google I/O. I had booked an extra
day with the hopes of either doing some site-seeing or meeting up with the
<a class="reference external" href="http://openlibrary.org">OpenLibrary</a> team. After firing off an email to
find out if anyone there was interested on working on some tighter integration
between OpenLibrary and Evergreen, the answer from George Oates was an
enthusiastic "Yes!". So, we spent a beautiful sunny day inside the Internet
Archive headquarters discussing possible directions for this integration.
Alcatraz, you can wait for my next trip...</p>
<p>As it turned out, the timing was great. I had spent a day hacking on the
OpenLibrary "added content" module for Evergreen during the Evergreen hackfest
(which I spent in an airport due to an eight-hour fog delay... different
story), so I was quite familiar with the existing OpenLibrary Book API and
their patterns of use were fresh in my brain. The biggest problem with the
existing Book API, from my perspective, was that I had to make two calls for
each work that I was interested in retrieving information about; one call
returned the <em>data</em> (stable elements) and one call returned the <em>details</em>
(unstable, but quite interesting elements like the table of contents, excerpts,
etc).</p>
<p>The OpenLibrary team had this in their sights as well - but they wanted to
tackle a bigger target. Rather than making one or more calls per work, they
wanted to expose an API that would let users request info for multiple works in
one shot: the <em>Shotgun API</em> (known amongst more polite company as the <em>Read</em>
API). Loosely modelled on the Hathitrust API, it would also focus on exposing
URLs for reading or borrowing (using the relatively recent OpenLibrary
borrowing program) exact matches or similar editions. It sounded great, and we
spent the afternoon fleshing out how we wanted it to look and work. My role was
largely that of the third-party developer - the API customer - and we had great
discussions.</p>
<div class="section" id="working-code-wins">
<h2>Working code wins</h2>
<p>Of course, discussions are one thing, and working code is another. OpenLibrary
developer Mike McCabe was riding shotgun on the development of the Read API,
and once he had enough working code in place, he contacted me to ask me to
start developing against it. It was the usual development process: I started
with a hard-coded sample JSON output, then as Mike pushed more functionality
into a server environment I was able to test and expand my client-side code.</p>
<p>So where are we now? I can vouch that working with the all-in-one Read API, as
a developer, is sweet. All of the data elements are readily visible in sweet,
sweet JSON, in a single call, and it is utterly simple to pull the bits that
you want to expose. I had been trying to pull together ebook links and the like
from the Books API, and the use of the <tt class="docutils literal">items</tt> list makes that absolutely
painless for developers. Kudos!</p>
<p>Evergreen has a largely rewritten OpenLibrary added content module built
against the Read API sitting in the Evergreen working repository
<tt class="docutils literal"><span class="pre">user/dbs/openlibrary-read-api</span></tt> branch. As the <strong>Borrow</strong> and <strong>Read</strong>
functions depend on IP address range matching, I have added the ability to
proxy the Read API requests via the Evergreen server - so that if an Evergreen
institution has special access rights to the OpenLibrary collection, their
patrons will see the appropriate levels of access in the catalogue. Oh yes, the
catalogue; as we were already using OpenLibrary by default for cover art,
tables of content, and excerpts in Evergreen since the 2.0 release, the major
difference that will be visible to Evergreen users will be in search results:</p>
<p><a class="reference external image-reference" href="/uploads/files/openlibrary-evergreen.png"><img alt="Search results showing OpenLibrary Read integration" class="serendipity-image-left" src="/uploads/files/openlibrary-evergreen.serendipityThumb.png" style="width: 110px; height: 66px;" /></a></p>
<p>As you can see, if you have left the <tt class="docutils literal">OpenLibraryLinks</tt> variable turned on in
the <tt class="docutils literal">result_common.js</tt> file, Evergreen will search for a matching record in
OpenLibrary and tell you if an online version is available. It tells you
whether the online version is an exact match, or similar, and will also expose
items that you can borrow from OpenLibrary. Given the preponderance of print
materials that still remains in our collections, and our users' general
preference for anything electronic, I think this will be an extremely popular
feature.</p>
</div>
<div class="section" id="moving-forward">
<h2>Moving forward</h2>
<p>There are a number of areas that could use more polish and tender loving
care.</p>
<p>First and foremost, OpenLibrary supports matching based on ISBNs, LCCNs,
OCLC numbers, and OpenLibrary IDs; right now, the Evergreen support is based
strictly on ISBNs, which of course don't exist for many of the older materials
in our collections. So a fruitful direction would be to take the regular dump
of data that OpenLibrary thoughtfully provides (yay for open data) and use that
to augment our records to include OpenLibrary ID numbers to use as a match
point.</p>
<p>There is the small matter of merging these changes back into Evergreen
proper.</p>
<p>I developed against the Evergreen 2.0 branch because I wanted to be able to put
this code into production as soon as possible, so there will be a tiny bit of
merging pain to get this into master and backported properly. However, the
changes are quite localized and should be agreeable, so hopefully this will not
sit in a branch for too long.</p>
<p>At this early stage in the Read API's release, I have also found that it can be
a bit slow to respond to requests containing a number of identifiers (or
perhaps a large number of records and items). It is to be expected that
functionality comes first and optimization comes later, so I have great hopes
for improved performance once the Read API settles down.</p>
<p>Of course, once you have the Read API, you need an Write API - and I hope to be
able to help pilot that as well, because the potential communal benefit of a
Write API for library systems that have integrated with OpenLibrary is huge.
Imagine a system where, when you ask for added content based on a given
identifier, if the system says "Huh, I don't know anything about that
identifier" it follows up with "Hey, can you POST what you know about it to
this URL?".</p>
<p>OpenLibrary could then run its algorithms and either add an edition to an
existing work or generate a new work. We should also be able to expose
OpenLibrary's metadata editing tools for our users, so they can flag bad cover
art, or add a table of contents to works that they are passionate about, or
post a favourite excerpt... Enabling a bi-directional give and take between
systems has the potential to quickly make OpenLibrary a huge knowledgebase of
open data. It would be a great boon for libraries, and I hope we can make it
happen.</p>
<p><strong>Update 2011-06-02 21:54 EDT</strong>: The omission of Mike McCabe's name has been
corrected. Also, I forgot to thank my employer, Laurentian University, and the
University of Windsor for allowing me to invest some of my time on
strengthening Evergreen's ties to OpenLibrary. I believe this is the beginning
of a solid, mutually beneficial partnership!</p>
</div>
Reducing cached content pain after Evergreen upgrades2011-05-23T15:16:00-04:002011-05-23T15:16:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2011-05-23:/reducing-cached-content-pain-after-evergreen-upgrades.html<p>If you have been through an Evergreen upgrade, you know that the days
after the upgrade can be painful. Users complain that the catalogue
doesn't work right, there are mysterious glitches that happen on some
machines and not others (even though the browser and operating systems
are identical on each …</p><p>If you have been through an Evergreen upgrade, you know that the days
after the upgrade can be painful. Users complain that the catalogue
doesn't work right, there are mysterious glitches that happen on some
machines and not others (even though the browser and operating systems
are identical on each machine!), rebooting doesn't help... and then
eventually the problem goes away.</p>
<p>The problem isn't all that mysterious, really, it's the result of the
browser caching content. Normally, browser caching is a very positive
experience: when a browser requests a file from a Web server, the Web
server tells it to how long the browser should hold onto the file via a
<tt class="docutils literal"><span class="pre">Cache-control</span></tt> directive. This means that if a page on your Web site
is dozens of hundreds of images and CSS and JavaScript files, your
browser doesn't have to download every one of those files on every page
you visit; as long as the file hasn't expired, the browser can just
serve it up from the local cache and only the fresh content needs to be
fetched from the server. It's how the Web works, and it's really
important for performance reasons.</p>
<p>However, if your Web server has told your browser to cache files for a
month, and then during that month you upgrade your Web site so that
there is new JavaScript and CSS files that your fresh content depends
on, then you can run into trouble until those cached files expire. And
that is exactly the case that we run into with Evergreen upgrades - only
the problem is amplified by how heavily the Evergreen catalogue (which
is just a Web site) relies on JavaScript for basic operations.</p>
<p>On the user side, you can handle the problem a few ways:</p>
<ul class="simple">
<li>Doing a <em>hard refresh</em> to force the browser to fetch fresh versions
of all the files in its cache. You can force a hard refresh on most
browsers by holding down the <tt class="docutils literal">Shift</tt> key and clicking the
<tt class="docutils literal">Refresh</tt> or <tt class="docutils literal">Reload</tt> button.</li>
<li>Emptying the browser cache.</li>
</ul>
<p>Neither of these user-side approaches is particularly convenient. Doing
a hard refresh may work for one page, but as the user navigates to a
different page that uses different CSS and JavaScript, they will have to
do another hard refresh... and so on, which in the case of Evergreen
means users will have to refresh around a half-dozen different pages
(home page, search results, record details, account, advanced search).
Hard refreshes are also not reliable, as resources fetched by XHR are
not actually refreshed (this is <a class="reference external" href="http://code.google.com/p/chromium/issues/detail?id=37711">a long-standing bug with Chrome and
Firefox</a>).
If you don't know what XHR is, just know that Evergreen uses a lot of
them. And emptying the browser cache is both painful (every browser has
a different way of emptying browser cache) and overkill (you just want
to discard the cache for one site, but most browsers will discard the
cache for every site they have visited).</p>
<p>The "right" solution is to have the server tell the browser to fetch a
new version of the resource. You could change the caching settings to be
very short-lived - for example, change the cache time from one month
down to one day for JavaScript and CSS - but unless you upgrade your
site very frequently, that would mean that 99% of the time your users'
browsers will be making unnecessary requests, and their experience of
your catalogue will be that it is slower to load than other sites on the
Web. Not so good.</p>
<p>The other approach is to change the pathname for the cached resources at
upgrade time so that the browser doesn't find a match in its local cache
and has to fetch the new version. There's some good news: some work has
been going on in the Evergreen 2.1 release to tackle this problem, but
it is not yet complete. And most sites are only looking at moving to 2.0
right now. As it happens, we made the jump from 1.6.1.8 to 2.0.6+
yesterday and boy howdy the browser cache was a problem after the
upgrade, as one would expect. I took a quick stab at identifying the
most likely paths that needed to be refreshed and threw together some
shell commands to "munge" our catalogue skins so that browsers would be
forced to pick up the new versions of the content.</p>
<p>Post-upgrade panic, I refactored those commands into <a class="reference external" href="http://git.evergreen-ils.org/?p=contrib/Conifer.git;a=blob;f=tools/migration-scripts/cache-munger.pl;h=aa2a49a030e9b4d9aeb1213562609dc640d3e453;hb=master">a Perl script
named
cache-munger.pl</a>
(well, more precisely, a Perl script that generates shell commands). The
Perl script has two hardcoded variables: a datestamp (which is really
any uniquely identifying string that can appear in a directory name and
URL) and a list of catalogue skins to munge. When you run the script, it
generates a set of shell commands that you should be able to run on your
Evergreen 2.0 instance to force browsers to cache the new version of
your catalogue's JavaScript and CSS files.</p>
<p>Some limitations: I haven't written a script to convert your skins back
to pristine mode (that's mostly a matter of updating the ack-grep
commands and reversing the sed commands). And I haven't written a script
to update a munged set of skins. And, I'm not 100% sure that I've hit
every set of JavaScript and CSS that needs to be refreshed after an
upgrade from 1.6 to 2.0. But it's a reasonable start, in my opinion, and
hopefully it helps inform the Evergreen 2.1 effort so that we can have a
standard, supported, painless means of telling browsers to fetch new
resources as an automatic part of any upgrade in the future.</p>
Authority support in Evergreen 2.02011-04-29T15:09:00-04:002011-04-29T15:09:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2011-04-29:/authority-support-in-evergreen-20.html<p>I'm at the Evergreen 2011 conference in balmy Decatur, Georgia... which wasn't
a sure thing yesterday, given that the day started with an eight hour delay at
the Sudbury airport due to fog - not to mention having to fly through the storm
that spawn a tornado in Alabama. After all …</p><p>I'm at the Evergreen 2011 conference in balmy Decatur, Georgia... which wasn't
a sure thing yesterday, given that the day started with an eight hour delay at
the Sudbury airport due to fog - not to mention having to fly through the storm
that spawn a tornado in Alabama. After all that, though, it's great to be back
in the same physical space as the vibrant Evergreen community!</p>
<p>Yesterday afternoon I gave a presentation on <a class="reference external" href="http://bzr.coffeecode.net/eg2011_authorities">Authorities in Evergreen 2.0</a>, covering (as the title
suggests) Evergreen's support for authority records in the 2.0 release (as well
as a peek at the future of Evergreen 2.2).</p>
<p>The session appeared to be well-received - yay! - and I tried recording it on
my colleague Rick Scott's Sansa Clip+. Hopefully that worked out and I'll be
able to update this post with the audio, so you can have the full-on audio and
slide experience.</p>
<p>The presentation is available under the <a class="reference external" href="http://creativecommons.org/licenses/by-sa/2.5/ca/">Creative Commons Attribution Share
Alike license</a>, in the
hopes that others will be able to use it for training purposes, to extend and
improve it, and generally help out with the adoption of Evergreen.</p>
Evergreen's continuous integration servers - past, present, and future2011-03-14T02:13:00-04:002011-03-14T02:13:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2011-03-14:/evergreens-continuous-integration-servers-past-present-and-future.html<p><em>tldr</em> version: the Evergreen project now has a <a class="reference external" href="http://testing.evergreen-ils.org">continuous integration
server and build farm</a> and needs
testcases to make the best use of that infrastructure to help us provide
higher-quality releases in the future.</p>
<div class="section" id="evergreen-buildbot-past">
<h2>Evergreen buildbot - past</h2>
<p>Back in November 2009, Evergreen developer Shawn Boyette <a class="reference external" href="http://markmail.org/message/j6hd6634oimpum6x">launched</a> the Evergreen <em>buildbot</em> -
a …</p></div><p><em>tldr</em> version: the Evergreen project now has a <a class="reference external" href="http://testing.evergreen-ils.org">continuous integration
server and build farm</a> and needs
testcases to make the best use of that infrastructure to help us provide
higher-quality releases in the future.</p>
<div class="section" id="evergreen-buildbot-past">
<h2>Evergreen buildbot - past</h2>
<p>Back in November 2009, Evergreen developer Shawn Boyette <a class="reference external" href="http://markmail.org/message/j6hd6634oimpum6x">launched</a> the Evergreen <em>buildbot</em> -
a continuous integration server that ran basic tests with every commit to the
OpenSRF and Evergreen repositories and created nightly tarballs of the code. It
was a promising start towards a system that would provide us with instant
feedback about the state of our code - at least as much as we had tests for it.
Unfortunately, the server ran for only a few months before disappearing when
Shawn parted ways with Equinox in early 2010.</p>
<p>I always thought it was a shame we had lost this piece of the development
infrastructure, but Equinox had offered accounts on a server for anyone in the
Evergreen community interested in taking on the task of setting up a new
continuous integration test server - and through the rest of 2010, nobody
stepped up to take on that responsibility. Most of us were busy developing and
testing Evergreen 2.0, I suspect. So, in January of 2011, when I had a bit of
breathing room, I scoped out the current state of continuous integration
frameworks and discovered that the <a class="reference external" href="http://buildbot.net">buildbot</a> project
(no relation to Shawn's code, other than a serendipitous name) was written in
Python and therefore was much more approachable to me than the other leading
alternative, Hudson... so I wrote up <a class="reference external" href="http://markmail.org/message/2ke455rplbrpcxuv">my findings and a quick proposal</a>.</p>
</div>
<div class="section" id="evergreen-buildbot-present">
<h2>Evergreen buildbot - present</h2>
<p>A few days later I had the buildbot running on the <a class="reference external" href="http://testing.evergreen-ils.org"">server provided by Equinox</a>, providing reports on the status of the
OpenSRF builds on Ubuntu Lucid. After putting out a call to the community for
build servers to provide coverage for Evergreen on different operating systems,
I had enough responses to focus my mind on improving the Evergreen build.
Evergreen now has the same standard layout for Perl modules that we adopted a
year ago for OpenSRF, along with some basic sanity tests in Perl (such as are
there any syntax errors in this module?).</p>
<p>So, thanks to <a class="reference external" href="http://esilibrary.com">Equinox</a> for providing the testing
server that serves as the mothership for controlling all of the build tests.
And many thanks to the University of Prince Edward Island Robertson Library and
the <a class="reference external" href="http://georgialibraries"">Georgia Public Library Service</a> for providing
build servers for the build farm. We now have Evergreen test coverage on the
Ubuntu Lucid and Debian Squeeze Linux distributions (huzzah) and OpenSRF test
coverage on Ubuntu Lucid. If you have an interest in getting test coverage for
a different distribution and have a server to spare, please feel free to
<a class="reference external" href="mailto:dan@coffeecode.net">contact me</a> and we can get your server
<a class="reference external" href="http://evergreen-ils.org/dokuwiki/doku.php?id=dev:testing:buildbot#setting_up_the_buildbot_slave"">added to the build farm</a>.</p>
</div>
<div class="section" id="checking-build-status">
<h2>Checking build status</h2>
<p>You can check the current state of the code for various OpenSRF and Evergreen
branches at any time by visiting the <a class="reference external" href="http://testing.evergreen.ils.org">Evergreen buildbot page</a> and choosing one of the menu options.</p>
<p><a class="reference external" href="http://testing.evergreen-ils.org/buildbot/one_line_per_build"">Recent builds</a> provides
a simple list of the success or failure of the 20 most
recent builds.</p>
<p><a class="reference external" href="http://testing.evergreen-ils.org/buildbot/waterfall">Waterfall</a>, on the
other hand, provides the detailed status of every tested combination of Linux
distribution and code branch.</p>
</div>
<div class="section" id="evergreen-buildbot-future">
<h2>Evergreen buildbot - future</h2>
<p>We still have work to do to deliver on the promise of the buildbot. Most
important, I think, is that a continuous integration server can only run the
tests that it has been given - and we have not given it many tests.</p>
<p>It kills me that people discovered some fairly fundamental problems with the
Evergreen 2.0 release (some recent examples include most identifier searches
not working and limitations with Unicode in patron names). Now that we have a
continuous integration server, we need a testing framework so that it becomes
easy to add tests along the lines of "Import a set of sample bibliographic
records, then run the following sets of searches (ISSN and ISBN with and
without hyphens; EAN; UPC...) and ensure that the returned results match these
expected results". It should be a human's job to set up that automated test
<em>once</em> so that we're forever confident in the future that we're not screwing up
those basic features, no matter what we change in our database schema or
underlying code.</p>
<p>Now, there are very few people that can currently create that sort of a test.
There might be none at the moment, in fact, because we need that previously
mentioned testing framework to be sorted out and integrated into the buildbot.</p>
<p>However, in the short term we <em>can</em> create these testing scenarios so that
humans can reproduce them during testing blitzes, until such time as we have
the testing framework sorted out and can begin automating these tests.</p>
<p>Otherwise, I fear that we'll go into the Evergreen 2.1 alpha/beta/release
candidate cycle and get reports from testing that indicate that all is well -
but only because some of the more complex tasks haven't actually been attempted
- and we'll find ourselves scrambling once again after the release to fix
problems that become evident when sites actually start moving to the release.</p>
<p>Beyond tests, we need to teach it to create cleanly packaged tarballs on a
regular basis - although that should arguably be nothing more than, or not much
more than, the equivalent of running <tt class="docutils literal">make package</tt> rather than pushing all
kinds of specialized packaging logic into the buildbot itself.</p>
<p>Autotools wizards, your assistance would be greatly appreciated.</p>
</div>
<div class="section" id="spreading-evergreen-buildbot-knowledge">
<h2>Spreading Evergreen buildbot knowledge</h2>
<p>To ensure that our project can survive the loss of the current master build
server (or me, for that matter!), I've been committing a password-sanitized
copy of the buildbot configuration to the examples directory of the OpenSRF
repository. In addition to reducing the dependency on one person and one
server, it also gives anyone else interested in contributing to the Evergreen
buildbot the ability to easily define a build master and build slaves in a
local environment.</p>
</div>
Evergreen 2.0.0: What it has (and does not have)2011-02-05T15:08:00-05:002011-02-05T15:08:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2011-02-05:/evergreen-200-what-it-has-and-does-not-have.html<p>Back in early 2010, I responded to the call for proposals for the OLA
SuperConference with the following
<a class="reference external" href="http://www.accessola.com/superconference2011/showSession.php?lsession=620&usession=620">proposal</a>
for a session called <strong>Evergreen 2.0: What doesn't it have?</strong>:</p>
<blockquote>
</p><p>The first release of the Evergreen library system in September 2006
brought circulation, cataloguing, reports, and a modern OPAC.
Evergreen …</p></blockquote><p>Back in early 2010, I responded to the call for proposals for the OLA
SuperConference with the following
<a class="reference external" href="http://www.accessola.com/superconference2011/showSession.php?lsession=620&usession=620">proposal</a>
for a session called <strong>Evergreen 2.0: What doesn't it have?</strong>:</p>
<blockquote>
</p><p>The first release of the Evergreen library system in September 2006
brought circulation, cataloguing, reports, and a modern OPAC.
Evergreen 2.0, expected in early 2011, promises deep support for
acquisitions, serials, telephony, and more. The range of features
will be highlighted and weaknesses exposed.</p>
<p></blockquote>
<p>Talk about great timing! My talk was accepted and scheduled for February
3rd, and of course Evergreen 2.0.0 was released exactly one week before
that! So not only was I able to accurately predict when Evergreen 2.0
would be available, I was actually able to deliver a presentation based
on reality. I believe I provided a balanced look at Evergreen's current
strengths and weaknesses, and as with my sessions in previous years at
the OLA SuperConference, all of the seats in the room (25 or so) were
filled. There were unfortunately a number of people who poked their
heads in the door and, seeing the lack of available seats, moved on to
some other presentation. So, interest in Evergreen remains strong
amongst the Ontario crowd - and maybe next year I can swing a larger
venue! I was also really fortunate to meet several people after the
session who expressed interest in contributing to the Evergreen project
in various ways; I'm always eager to welcome new members to the
community of Evergreen contributors, so here's hoping that works out
<img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
<p>If you were one of the people who couldn't get a seat, or you're just
interested in catching up with the state of Evergreen at the 2.0.0
release, the presentation itself is available in <a class="reference external" href="http://bzr.coffeecode.net/ola_2011/ola_2011_slidy.html">HTML
form</a> (ahh
Slidy). I have also made the <a class="reference external" href="http://bzr.coffeecode.net/ola_2011/ola_2011_slidy.txt">ASCIIDOC
source</a> and
<a class="reference external" href="http://bzr.coffeecode.net/ola_2011/images">screenshots</a> for the
presentation available in a <a class="reference external" href="%3Ca%20href=">Bazaar repository</a>. The
presentation is licensed under the Creative
Commons-Attribution-ShareAlike license in the hope that others in the
Evergreen community may find the material useful for learning and
sharing with their own libraries, and may want to fill in some areas
where I may have left gaps (feel free to fork the repository and send
patches my way!). It would be great if we could collectively pull
together a kick-butt presentation for Evergreen advocacy, and I would be
delighted if my material served as a starting point for that effort.</p>
Standard Social Sharing and Aggregation on the Go: Access 2010 presentation2010-10-17T14:44:00-04:002010-10-17T14:44:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-10-17:/standard-social-sharing-and-aggregation-on-the-go-access-2010-presentation.html<p>Earlier this week, I had the honour of speaking at the <a class="reference external" href="http://access2010.lib.umanitoba.ca/">Access 2010
conference</a> in Winnipeg,
Manitoba. The title of my talk was rather unwieldy, but what it boiled
down to was:</p>
<ul class="simple">
<li>An environmental scan of how libraries are currently offering users
of their services the ability to share their …</li></ul><p>Earlier this week, I had the honour of speaking at the <a class="reference external" href="http://access2010.lib.umanitoba.ca/">Access 2010
conference</a> in Winnipeg,
Manitoba. The title of my talk was rather unwieldy, but what it boiled
down to was:</p>
<ul class="simple">
<li>An environmental scan of how libraries are currently offering users
of their services the ability to share their thoughts and to connect
with one another around library activities</li>
<li>A brief overview of some relevant emerging standards for
socially-enabled applications (<a class="reference external" href="http://activitystrea.ms">Activity
Streams</a>, <a class="reference external" href="http://gmpg.org/xfn/">XHTML Friends Network
(XFN)</a>, and the HTML5 browser geolocation
API)</li>
<li>Some of my thoughts about how library software could adopt these
standards to knit together experiences across library system
boundaries, and outside of library systems altogether</li>
<li>Some findings from an <a class="reference external" href="http://markmail.org/message/nb2w7fslmsi33x33">initial
implementation</a> of
one of these standards (Activity Streams) in the <a class="reference external" href="http://evergreen-ils.org">Evergreen library
system</a></li>
</ul>
<p>Here are the slides
(<a class="reference external" href="/uploads/talks/2010/social_sharing.odp">OpenDocument</a>,
<a class="reference external" href="/uploads/talks/2010/social_sharing.pdf">PDF</a>)
and the accompanying recording (<a class="reference external" href="/uploads/talks/2010/sharing_talk.ogg">OGG
Vorbis</a>,
<a class="reference external" href="/uploads/talks/2010/sharing_talk.mp3">MP3</a>).
Thanks to Bill Denton for the use of his recorder for the audio!</p>
<p>One quick reflection is that, in the interest of using a familiar
example, I think I focused too much on sharing and aggregating <em>objects</em>
(such as reviews) between libraries and didn't make a good argument for
the value of enabling connections between <em>people</em> based on their
activities.</p>
On avoiding accusations of forking a project2010-09-29T02:16:00-04:002010-09-29T02:16:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-09-29:/on-avoiding-accusations-of-forking-a-project.html<p>Sometimes forking a project is necessary to reassert community control
over a project that has become overly dominated by a single corporate
rules: see <a class="reference external" href="http://openindiana.org/">OpenIndiana</a> and
<a class="reference external" href="http://www.documentfoundation.org/download/">LibreOffice</a> for recent
examples. And in the world of distributed version control systems,
forking is viewed positively; it's a form of evolution, where
experimental …</p><p>Sometimes forking a project is necessary to reassert community control
over a project that has become overly dominated by a single corporate
rules: see <a class="reference external" href="http://openindiana.org/">OpenIndiana</a> and
<a class="reference external" href="http://www.documentfoundation.org/download/">LibreOffice</a> for recent
examples. And in the world of distributed version control systems,
forking is viewed positively; it's a form of evolution, where
experimental branches that lead to new features or a stabler system or
better performance get grafted back onto the accepted authoritative
branch.</p>
<p>Yet a negative connotation can also be associated with forking a
project, particularly if the word is whispered behind closed doors as an
accusation of the behaviour of one or more parties in the community.
Particularly in a small community, where development resources for a
project built on the principles of software freedom from the ground up
are relatively scarce, the spectre of a development effort based on that
project that is not publicly visible can be troubling and opens the door
to the accusation: <strong>FORK</strong>! Organizations that have staked their
customers' satisfaction, and their own reputation, on free software that
they expected to see flourish as others joined in the development
effort, fret and worry that they'll be left behind with just the base
for another organization's project and no easy way to reconcile the two.</p>
<p>In the Evergreen community, we're fortunate that we're small enough that
we should be able to avoid these concerns. The Evergreen "trunk" code
repository has been hopping; just take a peek at the <a class="reference external" href="http://svn.open-ils.org/trac/ILS/log/trunk">revision
log</a> to see the rather
torrid pace of development. Some <a class="reference external" href="http://evergreen-ils.org/dokuwiki/doku.php?id=faqs:evergreen_roadmap">major
features</a>
are taking shape, such as acquisitions with EDI support, first-class
serials management, outbound telephony, and more - evident in the
<a class="reference external" href="http://evergreen-ils.org/downloads.php">Evergreen 2.0 alpha 3
release</a> that the development
team put together today. This is not a minor release!</p>
<p>And yet, and yet... during the exciting KCLS migration live-blog, Lori
Ayre felt it necessary <a class="reference external" href="http://rscel.evergreen-ils.org/node/1526">to
write</a>:</p>
<blockquote>
</p><p>The answer is that much of it is already in trunk, and if it isn't
there now, it will be very soon. None of this work is being held
back. There is no KCLS fork. This is all Evergreen and anyone who
knows how to download from trunk will be able to get at this code in
very short order.</p>
<p></blockquote>
<p>Well, I know that everyone involved with the KCLS enhancements are good
people, and that that it is certainly their intention to make any of the
enhancements available, and there is no intention to fork Evergreen. I
know! Ironically enough, however, due to the prior actions of
proprietary companies such as Relais' "<a class="reference external" href="http://www.relais-intl.com/relais/home/Relais%20Open%20Source%20Update%20Feb%208_2010.pdf">announce that we will open
source our ILL product in 2008; freeze the market; announce in 2010 that
maybe we'll have something by the end of
2010</a>"
strategy, the broader library community has become more skeptical and
susceptible to disinformation and FUD. I can't imagine who would want to
sow discontent among the community of a rapidly maturing ILS project,
other than perhaps proprietary competitors who have forgotten how to
compete on the merits of their product rather than
<a class="reference external" href="http://thebookpile.wordpress.com/2010/04/01/sirsidynix-opensource-paper-pdf/">negative</a>
<a class="reference external" href="http://www.galecia.com/sirsidynix-and-the-fud-factor/">marketing</a>.
(Just a guess, mind you!)</p>
<p>Still: until the code for any remaining enhancements is available under
an open source license, the possibility that those whispering,
Saruman-like voices could be right remains an actual possibility. My
suggested remedy for the easiest way to dispel those concerns, now and
in the future for any project (Evergreen or otherwise), is to simply
develop in the open:</p>
<ul class="simple">
<li>Create a public repository - SVN (<a class="reference external" href="http://svn.open-ils.org/trac/ILS-Contrib">Evergreen
contributions</a>), or
Bazaar (<a class="reference external" href="https://code.launchpad.net/evergreen">Evergreen
LaunchPad</a>), or git
(<a class="reference external" href="http://gitorious.org">Gitorious</a>), or what have you. Put a
README in the top directory of the repository specifying that the
contents are licensed under the "GPL v2 or later" or GPL-compatible
license.</li>
<li>Announce the repository on the Evergreen development mailing list. If
you tuck your repository in an obscure location and don't tell
anybody about it, it might technically be open, but that's not really
the spirit of openness. You're also depriving your effort of possible
collaborators, and possibly duplicating effort if somebody else is
working on the same feature.</li>
<li>Watch the rumours disappear and the fame, glory, and accolades roll
in. (Oh, and don't forget to invite us to integrate the fruit of your
labour into the core of Evergreen!)</li>
</ul>
<p>Sure, there might be some material that you don't want to share:
trademarked institutional logos or the like. But the bulk of what we
collectively create should be able to be openly shared, not just when
things are perfectly baked, but all the way through the process. Release
early, release often, and keep the spooky whisperers at bay.</p>
Responding to the Evergreen "research" article in Information Technology and Libraries2010-09-20T02:25:00-04:002010-09-20T02:25:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-09-20:/responding-to-the-evergreen-research-article-in-information-technology-and-libraries.html<p><strong>Update 2010-09-28</strong>: Fixed link</p>
<hr class="docutils" />
<p>The <a class="reference external" href="http://www.ala.org/ala/mgrps/divs/lita/ital/italinformation.cfm">home page for ***Information Technology and
Libraries***</a>
states:</p>
<blockquote>
</p><p><strong>*Information Technology and Libraries*</strong> ( <em>ITAL</em>) (ISSN
0730-9295) is a refereed journal published quarterly by the Library
and Information Technology Association (LITA), a division of the
American Library Association.</p>
<p></blockquote>
<p>The September 2010 issue of ITAL contained an article …</p><p><strong>Update 2010-09-28</strong>: Fixed link</p>
<hr class="docutils" />
<p>The <a class="reference external" href="http://www.ala.org/ala/mgrps/divs/lita/ital/italinformation.cfm">home page for ***Information Technology and
Libraries***</a>
states:</p>
<blockquote>
</p><p><strong>*Information Technology and Libraries*</strong> ( <em>ITAL</em>) (ISSN
0730-9295) is a refereed journal published quarterly by the Library
and Information Technology Association (LITA), a division of the
American Library Association.</p>
<p></blockquote>
<p>The September 2010 issue of ITAL contained an article by Sharon Q. Yang
and Melissa A. Hofmann called "The Next Generation Library Catalog: A
Comparative Study of the OPACs of Koha, Evergreen, and Voyager". As an
Evergreen developer, I wonder just how much refereeing happened before
this article was published. Certainly I am biased, but there are a
number of problems with the study from my perspective:</p>
<ol class="arabic">
<li><p class="first">The article stated "The latest releases at the time of the study was
Koha 3.0, Evergreen 2.0, WebVoyage 7.1." Grammatical problems with
that sentence aside, the first alpha release of Evergreen 2.0 was
created on August 23, 2010. For an article published in September
2010, I find it highly unlikely that the authors were able to find
any running instances of this version of Evergreen on which to base
their information. Which leads to a problem with the methodology:</p>
</li>
<li><p class="first">The stated methodology in the article was</p>
<blockquote>
<p>The OPACs used in this study included three examples from each
system. They may have been product demos and live catalogs
randomly chosen from the user list on the product websites. ...
In case of discrepancies between product descriptions and
reality, we gave precedence to reality over claims. In other
words, even if the product documentation lists and describes a
feature, this study does not include it if the feature is not in
action either in the demo or live catalogs.</p>
</blockquote>
<p>This sounds like a thorough, pragmatic approach. But the product
versions associated with each of the chosen examples are not listed.
So while the article mentions the latest releases of each product,
the actual reported experience might be based on an outdated version
of the product. In the case of Evergreen, one of the chosen examples
is two major versions behind the actual current stable release of
1.6.1.2, and another of the chosen examples is one major version
behind the current stable release. In addition, one of the desired
features of modern OPACs is customizability: not just the ability to
turn features on and off, but also the ability to change the user
experience significantly as a small matter of programming. Depending
on which example OPACs were chosen for each system, the features the
authors were looking for might not have been turned on or exposed.</p>
</li>
<li><p class="first">On the "Single Point of Entry for All Library Information" feature,
the authors state:</p>
<blockquote>
<p>While WebVoyage and Evergreen only display journal-holdings
information in their OPACs, Koha links journal titles from its
catalog to ProQuest’s Serials Solutions, thus leading users to
full- text journals in the electronic databases.</p>
</blockquote>
<p>As far as I can tell, however, this is not a special integration
feature of Koha; it appears to just be the use of an 856 with a URL
that points to a link resolver for a lookup of a given ISSN. While it
is a reasonable cataloguing practice, any other library system should
be capable of that; Evergreen certainly is. However, check out this
link for an example of how one can <a class="reference external" href="http://ur1.ca/1oj6f">make an OPAC work harder by
bringing resolver results right into the OPAC
display</a>. I built an Evergreen service,
called <tt class="docutils literal"><span class="pre">open-ils.resolver</span></tt>, for caching resolver requests for ISSNs
and used that service as the basis of <a class="reference external" href="http://www.accessola.com/superconference2010/showSession.php?lsession=8&usession=8">a developer tutorial for
writing Evergreen
services</a>.
The idea isn't new; <a class="reference external" href="http://bibwild.wordpress.com/2007/03/04/online-coveragelink-info-in-your-opac-via-sfx/">Jonathan Rochkind wrote about doing
this</a>
back in 2007. But having a caching server-side implementation freely
available for your library system is relatively novel. We've been
using it since the summer of 2009. If you use Evergreen, then you can
add this feature to your system too; it is written up in the
<a class="reference external" href="http://evergreen-ils.org/~denials/workshop.html">developer
workshop</a> and is
licensed under the GPL v2 or later, but if there's interest I can add
it to Evergreen's core.</p>
</li>
<li><p class="first">On "Enriched Content", the authors found that Evergreen offered only
cover art. Of course, enriched content depends heavily on the content
supplier and the chosen item. Since launching in 2006, Evergreen has
provided enriched content such as cover art, abstracts, author notes,
reviews, and tables of contents from Syndetic (requiring a Syndetic
subscription, of course). In addition, Evergreen has offered Google
Books integration in the form of partial & full previews (if
available) inline in the detail page since Evergreen 1.6.0.0, thanks
to the initial efforts of Alexander O'Neill at the University of
Prince Edward Island. And as of Evergreen 2.0, the default content
provider for cover art and tables of content will be
<a class="reference external" href="http://openlibrary.org/">OpenLibrary</a>. Here's <a class="reference external" href="http://ur1.ca/1ojdy">an
example</a> from our catalogue that brings in
enriched content including cover art and a book review from Syndetic
and a Google Preview.</p>
</li>
<li><p class="first">On "RSS Feeds", the authors make the bold statement: "Koha provides
RSS feeds, while Evergreen and WebVoyage do not". In the case of
Evergreen, that's a laughable statement, because significant parts of
the OPAC are built on RSS feeds. For example, in any Evergreen
system, click on the "Basic catalogue" link and you'll find that it
is nothing more than an RSS feed with a simple search form. If you
are using Internet Explorer or Firefox on an Evergreen site, you
might notice the search source selector widget is highlighted; that's
because Evergreen is an OpenSearch provider, so you can easily add an
Evergreen site to your browser as a search source. The OpenSearch
results, of course, are built on RSS/Atom just like the examples in
the <a class="reference external" href="http://www.opensearch.org/Specifications/OpenSearch/1.1#OpenSearch_response_elements">OpenSearch description
document</a>.
The default format that a user's custom bookbags are exposed in is
also an RSS feed. I suppose the authors didn't find an RSS feed icon
lighting up in the search results in the dynamic Evergreen OPAC and
made the assumption that no RSS feeds were provided. To address this
gap, I have added the one line of JavaScript to the default OPAC skin
that adds the Atom feed link necessary to make the RSS feed icon
light up. Not that many humans actually use RSS feeds directly - but
it will help make it easier to find for future feature comparison
articles.</p>
</li>
<li><p class="first">On "Relevancy", while Evergreen does not currently use circulation
data or "popularity" to affect relevancy rankings, I would happily
argue that the out-of-the-box relevancy ranking algorithm is good
enough to keep relevancy as the default sort option, while the
relevancy algorithm of our previous ILS was simply terrible. Combine
that with your ability to <a class="reference external" href="/archives/218-Adjusting-relevancy-rankings-in-Evergreen-1.6,-some-explorations.html">customize the relevancy
algorithm</a>,
and I think an argument could be made that, while "Relevancy has not
worked well in OPACs", it works well in this one.</p>
</p>
<p></li>
</ol>
<p>As the article mentioned the latest release of Evergreen was 2.0, let me
show you a screenshot of the default OPAC in Evergreen 2.0 as of the
upcoming alpha2 release. Notice a few things:</p>
<ul class="simple">
<li>Those facets on the left are much closer to what the world has come
to expect from a faceting interface. You get a narrowing effect on
your current search results, rather than firing off a brand new
search. There are pros and cons to this, but oh well.</li>
<li>Notice that the RSS feed icon is lit up in the URL bar. Yes,
Virginia, Evergreen has RSS feeds for search results, amongst many
other things.</li>
<li>The inline advanced search interface shows that Evergreen 2.0 offers
an <strong>OR</strong> option, and clearly labels the relationships between the
search terms.</li>
<li>The OpenSearch source has been added to the list of Firefox search
sources in the top right box, just by clicking on the icon and
selecting "Add Evergreen catalogue"</li>
</ul>
<div class="serendipity_imageComment_left" style="width: 970px"><div class="serendipity_imageComment_img"><p><img alt="Evergreen 2.0 inline advanced search interface showing AND and OR options" class="serendipity-image-left" src="/uploads/files/Evergreen2_advanced.png" style="width: 970px; height: 709px;" /></p>
</div><div class="serendipity_imageComment_txt"><p>Evergreen 2.0 inline advanced search interface showing AND and OR
options</p>
</div></div>Evergreen on FLOSS Weekly: the aftermath!2010-09-01T02:50:00-04:002010-09-01T02:50:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-09-01:/evergreen-on-floss-weekly-the-aftermath.html<p><strong>Update 2010-09-28</strong>: pedantic XHTML fix</p>
<hr class="docutils" />
<p>The recorded version of the <a class="reference external" href="http://twit.tv/floss132">Evergreen episode of the FLOSS
Weekly</a> show was released over the weekend.
I'm happy to say that Lynn watched it without looking too pained at any
given point, and the Evergreen project has already had several responses
to our plea …</p><p><strong>Update 2010-09-28</strong>: pedantic XHTML fix</p>
<hr class="docutils" />
<p>The recorded version of the <a class="reference external" href="http://twit.tv/floss132">Evergreen episode of the FLOSS
Weekly</a> show was released over the weekend.
I'm happy to say that Lynn watched it without looking too pained at any
given point, and the Evergreen project has already had several responses
to our plea for assistance so far, particularly on the packaging front,
which is fantastic! Just having one more skilled helping hand makes all
the preparation for and stress about the show worth it.</p>
<p>Several points that amused me about the show as I glanced over Lynn's
shoulder:</p>
<ul class="simple">
<li>In Randal's introduction, he said that I "worked for Coffee|Code",
full-stop. Aside to Leila Wallenius, the University Librarian of
<a class="reference external" href="http://laurentian.ca">Laurentian University</a>: no, there's nothing
I need to tell you, I'm still a full-time employee at the University
and I'm not planning on going anywhere! (That said, <strong>Coffee|Code
Consulting</strong> is a registered sole proprietorship that provides small
blocks of consulting services for Evergreen software in my spare
time).</li>
<li>For the first half of the show, my affiliation was shown as the
(misspelled) <a class="reference external" href="http://coffecode.net">http://coffecode.net</a>. So of course I immediately ran out
and bought that domain.</li>
<li>Co-host Dan Lynch expressed a wish that his own show, <a class="reference external" href="http://linuxoutlaws.com/">Linux
Outlaws</a>, had a guest list like FLOSS
Weekly. Oddly enough, some time ago when the subject of librarians
and their fanatical devotion to open access to information came up on
Linux Outlaws, I had submitted a feedback form on their site saying
(essentially) "hey, if you want to talk to a Linux-loving free
software-developing librarian some time, I'm around..." but I think
that comment went into the ether.</li>
</ul>
<p>If I ever do a video interview like this again, I'm going to try to:</p>
<ul class="simple">
<li>Prop my laptop up on a couple of LCSH books or attach a separate web
cam at a proper height so it doesn't appear that I'm looking down on
the viewer.</li>
<li>Cut down on the "uhm"s and "ahh"s and stare a bit more robotically at
the camera instead of rolling my eyes as I rack my brains to come up
with an answer</li>
<li>Stop rambling and trust the interviewers to take the show in the
direction that their audience will be interested in instead of trying
to jam in points that I think are important or interesting</li>
<li>Ensure that my erstwhile partner in crime has a better Internet
connection</li>
</ul>
Non-stop Evergreen, or "What I'm doing on my summer vacation"2010-08-26T00:40:00-04:002010-08-26T00:40:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-08-26:/non-stop-evergreen-or-what-im-doing-on-my-summer-vacation.html<p>Last week, I started my summer vacation with a weekend at a friend's
cottage. By Tuesday I was deeply engrossed in some Evergreen enhancement
work for the <a class="reference external" href="http://www.iisg.nl/">International Institute of Social
History</a>. I'm building an authorities management
user interface that properly exposes Evergreen's powerful authority
support in the 2.0 …</p><p>Last week, I started my summer vacation with a weekend at a friend's
cottage. By Tuesday I was deeply engrossed in some Evergreen enhancement
work for the <a class="reference external" href="http://www.iisg.nl/">International Institute of Social
History</a>. I'm building an authorities management
user interface that properly exposes Evergreen's powerful authority
support in the 2.0 release: browsing authority lists, editing
authorities and having the updates ripple through to the bibliographic
records with controlled fields, merging and deleting authorities...
here's a screenshot of the interface in progress:
<a class="reference external" href="/uploads/pics/Authority_record_list.png">|image0|</a>.
The numbers represent the number of bibliographic records linked to each
authority record. These are still early days, but I think there are some
cataloguers that are going to be pretty excited about this functionality
when they get their hands on it.</p>
<p>This week, I'm on location in the <a class="reference external" href="http://library.upei.ca/">Robertson Library at the University
of Prince Edward Island</a> doing some
Evergreen consulting work for them. The good people at UPEI have put my
family and I up in a nice cottage on the island, so I'm toiling away at
improving Evergreen during the day while my family explores the island.
Melissa Belvadi and Grant Johnson have put together a list of pain
points that they would like me to address that happen to mesh nicely
with general pain points that have come up over the years on the
Evergreen mailing lists. My first priority has been to make working with
spine labels a little less aggravating. I'm happy to say that after a
day and a half, I've been able to teach the spine label editor how to
(*gasp*) move up and down with the arrow keys and (*ooh-ahh*) insert
and delete new lines and (*w00t*) have the spine label defaults come
from library settings that only have to be set once instead of being
individually set by each cataloguer. Oh, and I've added font size, font
weight, and font family to those settings so that you can have 20 pt.
bold Helvetica spine labels if you want them.</p>
<p>All of this code is being committed to Evergreen trunk as I hit
functionality milestones; much of the authority work has made its way
into the Evergreen 2.0 alpha release that was cut on Monday (although
not yet announced officially). On Monday I also cut the OpenSRF
1.6.0-alpha release and uploaded a virtual image built on Debian Squeeze
reflecting the OpenSRF/Evergreen alpha releases to
<a class="reference external" href="http://evergreen-ils.org/~denials/Evergreen_trunk_2010_08_23.zip">http://evergreen-ils.org/~denials/Evergreen_trunk_2010_08_23.zip</a> (note
that it's 500 MB, and does not come with X installed, so it's primarily
aimed at users that are already familiar with Evergreen and just want to
see the new stuff without having to go through the entire install
process).</p>
<p>I did take some time off of Evergreen development this afternoon, as I
was honoured to be one of the two guests on the <a class="reference external" href="http://twit.tv/floss">FLOSS Weekly
podcast</a>. Mike Rylander and I were there to
discuss Evergreen with the hosts, <a class="reference external" href="http://twitter.com/merlyn">Randal
Schwartz</a> and <a class="reference external" href="http://danlynch.org/">Dan
Lynch</a>. Unfortunately for Mike, me, and the
audience, Mike's Skype connection kept dropping and I had to do the bulk
of the talking. Despite missing the contributions from Mike's massive
brain, I'm told that the show went well. So if you're interested in
hearing a bit about Evergreen and why I do what I do, keep an eye open
for the interview at <a class="reference external" href="http://twit.tv/floss132">http://twit.tv/floss132</a> - it should be edited and
online by Friday, August 27th at the latest. I tried not to swear too
often so they wouldn't have to do much editing work - heh.</p>
<p>Finally, somewhere in there I celebrated another birthday. Oh yeah!
Older? Yes! Wiser? Probably not.</p>
Classification scheme-aware call number sorting in Evergreen2010-08-09T01:37:00-04:002010-08-09T01:37:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-08-09:/classification-scheme-aware-call-number-sorting-in-evergreen.html<p>As a librarian who works at a library that primarily uses the <a class="reference external" href="http://www.loc.gov/catdir/cpso/lcco/">Library
of Congress classification
scheme</a>, I have been interested
for <a class="reference external" href="http://svn.open-ils.org/trac/ILS/ticket/51">a long time</a> in
teaching Evergreen to be aware of call number schemes other than Dewey.
The problem, in a nutshell, is that Evergreen simply applies an
alphabetical …</p><p>As a librarian who works at a library that primarily uses the <a class="reference external" href="http://www.loc.gov/catdir/cpso/lcco/">Library
of Congress classification
scheme</a>, I have been interested
for <a class="reference external" href="http://svn.open-ils.org/trac/ILS/ticket/51">a long time</a> in
teaching Evergreen to be aware of call number schemes other than Dewey.
The problem, in a nutshell, is that Evergreen simply applies an
alphabetical sort against the the uppercased version of the call number
when generating call number browser displays - resulting in LC call
numbers that sort incorrectly, like:</p>
<ul class="simple">
<li>K 215 .E53 W37 1997</li>
<li>K 22 .U748 v.18</li>
</ul>
<p>When the subject recently came up on the <a class="reference external" href="http://article.gmane.org/gmane.education.libraries.open-ils.general/2891">open-ils-general mailing
list</a>,
I decided to follow up with some code. So, <a class="reference external" href="http://svn.open-ils.org/trac/ILS/changeset/17130">as of this
weekend</a>, Evergreen
trunk now has a generalized infrastructure for generating sort keys for
call numbers. The broad strokes of the current implementation are:</p>
<ul class="simple">
<li>The classification scheme is set the level of the call number.</li>
<li>Classification schemes are defined in the
<tt class="docutils literal">asset.call_number_classification</tt> table with a pointer to a
database function to call to generate a normalized sort key for the
given call number.</li>
<li>Three classification schemes are available out of the box:<ul>
<li><em>Generic</em> (the default) - a simple normalization approach that
produces reasonable results in the absence of special rules for
Cutters, etc</li>
<li><em>Dewey (DDC)</em> - a normalization routine taken from the <a class="reference external" href="http://git.koha-community.org/gitweb/?p=koha.git;a=blob;f=C4/ClassSortRoutine/Dewey.pm;h=b4ba92199e7d425e3c4cfdb5082a4f36b486e3c9;hb=HEAD">Koha
C4::ClassSortRoutine::Dewey</a>
Perl module</li>
<li><em>Library of Congress (LC)</em> - a normalization routine that simply
wraps Bill Dueber's excellent
<a class="reference external" href="http://code.google.com/p/library-callnumber-lc/">Library::CallNumber::LC</a>
Perl module</li>
</ul>
</li>
<li>and adding more classification schemes is just a matter of adding
another row to the <tt class="docutils literal">asset.call_number_classification</tt> table and the
appropriate sortkey-generating database function.</li>
</ul>
<p>Note that this is the first time, to my knowledge, that Koha code has
been adopted directly by Evergreen. I included attribution for the
copyright holders in both the Generic and Dewey normalization functions.
I wrote the Generic implementation in Evergreen from scratch shortly
after taking a look at Koha's approach, so in some corners my work would
be considered a "derived work". Koha's Dewey normalization function was
(somewhat surprisingly) the only open-source implementation that I could
find for Dewey, so it made perfect sense to me to adopt that for use in
Evergreen. Many thanks to Koha for their use of the GPL v2 or later
licence!</p>
<p>There are still some limitations and low-hanging fruit that I hope to
address in the near future:</p>
<ul class="simple">
<li>Right now you can only manipulate classification schemes via SQL. The
<strong>Holdings Maintenance</strong> dialogue needs to give cataloguers the
ability to set the classification scheme for each call number,
because I'm sure they don't want to drop down to the command line.
This setting should probably be sticky during a given session, so
that if they're processing a cart of government docs, they won't have
to change the scheme from the default to CODOC for each item.</li>
<li>Speaking of defaults, each library needs to be able to define a
default classification scheme - so your consortium can have a Dewey
library and an LC library and a SUDOC library, and their preferences
won't trample each other. This can just be a simple org-unit setting.</li>
<li>Following on Mike Rylander's
<a class="reference external" href="http://svn.open-ils.org/trac/ILS/ticket/51">advice</a>, the
<tt class="docutils literal">asset.call_number_classification</tt> table should gain a new column
that lists the field/subfield combinations used to find the
appropriate call number (if any) for each scheme in a given
bibliographic record. Then the <strong>Holdings Maintenance</strong> dialogue can
offer the appropriate call number based on the classification scheme.</li>
</ul>
Authorities in Evergreen: an Amsterdam trip report2010-07-19T18:52:00-04:002010-07-19T18:52:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-07-19:/authorities-in-evergreen-an-amsterdam-trip-report.html<p>As part of the informal partnership between the <a class="reference external" href="http://iisg.nl">International Institute
of Social History (IISH)</a> and <a class="reference external" href="http://projectconifer.ca">Project
Conifer</a>, I was pleased to be able to spend
the last two weeks in Amsterdam, working side-by-side with one of the
Institute's developers, Ole Kerpel, on augmenting the support for MARC21
authorities in Evergreen …</p><p>As part of the informal partnership between the <a class="reference external" href="http://iisg.nl">International Institute
of Social History (IISH)</a> and <a class="reference external" href="http://projectconifer.ca">Project
Conifer</a>, I was pleased to be able to spend
the last two weeks in Amsterdam, working side-by-side with one of the
Institute's developers, Ole Kerpel, on augmenting the support for MARC21
authorities in Evergreen. To prepare for the work session, I had posted
a
<a class="reference external" href="https://blueprints.launchpad.net/evergreen/+spec/respect-my-authorities">blueprint</a>
for the authorities work on the Evergreen Launchpad instance and
circulated <a class="reference external" href="http://evergreen-ils.org/dokuwiki/doku.php?id=dev:proposal:authorities">the list of
requirements</a>
we had been asked to address to the broader Evergreen development
community. We were fortunate to have the attention of Mike Rylander on
the proposal, who not only supplied suggestions for how to implement
some of the items, but also committed significant code contributions to
the effort that greatly assisted our efforts. Here is a summary of the
goals we accomplished in the current development branch of Evergreen
(targeted for the 2.0 release), followed by a list of the outstanding
items and my finger-in-the-air estimate of how much more time it would
take to accomplish each of the tasks:</p>
<div class="section" id="accomplishments">
<h2>Accomplishments</h2>
<ul>
<li><p class="first">Controllable control numbers</p>
<p>While not, strictly speaking, a requirement for authority control in
and of itself, the ability to ensure that the behaviour of the
001/003/035 fields all conformed to the MARC21 specifications was an
important requirement for IISH. They plan to provide external access
to their authority and bibliographic records, so making the official
identifier fields linkable based on the underlying record ID was an
important aspect of the work. We implemented this feature as an
optional database-level trigger to ensure that the control numbers
and control number identifiers are always perfectly in sync with the
internal identifier of the particular system on which the records are
stored.</p>
</li>
<li><p class="first">Links</p>
<p>Where having Mike Rylander participate in your review process pays
off, part one... Before I even arrived in Amsterdam, Mike implemented
a tricky database trigger that tracks the links between a given
bibliographic record and the authority records to which it links. The
links are tracked at the database level, as well as directly in one
or more <tt class="docutils literal">0</tt> subfields in each field that is controlled by an
authority record. Yes, a given field in a bibliographic record can be
controlled by two authority records and it all works. Nice, Mike!</p>
</li>
<li><p class="first">Syncs</p>
<p>Where having Mike Rylander participate in your review process pays
off, part two... Mike also implemented the bulk of the logic for
automatically updating bibliographic records that are linked to a
given authority record when that authority record is modified. Yes,
folks, when you add a death date to an authority record, it will
automatically appear in the corresponding bib records.</p>
</li>
<li><p class="first">Control an uncontrolled set of bibliographic records</p>
<p>You may have dealt with library systems in the past that use some
sort of string matching to implement authority support. As noted
above, Evergreen is not like that. However, this means that many of
us, when migrating to Evergreen, have bibliographic records lacking
the <tt class="docutils literal">0</tt> subfields that are required for full authority support.
Towards that end, I wrote <a class="reference external" href="http://svn.open-ils.org/trac/ILS/browser/trunk/Open-ILS/src/support-scripts/authority_control_fields.pl">a
script</a>
that will walk through a set of bibliographic records, search for
matching authority records for each controllable field in each
bibliographic record, and add the required <tt class="docutils literal">0</tt> subfields to the
bibliographic records. It certainly won't be a fast solution, but you
should only need to do it once, and it worked on the limited test
cases that we had ready at hand.</p>
</li>
<li><p class="first">Teach the MARC editor about authority records</p>
<p>The MARC editor knew all about fixed fields for bibliographic
records, and provided a handy grid for editing those fields. However,
it didn't even know how to recognize authority records, and presented
a fixed field grid that was absolutely meaningless. I spent a chunk
of time laboriously transcribing the fixed field rules from MARC
documentation into the MARC editor and now the MARC editor presents a
reasonable fixed field grid for your editing convenience.</p>
</li>
<li><p class="first">Merge authority records</p>
<p>Something that often happens in a library is that two authority
records are created that identify the same thing. Eventually somebody
notices the problem and wants to merge the authority records
together. Towards this end, I added a database-level stored procedure
that supports the merging of authority records, such that the linked
bibliographic records will automatically point to the winning
authority record.</p>
</li>
<li><p class="first">Authority browse interfaces</p>
<p>Where having Mike Rylander participate in your review process pays
off, part the third... Mike also implemented basic browse interfaces
that presents a series of authority records in MARCXML format
matching your requested authority type (author, title, subject,
topic) and the matching substring at the <tt class="docutils literal">/opac/extras/browse</tt> and
<tt class="docutils literal">/opac/extras/startwith</tt> URL entry points. While still raw at this
point, these can provide the basis for classic authority browse
interfaces for those who desperately desire them.</p>
</li>
</ul>
</div>
<div class="section" id="remaining-to-do-items">
<h2>Remaining to-do items</h2>
<p><em>Note that any estimates are based on how long I think it would take me
to implement, based on my own familiarity with MARC and Evergreen and
all things Perl and JavaScript and PostgreSQL, and provided with the
granularity of no less than one day. Actual implementation times may
vary, of course; if related work items are worked on consecutively, then
it is likely to take less time to achieve than if the items are tackled
sporadically.</em></p>
<ul>
<li><p class="first">Add an authority in the flow</p>
<p>When you're working in the MARC Editor and you find that there is no
match for an entry that you really think should be controlled, IISH
wants to make it easy for a cataloguer to add an authority record for
that entry. We thought that there might be two options that we would
want to expose - a direct "create an authority record from this
field" option that takes no further input, and a "create an authority
record from this field and open it in another MARC editor to let me
tweak it" option. <strong>Estimate</strong>: 2 person days</p>
</li>
<li><p class="first">Highlight controlled fields</p>
<p>This is really a two-part problem. First, for uncontrolled fields, we
want to teach the <strong>Validate</strong> button to offer the kind of automatic
matching that the script does and add the required <tt class="docutils literal">0</tt> subfield.
Second, we want to highlight fields that are explicitly controlled by
authority records with a <tt class="docutils literal">subfield</tt> differently from fields that
simply match an authority record, but which are not controlled by it.
<strong>Estimate:</strong> 1 person day</p>
</li>
<li><p class="first">Simplify authority record selection</p>
<p>This two-part requirement would mask many of the fields that are
currently offered as options when you right-click on an uncontrolled
subfield to display matching authority records. For example, it is a
little weird to offer a "See from" heading to a cataloguer; we're
trying to avoid adding new records with those headings, right? Heh.
Second, we want to introduce the ability to invoke the authority
browse list in this interface so that the cataloguer can see a given
set of headings in context and select the heading to apply from
there. <strong>Estimate:</strong> 2 person days</p>
</li>
<li><p class="first">Delete authority record</p>
<p>There is currently no cataloguer-friendly way to delete authority
records. We need to expose a list of authority records (probably
reusing that browse list again) and make it possible for cataloguers
to delete an authority record. When that record is deleted, all
bibliographic records that link to it need to have their links
removed - and ideally, the cataloguer would be able to tell how many
bibliographic records link to that authority before the delete takes
place. <strong>Estimate:</strong> 1 person day</p>
</li>
<li><p class="first">Edit and merge authority records</p>
<p>Although the database-level support now exists for merging authority
records, we need to expose a means for cataloguers to select the
authority records that they want to edit or merge. This could just be
a slightly evolved version of the "Delete" interface. <strong>Estimate:</strong> 1
person day</p>
</li>
<li><p class="first">Expose authority records via SRU/Z39.50/crawlable interface</p>
<p>One of the goals of the IISH is to be able to share their authority
records with other institutions. One of the standard methods is SRU +
Z39.50 server support; we should be able to build on the SRU/Z39.50
server support for bibliographic records in Evergreen to provide a
basic solution for authority records. Interest has also been
expressed in having a crawlable implementation that would give the
linked data crowd something to play with. <strong>Estimate:</strong> 2 person days
for an SRU/Z39.50 server, 1 person day for a very basic crawlable
linked-data implementation</p>
</li>
</ul>
<p>In summary - hurray for Mike Rylander for helping us out to such an
extent, and many thanks, again, to IISH for giving me an opportunity to
focus on Evergreen development for an extended period of time, and to
Laurentian University for supporting my efforts. I hope that between Ole
and myself that it will be possible to finish the rest of these work
items prior to the Evergreen 2.0 release. It has been exhilarating to
see far Evergreen's authority support has come in less than a month, and
given a little more time I suspect that Evergreen's authority support
will be the envy of other library systems.</p>
</div>
Got funds for enhancing Evergreen? Looking for places to spend it?2010-07-02T19:37:00-04:002010-07-02T19:37:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-07-02:/got-funds-for-enhancing-evergreen-looking-for-places-to-spend-it.html<p>As an Evergreen developer, I believe our project has a few significant
gaps that projects like <a class="reference external" href="http://rscel.org">RSCEL</a> might be able to
help address for the overall good of the community by bringing in
outside resources to the project. Or perhaps there are skills within the
community that don't feel like …</p><p>As an Evergreen developer, I believe our project has a few significant
gaps that projects like <a class="reference external" href="http://rscel.org">RSCEL</a> might be able to
help address for the overall good of the community by bringing in
outside resources to the project. Or perhaps there are skills within the
community that don't feel like they've been called on yet; when I say
that we lack skills, I'm basing that on the lack of patches and offers
of assistance that I've seen in these areas. I would be delighted to be
proven wrong! Either way, I submit this for the community's
consideration.</p>
<ul class="simple">
<li><strong>3rd party security audit</strong>: Before Conifer adopted Evergreen, I had
hoped that we would be able to fund a security audit of the code by a
trusted and competent 3rd party like
<a class="reference external" href="http://omniti.com/does/web-application-security">OmniTI</a> (from a
previous life, I believe that OmniTI employs some of the best people
in the business, thus the plug - but there are certainly other
options out there). As developers, we try our best to avoid
vulnerabilities, but as the <a class="reference external" href="http://evergreen-ils.org/blog/?p=406">recently disclosed vulnerability in
open-ils.pcrud</a> attests,
we're not experts in security. An audit of the public-facing
interfaces (the catalogue, feeds, etc) would be a great help to the
project. I would expect a prioritized list of areas that need to be
addressed, along with recommendations on how to address those
problems (whether they be cross-site scripting, session fixation
attacks, authentication encryption attacks, etc). Our community's
process (or lack thereof) for reporting and addressing security
vulnerabilities might be an appropriate subject for an audit as well.</li>
<li><strong>Testing framework</strong>: Our project is woefully short on tests, either
human-powered or automated, for determining the state of the code at
any given point in a release cycle. Thus, we have put out release
after release that either won't install cleanly, or won't upgrade
successfully from a previous release. The trunk version of the code
had a error that meant that Evergreen couldn't be compiled; that
problem existed for three weeks before somebody noticed and fixed it.
I'm not pointing fingers, here; if I did that, I wouldn't have enough
fingers to point back at myself for all of the problems I've
introduced that other people have had to fix. Johnathan Nightingale
in <a class="reference external" href="http://blog.johnath.com/2008/07/02/the-most-important-thing/">The Most Important Thing … or How Mozilla Does Security and What
You Can
Steal</a>
provides a great overview of Mozilla's philosophy about and approach
to testing. There is all kinds of goodness in this presentation, but
one of the most interesting points is that "money can be exchanged
for services" -- that is to say, if your existing development team
doesn't have the skills or time to implement a testing
infrastructure, there are companies that do have the ability to put
together a test infrastructure for a given project. Once that
infrastructure is in place, it tends to get extended and used by the
existing development team because it makes their lives easier; they
don't need to manually test a given code path every time in the
future, or deal with regressions that aren't noticed until months in
the future when the changes they were making are no longer fresh in
their minds. It sometimes requires a culture change, though.</li>
<li><strong>Continuous integration</strong>: Hand in hand with a testing framework is
a continuous integration server that provides testing feedback on
every commit to the Evergreen repository for a given set of branches.
Even without a testing framework, it is possible to have a continuous
integration server run through the process of installing all
prerequisites, configuring the code, building and installing the
code, and creating the database schema to at least determine whether
the basics can be accomplished successfully to confirm that a branch
is ready for release. This arguably also goes hand in hand with a
team's process for addressing a security vulnerability: if you have a
continuous integration server that can tell you if a given fix does
not introduce basic build and install errors, then you can get a new
release out with much more confidence that you're not going to be
encouraging your users to jump to a broken package. Note that Equinox
ran a continuous integration server for OpenSRF and Evergreen trunk
<a class="reference external" href="http://markmail.org/message/j6hd6634oimpum6x">for a while</a>, but
that was <a class="reference external" href="http://markmail.org/message/evjm4rdsricwz5qh">killed</a>
and replaced by a <a class="reference external" href="http://markmail.org/message/zfqnec2b6n6wjcsk">call for volunteers to build a new continuous
integration service</a>
(I can't find a more to-the-point call for volunteers, so perhaps it
just hasn't been advertised widely enough - or again, perhaps we lack
the skills in the community to get a standard CI service like
<a class="reference external" href="http://hudson-ci.org/">Hudson</a> running.)</li>
<li><strong>Packaging</strong>: To decrease the difficulty of installing and
configuring Evergreen, we need more investment in packaging Evergreen
and all of its unpackaged dependencies. The idea is that a user
should be able to run "aptitude install evergreen" or "yum install
evergreen" and have the entire system installed and configured, and
then run "aptitude upgrade" or "yum upgrade" to have newer versions
installed. Right now the process is still rather onerous and requires
a great deal of manual effort, although it has improved significantly
since the early days of 2007. Again, this requires a particular set
of skills that the Evergreen community does not appear to possess in
depth: autoconf, automake, APT and RPM packaging - and perhaps some
redesign of elements like skins to make local customizations easier
to incorporate and keep up to date. This would be a natural
complement to a continuous integration service, but much of the
effort could also be done on its own.</li>
</ul>
Random useful Evergreen database queries2010-07-02T16:20:00-04:002010-07-02T16:20:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-07-02:/random-useful-evergreen-database-queries.html<p>Occasionally I drop down to the database level to generate some
reporting information. You could probably get the same information
through the reporter but I like the precision of SQL. Here are a couple
of queries that I've put together recently.</p>
<div class="section" id="list-titles-for-periodicals-published-by-human-kinetics-with-subscriptions-owned-by-library-id-osul">
<h2>List titles for periodicals published by "Human Kinetics" with …</h2></div><p>Occasionally I drop down to the database level to generate some
reporting information. You could probably get the same information
through the reporter but I like the precision of SQL. Here are a couple
of queries that I've put together recently.</p>
<div class="section" id="list-titles-for-periodicals-published-by-human-kinetics-with-subscriptions-owned-by-library-id-osul">
<h2>List titles for periodicals published by "Human Kinetics" with subscriptions owned by library ID "OSUL"</h2>
<pre class="literal-block">
SELECT rsr.id, rsr.title FROM metabib.full_rec mfr INNER JOIN metabib.rec_descriptor mrd ON mfr.record = mrd.record INNER JOIN asset.call_number acn ON acn.record = mrd.record INNER JOIN reporter.super_simple_record rsr ON rsr.id = mrd.record INNER JOIN actor.org_unit aou ON aou.id = acn.owning_lib WHERE mfr.tag = '260' AND mfr.subfield = 'b' AND mfr.value ilike 'Human Kinetics%' AND mrd.bib_level = 's' AND aou.shortname = 'OSUL';
</pre>
</div>
<div class="section" id="strip-out-urls-for-an-online-resource-to-which-we-no-longer-subscribe">
<h2>Strip out URLs for an online resource to which we no longer subscribe</h2>
<p>Occasionally we drop subscriptions to an online resource that we
happened to catalogue with an inline 856 field. Our new approach relies
on just-in-time results from our link resolver to display accurate
access to online resources (or at least consistent representations of
what we have access to!), but our legacy records placed all of that
information directly in the 856 field in the corresponding bibliographic
record. The PostgreSQL
<a class="reference external" href="http://www.postgresql.org/docs/current/static/functions-string.html">regexp_replace()</a>
function lets you use regular expressions to match subsets of the MARC
record and replace it with... well... nothing, in this case.</p>
<p>As we want to subsequently reingest the MARC records, and we're not
running Evergreen trunk yet in which a reingest will automatically be
triggered by an update to the biblio.record_entry table, I first push
the list of affected IDs into a scratch table. This also lets me put
limits on the MARC records that I'm going to touch, so that I don't
inadvertently destroy content in another library's set of bibliographic
records.</p>
<pre class="literal-block">
CREATE TABLE scratchpad.urls_to_delete (id BIGINT);INSERT INTO scratchpad.urls_to_delete SELECT acn.record FROM asset.uri au INNER JOIN asset.uri_call_number_map aucnm ON au.id = aucnm.uri INNER JOIN asset.call_number acn ON aucnm.call_number = acn.id INNER JOIN actor.org_unit aou ON acn.owning_lib = aou.id WHERE au.href ILIKE '%/search.ebscohost.com/direct.asp?db=rch%' AND aou.shortname = 'OSUL';BEGIN; UPDATE biblio.record_entry SET marc = regexp_replace( marc, E'<datafield tag="856" ind1="4" ind2="0"><subfield code="z">Available online from Ebsco.*?search.ebscohost.com/direct.asp\\?db=rch.*?</datafield>', '' ) WHERE id IN (SELECT id FROM scratchpad.urls_to_delete);
</pre>
<p>Note that the UPDATE statement is preceded by a BEGIN statement so that
we can check our results and issue a ROLLBACK if we inadvertently
changed too much, or created mangled records. Once you check your work
with a SELECT statement or two, you can issue a COMMIT statement to make
the changes take effect.</p>
</div>
OpenSRF article in code4lib Journal has been published2010-06-22T18:44:00-04:002010-06-22T18:44:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-06-22:/opensrf-article-in-code4lib-journal-has-been-published.html<p>Many of you have undoubtedly seen previous drafts of <a class="reference external" href="http://journal.code4lib.org/articles/3284">this
article</a> as I worked on it
over the past three or four months, but I'm pleased to say that the
<em>Easing Gently into OpenSRF</em> article has now been officially published
by the <a class="reference external" href="http://journal.code4lib.org">code4lib Journal</a>. The goal of
the article is …</p><p>Many of you have undoubtedly seen previous drafts of <a class="reference external" href="http://journal.code4lib.org/articles/3284">this
article</a> as I worked on it
over the past three or four months, but I'm pleased to say that the
<em>Easing Gently into OpenSRF</em> article has now been officially published
by the <a class="reference external" href="http://journal.code4lib.org">code4lib Journal</a>. The goal of
the article is to introduce the OpenSRF infrastructure for building
applications on a scale-out architecture -- which is a high-faluting
mouthful -- using a 10-line Perl module that implements a standalone
OpenSRF service as the entry point. Along the way, the article covers a
little bit of the Evergreen-specific functionality that is built on top
of OpenSRF; hopefully enough to act as a teaser for follow-on articles
in the future. My naked desire is to get more development talent to join
us at the OpenSRF + Evergreen tables. The buffet is rich and the food
(and available tasks) are plentiful!</p>
<p>I would be remiss if I did not profusely thank my editors, <a class="reference external" href="http://bibwild.wordpress.com/">Jonathan
Rochkind</a> and <a class="reference external" href="http://rc98.net/">Gabriel
Farrell</a>, for their probing questions, requests for
more content and examples, and suggestions. They helped shape a much
more comprehensive and useful article than I would have produced on my
own.</p>
Building more informative record displays in Evergreen with BibTemplate2010-04-23T11:14:00-04:002010-04-23T11:14:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-04-23:/building-more-informative-record-displays-in-evergreen-with-bibtemplate.html<p><strong>Update: 2011-04-24</strong> Just noticed the link to the Laurentian detail
file was broken - that's what I get for posting early in the morning
under the influence of a cold, eh? All fixed up now, though.</p>
<p>This is a quick link to the updated version of the presentation
<a class="reference external" href="/uploads/talks/2010/BibTemplate_EG2010.odp">(OpenOffice.org)</a>
<a class="reference external" href="/uploads/talks/2010/BibTemplate_EG2010.pdf">(PDF …</a></p><p><strong>Update: 2011-04-24</strong> Just noticed the link to the Laurentian detail
file was broken - that's what I get for posting early in the morning
under the influence of a cold, eh? All fixed up now, though.</p>
<p>This is a quick link to the updated version of the presentation
<a class="reference external" href="/uploads/talks/2010/BibTemplate_EG2010.odp">(OpenOffice.org)</a>
<a class="reference external" href="/uploads/talks/2010/BibTemplate_EG2010.pdf">(PDF)</a>
I'll be giving in a few hours (no, for real this time!).</p>
<p>Also of interest is the <a class="reference external" href="http://svn.open-ils.org/trac/ILS-Contrib/browser/conifer/branches/rel_1_6_1/web/opac/skin/lul/xml/rdetail/rdetail_summary.xml">customized
rdetail_summary.xml</a>
file used in the Laurentian University catalogue - which, with one minor
change to the ISSN display (you don't really want to be displaying
electronic holdings for Laurentian in your catalgoue, do you?) should be
a drop-in replacement for any Evergreen 1.6 site.</p>
Evergreen self-serve password reset interface coming in 1.6.1.02010-04-22T19:28:00-04:002010-04-22T19:28:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-04-22:/evergreen-self-serve-password-reset-interface-coming-in-1610.html<p><strong>Update: 2010-04-22 16:24:56</strong>: Evidently, "in a few minutes" means
tomorrow morning... avid Coffee|Code readers get the early scoop.</p>
<p>I'm going to give a lightning talk in a few minutes about the self-serve
password reset mechanism that I added to Evergreen last month, that
should see the light …</p><p><strong>Update: 2010-04-22 16:24:56</strong>: Evidently, "in a few minutes" means
tomorrow morning... avid Coffee|Code readers get the early scoop.</p>
<p>I'm going to give a lightning talk in a few minutes about the self-serve
password reset mechanism that I added to Evergreen last month, that
should see the light of day in the Evergreen 1.6.1.0 release in May
2010. Here's the presentation in <a class="reference external" href="/uploads/talks/2010/PasswordResetLightningTalk.odp">OpenOffice.org Impress
format</a></p>
Setting up secure self-check connections using SIP tunneled through SSH2010-04-16T19:17:00-04:002010-04-16T19:17:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-04-16:/setting-up-secure-self-check-connections-using-sip-tunneled-through-ssh.html<p><strong>Updated 2010-04-27</strong>: Fix corrupted characters introduced by copying
from my GroupWise client. Thanks to Joe Atzberger for pointing that out.</p>
<p>I set up a secure SIP connection from our self-check machine to our
Evergreen server located about 450km away, and thought I would put
together a quick blog post on …</p><p><strong>Updated 2010-04-27</strong>: Fix corrupted characters introduced by copying
from my GroupWise client. Thanks to Joe Atzberger for pointing that out.</p>
<p>I set up a secure SIP connection from our self-check machine to our
Evergreen server located about 450km away, and thought I would put
together a quick blog post on how things are working in production with
SIP in <a class="reference external" href="http://projectconifer.ca">Conifer</a>. It seems a lot of sites
run SIP without a secured connection, based on how our self-check sales
rep and technical support person talked to me on the phone as though
they were talking to someone with two heads when I mentioned my concerns
about security - and they had no advice to offer on setting up an
encrypted connection. So I guess the subject doesn't come up too often.</p>
<p>That doesn't excuse us as proper systems librarians from protecting as
much patron information from exposure as possible. So here's how we do
things at <a class="reference external" href="http://laurentian.ca">Laurentian University</a> - some
hostnames / IP addresses changed to protect the innocent:</p>
<ol class="arabic">
<li><p class="first">The SIP server runs on one of our Evergreen server boxes; let's call
it carbon.example.com. carbon itself has no direct access to or from
the Internet.</p>
</li>
<li><p class="first">carbon has been set up with an iptables rule allowing access via port
6001 from starburst.example.com. starburst lives out in the
demilitarized zone of our ISP.</p>
</li>
<li><p class="first">starburst has been set up to allow access via port 22 from two
specific addresses at our library - no VPN connection required. We're
keeping this as locked down as possible, hence the source IP address
restriction. We opted for no VPN connection because most VPN clients
require manual steps to authenticate, and we need the self-check to
make the connection automatically when it boots up. Don't worry,
we'll get to the encryption part.</p>
</li>
<li><p class="first">From the self-check machine, we set up port-forwarding of carbon:6001
to localhost:6001 via the sipuser user on starburst. I have set up a
hostname called "sip.example.com" that points at
starburst.example.com; our ISP sysadmin has added a local user on
starburst named "sipuser". We have then set up the SSH
authorized_keys file so that SSH logins can't actually log in, and
in fact the only thing they can do is forward port 6001 on carbon.</p>
</p><p>In /home/sipuser/.ssh/authorized_keys, each entry should therefore
begin with:</p>
<pre class="literal-block">
command="/bin/false",no-port-forwarding,no-agent-forwarding,permitopen="carbon:6001" <key-type> <key> <name>
</pre>
</p>
</p><ol class="arabic">
<li><p class="first">On the self-check machine, I used ssh-keygen to generate an SSH
key and then appended the public key to
/home/sipuser/.ssh/authorized_keys on starburst to enable logins
without using the UNIX password.</p>
</li>
<li><p class="first">On the self-check machine, the SSH command looks like:</p>
</p><pre class="literal-block">
ssh -f -N -L 6001:carbon.example.com:6001 sipuser@sip.example.com
</pre>
</li>
<li><p class="first">Our self-check machine is running Windows Vista inside, so I've
actually implemented it using Cygwin's "run" command in a shortcut
and dropped it into the user's Start folder so that it
automatically sets up the connection at startup time. The shortcut
command is:</p>
</p><pre class="literal-block">
C:\cygwin\bin\run.exe -p /bin ssh -f -N -L 6001:carbon.example.com:6001 sipuser@sip.example.com
</pre>
</p>
<p></li>
</ol>
</p>
<p></li>
<li><p class="first">We're using SIP2 over raw sockets to communicate. We found that we
had to supply the SIP username and password in the 3M self-check
software. Apparently authentication is unnecessary for Unicorn's SIP
implementation, and also apparently no library has ever been
concerned about SIP2 being a clear-text protocol before!</p>
</p>
<p></li>
<li><p class="first">And all of that has worked exactly once so far, starting from a cold
boot. I'm going to be giving it a bunch of tests tomorrow, but I'm
very excited to have an end-to-end encrypted connection working out
of the box.</p>
</li>
</ol>
<p>Well - that was the substance of the email I wrote four months ago.
Since then, the self-check has been turned off every night and has
connected flawlessly every morning - with the exception of one weekend
when we brought down Evergreen for system maintenance and someone
*cough our vendor cough* forgot to start the SIP server again. I'm
happy with the results, and it's really not <em>that</em> complicated. If your
library uses self-check machines and runs SIP over the network in
clear-text, isn't it time your library beefed up its security?</p>
Adjusting relevancy rankings in Evergreen 1.6, some explorations2010-03-17T21:13:00-04:002010-03-17T21:13:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-03-17:/adjusting-relevancy-rankings-in-evergreen-16-some-explorations.html<p><strong>Update: 2010-03-18</strong> - I just realized that now that I have a separate
keyword index for titles, I can assign a stronger boost in general to
keywords that appear in the title for a keyword search; see <a class="reference external" href="#updated_index_weight">update:
index weight</a> below.</p>
<p>One of my colleagues just asked me:</p>
<blockquote>
</p><p>So, how does …</p></blockquote><p><strong>Update: 2010-03-18</strong> - I just realized that now that I have a separate
keyword index for titles, I can assign a stronger boost in general to
keywords that appear in the title for a keyword search; see <a class="reference external" href="#updated_index_weight">update:
index weight</a> below.</p>
<p>One of my colleagues just asked me:</p>
<blockquote>
</p><p>So, how does relevancy ranking work in Evergreen, anyway?</p>
<p></blockquote>
<p>I've been poking around in the area recently, as one of our users
complained about the relevance of some results with a basic keyword
search, so I thought I would throw my thoughts out there. It might give
other people a good jumping off point, and it provides a bit more of an
answer to questions <a class="reference external" href="http://markmail.org/message/t32c5zwkzp2sicgg">like
these</a> on the Evergreen
mailing lists. There are a number of factors, but cover density plays a
significant role - how often the terms you're looking for appear within
the target index, where index = keyword, author, title, subject, or
series (at least, those are the indexes that Evergreen supplies you with
out of the box). Then there are a number of tweakable boosts that appear
in the <tt class="docutils literal">search.relevance_ranking</tt> table:</p>
<ul class="simple">
<li><tt class="docutils literal">full_match</tt>: for an exact match of the terms you're looking for,
from beginning to end, in the target index</li>
<li><tt class="docutils literal">first_word</tt>: for a match of the first search term with the first
term in the target index</li>
<li><tt class="docutils literal">word_order</tt>: for a match between the order of the search terms and
the order of the terms in the target index</li>
</ul>
<p>The problem with searching the out of the box "keyword" index is that
there's no way of boosting the ranking for terms appearing in, say, the
title or subject, because out of the box there's just one
<strong>keyword|keyword</strong> index. For a keyword search, you can't tell
Evergreen that terms that appear in the title should be more relevant
than terms that appear in something like the content notes. In
comparison, the <strong>title</strong> index is actually composed of a number of
separate indexes: <strong>title|proper</strong>, <strong>title|uniform</strong>,
<strong>title|alternative</strong>, <strong>title|translated</strong>, etc, that collectively
form the title index. You can see this in the <tt class="docutils literal">config.metabib_field</tt>
table.</p>
<p>Given some relatively horrible results for a keyword search like
"programming languages" that returns <em>Regular expression recipes for
Windows developers</em> as the most relevant hit (are you <em>kidding</em> me? No,
it's because "Programming languages" appears in the subjects about 10
times... sigh), on our test server I added a <strong>keyword|title</strong> index
that is identical to the <strong>title|proper</strong> index, and then added some
entries to the <tt class="docutils literal">search.relevance_adjustment</tt> table to modify the
relevancy ranking accordingly, as follows:</p>
<pre class="literal-block">
-- Clone the title|proper index to create a keyword|title index-- 6 = the title|proper indexINSERT INTO config.metabib_field (field_class, name, xpath, weight, format, search_field, facet_field) SELECT 'keyword', 'title', xpath, weight, format, search_field, facet_field FROM config.metabib_field WHERE id = 6;-- Populate the keyword|title index with a set of index entries cloned-- from the metabib.title_field_entry table;-- 6 = the title|proper indexINSERT INTO metabib.keyword_field_entry (source, field, value) SELECT source, 17, value FROM metabib.title_field_entry WHERE field = 6;-- Bump the relevance when the first search term appears first in the title in a keyword search-- 17 = our new keyword|title indexINSERT INTO search.relevance_adjustment (active, field, bump_type, multiplier) VALUES (true, 17, 'first_word', 5);
</pre>
<p>It feels dirty, because we're creating such a massively duplicated set
of rows. But it works... at least the <tt class="docutils literal">first_word</tt> relevance
adjustment works. When I tried using a multiplier of 1000 for the
<tt class="docutils literal">word_order</tt> relevance adjustment, it did not affect the search
results in the least. Perhaps there's a bug there?</p>
<p>In any case, by combining some of the findings of this post with my
previous post on <a class="reference external" href="/archives/217-More-granular-identifier-indexes-for-your-Evergreen-SRU-Z39.50-servers.html">adding more granular
indexes</a>,
perhaps this will help people get deeper into customizing the search
experience for their Evergreen installations.</p>
<p>` <>`__<strong>Update: adjusting search weight of terms in title in
general</strong>: So, now that we have the <tt class="docutils literal">keyword|title</tt> index, we can
boost the relevancy ranking for records in which the search terms appear
in the <tt class="docutils literal">keyword|title</tt> index rather than the general
<tt class="docutils literal">keyword|keyword</tt> index. Here's how to shake things up:</p>
<pre class="literal-block">
-- Boost the relevance for search terms appearing in the title in general-- 17 = our new keyword|title indexUPDATE config.metabib_field SET weight = 10 WHERE id = 17;
</pre>
<p>Some quick testing suggests that a weight of 10 works reasonably well...
but that is obviously going to be subject to further testing and
tweaking. But hey: we have the ability to tweak now! Yay!</p>
More granular identifier indexes for your Evergreen SRU / Z39.50 servers2010-03-10T04:27:00-05:002010-03-10T04:27:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-03-10:/more-granular-identifier-indexes-for-your-evergreen-sru-z3950-servers.html<p>In June of 2009 I was
<a class="reference external" href="/archives/194-SFX-target-parser-for-Evergreen-and-some-thoughts-about-searching-identifiers.html">moaning</a>
about how “Evergreen, by default, has no identifier index for limiting
searches by ISBN / ISSN / LCCN / OCLCnum” and that “if [fixing this
problem] requires work from me, it will probably be 2010 before any of
it happens”. Due to some of the tools …</p><p>In June of 2009 I was
<a class="reference external" href="/archives/194-SFX-target-parser-for-Evergreen-and-some-thoughts-about-searching-identifiers.html">moaning</a>
about how “Evergreen, by default, has no identifier index for limiting
searches by ISBN / ISSN / LCCN / OCLCnum” and that “if [fixing this
problem] requires work from me, it will probably be 2010 before any of
it happens”. Due to some of the tools our consortium relies on, we
really needed a solution for identifier searches in Z39.50 that was
better than just a general keyword search: we were returning too many
false positives that cause extra work and frustration for everyone.</p>
<p>Well, here it is, 2010, and as of today Conifer's Evergreen server now
has a very handy identifier index. Most of the required pieces were
already there, in one form or another, but they all needed to be brought
together. This blog post is going to try to do that (and serve as
documentation for my ever-decaying brain, too). At the time of this
post, we're running a 1.6.0.4-ish Evergreen system; you'll need to be
running 1.6.0.4 to get ISSN searching to work properly, too.</p>
<p>First, we need to create the identifier index. Evergreen comes with the
following indexes out of the box:</p>
<ul class="simple">
<li><tt class="docutils literal">author</tt></li>
<li><tt class="docutils literal">title</tt></li>
<li><tt class="docutils literal">series</tt></li>
<li><tt class="docutils literal">subject</tt></li>
<li><tt class="docutils literal">keyword</tt></li>
</ul>
<p>Pretty standard. With the exception of <tt class="docutils literal">keyword</tt>, each of these
indexes is composed of more granular indexes; for example, the <tt class="docutils literal">title</tt>
index is composed of the following specific indexes, with the XML format
that the MARCXML is converted to and then the XPath expression that
extracts the text from the pertinent XML format:</p>
<ul class="simple">
<li><tt class="docutils literal">abbreviated</tt> - MODS32 -
<tt class="docutils literal">//mods32:mods/mods32:titleInfo[mods32:title and <span class="pre">(@type='abbreviated')]</span></tt></li>
<li><tt class="docutils literal">translated</tt> - MODS32 -
<tt class="docutils literal">//mods32:mods/mods32:titleInfo[mods32:title and <span class="pre">(@type='translated')]</span></tt></li>
<li><tt class="docutils literal">alternative</tt> - MODS32 -
<tt class="docutils literal">//mods32:mods/mods32:titleInfo[mods32:title and <span class="pre">(@type='alternative')]</span></tt></li>
<li><tt class="docutils literal">uniform</tt> - MODS32 -
<tt class="docutils literal">//mods32:mods/mods32:titleInfo[mods32:title and <span class="pre">(@type='uniform')]</span></tt></li>
<li><tt class="docutils literal">proper</tt> - MODS32 -
<tt class="docutils literal">//mods32:mods/mods32:titleInfo[mods32:title and <span class="pre">(@type='proper')]</span></tt></li>
</ul>
<p><strong>Aside</strong>: You can search against these more granular indexes in the
Evergreen OPAC, by the way, by appending the granular index name to the
index class name with a <tt class="docutils literal">|</tt> as a delimiter. For example, a search
query of <tt class="docutils literal">title|uniform: canada</tt> will search only the uniform titles
for the term "canada". Okay, sorry for that detour, but I bet you
weren't aware of that - we haven't done a good job of exposing some of
the magic that has been there for a long time in Evergreen in the OPAC
interface.</p>
<p>Back to understanding the configuration - as you can see above, the
conversion to <a class="reference external" href="http://www.loc.gov/standards/mods/">MODS</a> does the
heavy lifting in pulling out the fields of interest to us from the
MARCXML. The full set of indexed fields and their definitions is visible
in the database via the query:</p>
<pre class="literal-block">
SELECT * FROM config.metabib_field;
</pre>
<p>For our purposes, we're interested in pulling the raw 010 (LCCN), 020
(ISBN), and 022 (ISSN) <tt class="docutils literal">a</tt> subfields directly from the MARCXML source.
Our first step is to add an entry to the <tt class="docutils literal">config.metabib_field</tt> table
defining our new index. We'll create a new granular index under the
"keyword" index class and call it "identifier", because that's what it
is, right? That's as easy as:</p>
<pre class="literal-block">
INSERT INTO config.metabib_field (field_class, name, xpath, weight, format, search_field, facet_field) VALUES ('keyword', 'identifier', '//marcxml:datafield[@tag="010" or @tag="020" or @tag="022"]/marcxml:subfield[@code="a"]', 1, 'marcxml', true, false);
</pre>
<p>Next, we need to restart the <tt class="docutils literal"><span class="pre">open-ils.storage</span></tt> and
<tt class="docutils literal"><span class="pre">open-ils.ingest</span></tt> services to make them aware of this new entry. Go
ahead, I'll wait while you run <tt class="docutils literal">osrf_ctl.sh <span class="pre">-a</span> restart_perl</tt> or use
<tt class="docutils literal"><span class="pre">opensrf-perl.pl</span></tt> to restart the services individually. Done? Good.</p>
<p>We have to make up for lost time, now, as all of the bibliographic
records in your system didn't have this definition in place when they
were first ingested. The easiest thing to do is to just pull the
pertinent data directly from the <tt class="docutils literal">metabib.full_rec</tt> view (which is a
shredded version of the source MARCXML from your bibliographic records,
with one tag/subfield value per row. Ergo:</p>
<pre class="literal-block">
-- Get the ID from the row that you just inserted for the new index;-- we'll use this in the INSERT statementSELECT id FROM config.metabib_field WHERE field_class = 'keyword' AND name = 'identifier';-- Let's say the ID was 18; we'll use that to identify the index in the SELECT statementINSERT INTO metabib.keyword_field_entry (field, source, value) SELECT 18, record, agg_text(value) FROM metabib.full_rec WHERE tag IN ('010', '020', '022') AND subfield = 'a' GROUP BY 1, 2;
</pre>
<p>All right! Now you can run some test searches in the OPAC for ISSNs,
ISBNs, and LCCNs in your OPAC using the
<tt class="docutils literal">keyword|identifier: some_identifier</tt> prefix. Cool. So that's part
one, mostly lifted from the <a class="reference external" href="http://evergreen-ils.org/dokuwiki/doku.php?id=scratchpad:random_magic_spells#how_to_include_a_specific_marc_field_with_a_specific_search_class">"magic
spell"</a>
in the Evergreen wiki.</p>
<p>Part two is configuring SRU to use the new identifier index. The bulk of
the Evergreen SRU implementation is contained in the Perl module
OpenILS::WWW::SuperCat.pm (located in your install directory in
<tt class="docutils literal">/openils/lib/perl5/OpenILS/Application/SuperCat.pm</tt>). Get out your
patch tool or open up the Perl module in a text editor, we're going to
make a few changes. The pertinent diff follows:</p>
<pre class="literal-block">
--- old/OpenILS/WWW/SuperCat.pm 2010-03-09 17:26:20.000000000 -0500+++ new/OpenILS/WWW/SuperCat.pm 2010-03-10 00:11:58.000000000 -0500@@ -1410,6 +1410,7 @@ 'bib.titlealternative' => 'title', 'bib.titleseries' => 'series', 'eg.series' => 'title',+ 'eg.identifier' => 'keyword|identifier', # Author/Name class: 'eg.author' => 'author',@@ -1438,7 +1439,7 @@ 'srw.serverchoice' => 'keyword', # Identifiers:- 'dc.identifier' => 'keyword',+ 'dc.identifier' => 'keyword|identifier', # Dates: 'bib.dateissued' => undef,@@ -1497,6 +1498,7 @@ subject => ['subject'], keyword => ['keyword'], series => ['series'],+ identifier => ['keyword|identifier'], }, dc => { title => ['title'],@@ -1504,7 +1506,7 @@ contributor => ['author'], publisher => ['keyword'], subject => ['subject'],- identifier => ['keyword'],+ identifier => ['keyword|identifier'], type => [undef], format => [undef], language => ['lang'],
</pre>
<p>Essentially, we've defined a new qualifier (<tt class="docutils literal">eg.identifier</tt>) and
pointed it and the <tt class="docutils literal">dc.identifier</tt> indexes at the new, more specific
<tt class="docutils literal">keyword|identifier</tt> index. Once the updated file is in place, reload
your Apache configuration (<tt class="docutils literal">/etc/init.d/apache reload</tt>) and SRU
requests using those qualifiers will now point at the identifier index.
FABULOUS.</p>
<p>Our last step is to teach our simple2zoom-based Z39.50 configuration
about the new index by mapping the corresponding BIB-1 attributes to the
new <tt class="docutils literal">eg.identifier</tt> qualifier, like so:</p>
<pre class="literal-block">
<database name="FOOBAR"> <zurl>http://localhost/opac/extras/sru/FOOBAR/holdings</zurl> <option name="sru">get</option> <charset>marc-8</charset> <search> <querytype>cql</querytype> <map use="4"><index>eg.title</index></map> <map use="7"><index>eg.identifier</index></map> <map use="8"><index>eg.identifier</index></map> <map use="9"><index>eg.identifier</index></map> <map use="21"><index>eg.subject</index></map> <map use="1003"><index>eg.creator</index></map> <map use="1018"><index>eg.publisher</index></map> <map use="1035"><index>eg.keyword</index></map> <map use="1016"><index>eg.keyword</index></map> </search> </database>
</pre>
<p>Kill your simple2zoom processes and restart simple2zoom and you should
be in heaven - farewell, false positive matches! Oh, and about that <a class="reference external" href="/archives/194-SFX-target-parser-for-Evergreen-and-some-thoughts-about-searching-identifiers.html">SFX
target parser for
Evergreen</a>;
now you can remove all of the gimmickry around exact searches and
worrying about ISSNs that contain an 'X' and just point at the
identifier index. For example:</p>
<pre class="literal-block">
if (defined($ISSN)) { $searchString .= "keyword|identifier: $ISSN"; } elsif (defined($ISBN)) { $ISBN =~ s/-//g; # Most of our ISBNs are normalized to no hyphens $searchString .= "keyword|identifier: $ISBN"; }
</pre>
<p>Things still aren't perfect in Evergreen identifier-land: we still need
to do some work to normalize hyphenation of our ISBNs, for example, and
ensure we have 10-digit & 13-digit ISBN equivalents. But we're a lot
closer to perfection now - and with the work that Mike Rylander is doing
in trunk, normalization of that kind should be relatively
straightforward to implement on both the indexing and query-parsing
side.</p>
Evergreen 1.6: Z39.50 target servers for academics2010-03-05T02:21:00-05:002010-03-05T02:21:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-03-05:/evergreen-16-z3950-target-servers-for-academics.html<p><strong>UPDATE 2010-03-05</strong> I just backported Warren's patch for sorting
Z39.50 servers to rel_1_6_0 (it counts as a bug fix), so expect to
see it in the Evergreen 1.6.0.4 release. Yay!</p>
<p>In Evergreen 1.6, Z39.50 target server configuration (for
copy-cataloguing targets) moves …</p><p><strong>UPDATE 2010-03-05</strong> I just backported Warren's patch for sorting
Z39.50 servers to rel_1_6_0 (it counts as a bug fix), so expect to
see it in the Evergreen 1.6.0.4 release. Yay!</p>
<p>In Evergreen 1.6, Z39.50 target server configuration (for
copy-cataloguing targets) moves into the database. This makes it pretty
easy for sites to share their Z39.50 target servers with one another.</p>
<p>I recently added a number of target servers to our configuration, and
thought that other academic Evergreen sites might be interested in our
set (because we're primarily pointing at other academic libraries) -
particularly if they haven't added many of their own yet. You can find a
PostgreSQL dump of our current configuration in the ILS-Contrib
repository at
<a class="reference external" href="http://svn.open-ils.org/trac/ILS-Contrib/browser/conifer/branches/rel_1_6_0/tools/config/config_z3950.sql">conifer/branches/rel_1_6_0/tools/config/config_z3950.sql</a>.</p>
<p>I generated this dump of the data using the following command:</p>
<pre class="literal-block">
pg_dump --data-only --table config.z3950_source --table config.z3950_attr evergreen > config_z3950.sql
</pre>
<p>(where <em>evergreen</em> is the name of the Evergreen database, naturally!).
You should be able to load the data into a clean Evergreen database via
<tt class="docutils literal">psql</tt> inside a transaction as follows:</p>
<pre class="literal-block">
BEGIN;\i config_z3950.sqlCOMMIT;
</pre>
<p>If you already have other Z39.50 servers in your database configuration,
you might need to adjust the ID values in the <tt class="docutils literal">config.z3950_attr</tt>
rows. Just prepending a <tt class="docutils literal">1</tt> to them ought to do the trick, unless you
have masses of Z39.50 servers. In which case, you probably don't need
ours!</p>
<p>Oh, one final tip: when you start adding a bunch of Z39.50 target
servers, you'll notice that the order in the <strong>Import from Z39.50</strong>
screen is random; it will drive your cataloguers crazy. Quite some time
ago, <a class="reference external" href="http://thebookpile.wordpress.com/">Warren Layton</a> from Natural
Resources Canada submitted a patch for sorting the servers
alphabetically that has been committed to trunk and the 1.6 branch, but
which hasn't made its way into a 1.6.0 release yet. If, at the time
you're reading this, you're on a 1.6 release but your list isn't sorted,
get <a class="reference external" href="http://svn.open-ils.org/trac/ILS/browser/branches/rel_1_6/Open-ILS/xul/staff_client/server/cat/z3950.js">the
file</a>
and drop it into <tt class="docutils literal">/openils/var/web/xul/server/cat/z3950.js</tt> - your
cataloguers will thank you. You, in turn, can thank Warren.</p>
Fun with Evergreen and SQL: representative record samples2010-03-04T04:35:00-05:002010-03-04T04:35:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-03-04:/fun-with-evergreen-and-sql-representative-record-samples.html<p>Let's pretend your national library asked you to submit a set of records
with holdings representing all of the various formats in your library
system. Let's also pretend that you're really lucky and you're running
Evergreen. Here's what you would do to get one example of each
combination of item …</p><p>Let's pretend your national library asked you to submit a set of records
with holdings representing all of the various formats in your library
system. Let's also pretend that you're really lucky and you're running
Evergreen. Here's what you would do to get one example of each
combination of item type, item form, bibliographic level, literary form,
cataloguing form, and video recording format into a scratch table for a
given library (ID = 103) in your system:</p>
<pre class="literal-block">
CREATE TABLE scratchpad.osul_export (record BIGINT); INSERT INTO scratchpad.osul_export SELECT record FROM ( SELECT DISTINCT ON (mrd.item_type, mrd.item_form, mrd.bib_level, mrd.lit_form, mrd.cat_form, mrd.vr_format) mrd.record, mrd.item_type, mrd.item_form, mrd.bib_level, mrd.lit_form, mrd.cat_form, mrd.vr_format FROM biblio.record_entry bre INNER JOIN asset.call_number acn ON acn.record = bre.id INNER JOIN asset.copy ac ON ac.call_number = acn.id INNER JOIN metabib.rec_descriptor mrd ON mrd.record = bre.id WHERE bre.deleted IS FALSE AND acn.deleted IS FALSE AND ac.deleted IS FALSE AND acn.owning_lib = 103 ORDER BY mrd.item_type, mrd.item_form, mrd.bib_level, mrd.lit_form, mrd.cat_form, mrd.vr_format ) AS formats ORDER BY record;
</pre>
<p>And then, because you were asked to provide a total of 2000 records for
this representative sample, you might fill up the remaining 1800 records
as follows:</p>
<pre class="literal-block">
INSERT INTO scratchpad.osul_export SELECT bre.id FROM biblio.record_entry bre INNER JOIN asset.call_number acn ON acn.record = bre.id INNER JOIN asset.copy ac ON ac.call_number = acn.id INNER JOIN reporter.super_simple_record rsr ON rsr.id = bre.id WHERE bre.deleted IS FALSE AND acn.deleted IS FALSE AND ac.deleted IS FALSE AND acn.owning_lib = 103 AND bre.id NOT IN ( SELECT record FROM scratchpad.osul_export ) AND substring(bre.id::text from (length(bre.id::text)) for 1)::int = 8 AND bre.id % 17 = 0 ORDER BY rsr.author DESC LIMIT 1800;
</pre>
<p>... which, of course, gives you the records with a record ID ending in
'8' and (to whittle it down further) records where record ID <em>modulo</em> 17
is 0 - and finally, just the first 1800 records ordered by author name
in descending order.</p>
<p>All of this will give you 2000 record IDs in <tt class="docutils literal">scratchpad.osul_export</tt>
that you can then extract into a text file and feed into Evergreen's
<tt class="docutils literal"><span class="pre">Open-ILS/src/support-scripts/marc_export</span></tt> script to dump the MARC
records with holdings in the 852 field from your system. Beautiful, eh?</p>
Wrap-up: Evergreen developer workshop at OLA SuperConference 20102010-03-01T23:48:00-05:002010-03-01T23:48:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-03-01:/wrap-up-evergreen-developer-workshop-at-ola-superconference-2010.html<p>To summarize the results of the Evergreen developer workshop at the OLA
SuperConference, I think things went pretty well. The primary focus this
time was on the nuts and bolts of building a minimal OpenSRF service and
I saw the lights go on in a number of faces as I …</p><p>To summarize the results of the Evergreen developer workshop at the OLA
SuperConference, I think things went pretty well. The primary focus this
time was on the nuts and bolts of building a minimal OpenSRF service and
I saw the lights go on in a number of faces as I broke it down. Things
got a little hand-wavy in the final half-hour when I leapt into the Dojo
JavaScript widgets that have been custom-built for Evergreen interfaces
such as the administration and acquisitions functionality. In
retrospect, the first half of the session deserves its own half-day, and
the second half of the session similarly deserves its own half-day, and
something had to give this time around.</p>
<p>I focused on getting hands-on, and for the most part I think it was a
success - although even though I had packaged up a virtual image, we
still ran into some problems getting it running on some laptops. And due
to some communications problems, about half of the participants weren't
ready for a hands-on session (read: no laptop, or a netbook that
couldn't handle a virtual image). I have real hopes that we'll see some
contributions in the next few months from some of the participants,
which would be a <strong>huge</strong> win for Evergreen.</p>
<p>Without any further ado, here are the materials for the session (all of
which are made available to you under a <a class="reference external" href="http://creativecommons.org/licenses/by-sa/2.5/ca/">Creative Commons By
Attribution-Share-Alike Canada 2.5
license</a>):</p>
<ul class="simple">
<li>Slides: <a class="reference external" href="/uploads/talks/2010/OLA_2010_slides.odp">(OpenOffice.org
Impress)</a>
<a class="reference external" href="/uploads/talks/2010/OLA_2010_slides.pdf">(PDF)</a></li>
<li>Workshop tutorial:
<a class="reference external" href="http://evergreen-ils.org/~denials/workshop.html">(HTML)</a>
<a class="reference external" href="http://evergreen-ils.org/~denials/workshop.pdf">(PDF)</a></li>
<li>JavaScript and Perl files:
<a class="reference external" href="/uploads/talks/2010/OLA_2010_files.zip">OLA_2010_files.zip</a></li>
</ul>
OLA SuperConference 2010 - Evergreen developer workshop update2010-02-23T03:31:00-05:002010-02-23T03:31:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-02-23:/ola-superconference-2010-evergreen-developer-workshop-update.html<p>Hey all - if you're coming to the <a class="reference external" href="/archives/211-Evergreen-developer-workshop-at-OLA-SuperConference,-February-24,-2010.html">Evergreen developer workshop at the
OLA SuperConference
2010</a>,
there's one thing you can do to prepare. As this is a hands-on workshop
(how else can you learn!), I'm hoping many or most of you will have
laptops. And ideally, your laptop will have …</p><p>Hey all - if you're coming to the <a class="reference external" href="/archives/211-Evergreen-developer-workshop-at-OLA-SuperConference,-February-24,-2010.html">Evergreen developer workshop at the
OLA SuperConference
2010</a>,
there's one thing you can do to prepare. As this is a hands-on workshop
(how else can you learn!), I'm hoping many or most of you will have
laptops. And ideally, your laptop will have a current version of
<a class="reference external" href="http://www.virtualbox.org/wiki/Downloads">VirtualBox</a> or VMWare
installed on it, as I plan to bring a virtual image for the attendees to
use.</p>
<p>I'm hoping the virtual image will sidestep the configuration hassles
people seem to run into with installing OpenSRF / Evergreen natively and
enable us to just focus on the code and architecture during the limited
time we will have together. *sniff*</p>
Introduction to SQL for Evergreen administrators2010-02-20T10:16:00-05:002010-02-20T10:16:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-02-20:/introduction-to-sql-for-evergreen-administrators.html<p>I've been a bit quiet for the last two weeks, ostensibly because I've
been on vacation. However, much of the time I was preparing to deliver a
two-day introduction to SQL for Evergreen to the good people at
<a class="reference external" href="http://www.biblio.org/">Bibliomation</a>. On Wednesday I flew down to
Middlebury, CT - Bibliomation central - and …</p><p>I've been a bit quiet for the last two weeks, ostensibly because I've
been on vacation. However, much of the time I was preparing to deliver a
two-day introduction to SQL for Evergreen to the good people at
<a class="reference external" href="http://www.biblio.org/">Bibliomation</a>. On Wednesday I flew down to
Middlebury, CT - Bibliomation central - and on Thursday and Friday of
this week, I led nine great people* through the ropes of SQL: from
understanding the basics of how SQL databases operate all the way
through inner and outer joins and set operations. I also walked though a
set of SQL queries I had developed to help Bibliomation with the
recurring reports they need to provide to their member libraries.</p>
<p>Other than an episode of grievous illness on Thursday night that led to
zero food intake and very little sleep on my part, I think things went
well; it was gratifying to see lights go on in people's heads as we
worked through hands-on exercises and tackled the same problem with
different (but valid) approaches, and (with a few minor adjustments) the
canned SQL queries seemed to meet their requirements. The feedback I
received was positive, and by the time I left I had the sense that they
had significantly increased their confidence in their ability to
understand the queries I had written for them and to create their own
queries. The major remaining learning curve is understanding how all of
the pieces of the Evergreen database schema fit together, and through
the two days I had tried to bring together pieces like the user tables,
the library tables, the circulation and holds tables, and the record /
call number / copy tables to help them find the right tables to bring
together to meet their needs.</p>
<p>I am happy to say that Bibliomation agreed to my condition that I be
allowed to release the materials for this workshop under a CC-BY-SA
license, so others can take these materials, adapt or enhance them, and
deliver similar training to other Evergreen libraries (as long as the
attribution remains and the materials are offered under the same
share-alike license). Many thanks to Bibliomation for this contribution
to the community! Without further ado, here are the materials:</p>
<ul class="simple">
<li>Reference documentation (25-ish pages introducing SQL, ending with
the canned SQL queries Bibliomation required):
<a class="reference external" href="http://bzr.coffeecode.net/intro_to_sql/introduction_to_sql.html">(HTML)</a>
<a class="reference external" href="/uploads/files/introduction_to_sql.pdf">(PDF)</a></li>
<li>Presentation: <a class="reference external" href="/uploads/files/SQL_instruction.odp">(OpenOffice.org
Impress)</a>
<a class="reference external" href="/uploads/files/SQL_instruction.pdf">(PDF)</a></li>
</ul>
<p>* Including people like Kate Sheehan, Melissa Lefebvre, and Benjamin
Shum who I previously only knew from the Evergreen mailing lists and
other online presences</p>
Evergreen developer workshop at OLA SuperConference, February 24, 20102010-01-28T20:45:00-05:002010-01-28T20:45:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-01-28:/evergreen-developer-workshop-at-ola-superconference-february-24-2010.html<p>Given the <a class="reference external" href="/archives/210-Conifer-garners-two-awards-from-the-Ontario-Library-Association.html">the
awards</a>
that Project Conifer will be presented with at the OLA SuperConference,
this might be a good opportunity to mention the <a class="reference external" href="http://www.accessola.com/superconference2010/showSession.php?lsession=8&usession=8">Customizing and
Extending Evergreen: a guide for
geeks</a>
workshop that I'll be giving on Wednesday, February 24th. The workshop
description promises:</p>
<blockquote>
Together, we will break OpenSRF …</blockquote><p>Given the <a class="reference external" href="/archives/210-Conifer-garners-two-awards-from-the-Ontario-Library-Association.html">the
awards</a>
that Project Conifer will be presented with at the OLA SuperConference,
this might be a good opportunity to mention the <a class="reference external" href="http://www.accessola.com/superconference2010/showSession.php?lsession=8&usession=8">Customizing and
Extending Evergreen: a guide for
geeks</a>
workshop that I'll be giving on Wednesday, February 24th. The workshop
description promises:</p>
<blockquote>
Together, we will break OpenSRF down into its constituent parts
(JSON, XMPP) and put it back together again in Perl, Python, and
JavaScript so that you can define new services, or integrate
existing services into other applications and websites. You will
learn how PostgreSQL underpins Evergreen's search indices and how to
access and modify any data in the system with permission-based
storage APIs; plus we will build new interfaces with the Dojo
JavaScript framework Evergreen extensions.</blockquote>
<p>That's a hefty agenda for a half-day workshop, but I promise to do my
best to deliver on that promise... <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
Conifer garners two awards from the Ontario Library Association2010-01-28T20:28:00-05:002010-01-28T20:28:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2010-01-28:/conifer-garners-two-awards-from-the-ontario-library-association.html<p>The Ontario Library Association (OLA) announced its <a class="reference external" href="http://www.accessola3.com/index.php?app=blog&module=display&section=blog&blogid=9&showentry=681">2010 OLA and OLA
Divisional Award
winners</a>
today, and to my great surprise Project Conifer was named the winner of
two awards:</p>
<ol class="arabic simple">
<li>The Ontario College and University Library Association (OCULA)
Special Achievement Award</li>
<li>The Ontario Library Information Technology Association (OLITA) Award
for Technical …</li></ol><p>The Ontario Library Association (OLA) announced its <a class="reference external" href="http://www.accessola3.com/index.php?app=blog&module=display&section=blog&blogid=9&showentry=681">2010 OLA and OLA
Divisional Award
winners</a>
today, and to my great surprise Project Conifer was named the winner of
two awards:</p>
<ol class="arabic simple">
<li>The Ontario College and University Library Association (OCULA)
Special Achievement Award</li>
<li>The Ontario Library Information Technology Association (OLITA) Award
for Technical Innovation</li>
</ol>
<p>All of the libraries in the Project Conifer consortium have been listed
in the award announcement, and for good reason: everyone using the
<a class="reference external" href="http://evergreen-ils.org">Evergreen</a> library system <a class="reference external" href="/archives/191-Conifer-lives-Ontario-launches-a-consortial-academic-library-system-built-on-Evergreen.html">since May
2009</a>
has contributed to the project, be it by bug reports, or suggestions for
enhancement, or sharing approaches to solving problems, or contributing
code. This has been a real team effort, and make no mistake: the road
has been bumpy at times, and there's a <strong>lot</strong> of road left to travel
before we get to our destination. <em>Dan furtively glances at the open
list of requested enhancements on the Conifer ticket system and gets
back to finishing off this blog post...</em> The continuing support of staff
and librarians across the consortium has been critical to keeping things
moving in a very positive direction, and I'm delighted that they're
being recognized for their efforts.</p>
Doing useful things with periodical holdings, part 2: comparing with print holdings in Evergreen2009-11-17T17:05:00-05:002009-11-17T17:05:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-11-17:/doing-useful-things-with-periodical-holdings-part-2-comparing-with-print-holdings-in-evergreen.html<div class="section" id="doing-interesting-things-with-evergreen-serials-data">
<h2>Doing interesting things with Evergreen serials data</h2>
<p><strong>Update: 2010-05-31</strong> Running through the process again, I found a few
typos in the <tt class="docutils literal">pg_dump</tt> commands, so I fixed those up.</p>
<p>I'm working on a project to compare our electronic journal holdings with
our print journal holdings. This is probably a task …</p></div><div class="section" id="doing-interesting-things-with-evergreen-serials-data">
<h2>Doing interesting things with Evergreen serials data</h2>
<p><strong>Update: 2010-05-31</strong> Running through the process again, I found a few
typos in the <tt class="docutils literal">pg_dump</tt> commands, so I fixed those up.</p>
<p>I'm working on a project to compare our electronic journal holdings with
our print journal holdings. This is probably a task that most academic
libraries have been working on over the past few years, as collection
space dwindles, the duplication of holdings in electronic and print
formats increases, and electronic delivery and 24/7 access becomes the
default expectation of our patrons.</p>
<p><a class="reference external" href="/archives/205-Doing-useful-things-with-the-TXT-dump-of-SFX-holdings,-part-1-database.html">In my previous
post</a>,
I worked through the hoops required to get our SFX holdings into a
usable database for query purposes. In this post, I'll walk through the
steps required to get the serials holdings from Evergreen into the same
database so that we can generate reports based on the authoritative
sources for both our electronic and print holdings.</p>
<p>We'll start by dumping the schema for the biblio.record_entry and
serial.record_entry tables from our Evergreen database. In the previous
post, we could have added the tables from the SFX export to the
Evergreen database, but I don't like mixing these more experimental
projects with our production system - so we'll work with the a database
named <strong>periodicals</strong> instead.</p>
<pre class="literal-block">
pg_dump --no-owner --schema-only --table biblio.record_entry \ --table serial.record_entry evergreen > bre_sre_schema.dump
</pre>
<p>We have to munge the schema to not create the indexes on the tables -
should lead to faster loads. Also, remove any triggers that point at
other stuff that doesn't exist in this limited subset of data. Then
create the schema in our periodicals holdings database:</p>
<pre class="literal-block">
psql -f bre_sre_schema.ddl -d periodicals
</pre>
<p>Now dump the data for those tables from the Evergreen database. If you
have a large set of bibliographic records like we do, make sure you have
a few gigabytes of space available in the output location.</p>
<pre class="literal-block">
pg_dump --no-owner --data-only --table biblio.record_entry \ --table serial.record_entry evergreen \ > bre_sre_data.ddl
</pre>
<p>Okay, now you can load the data into your serials holdings database:</p>
<pre class="literal-block">
psql -f bre_sre_data.dump -d periodicals
</pre>
<p>And now we add the indexes that we previously culled from the schema.
You can be more selective in the indexes you create, if you know what
you're doing.</p>
<p>For some reason, I opted to play with PostgreSQL's support for XML as a
native column type and converted the plain text marc column into an XML
column:</p>
<pre class="literal-block">
ALTER TABLE biblio.record_entry ALTER COLUMN marc SET DATA TYPE XML USING marc::XML;
</pre>
<p>Now we add the Evergreen holdings to the <tt class="docutils literal">holdings.conifer</tt> table. We
use the <tt class="docutils literal">xpath()</tt> function to retrieve the desired values from the
MARC XML in biblio.record_entry, and wrap the results in the
<tt class="docutils literal">unnest()</tt> function to return the nodeset as a plain text string,
rather than an array of values. The WHERE clause restricts the holdings
to those owned by the library in which I am interested.</p>
<pre class="literal-block">
CREATE TABLE holdings.conifer ( record BIGINT, issn TEXT, coverage TEXT, call_number TEXT);INSERT INTO holdings.conifer (record, issn) SELECT bre.id, UNNEST(XPATH('//*[local-name()="datafield"][@tag="022"]' || '/*[local-name()="subfield"][@code="a"]/text()', bre.marc)) FROM biblio.record_entry bre INNER JOIN serial.record_entry sre ON sre.record = bre.id WHERE sre.owning_lib = 103;
</pre>
<p>We'll populate the call number based on the 852 field in the serial
record. We could pull this from the <tt class="docutils literal">asset.call_number</tt> table, but
this will be good enough for the first pass.</p>
<pre class="literal-block">
UPDATE holdings.conifer SET call_number = UNNEST( XPATH( '//*[local-name()="datafield"][@tag="852"]/' || '*[local-name()="subfield"][@code="h"]/text()', ( SELECT sre.marc::xml FROM serial.record_entry sre INNER JOIN holdings.conifer hc ON hc.record = sre.record WHERE hc.record = holdings.conifer.record LIMIT 1 ) ) );
</pre>
<p>Now we need to generate usable holdings statements for the print.
Evergreen includes a great MFHD parsing library written in Perl, and
PostgreSQL thankfully enables you to create functions written in Perl,
but to get the following to work on a non-Evergreen machine, I had to
copy <tt class="docutils literal"><span class="pre">Open-ILS/src/perlmods/OpenILS/Utils/MFHD/*</span></tt> to
<tt class="docutils literal">/usr/local/share/perl/5.10.0</tt> and edit the occurrences of
<tt class="docutils literal"><span class="pre">OpenILS::Utils::MFHD::*</span></tt> to *.</p>
<pre class="literal-block">
CREATE OR REPLACE FUNCTION holdings.parse_mfhd ( xml TEXT ) RETURNS TEXT AS $_$ use MARC::Record; use MARC::File::XML; use MFHD; my $xml = shift; my $text; my $captions; my $marc = MARC::Record->new_from_xml( $xml ); my $mfhd = MFHD->new($marc); foreach my $field ($marc->field('866')) { my $holdings = $field->subfield('a'); if ($holdings) { my $public_note = $field->subfield('z'); if ($public_note) { $text .= "$holdings - $public_note"; } else { $text .= "$holdings"; } } } foreach my $cap_id ($mfhd->captions('853')) { my @curr_holdings = $mfhd->holdings('863', $cap_id); next unless scalar @curr_holdings; foreach (@curr_holdings) { if ($captions) { $captions .= ', '; } $captions .= $_->format(); } } if ($text and $captions) { $text = "$text / $captions"; } else { $text = "$text$captions"; } return $text;$_$ LANGUAGE PLPERLU;
</pre>
<p>And update the table:</p>
<pre class="literal-block">
UPDATE holdings.conifer SET coverage = ( SELECT holdings.parse_mfhd(marc) FROM serial.record_entry WHERE serial.record_entry.record = holdings.conifer.record LIMIT 1);
</pre>
<p>That almost works, but it only retrieves the coverage from a single
serial holdings record for a given bibliographic record, even though
there might be multiple serial holdings records. To amend that, we'll
create a PL/pgSQL function that concatenates all of the coverage
statements from all of the pertinent serial holdings records for a given
bibliographic record:</p>
<pre class="literal-block">
CREATE OR REPLACE FUNCTION holdings.print_coverage(marc_record BIGINT) RETURNS TEXT AS $$ DECLARE r RECORD; coverage TEXT; BEGIN -- If coverage is NULL to begin with, then concatenating to it results in NULL coverage := ''; -- RAISE NOTICE 'marc_record = %', marc_record; -- Loop over the serial records attached to the targeted bib record FOR r IN SELECT marc FROM serial.record_entry WHERE record = marc_record ORDER BY id LOOP coverage := coverage || holdings.parse_mfhd(r.marc); -- RAISE NOTICE 'r.marc = %', r.marc; END LOOP; -- RAISE NOTICE 'coverage = %', coverage; RETURN coverage; END$$ LANGUAGE 'plpgsql';
</pre>
<p>And we'll use this fancy new function to update the print holdings
statements again with the more complete coverage:</p>
<pre class="literal-block">
UPDATE holdings.conifer SET coverage = ( SELECT holdings.print_coverage(record) FROM serial.record_entry WHERE serial.record_entry.record = holdings.conifer.record LIMIT 1 );
</pre>
<p>Now the payoff: generating a list of matching ISSNs from the electronic
holdings and our print holdings, with the coverage statements for each,
for a subset of the SFX collections to which we have access:</p>
<pre class="literal-block">
-- Set the display to expanded format for easy reading\x-- Basic report for perusalSELECT hsfx.issn AS "ISSN", hsfx.title AS "Title", hsfx.collection AS "SFX Collection", hsfx.coverage AS "Electronic Coverage", hc.coverage AS "Print Coverage", hc.call_number AS "Call Number" FROM holdings.sfx hsfx INNER JOIN holdings.conifer hc ON hsfx.issn = hc.issn WHERE (hsfx.collection ILIKE '%JStor%' OR hsfx.collection LIKE '%Scholars%') AND hc.coverage > '' LIMIT 5;
</pre>
<p>That results in:</p>
<pre class="literal-block">
-[ RECORD 1 ]-------+--------------------------------------------------------------------------ISSN | 0142-2774Title | Journal of Occupational BehaviorSFX Collection | JSTOR Arts and Sciences 4Electronic Coverage | Available from 1980 until 1987. Print Coverage | Vol. 1 No. - Vol. 8 No. 4 (1980-1987)Call Number | DESM-PER-[ RECORD 2 ]-------+--------------------------------------------------------------------------ISSN | 0741-6261Title | The Rand Journal of EconomicsSFX Collection | JSTOR Arts and Sciences 2Electronic Coverage | Available from 1984 until 2006. Print Coverage | V.17 (1986) - v.23 (1992)Call Number | DESM-PER-[ RECORD 3 ]-------+--------------------------------------------------------------------------ISSN | 0002-8614Title | Journal of the American Geriatrics SocietySFX Collection | Scholars PortalElectronic Coverage | Available from 2001 volume: 49 issue: 1 until 2009 volume: 57 issue: 10. Print Coverage | Vol. 1 - 37 (1953-1989)Call Number | DESM-PER-[ RECORD 4 ]-------+--------------------------------------------------------------------------ISSN | 0023-7639Title | Land EconomicsSFX Collection | JSTOR Arts and Sciences 7Electronic Coverage | Available from 1948 until 2005. Print Coverage | v.62 (1986) - v.68 (1992)Call Number | DESM-PER-[ RECORD 5 ]-------+--------------------------------------------------------------------------ISSN | 0090-2616Title | Organizational dynamicsSFX Collection | Scholars PortalElectronic Coverage | Available from 1995 volume: 23 issue: 3 until 2009 volume: 38 issue: 3. Print Coverage | Vol. 15 No. - Vol. 23 No. 5 (Summer 1986-Spring 1995)Call Number | DESM-PER
</pre>
<p>Looks pretty good to these eyes. Okay, now we'll get serious and dump
the output to a tab-delimited file so we can easily open it in
OpenOffice.org Calc or another spreadsheet:</p>
<pre class="literal-block">
-- Set delimiter to TAB (CTRL-V )\f '^V'-- Set the output to being unaligned\a-- Dump the output to a file\o /tmp/periodicals.tsv-- Generate URLs for quick catalogue lookupsSELECT 'http://laurentian.concat.ca/opac/en-CA/skin/lul/xml/rdetail.xml?r=' || hc.record || '&l=105&d=1' AS "URL", hsfx.issn AS "ISSN", hsfx.title AS "Title", hsfx.collection AS "SFX Collection", hsfx.coverage AS "Electronic Coverage", hc.coverage AS "Print Coverage", hc.call_number AS "Call Number" FROM holdings.sfx_complete hsfx INNER JOIN holdings.conifer hc ON hsfx.issn = hc.issn WHERE (hsfx.collection ILIKE '%JStor%' OR hsfx.collection LIKE '%Scholars%') AND hc.coverage > '';
</pre>
<p>And that's it. It might seem complex, but I've found that investing the
effort into learning how to lean on PostgreSQL to do the hard work pays
plenty of dividends. This exploration should help me contribute more
functionality to Evergreen core; for example, I hope to use my
experiments with the pl/Perl function to start populating the
<tt class="docutils literal">serial.bib_summary</tt> tables using an INSERT/UPDATE/DELETE trigger on
<tt class="docutils literal">serial.record_entry</tt> so that we don't have to generate the summaries
for every item details request in the catalogue.</p>
</div>
FSOSS 2009: Project Conifer update2009-11-10T04:00:00-05:002009-11-10T04:00:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-11-10:/fsoss-2009-project-conifer-update.html<p><strong>Update: 2009-11-24</strong> James Forrester of the Ontario Academy of Art and
Design has posted a <a class="reference external" href="http://www.archive.org/details/DanScottpresentationEvergreeninOntarioMigratingAcademicLibraries">short
video</a>
(Internet Archive) of the presentation. Thanks, James!</p>
<p>On Friday, October 30th, I presented a status update on Project Conifer
at the <a class="reference external" href="http://fsoss.ca">Free Software Open Source Symposium
(FSOSS)</a>. This was a follow-up to the …</p><p><strong>Update: 2009-11-24</strong> James Forrester of the Ontario Academy of Art and
Design has posted a <a class="reference external" href="http://www.archive.org/details/DanScottpresentationEvergreeninOntarioMigratingAcademicLibraries">short
video</a>
(Internet Archive) of the presentation. Thanks, James!</p>
<p>On Friday, October 30th, I presented a status update on Project Conifer
at the <a class="reference external" href="http://fsoss.ca">Free Software Open Source Symposium
(FSOSS)</a>. This was a follow-up to the talk I gave
with John Fink at <a class="reference external" href="/archives/170-Evergreen-deOSSification-of-library-software.html">last year's
FSOSS</a>,
with the hopefully interesting twist that instead of talking about what
we were going to do, I talked about what we had done, and the lessons
learned along the way.</p>
<p>This was a slightly modified version of the talk I gave at the
<a class="reference external" href="/archives/201-Presentation-at-the-Lyrasis-Open-Source-in-Your-Library-conference.html">Lyrasis/NELINET open source
conference</a>
earlier in October, aimed at a more general audience. The talk was
recorded and will be posted online at the FSOSS site at some point.</p>
<p>Here are the slides in
<a class="reference external" href="/uploads/talks/2009/FSOSS_Conifer_update.odp">(ODP)</a>
and
<a class="reference external" href="/uploads/talks/2009/FSOSS_Conifer_update.pdf">(PDF)</a>
format. The speaker notes on the slides will give you the meat of the
content.</p>
Evergreen development workshop at FSOSS 20092009-10-30T17:24:00-04:002009-10-30T17:24:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-10-30:/evergreen-development-workshop-at-fsoss-2009.html<p><strong>Update 2009-11-24</strong> Robert Soulliere has also made the videos
available via the <a class="reference external" href="http://www.archive.org/details/EvergreenDeveloperSeries-Fsoss2009%20">Internet
Archive</a>
- thanks again, Robert!</p>
<p><strong>Update 2009-11-09</strong> As promised, Robert Soulliere has <a class="reference external" href="http://markmail.org/message/4ntfbshzhhmjn2tw">posted the
video recordings</a> he
made of the workshop - thanks, Robert!</p>
<p>Yesterday, I lead a three-hour Evergreen development workshop at the
<a class="reference external" href="http://fsoss.ca">Free Software Open Source Symposium …</a></p><p><strong>Update 2009-11-24</strong> Robert Soulliere has also made the videos
available via the <a class="reference external" href="http://www.archive.org/details/EvergreenDeveloperSeries-Fsoss2009%20">Internet
Archive</a>
- thanks again, Robert!</p>
<p><strong>Update 2009-11-09</strong> As promised, Robert Soulliere has <a class="reference external" href="http://markmail.org/message/4ntfbshzhhmjn2tw">posted the
video recordings</a> he
made of the workshop - thanks, Robert!</p>
<p>Yesterday, I lead a three-hour Evergreen development workshop at the
<a class="reference external" href="http://fsoss.ca">Free Software Open Source Symposium</a>. I had
promised Nick Ruest from McMaster that it wouldn't be three hours of me
talking... but in prepping for the workshop, I ran out of time putting
together the virtual image that was going to include all of the tutorial
materials... and therefore, ended up talking for almost three hours. Not
ideal. Interestingly, there were a number of non-library-world attendees
who were interested in OpenSRF, so I was able to spend most of the first
hour covering that framework and (I think) managed to successfully keep
their attention for that period of time. I wasn't suprised to see them
leave once we hit more library-centric content <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
<p>That said, there is a stake in the ground now for developers who are
relatively new to Evergreen. The assumption is that the developer is
already comfortable with basic install and configuration of OpenSRF and
Evergreen, at least as far as following the <a class="reference external" href="http://evergreen-ils.org/dokuwiki/doku.php?id=server:1.6.0:install">install
instructions</a>,
and that the developer is comfortable writing one or both of Perl or
JavaScript. I posit that such a person should be able to work through
the <a class="reference external" href="http://evergreen-ils.org/~denials/workshop.html">workshop
tutorial</a> and follow
the workshop slides through the evolution of a CGI program to an OpenSRF
service that eventually taps into the Evergreen IDL (see <a class="reference external" href="http://evergreen-ils.org/~denials/workshop.tar.gz">workshop
tarball</a>).</p>
<p>In writing this down and trying to provide basic examples that can be
building blocks for bigger applications, I surprised myself by how much
I had to re-learn or in some cases learn for the first time. But now
it's written down, and the re-learning path (because my brain is full
and constantly rids itself of even painfully learned lessons) will be
much shorter. And I hope that this makes it easier for others to become
productive OpenSRF and Evergreen developers as well.</p>
<p>This content will continue to evolve and improve over time, as I'm
betting that my fellow Evergreen developers will suggest improvements to
the materials. Note that I'm delivering a four-hour workshop covering
much of the same material at the OLA SuperConference in 2010. The extra
hour should give us time to complete some hands-on exercises, and I'll
incorporate the feedback that I've received from the FSOSS workshop for
the OLA workshop. (Your feedback is always welcome, either in comments
to this post or via email at <a class="reference external" href="mailto:dan@coffeecode.net">dan@coffeecode.net</a>). It would be great to
see other people take these materials and improve and deliver them as
well - they're under a CC-BY-SA license - so if there's interest, I'll
be happy to check them into a public source repository (hmm, maybe a bzr
branch at the <a class="reference external" href="http://code.launchpad.net/evergreen">Evergreen
Launchpad</a> project).</p>
<p>Oh! And Robert Soulliere from Mohawk College recorded the entire
workshop and plans to make it available online. So if you need some
sleep, those video segments will be available!</p>
Presentation at the Lyrasis "Open Source in Your Library" conference2009-10-10T17:34:00-04:002009-10-10T17:34:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-10-10:/presentation-at-the-lyrasis-open-source-in-your-library-conference.html<p>On Friday, October, 9th, I had the pleasure of (along with Joe Lucia and
Karen Coombs) speaking at the Lyrasis "Open Source in Your Library"
conference at the Olin College of Engineering in Needham, MA. First, a
note about Olin College - it is a very modern campus that makes an …</p><p>On Friday, October, 9th, I had the pleasure of (along with Joe Lucia and
Karen Coombs) speaking at the Lyrasis "Open Source in Your Library"
conference at the Olin College of Engineering in Needham, MA. First, a
note about Olin College - it is a very modern campus that makes an
excellent venue for a single-track conference (New England
#code4libber's take note!). Second, this had originally been a NELINET
conference, but as of last week NELINET had merged with Lyrasis to
create a regional library non-profit organization that spans most of the
East Coast of the United States.</p>
<p>My presentation slides (with copious speaker notes) are available in
<a class="reference external" href="/uploads/talks/2009/NELINET_2009_Developing.odp">OpenOffice.org
Impress</a>,
<a class="reference external" href="/uploads/talks/2009/NELINET_2009_Developing.pdf">PDF</a>,
and <a class="reference external" href="/uploads/talks/2009/NELINET_2009_Developing.ppt">PowerPoint
format</a></p>
<p>I had been asked to talk about Conifer's experiences implementing
Evergreen, as there is certainly some interest on the part of Lyrasis
member organizations in open source library systems. I chose to tell the
unvarnished story of Conifer: how we decided to build a consortial
academic library system on Evergreen, what steps we have taken in the
past two years, and probably more importantly what missteps we have
taken over the past two years. I told some cautionary tales that were
hopefully useful to others considering the same path, and then discussed
the state of the Evergreen community.</p>
<p>As a quick recap, the biggest challenges we hit on the road to adopting
Evergreen were:</p>
<ul class="simple">
<li>Finding skilled developer resources that could commit time to help us
develop solutions for some of our requirements was challenging, even
when we did have financial resources.</li>
<li>Our largest founding partner withdrew from the project months before
we were set to go live.</li>
<li>Due to the effects of the recession on provincial and therefore
university finances, and the increased burden on the remaining
Conifer partners for the shared costs that weren't reduced after the
partner's withdrawal, our collective budget was slashed and we ended
up having to pay opportunity costs by focusing on migrating our own
data rather than outsourcing that role and focusing several months of
effort on development.</li>
</ul>
<p>I noted that our efforts to build a reserves system
(<a class="reference external" href="http://svn.open-ils.org/trac/ILS-Contrib/wiki/SyrupReserves">Syrup</a>)
have thus far resulted in a loosely coupled reserves system that none of
us have been able to use - but that for the time being Evergreen's
bookbags have served as a reasonable replacement for lists of
monographic reserve items, and that the discussion about how to more
tightly couple Syrup with Evergreen has resumed (and is currently
waiting on me for a response)... so there's hope that we might be able
to deploy the all-singing, all-dancing reserves system next term.</p>
<p>I confessed that we're using spreadsheets to track acquisitions while
Evergreen's native acquisitions system solidifies (although, given the
current state of our budget, spreadsheets are all that we need for the
time being - sigh). Joe Lucia had remarked during his own presentation
that an acquisitions system that can handle the rather complex
requirements of academic institutions was a showstopper for his library.
In the Evergreen 1.6 release, you can see that the acquisitions system
is almost ready; we loaded six years of historical acquisitions data
into a test server and were able to do most of what we need, subject to
some refinements. I think it has been an extremely challenging balancing
act for Bill Erickson to juggle the requirements of academic libraries
with those of large consortial public library systems to come up with
something that can make everyone happy (as happy as you can possibly be
with acquisitions), but the progress over the summer has been
encouraging.</p>
<p>On a more positive note, one of the great advantages of adopting a
consortial library system is that I was able to take two months of
parental leave and not worry about the state of the system at all. We
have shared responsibilities across the consortial partners, such that I
can actually turn off my cell phone when it's not my turn to respond to
problem reports. And during my absence, my colleagues (Art Rhyno, Robin
Isard, Kevin Beswick) all gained a lot of confidence in their own
understanding of the system. This shared responsibility should also pay
dividends when we put together processes for reporting records to our
various consortial catalogues (such as AMICUS): rather than each of us
having to rediscover the process on our own, we can collaborate and
improve upon each other's work. It's a lot less lonelier being a systems
librarian in a consortial library system, let me tell you!</p>
<p>I also shared our positive experiences with Evergreen's uptime and with
<a class="reference external" href="http://esilibrary.com">Equinox</a> as a support provider. The few times
that we have had outages, they have been relatively brief and when we
have opened a problem ticket with Equinox, they have responded quickly.
Robin measured our uptime over the last two months at 99.5% - which
isn't five nines, but is still far better than the 75% (maximum) that we
had with our previous system due to the six hours it was down every
night for backups. We also chalk up some of the downtime so far to
learning experiences; we're refining the configuration of the system and
improving our own knowledge of how to maintain the system without
incurring an outage. So, I expect that we'll eke our way back up over
the next few months to an even better uptime percentage.</p>
<p>On the topic of the Evergreen community, I compared several
commonly-used objective measures of the health of a given open source
community, such as mailing list volume, number of contributors and
contributing organizations, and release frequency with Evergreen's track
record. We're doing reasonably well on the mailing list front, and we've
seen a small increase in the number of patch contributors, but I think
we need to make the on-ramp to Evergreen development slightly easier to
ascend. This is why I'm trying to create a set of tutorials for new
developers, starting with basic OpenSRF, extending through database
access methods such as open-ils.cstore and open-ils.pcrud, rounding off
with the IDL-aware custom Dojo widgets that Bill Erickson has put
together, and perhaps giving people enough XUL to know how to add a new
menu entry to the staff client. (I really can't tackle XUL, too, in just
one half-day workshop!) If our community has a broader set of developers
capable of contributing to the project, then we can expect to see more
customization and extensions available - and possibly more committers.</p>
<p>On the release front, I got a rueful laugh from the audience when I said
that the Evergreen 1.6 release was expected within a few days - "just
like we [the developers] said at the Evergreen International
Conference". I acknowledged that we've had trouble getting high quality
releases out the door - that it took months, and five point releases,
before the 1.4 release was really usable out of the box, and that it had
taken even longer to get 1.6 out for a release. But I also promised that
we (the core committers) had been discussing ways that we can improve
the release process; for example, Mike Rylander had committed resources
from Equinox to help build a suite of regression tests so that we could
have automated nightly builds with known pass/fail rates, and on the
mailing list we had been discussing different approaches to bug-tracking
and development (including the possibility of using distributed version
control systems to do feature development in branches instead of trunk).</p>
<p>On the state of the community, I applauded the Evergreen Documentation
Interest Group (DIG) for leading the charge in taking a team-based
approach to tackling a problem. I pointed to this as a sign the
community was maturing beyond its origins of a core set of contributors
who did everything from maintaining servers to creating Web site content
to development, to a set of more focused teams that would be able to
achieve more through close collaboration on their objectives. We're
seeing that in discussions about a Quality Assurance (QA) team, as well,
that would be responsible for tracking and verifying bugs in a public
repository and (probably) enhancing the tests that let us measure the
quality of the project code at any given time. I can imagine other
possible teams charged with Web site design and content maintenance,
perhaps as a more focused spin-off of the DIG; an internationalization
team, focused on enabling translations and managing contributed
translations; and an infrastructure team responsible for maintaining the
health of the project servers.</p>
<p>Speaking of the community, this is probably a good time to suggest <a class="reference external" href="http://www.artofcommunityonline.org/2009/09/18/the-art-of-community-now-available-for-free-download/">The
Art of
Community</a>
by Jono Bacon (Ubuntu Community Manager) as an excellent read - at least
based on the first half of the book that I've managed to get through
during my travels.</p>
<p>So, with that, I head back home (thanks <a class="reference external" href="http://bpl.org">Boston Public
Library</a> for the free wifi). We have challenges to
tackle in both Project Conifer and in the growth of the Evergreen
community, but knowing the people involved in both of these efforts, I'm
confident that we're going to make a huge amount of progress over the
next few months.</p>
Using nginx to serve static content with Evergreen2009-10-04T04:51:00-04:002009-10-04T04:51:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-10-04:/using-nginx-to-serve-static-content-with-evergreen.html<p><strong>Update 2009-10-04</strong> Added a title to the post; oops!</p>
<p>A long time ago, when I discovered that Evergreen was chewing up and
spitting out Apache backends at a furious pace because Apache was being
used to serve up static content like CSS, JavaScript, and image files, I
<a class="reference external" href="http://markmail.org/message/ndzjweq4luenjvzj">suggested</a> that using …</p><p><strong>Update 2009-10-04</strong> Added a title to the post; oops!</p>
<p>A long time ago, when I discovered that Evergreen was chewing up and
spitting out Apache backends at a furious pace because Apache was being
used to serve up static content like CSS, JavaScript, and image files, I
<a class="reference external" href="http://markmail.org/message/ndzjweq4luenjvzj">suggested</a> that using
<a class="reference external" href="http://nginx.org/">nginx</a> to serve up the static content and
proxying the dynamic requests to Apache would be a good solution to a
number of problems we were facing. Here we are, five months later, and
I've managed to put in a few hours tonight (amidst stomach-wrenching
laughter at SNL's "Threw it on the ground" tune) to get a proof of
concept configuration working on the Ubuntu 9.10 beta release.</p>
<p>The following nginx configuration hasn't been tested in a production
environment yet, and isn't tuned beyond the defaults that ship with
Ubuntu Karmic, but it works on my laptop in a virtual image for both
regular HTTP and SSL requests - so what could possibly go wrong?</p>
<p>Steps to get this working on Ubuntu Karmic, assuming that nginx and
Apache are running on the same server:</p>
<ol class="arabic simple">
<li>Install nginx: <tt class="docutils literal">sudo aptitude install nginx</tt></li>
<li>Copy the <a class="reference external" href="http://evergreen-ils.org/dokuwiki/doku.php?id=server_installation:nginx_proxy">configuration
file</a>,
changing "192.168.69.107" to match your server's IP address or host
name, into a file called <tt class="docutils literal"><span class="pre">/etc/nginx/sites-available/evergreen</span></tt> and
create a symbolic link to the file at
<tt class="docutils literal"><span class="pre">/etc/nginx/sites-enabled/evergreen</span></tt></li>
<li>Modify <tt class="docutils literal">/etc/apache2/ports.conf</tt> to change port 80 to 9080 and port
443 to 9443.</li>
<li>Modify <tt class="docutils literal">/etc/apache2/eg_vhost.conf</tt> to change the "Listen 443"
directive to "Listen 9443"</li>
<li>Restart nginx and Apache to put the new configuration in place</li>
<li>Enjoy!</li>
</ol>
<p>As I said, there's probably plenty of room for improvement; I have only
a few hours of experimentation with nginx under my belt at this point.
But assuming no showstoppers turn up after further testing, I would
expect to see this going into production in Conifer sooner rather than
later, and potentially becoming a standard part of any production
Evergreen system.</p>
Evergreen Developer Basics Workshop at FSOSS 20092009-10-03T01:04:00-04:002009-10-03T01:04:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-10-03:/evergreen-developer-basics-workshop-at-fsoss-2009.html<p>If you're working on or interested in working on the
<a class="reference external" href="http://evergreen-ils.org">Evergreen</a> open source library system, and
you can be in the Toronto area on October 29th, 2009, you might want to
spend $75 and register for the <a class="reference external" href="http://fsoss.senecac.on.ca/2009/">Free Software Open Source Symposium
(FSOSS)</a> to be held at the
<a class="reference external" href="mailto:Seneca@York">Seneca@York …</a></p><p>If you're working on or interested in working on the
<a class="reference external" href="http://evergreen-ils.org">Evergreen</a> open source library system, and
you can be in the Toronto area on October 29th, 2009, you might want to
spend $75 and register for the <a class="reference external" href="http://fsoss.senecac.on.ca/2009/">Free Software Open Source Symposium
(FSOSS)</a> to be held at the
<a class="reference external" href="mailto:Seneca@York">Seneca@York</a> campus. You'll get a three hour workshop introducing you to
Evergreen development out of the deal, plus your choice of another
workshop on the 29th and the ability to attend all of the FSOSS
presentations on the 30th. I attended FSOSS last year for the first time
and was stunned at the high quality of the conference.</p>
<p>I apologize for the late notice that means that you missed out on the
$30 early registration special; I did not hear until this morning that
my workshop proposal had been accepted. This seems in keeping with this
year's edition of FSOSS, as the conference Web site also seems to be a
bit behind where one would expect with only four weeks to go (heh). The
late notice will also mean that most of my spare minutes will be soaked
up for the rest of the month preparing the workshop materials, but
building a collection of Evergreen development tutorials for the
community is high on my personal list of goals, so it will definitely be
worth it. Expect a high-energy presentation!</p>
<p>Here are the particulars for the workshop:</p>
<p><strong>Workshop title</strong>: Evergreen Library System Development Basics</p>
<p><strong>Workshop description</strong> Over the past year, Evergreen has been</p>
<p>adopted by a number of libraries in Ontario. While it is built on a</p>
<p>flexible, scalable architecture and offers an impressive set of</p>
<p>features, the Evergreen community needs a broader base of developers</p>
<p>who are able to contribute to the base functionality and create</p>
<p>customized Evergreen instances. This workshop will provide developers</p>
<p>with the tools they need to contribute to the Evergreen project and</p>
<p>better serve their libraries, tackling subjects such as creating a new</p>
<p>OpenSRF service, accessing data with permission-based methods,</p>
<p>customizing the database schema and IDL, and building AJAX interfaces</p>
<p>with the OpenILS Dojo widgets.</p>
Two podcasts of potential interest to Evergreen fans2009-09-15T20:59:00-04:002009-09-15T20:59:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-09-15:/two-podcasts-of-potential-interest-to-evergreen-fans.html<p>Most recently, the latest <a class="reference external" href="http://www.softwarefreedom.org/podcast/">Software Freedom Law
Show</a> focuses on the subject
of how to choose a license for your software project's documentation.
The episode was a direct response to a
<a class="reference external" href="http://identi.ca/notice/8976792">dent</a> I had sent to one of the
hosts, Bradley Kuhn, suggesting the subject. I thought the <a class="reference external" href="http://www.evergreen-ils.org/dokuwiki/doku.php?id=evergreen-docs:dig">Evergreen
Documentation …</a></p><p>Most recently, the latest <a class="reference external" href="http://www.softwarefreedom.org/podcast/">Software Freedom Law
Show</a> focuses on the subject
of how to choose a license for your software project's documentation.
The episode was a direct response to a
<a class="reference external" href="http://identi.ca/notice/8976792">dent</a> I had sent to one of the
hosts, Bradley Kuhn, suggesting the subject. I thought the <a class="reference external" href="http://www.evergreen-ils.org/dokuwiki/doku.php?id=evergreen-docs:dig">Evergreen
Documentation Interest
Group</a>
might find it a useful treatment from two of the most knowledgeable
folks in the free software licensing world. As a bonus, when I started
listening to the episode today, I was pleased to hear Bradley lead in
with a very positive mention of Evergreen. Many thanks, Bradley, both
for the show and for the shout-out to Evergreen!</p>
<p>Also, back in July, I had the opportunity to travel to <a class="reference external" href="http://algomau.ca">Algoma
University</a> in Sault Ste. Marie to spend a few
days locked in a room with my fellow Conifer propeller-heads (Art,
Kevin, and Robin) to dump the Evergreen-related content of my brain out
onto the table in preparation for my parental leave. As part of the
visit, we joined in the <a class="reference external" href="http://tangentialconvergence.blogspot.com">Tangential
Convergence</a> crew to put
together a <a class="reference external" href="http://tangentialconvergence.blogspot.com/2009/08/episode-17-searching-for-evergreens.html">podcast about Conifer and
Evergreen</a>
in the standard Tangential Convergence style: having a few beer while
sitting around a table in Dave Brodbeck's backyard. We ended up veering
off onto other subjects rather quickly, but such is the nature of the
show!</p>
<p><strong>Addendum @ 20:44</strong></p>
<ol class="arabic simple">
<li>In the SFLC podcast, Bradley was riffing about my role in Evergreen
based on his memory of my FSOSS presentation from almost a year ago,
so to set the record straight - I'm a relative newcomer to Evergreen,
having joined the project in 2007 after Mike Rylander, Bill Erickson,
and Jason Etheridge had already accomplished the miracle of
delivering the first release of Evergreen to the public libraries of
the state of Georgia.</li>
<li>Also, in the opening moments of the SFLC podcast, there's a mention
of how Evergreen filled a gap in the free software universe (library
systems); one should note that <a class="reference external" href="http://koha.org">Koha</a> tackled
that gap a lot earlier (starting in 1999) and is also a thriving
project today.
<p>
</ol>
</p></li>
</ol>
SFX target parser for Evergreen and some thoughts about searching identifiers2009-06-29T17:00:00-04:002009-06-29T17:00:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-06-29:/sfx-target-parser-for-evergreen-and-some-thoughts-about-searching-identifiers.html<p><strong>UPDATE 2010-03-10</strong> See <a class="reference external" href="/archives/217-More-granular-identifier-indexes-for-your-Evergreen-SRU-Z39.50-servers.html">More granular identifier indexes for your
Evergreen SRU / Z39.50
servers</a>
for some recommended enhancements to the target parser and Evergreen's
identifier index capabilities</p>
<p><a class="reference external" href="http://laurentian.ca">Laurentian University</a> is part of the <a class="reference external" href="http://www.ocul.on.ca/">Ontario
Council of University Libraries (OCUL)</a>, and
a user of the centrally hosted <a class="reference external" href="http://scholarsportal.info">Ontario Scholars
Portal</a> SFX …</p><p><strong>UPDATE 2010-03-10</strong> See <a class="reference external" href="/archives/217-More-granular-identifier-indexes-for-your-Evergreen-SRU-Z39.50-servers.html">More granular identifier indexes for your
Evergreen SRU / Z39.50
servers</a>
for some recommended enhancements to the target parser and Evergreen's
identifier index capabilities</p>
<p><a class="reference external" href="http://laurentian.ca">Laurentian University</a> is part of the <a class="reference external" href="http://www.ocul.on.ca/">Ontario
Council of University Libraries (OCUL)</a>, and
a user of the centrally hosted <a class="reference external" href="http://scholarsportal.info">Ontario Scholars
Portal</a> SFX link resolver, so one of the
things we needed when we migrated to Evergreen was a target parser for
our link resolver. This is the target associated with <em>Search the
library catalogue</em> that is the last resort when the resolver fails to
turn up any full-text resources for a given OpenURL - so hopefully it
won't need to be invoked too often, as we have a very rich set of
full-text electronic resources at Laurentian University.</p>
<div class="section" id="the-code">
<h2>The code</h2>
<p>Here is a quick implementation of a target parser that generates search
URLs based on ISSN, ISBN, book title, or journal title. Pretty
impoverished from an OpenURL perspective, but it maintains the same
level of functionality from our previous system. In
<strong>TargetParser/Evergreen/Conifer.pm</strong> I created a target parser called
Evergreen::Conifer that implements a subset of the Parsers::TargetParser
API for SFX as follows:</p>
<pre class="literal-block">
package Parsers::TargetParser::Evergreen::Conifer;use Parsers::TargetParser;use base qw(Parsers::TargetParser);use strict;sub getHolding { my ($this,$genRequestObj) = @_; my $objectType = $genRequestObj->{'objectType'}; my $ISBN = $genRequestObj->{'ISBN'}; my $eISBN = $genRequestObj->{'eISBN'}; my $ISSN = $genRequestObj->{'ISSN'}; my $eISSN = $genRequestObj->{'eISSN'}; my $CODEN = $genRequestObj->{'CODEN'}; my $bookTitle = $genRequestObj->{'bookTitle'}; my $journalTitle = $genRequestObj->{'journalTitle'}; # Canonical search results URL for simple searches: # http://laurentian.concat.ca/opac/en-CA/skin/lul/xml/rresult.xml?rt=keyword&tp=keyword&t=0895-2779&l=105&d=2&f=&av= my $svc = $this->{svc}; my $egHost = $svc->parse_param('eg_host'); my $egLocale = $svc->parse_param('eg_locale'); my $egSkin = $svc->parse_param('eg_skin'); my $egOrgUnit = $svc->parse_param('eg_org_unit'); my $egDepth = $svc->parse_param('eg_depth'); my $path = "http://${egHost}/opac/${egLocale}/skin/${egSkin}/xml/rresult.xml?l=${egOrgUnit}&d=${egDepth}"; my $searchString = '&rt=keyword&tp=keyword&t='; if (defined($ISSN)) { if ($ISSN =~ m/x/i) { # Current indexer doesn't deal well with ISSNs containing an X, so break it up $ISSN =~ s/^(\d{4})-?(\d+)x/$1 -$2 x/i; $searchString .= $ISSN; } else { $searchString .= "\"$ISSN\""; # format 9999-9999 for MARC } } elsif (defined($ISBN)) { # Evergreen doesn't force ISBNs to be stripped of hyphens, so take whatever $searchString .= "\"$ISBN\""; } elsif (defined($journalTitle)) { # Restrict searches to title index, with bibliographic level = s $searchString .= "ti:${journalTitle}&bl=s"; } elsif (defined($bookTitle)) { # Restrict searches to title index, with bibliographic level = m $searchString .= "ti:${bookTitle}&bl=m"; } return ($path . $searchString);}1;
</pre>
<p>And here's the help that I added to the corresponding <strong>Conifer.hlp</strong>
file:</p>
<div class="line-block">
<div class="line"><strong>General Information</strong></div>
</div>
<div class="line-block">
<div class="line">Target - LOCAL_CATALOGUE_EVERGREEN_CONIFER</div>
</div>
<div class="line-block">
<div class="line">Service - getHolding</div>
</div>
<p>Parser - Evergreen::Conifer</p>
<div class="line-block">
<div class="line"><strong>Information needed in the Target Service:</strong></div>
</div>
<div class="line-block">
<div class="line">In the PARSE_PARAM field, replace the following information:</div>
</div>
<div class="line-block">
<div class="line">eg_host = $$LOCAL_CATALOGUE_SERVER</div>
</div>
<div class="line-block">
<div class="line">eg_locale = Locale (en-US, en-CA, fr-CA, etc)</div>
</div>
<div class="line-block">
<div class="line">eg_skin = algoma, default, lul, nohin, uwin</div>
</div>
<div class="line-block">
<div class="line">eg_org_unit = 103, 1, etc</div>
</div>
<div class="line-block">
<div class="line">eg_depth = 0, 1, 2, 3, etc</div>
</div>
</div>
<div class="section" id="findings-and-wishlists">
<h2>Findings and wishlists</h2>
<p>While it's quite easy to set up Evergreen as a searchable resource,
thanks to its straightforward URL syntax, one of the things that leaps
out at me is that Evergreen, by default, has no identifier index for
limiting searches by ISBN / ISSN / LCCN / OCLCnum. Ideally, we would
disable full-text indexing on this index so that we can more accurately
search for ISSNs that include an <strong>x</strong>. Right now we have to split ISSNs
with an "x" into constituent parts and generate searches on those parts,
which results in false hits from across the database. This would also be
useful for limiting Z39.50 searches.</p>
<p>I would also like to teach Evergreen about ISBN-10/ISBN-13 equivalence,
to broaden the search while maintaining precision. And I would like to
automatically normalize ISSN and ISBN formats so that I don't have to
worry about whether a cataloguer entered hyphens or not - and the same
for incoming search terms.</p>
<p>Finally, to support services like
<a class="reference external" href="http://www.worldcat.org/affiliate/webservices/xisbn/app.jsp">xISBN</a>
that search for multiple formats and editions of a given work by
generating a shotgun blast of ISBNs for all known representations, I
would love to teach Evergreen how to accept a list of identifiers as
search input.</p>
<p>Don't ask me when these things will happen, though; if it requires work
from me, it will probably be 2010 before any of it happens.</p>
</div>
Globalization presentation at Evergreen International Conference 20092009-06-05T02:12:00-04:002009-06-05T02:12:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-06-05:/globalization-presentation-at-evergreen-international-conference-2009.html<p>I was fortunate to be invited to give a talk (<a class="reference external" href="/uploads/talks/2009/Globalization1.odp">OpenOffice.org
Impress</a>
/ <a class="reference external" href="/uploads/talks/2009/Globalization1.pdf">PDF</a>
) on Evergreen's progress on the</p>
<p>globalization front at the first ever <a class="reference external" href="http://evergreen-ils.org/dokuwiki/doku.php?id=eg09:main">Evergreen International
Conference</a>.
My friend</p>
<p><a class="reference external" href="http://www.eifl.org/cps/sections/services/eifl-foss/foss-blog/2009_05_29_first-evergreen">Tigran
Zargaryan</a>
from the <a class="reference external" href="http://www.flib.sci.am">Fundamental Science Library of the National Academy of
Sciences of the Republic of Armenia</a> gave a …</p><p>I was fortunate to be invited to give a talk (<a class="reference external" href="/uploads/talks/2009/Globalization1.odp">OpenOffice.org
Impress</a>
/ <a class="reference external" href="/uploads/talks/2009/Globalization1.pdf">PDF</a>
) on Evergreen's progress on the</p>
<p>globalization front at the first ever <a class="reference external" href="http://evergreen-ils.org/dokuwiki/doku.php?id=eg09:main">Evergreen International
Conference</a>.
My friend</p>
<p><a class="reference external" href="http://www.eifl.org/cps/sections/services/eifl-foss/foss-blog/2009_05_29_first-evergreen">Tigran
Zargaryan</a>
from the <a class="reference external" href="http://www.flib.sci.am">Fundamental Science Library of the National Academy of
Sciences of the Republic of Armenia</a> gave a</p>
<p>talk at almost the same time about his library's progress in adopting</p>
<p>Evergreen. Tigran himself was responsible for the translation of the</p>
<p>Evergreen catalogue and staff client into Armenian, and he confided that
he</p>
<p>also expected to make significant progress towards a Russian translation</p>
<p>during the lengthy layovers at airports that are part of his normal
travel routine.</p>
<p>So, my goal was to provide an overview of the progress we have made in</p>
<p>taking Evergreen from its American English roots and enabling it to
support</p>
<p>not just translated interfaces, but properly localized content display -
and</p>
<p>to provide some pointers towards where we need to go next. We have been</p>
<p>making progress towards a more formalized translation process, so keep
an</p>
<p>eye out for a call for translations in the next week or two when the
Evergreen</p>
<p>1.6 release candidate is made available for testing. We currently sport</p>
<p>Armenian, Canadian English, Canadian French, and Czech translations, and</p>
<p>welcome both new translations and revisions to our current translations.</p>
<p>To make it easier for translators to collaborate, we need to take our</p>
<p><a class="reference external" href=":8080">Pootle translation server</a> from a</p>
<p>beta service running on my poor little VPS to a real server. We have
some</p>
<p>technical challenges to overcome - providing translation support for the</p>
<p>Template::Toolkit framework, for example. And we have some basic grunt
work</p>
<p>to do to replace the hard-coded display of numbers, currencies, dates,
and times</p>
<p>with localized variations throughout our code.</p>
<p>I was pleasantly surprised by the number of people attending the
session; I</p>
<p>hadn't expected such an interest in the topic, despite it nominally
being an international</p>
<p>conference. My only regret was that I rushed off the stage without
taking</p>
<p>questions in the mistaken belief that I had used up all of my time and
was</p>
<p>eating into my successor's presentation timeslot; as it turned out,
there</p>
<p>was a built-in 15 minute buffer that I had overlooked. Ah well. Thanks
to</p>
<p>everyone who came out, and for everyone else who wasn't able to make it
to</p>
<p>the session, I hope you'll find the slides a good introduction to the</p>
<p>state of globalization in Evergreen. And if you have the skills to
contribute, please</p>
<p>consider pitching into the globalization enablement effort!</p>
Evergreen International Conference hackfest results: Evergreen serials support2009-05-27T18:23:00-04:002009-05-27T18:23:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-05-27:/evergreen-international-conference-hackfest-results-evergreen-serials-support.html<p>Yes, all of a sudden and rather quietly, Evergreen has serials support.</p>
<p>A few weeks ago, I finished hooking up a rudimentary serials holdings
display based on <a class="reference external" href="http://lisletters.fiander.info/">David Fiander's</a>
<a class="reference external" href="http://svn.open-ils.org/trac/ILS/browser/trunk/Open-ILS/src/perlmods/OpenILS/Utils/MFHD.pm">MFHD parsing
code</a>
to our production instance of Evergreen. We loaded our MFHD records from
our legacy system into Evergreen and …</p><p>Yes, all of a sudden and rather quietly, Evergreen has serials support.</p>
<p>A few weeks ago, I finished hooking up a rudimentary serials holdings
display based on <a class="reference external" href="http://lisletters.fiander.info/">David Fiander's</a>
<a class="reference external" href="http://svn.open-ils.org/trac/ILS/browser/trunk/Open-ILS/src/perlmods/OpenILS/Utils/MFHD.pm">MFHD parsing
code</a>
to our production instance of Evergreen. We loaded our MFHD records from
our legacy system into Evergreen and that gave us enough breathing room
to keep working on the problem. By rudimentary I mean:</p>
<ul class="simple">
<li>limited to displaying one MFHD record per bibliographic record (a
problem for journals for which you have separate sets of holdings in
microfiche, print, etc)</li>
<li>serials holdings were displayed for a given bibliographic record no
matter what library scope you were searching in (more of a problem in
theory than in practice as we currently have one copy of a given
bibliographic record per library... that will change over time...)</li>
<li>no way to edit the MFHD records, which is a problem as the issues we
have received since migrating to Evergreen three weeks ago are
starting to pile up</li>
<li>limited to English labels in the interface</li>
</ul>
<p>Here's the rudimentary serials holdings display: <a class="reference external image-reference" href="/uploads/talks/2009/serials_display.png"><img alt="image0" class="serendipity-image-left" src="/uploads/talks/2009/serials_display.serendipityThumb.png" style="width: 110px; height: 89px;" /></a></p>
<p>The operative phrase is <em>was rudimentary</em>. In the past two weeks, things
have come a long way in Evergreen. The primary result of my afternoon of
work at the Evergreen International Hackfest, with lots of help from
Mike Rylander and Bill Erickson in navigating the impressive new <a class="reference external" href="http://dojotoolkit.org">Dojo
toolkit</a>-based Evergreen JavaScript widgets
and services in the upcoming Evergreen 1.6 release, was to add an
<strong>Edit</strong> button to the holdings display that shows up when the record is
viewed in the staff client. When pressed, the Edit button invokes a MARC
editor so that you can copy an 86[345] field and fill in the pertinent
information; or collapse holdings in the 86[678] fields, etc. It seems
like a minor victory, but it was a real result from the hackfest, and
that cannot be discounted!</p>
<p>Here's the MARC editor in action: <a class="reference external image-reference" href="/uploads/talks/2009/MFHD_editor.png"><img alt="image1" class="serendipity-image-left" src="/uploads/talks/2009/MFHD_editor.serendipityThumb.png" style="width: 110px; height: 89px;" /></a></p>
<p>Since then, I've been on fire... or maybe on a slow burn, as I put a few
hours in here and there, and am happy to say that when Evergreen 1.6 is
released, serials support will feature:</p>
<ul class="simple">
<li>support for display unlimited MFHD records per bibliographic record</li>
<li>holdings display scoped by library search context - so you'll only
see holdings for the part of the library hierarchy that you're
searching, rather than the whole consortium</li>
<li>the <strong>Edit</strong> button for editing the raw MFHD record</li>
<li>internationalization support for interface labels, based on Dojo
string substitution</li>
</ul>
<p>I have already committed these features to the Evergreen trunk, but I
hope to add a few more pieces to the mix before the Evergreen 1.6
release is cut. We need to display the 852 field contents to identify
the location of each set of holdings, and we need to give cataloguers
the ability to edit some of the attributes (such as owning library).</p>
<p>Here are <a class="reference external" href="/uploads/talks/2009/Hackfest_results.pdf">|image2|the
slides</a>
I presented (largely screenshots of the serials display and edit button)
for the hackfest results lightning talk that I gave with Jeff Godin of
<a class="reference external" href="http://www.tadl.org/">Traverse Area District Library</a>. Jeff did some
interesting work in his own right on generating feeds for recently added
titles based on copy location during the hackfest.</p>
Conifer lives: Ontario launches a consortial academic library system built on Evergreen2009-05-11T21:21:00-04:002009-05-11T21:21:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-05-11:/conifer-lives-ontario-launches-a-consortial-academic-library-system-built-on-evergreen.html<p>I awoke around 4:48 am today. At the time, I thought it was just our baby kicking away excitedly. However, later this afternoon, I realized that it had been almost exactly a week ago, around 4:30 am on Monday, May 4th that I sent a broadcast email message …</p><p>I awoke around 4:48 am today. At the time, I thought it was just our baby kicking away excitedly. However, later this afternoon, I realized that it had been almost exactly a week ago, around 4:30 am on Monday, May 4th that I sent a broadcast email message to librarians and staff at 24 different libraries. The Conifer consortial library system, built on the solid foundations of the Evergreen open-source library system, had gone live - and I was exhausted after a long weekend of migrating all of that data. I was proud to see the <a class="reference external" href="http://laurentian.concat.ca">Laurentian catalogue</a> sporting a completely different look and new functionality - reviews! book covers! sharable book bags! format & edition grouping! - and excited by the promise of more to come.</p>
<p>Conifer represents the first flowering of an effort that began back in July 2007 with a hand-shake agreement between <a class="reference external" href="http://laurentian.ca">Laurentian University</a>, <a class="reference external" href="http://mcmaster.ca">McMaster University</a>, and the <a class="reference external" href="http://uwindsor.ca">University of Windsor</a> to build a provincial, primarily academic, library system on Evergreen. The system is centrally hosted by the top-notch IT team at the <a class="reference external" href="http://www.uoguelph.ca/ccs/">University of Guelph</a>.</p>
<p>Things change, and along the way <a class="reference external" href="http://algomau.ca">Algoma University</a> and the <a class="reference external" href="http://nosm.ca">Northern Ontario School of Medicine</a> joined us as full partners, and McMaster University opted to continue contributing to the common development effort but withdrew from the centrally hosted system.</p>
<p>As noted, we went live on Monday, May 4th and we survived the first day. On Tuesday, May 5th we corrected a problem in our configuration that had caused some instability (thanks to Mike Rylander for providing the patch that set things straight). Since then, we have been slowly refining aspects of the system - setting up circulation rules, migrating records and items that had been missed over the weekend, polishing the Z39.50 server, fine-tuning the permissions scheme - but the core of the system is solid. We have a consortial system that stretches from the southern-most tip of Ontario to the north-west corner of the province (hello, Thunder Bay!), and so far connectivity seems good and the reliability of the system - which, upon launch, has probably become the second largest Evergreen implementation by number of bibliographic records - has been superb.</p>
<p>A few interesting statistics about Conifer... (have I mentioned how much I love that Evergreen is built on PostgreSQL because it becomes so simple to generate basic reports in plain SQL?):</p>
<div class="section" id="number-of-staff-and-user-accounts-per-library-in-conifer">
<h2>Number of staff and user accounts per library in Conifer</h2>
<pre class="literal-block">
conifer=# SELECT aou.name, count(au.id)
FROM actor.org_unit aou
INNER JOIN actor.usr au
ON aou.id = au.home_ou
GROUP BY aou.name
ORDER BY 2 DESC;
name | count
-------------------------------------------+-------
Leddy Library | 19468
J.N. Desmarais Library | 11921
Algoma University, Wishart Library | 2431
University of Sudbury | 1100
Hearst, Bibliothèque Maurice-Saulnier | 1043
Huntington College Library | 834
Paul Martin Law Library | 592
Northern Ontario School of Medicine (West) | 284
HRSRH Health Sciences Library | 261
Northern Ontario School of Medicine (East) | 224
Xstrata Process Support Centre Library | 122
NOHIN | 121
Instructional Media Centre | 9
Laboratoire de didactiques, E.S.E. | 7
Vale Inco | 4
Mines Library, Willet Green Miller Centre | 2
Art Gallery of Sudbury | 1
Curriculum Resource Centre | 1
Sault Area Hospital | 1
Centre Franco-Ontarien de Folklore | 1
Conifer | 1
(21 rows)
</pre>
</div>
<div class="section" id="number-of-copies-held-per-library-in-conifer">
<h2>Number of copies held per library in Conifer</h2>
<pre class="literal-block">
conifer=# SELECT aou.name, count(ac.barcode)
FROM actor.org_unit aou
INNER JOIN asset.copy ac
ON aou.id = ac.circ_lib
GROUP BY aou.name
ORDER BY 2 DESC;
name | count
-------------------------------------------+---------
Leddy Library | 1373197
J.N. Desmarais Library | 614380
Paul Martin Law Library | 229391
Algoma University, Wishart Library | 115156
University of Sudbury | 42154
Hearst, Bibliothèque Maurice-Saulnier | 34276
Huntington College Library | 12517
Laboratoire de didactiques, E.S.E. | 10284
Mining and the Environment Database | 9940
HRSRH Health Sciences Library | 7512
Music Resource Centre | 7511
Xstrata Process Support Centre Library | 5477
Centre Franco-Ontarien de Folklore | 4365
Northern Ontario School of Medicine (East) | 3779
Northern Ontario School of Medicine (West) | 3301
NOHIN | 2647
Mines Library, Willet Green Miller Centre | 2617
Curriculum Resource Centre | 2583
Sault Area Hospital | 2515
Art Gallery of Sudbury | 2237
Hearst Timmins, Centre de Ressources | 2202
Hearst Kapuskasing, Centre de Ressources | 2007
Vale Inco | 1106
Instructional Media Centre | 1095
(24 rows)
</pre>
</div>
<div class="section" id="what-about-acquisitions-serials-and-reserves">
<h2>What about acquisitions, serials, and reserves?</h2>
<p>One of the reasons we had a hard migration date of early May was because it matches nicely with the fiscal year-end for those institutions who were running a traditional acquisitions system on their legacy ILS. We normally shut down all purchases for a period of weeks while we roll over the encumbrances into the next fiscal year and set up our budgets. This year, we're migrating all of the old financial data twice: first, and foremost, into the most sophisticated set of spreadsheets you'll ever see attached to a library system (as pulled together by the inestimable Art Rhyno); and second, into the Evergreen acquisitions system that will launch with Evergreen 1.6.</p>
<p>The first migration of a given set of data is always the hardest part, so once we have the fund / order / provider data in spreadsheets, the migration into Evergreen proper will be trivial. This will give us the summer to use both systems side-by-side and refine what we need from Evergreen. We have migrated all of our serials data from the legacy system, I just haven't enabled the display of that data in our live system. A prototype was running on my laptop for a few days until I accidentally blew it away - ah well, anything worthwhile doing is better the second time around anyway. This, too, will be part of the Evergreen 1.6 release, and will feature full MFHD compliance built on the code that David Fiander has been writing on behalf of Equinox. I should note that this first cut at serials is in some ways relatively basic; while the system in Evergreen 1.6 will be fully MFHD compliant, down to the point of letting you to edit an MFHD record to "check in" a new issue by adding a new 863 field, it won't associate barcodes with individual issues. Most of the database schema exists to support that, but there's still a large amount of code to be written on top of the schema and we need Something That Works Right Now <img alt=":-)" class="emoticon" src="/images/smile.png" /> I'm confident that that's coming not too far down the road, though.</p>
<p>Finally, what would an academic library be without reserves? Art Rhyno (again!) has been working with Graham Fawcett for the past six months on <a class="reference external" href="http://svn.open-ils.org/trac/ILS-Contrib/wiki/SyrupReserves">Syrup</a> - a really impressive melding of the world of electronic reserves and traditional physical library system reserves that uses SIP and Z39.50 to talk to Evergreen. Syrup is just about at a full boil now, so in a few more weeks we should have it deployed so that we can savour its sweetness through the relatively slow summer months before ensuring that the taste is just right for all of our incoming students and faculty in the fall.</p>
</div>
Evergreen iPhone application? Unnecessary!2009-04-13T04:29:00-04:002009-04-13T04:29:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-04-13:/evergreen-iphone-application-unnecessary.html<p>This Easter weekend I had the opportunity to play with someone's iPod
Touch. Of course, the only thing I tried was the Evergreen 1.4 catalogue
interface. Lo and behold, it came up just fine on Safari in all of its
heavily dynamic JavaScript and less-than-XHTML-compliant glory - even
sporting several …</p><p>This Easter weekend I had the opportunity to play with someone's iPod
Touch. Of course, the only thing I tried was the Evergreen 1.4 catalogue
interface. Lo and behold, it came up just fine on Safari in all of its
heavily dynamic JavaScript and less-than-XHTML-compliant glory - even
sporting several Dojo widgets. Nice. So we don't have to worry about
writing an iPhone-specific application to access Evergreen; users of
such devices can just use the normal dynamic catalogue with full
functionality.</p>
<p>Evergreen doesn't fare quite as well with Microsoft's rather decrepit
<em>PocketExplorer</em> browser on my HTC Touch smartphone (it's a Windows
Mobile monstrosity, sigh), but it does work well with the <a class="reference external" href="http://www.opera.com/mobile/">Opera
Mobile</a> 9.5 beta browser. I eagerly
anticipate the first good release of
<a class="reference external" href="https://wiki.mozilla.org/mobile">Fennec</a> for Windows Mobile (<a class="reference external" href="http://starkravingfinkle.org/blog/2009/03/fennec-windows-mobile-update/">coming
soon!</a>),
as I'm confident that's going to improve my mobile Web browsing
experience even further.</p>
<p>I predict that in another year or two the idea of building
mobile-specific Web portals to complement your full-function Web site
will be pretty passé. I already get really irritated when Web sites
think they're being helpful by automatically redirecting my smartphone
to an extremely limited interface; in most cases, the full site runs
fine. Give me the option, sure, but don't force me down that path. As
hardware costs continue to drop, and 3G networks expand, and more people
upgrade to more capable mobile devices, one full-function Web site will
be all we need--as long as that site is written in (X)HTML and CSS and
JavaScript.</p>
<p>Those sites that decide to push core functionality into Flash or
SilverLight, on the other hand, can go straight to hell,
thankyouverymuch. I'm looking at you,
<a class="reference external" href="http://ptonthenet.com">PTOnTheNet</a>. This is a site to which Lynn has
been a paying customer for years. It recently announced that it was
revising the Web site, which is all well and good. What's not so good is
that they adopted SilverLight: not just for pretty effects here and
there, but as a core technology. Problem: Lynn has been using Linux at
home since I introduced her to it somewhere around eight years ago, and
last year bought one of the early models of the Linux-based Asus EEE
netbook. Not only did the site redesign destroy the personal training
programs she had set up for her clients over the years (breaking site
redesign rule #1: <em>Thou shalt not destroy your clients' data</em>), but it
also renders her netbook useless for that site.</p>
<p>Even with the <a class="reference external" href="http://www.go-mono.com/moonlight/">Moonlight plugin</a>
installed, it looks like the cretinous site developers are using
detection scripts to prevent the plugin from even trying to render the
content. With Linux-based netbooks on the rise--and with netbooks being
the right form factor and price for personal trainers who want to throw
them into their backpacks and not weep too bitterly if their netbook
suffers the misfortune of being knocked around or sweated to death--this
seems very much like a technology choice that was not based on the needs
of the customers. Worst of all, they <a class="reference external" href="http://www.ptonthenet.com/techhelp.aspx">deliberately chose to exclude
Linux</a>, when a (X)HTML, CSS,
and JavaScript platform would have supported almost any modern platform:
not just Linux netbooks, but other mobile devices like the iPhone and
smartphones that are so well-suited to the personal trainer. So, at
least one customer is going to be walking away, and if there's a
competing Web site out there that caters to a broader clientele, I bet
there will be far more customers moving in that direction.</p>
One big library, one little device: Evergreen staff client on Nokia N8102009-03-02T05:23:00-05:002009-03-02T05:23:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-03-02:/one-big-library-one-little-device-evergreen-staff-client-on-nokia-n810.html<div class="serendipity_imageComment_left" style="width: 480px"><div class="serendipity_imageComment_img"><p><img alt="image0" class="serendipity-image-left" src="/uploads/pics/n810.jpg" style="width: 480px; height: 379px;" /></p>
</div><div class="serendipity_imageComment_txt"><p>It's hard to take good photos of these devices</p>
</div></div><p>Almost exactly a year ago, Jason Etheridge (the primary developer of the
Evergreen staff client) and I managed to get our hands on a developer
edition of the <a class="reference external" href="http://www.nseries.com/products/n810/#l=products,n810">Nokia
N810</a> Internet
tablet device. It's a nifty little handheld computer that packs …</p><div class="serendipity_imageComment_left" style="width: 480px"><div class="serendipity_imageComment_img"><p><img alt="image0" class="serendipity-image-left" src="/uploads/pics/n810.jpg" style="width: 480px; height: 379px;" /></p>
</div><div class="serendipity_imageComment_txt"><p>It's hard to take good photos of these devices</p>
</div></div><p>Almost exactly a year ago, Jason Etheridge (the primary developer of the
Evergreen staff client) and I managed to get our hands on a developer
edition of the <a class="reference external" href="http://www.nseries.com/products/n810/#l=products,n810">Nokia
N810</a> Internet
tablet device. It's a nifty little handheld computer that packs 128 MB
of memory, a touch screen, and a beautiful 800x480 screen, and I've had
my hands on it from almost the beginning. The primary rationale of the
Nokia developer program was to encourage developers to put together
useful applications for their platform, of course... and as the months
ticked by and I did nothing of interest, my guilt slowly grew.</p>
<p>Well, today I feel a little bit better. Here's what happened: when I was
attending <a class="reference external" href="http://fsoss.senecac.on.ca/2008/">FSOSS 2008</a> at Seneca
College, I ran into <a class="reference external" href="http://madhava.com/egotism/">Madhava Enros</a>.
Madhava and I had worked together on some help UI designs back when we
were both DB2 employees; since then, he had joined the Mozilla
Foundation and was working on
<a class="reference external" href="https://wiki.mozilla.org/Mobile/Fennec">Fennec</a>, the mobile version
of Firefox targeting the N810 device (to begin with, at least). The
first alpha of Fennec had been released to coincide with FSOSS 2008, so
I gave it a shot a few days later. Madhava's team made some great
innovative decisions for Fennec's UI, but what really caught my eye was
that they had packaged a port of XULRunner-1.9 to the N810.</p>
<p>See, the Evergreen staff client is built on XUL, the same
XML/JavaScript/CSS foundation as Firefox and Thunderbird and Fennec -
and to run XUL, you need XULRunner. At the time, though, the Evergreen
staff client needed the 1.8 version of XULRunner; it simply wouldn't
work with 1.9. So, I stuffed the N810 back into its case and forgot
about it for a few more months while I focused on other things like the
never-ending effort to improve Evergreen's internationalization support.</p>
<p>Over the last few weeks, though, Jason has been steadily enhancing the
staff client in Evergreen trunk - and the comment for one of his <a class="reference external" href="http://svn.open-ils.org/trac/ILS/changeset/12275">recent
commits</a> was “we're
kicking xulrunner 1.8 to the curb with trunk”. I had a spare hour or two
on my hands today, so I copied a staff client build from Conifer's
Evergreen trunk test box to the N810, kicked off the XULRunner command,
and waited... expecting failure. Instead, I found that the staff client
worked almost exactly as it does on my laptop - the major difference
being that some of the default function key mappings on the staff client
conflict with the mappings of special buttons on the N810 (like the full
screen toggle gets mapped to F6 - Record In-House Use on the staff
client). Otherwise, the client did a great job of adjusting to the
available screen width, and even Dojo-based interfaces like the Vandelay
MARC batch importer/exporter and the pop-up calendar worked. Very cool!</p>
<p>So, if I can find a barcode scanner with a mini-USB attachment, I could
have a nice little inventory tool on my hands. Or a mobile circulation
station. All because the Evergreen developers made the decision years
ago to build on XUL as a cross-platform framework... this should be
sweet confirmation that they made a good choice. XUL continues to be
ported to more platforms, and anyone using the Evergreen staff client
benefits from the optimizations and bug fixes that go into XULRunner.
Nice. When we cut a release from Evergreen trunk that supports XULRunner
1.9, I'll do my best to package up a version of the staff client for the
N810, and some of my guilt will be assuaged. Yes!</p>
<p><strong>*Updated 2009-03-02 10:35 am</strong>: Correcting Madhava's name; I shouldn't
write past midnight without proof-reading! Sorry Madhava.*</p>
Unicorn to Evergreen migration: rough notes2009-02-08T21:32:00-05:002009-02-08T21:32:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-02-08:/unicorn-to-evergreen-migration-rough-notes.html<p><strong>Updated 2009-02-25 00:29 EST</strong>: Corrected setuptools installation
step.</p>
<p><strong>Updated 2009-02-08 23:39 EST</strong>: Trimmed width of some of the <pre>
code sections for better formatting. Created bzr repository for
unicorn2evergreen scripts at <a class="reference external" href="http://bzr.coffeecode.net/unicorn2evergreen">http://bzr.coffeecode.net/unicorn2evergreen</a></p>
<p>I did this once a long time ago for the <a class="reference external" href="http://library.upei.ca/">Robertson
Library …</a></p><p><strong>Updated 2009-02-25 00:29 EST</strong>: Corrected setuptools installation
step.</p>
<p><strong>Updated 2009-02-08 23:39 EST</strong>: Trimmed width of some of the <pre>
code sections for better formatting. Created bzr repository for
unicorn2evergreen scripts at <a class="reference external" href="http://bzr.coffeecode.net/unicorn2evergreen">http://bzr.coffeecode.net/unicorn2evergreen</a></p>
<p>I did this once a long time ago for the <a class="reference external" href="http://library.upei.ca/">Robertson
Library</a> at the University of Prince Edward
Island. For our own migration to Evergreen, I have to load a
representative sample of records from our Unicorn system onto one of our
test servers. This has been a good refresher of the process... and a
reminder to myself to post the other part of the Unicorn to Evergreen
migration scripts in a publicly available location. Okay, they're posted
to this bzr repository called
<a class="reference external" href="http://bzr.coffeecode.net/unicorn2evergreen">unicorn2evergreen</a></p>
<ol class="arabic">
<li><p class="first">Export bibliographic records from Unicorn using Unicorn's catalog key
(basic sequential accession number) as the unique identifier (I
plopped the catalog key into the 935a field/subfield combo). I use
the catalog key because the "flexkey" is not guaranteed to be unique
within a single Unicorn instance - and because the catalog key makes
it easy for us to match call numbers and copies.</p>
</li>
<li><p class="first">For each item, export call number / barcode / owning library /
current location / home location / item type using the catalog key as
the identifier.</p>
</li>
<li><p class="first">Set up the organization unit hierarchy on your Evergreen system. You
can dump it from an existing Evergreen system into a file named
"orgunits.dump" like so:</p>
<pre class="literal-block">
pg_dump -U evergreen --data-only --table actor.org_unit_type \ --table actor.org_unit > orgunits.sql
</pre>
<p>Then drop all of the existing org_units and org_unit_types and
load your custom data in a psql session:</p>
<pre class="literal-block">
BEGIN;SET CONSTRAINTS ALL DEFERRED;DELETE FROM actor.org_unit;DELETE FROM actor.org_unit_type;\i orgunits.sqlCOMMIT
</pre>
</li>
<li><p class="first">Import bibliographic records using the <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=evergreen-admin:importing:bibrecords">standard marc2bre.pl /
direct_ingest.pl / pg_loader.pl
process</a>.
Point the --idfield / --idsubfield and --tcnfield / --tcnsubfield
options for marc2bre.pl at 935a (yes, this sucks for title control
numbers, but as noted above they are not guaranteed to be unique in
Unicorn and we need uniqueness in Evergreen). We need the
bibliographic record entry ID field to be the catalog key to set up
subsequent call number/barcode matches.</p>
</li>
<li><p class="first">Enable the subsequent addition of new bibliographic records by
setting the sequence object values to avoid conflicting ID / TCN
values by issuing the following SQL statements:</p>
<pre class="literal-block">
SELECT setval('biblio.autogen_tcn_value_seq', (select max(id) from biblio.record_entry) + 100);SELECT setval('biblio.record_entry_id_seq', (select max(id) from biblio.record_entry) + 100);
</pre>
</p>
<p></li>
<li><p class="first">Process holdings records.</p>
</p><ol class="arabic">
<li><p class="first">Call numbers might have MARC8 encoded characters, so process'em
and convert to UTF8. Theoretically "yaz-iconv -f MARC-8 -t UTF-8 <
holdings.lst > holdings_utf8.lst" should do it, but instead it
eats linefeeds and creates an unusable field. Ugh. We use a little
Python script instead that requires pymarc, which in turn requires
a version of setuptools (0.6c5) newer than Debian Etch's packaged
version (0.6c3). So:</p>
<pre class="literal-block">
wget http://pypi.python.org/packages/2.4/s/setuptools/setuptools-0.6c9-py2.4.eggsudo sh setuptools-0.6c9-py2.4.eggsudo easy_install pymarc
</pre>
</li>
<li><p class="first">Now actually generate the 'holdings_utf8.lst' file.</p>
<pre class="literal-block">
cat holdings.lst | python marc8_to_utf8.py
</pre>
</li>
<li><p class="first">Adjust parse_unicorn.py to match up the holdings fields (added
flexkey to the start). Then parse the holdings_utf8.lst to
generate an SQL file (holdings_eg.sql) that we can load into the
import staging table.</p>
<pre class="literal-block">
python parse_unicorn.py
</pre>
</p><p>Note that the holdings data for the item with barcode
30007007751786 didn't process cleanly and won't load. Weird -
possibly a corrupt character in the item data? Augh, no - there
are flexkeys and callnumbers that contain '|' characters (16
occurrences for "|z", 37 for "|b"), which is of course also what
we are using as our delimiters. ARGH. I deleted it for now with:</p>
<pre class="literal-block">
grep -v '|z' holdings_utf8.lst > holdings_clean.lstgrep -v '|z' holdings_clean.lst > holdings_clean.lst2mv holdings_clean.lst2 holdings_clean.lst
</pre>
</p>
<p><p>Adjust parse_unicorn.py to match the new input name and generate
a new holdings_eg.sql.</p>
</li>
</ol>
</p>
<p></li>
<li><p class="first">Create the import staging table:</p>
<pre class="literal-block">
psql -f Open-ILS/src/extras/import/import_staging_table.sql
</pre>
</li>
<li><p class="first">Load the items into the import staging table:</p>
<pre class="literal-block">
psql -f holdings_eg_clean.sql
</pre>
</p>
<p><p>We discover that some more of our data sucks - for example, one item
("Research in autism spectrum disorders", HIRC PER-WEB) has a create
date of '0' which is not a valid date format because the barcode is
"1750-9467|21". For now, grep it out as above and reload.</p>
</li>
<li><p class="first">Investigate possibilities of collapsing unnecessary duplicate item
types:</p>
<pre class="literal-block">
SELECT item_type, COUNT(item_type)FROM staging_itemsGROUP BY item_typeORDER BY item_type; item_type | item_count ------------+------------ ATLAS | 162 AUDIO | 792 AUD_VISUAL | 1790 AV | 69 AV-EQUIP | 182 BOOK | 996 BOOKS | 581592 BOOK_ART | 1 BOOK_RARE | 4949 BOOK_SHWK | 5 BOOK_WEB | 49163 COMPUTER | 33...(40 rows)
</pre>
</p><p>How about locations?</p>
<pre class="literal-block">
SELECT location, COUNT(location)FROM staging_itemsGROUP BY locationORDER BY location; location | count ------------+-------- ALGO-ACH | 13 ALGO-ATLAS | 148 ALGO-AV | 1837...(212 rows)
</pre>
</p><p>Now we can collapse categories pretty simply inside the staging
table. For example, if we want to collapse all of the BOOK types into
a single type of BOOK:</p>
<pre class="literal-block">
UPDATE staging_itemsSET item_type = 'BOOK'WHERE item_type IN ('BOOKS', 'BOOK_ART', 'BOOK_RARE', 'BOOK_SHWK', 'BOOK_WEB', 'REF-BOOK');
</pre>
</p>
<p></li>
<li><p class="first">Update legacy library names to new Evergreen library short names
(we're using OCLC codes where possible). Some will be straightforward
old names to new names. Others will require a little more logic based
on location + legacy library name; we're splitting the DESMARAIS
collection into multiple org-units (Music Resource Centre, Hearst
locations, hospital locations, etc).</p>
<pre class="literal-block">
-- Laurentian Music Resource CentreUPDATE staging_itemsSET owning_lib = 'LUMUSIC'WHERE location = 'DESM-MRC';-- Hearst - Kapuskasing locationUPDATE staging_itemsSET owning_lib = 'KAP'WHERE location LIKE 'HRSTK%';-- Hearst - Timmins locationUPDATE staging_itemsSET owning_lib = 'TIMMINS'WHERE location LIKE 'HRSTT%';
</pre>
</li>
<li><p class="first">Generate the copies in the system:</p>
<pre class="literal-block">
psql -f generate_copies.sql
</pre>
</li>
<li><p class="first">Make the metarecords:</p>
<pre class="literal-block">
psql -f quick_metarecord_map.sql
</pre>
</li>
</ol>
<p>Ah, recognize that any electronic resources (which don't have associated
copies) won't appear. Check for 856 40 and change the bre source to a
transcendent one mayhaps?</p>
<pre class="literal-block">
-- Create a new transcendant resource; -- this autogenerates an ID of 4 in a default, untouched systemINSERT INTO config.bib_source (quality, source, transcendant)VALUES (10, 'Electronic resource', 't');-- Make the electronic full text resources (856 40) transcendant-- by setting their bib record source to the new bib_source value of 4UPDATE biblio.record_entry SET source = 4 WHERE id IN ( SELECT DISTINCT(record) FROM metabib.full_rec WHERE tag = '856' AND ind1 = '4' AND ind2 = '0');
</pre>
<p>And no transcendence. Hmm. Oh well, worry about that later.</p>
Evergreen Exposed: introduction to Evergreen development (OLA 2009)2009-02-01T20:21:00-05:002009-02-01T20:21:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-02-01:/evergreen-exposed-introduction-to-evergreen-development-ola-2009.html<p><strong>Update 2009-02-19</strong>: uploaded diffs from Evergreen 1.4.0.2
(<a class="reference external" href="/uploads/files/EG_exposed.tar.gz">EG_exposed.tar.gz</a>)
for adding details to record summary; and Bill Erickson's slides and
code examples are <a class="reference external" href="http://acq.open-ils.org/~erickson/berick_ola.zip">also available for
download</a></p>
<p>The slides: <a class="reference external" href="/uploads/talks/2009/Evergreenexposed.odp">Evergreen exposed, part
1</a>
(OpenOffice)</p>
<p>My second presentation at the OLA SuperConference 2009 was <strong>Evergreen
Exposed …</strong></p><p><strong>Update 2009-02-19</strong>: uploaded diffs from Evergreen 1.4.0.2
(<a class="reference external" href="/uploads/files/EG_exposed.tar.gz">EG_exposed.tar.gz</a>)
for adding details to record summary; and Bill Erickson's slides and
code examples are <a class="reference external" href="http://acq.open-ils.org/~erickson/berick_ola.zip">also available for
download</a></p>
<p>The slides: <a class="reference external" href="/uploads/talks/2009/Evergreenexposed.odp">Evergreen exposed, part
1</a>
(OpenOffice)</p>
<p>My second presentation at the OLA SuperConference 2009 was <strong>Evergreen
Exposed: hacking the open library system</strong>, which promised to “take
attendees on a tour of the architecture and source code of the
<a class="reference external" href="http://evergreen-ils.org">Evergreen library system</a>”. I was very
fortunate to have Bill Erickson, one of the original Evergreen
developers, agree to join me as a co-presenter. Given the
hour-and-fifteen-minute time slot that we were allotted, we opted to
take an incremental approach to introducing parts of Evergreen to the
audience, starting with basic tasks and working up to more complex
customisations. We also tried to focus on answering questions that had
been posted to the <a class="reference external" href="http://evergreen-ils.org/listserv.php">Evergreen mailing
lists</a> to ensure that we would
satisfy our target audience's interests.</p>
<div class="section" id="dan-starts-with-the-basics">
<h2>Dan starts with the basics</h2>
<p>I started the session with an introduction of how to create a different
skin for the catalogue, starting with text, CSS, JavaScript, and images
and extending to the translation and customization framework. We talked
about how to future-proof your customizations against future upgrades
and how consortia can use skins to provide not just different
look-and-feel, but different functionality, for each member of the
consortium. Not much more than XML entities defined by DTDs, massaged
via Apache server side includes (SSI), but it's an important conceptual
building block for both the catalogue and the staff client.</p>
<p>I then ran through the exercise of <a class="reference external" href="/archives/181-Adding-a-new-metadata-format-to-Evergreen-in-a-dozen-lines-of-code.html">adding a new metadata export
format</a>
that brought the Federal Geographic Data Committee's Content Standard
for Geospatial Data Metadata (<a class="reference external" href="http://www.fgdc.gov/metadata/csdgm/">FGDC
CSGDM</a>) format to Evergreen's
existing list of supported formats. On the one hand: big deal, another
metadata format. Hold that thought in that one hand; we'll come back to
it later.</p>
<p>I also walked through two other common requests on the mailing lists:
<em>how do I define a new index or tweak the behaviour of an existing
index</em> and <em>how do I hide or show more information on the detailed
record display page</em>? I'll follow up with separate posts for each of
these pieces to augment what you have before you in the slides; suffice
to say that there's a lot of
<a class="reference external" href="http://www.loc.gov/standards/mods">MODS</a>, a little bit of
JavaScript, a smidgin of XPath, a dollop of Evergreen's interface
definition language (IDL), and a slice of Perl mixed together. Along the
way, I peeled back the covers to show a bit of OpenSRF in operation,
setting up Bill's part of the show...</p>
</div>
<div class="section" id="bill-leads-us-into-the-promised-land">
<h2>Bill leads us into the promised land</h2>
<p><strong>Note</strong> I'll update this with a link to Bill's slides when he manages
to post them!</p>
<p>Bill gave a quick "big picture" view of how OpenSRF operates, including
a much clearer overview of Evergreen's object-relational IDL that maps
objects to relational tables. He also covered the cstore OpenSRF
application that offers access to the underlying database without
requiring SQL but still with support for full transactions
(commit/rollback) and sub-transactions (savepoints). During Bill's
demonstrations of these features, he exercised srfsh in a way that was
new to me - he used the <strong>introspect</strong> command with a partial method
name to perform a left-anchored search for matching method names. Cool!</p>
<p>Oh, and he also showed that if OpenSRF would normally return a reference
to an object defined in the IDL, you can ask it to <em>flesh</em> the object
in-place with its complete set of attributes instead; and of course if
any of those attributes are object references, you have the option of
fleshing those as well. It's a lovely way to cut down on chattiness in
your application.</p>
<p>From there, Bill whipped out DojoSRF, the OpenSRF-aware extensions for
<a class="reference external" href="http://dojotoolkit.org">dojo, the JavaScript toolkit</a> that Evergreen
adopted as its core JavaScript framework in release 1.4. In 90 lines of
HTML and JavaScript code, he implemented a basic but workable catalogue
- and then, with a few more lines of code, he gave the audience the
payoff for that FGDC CSGDM (geographic metadata) format that I had
earlier hacked into Evergreen. As part of the transform separates out
the geographic coordinates of the subject matter (in the case of our
demo data, maps of Northern California), Bill was able, in just a few
more lines of code, to easily extract the coordinates from the FGDC
CSGDM representation of the bibliographic material and plot the bounding
box for the coverage area on a Google Map image. Very cool.</p>
<p>We had about 15 to 20 people attend our session, and I was happy with
that attendance given the extremely technical content and relatively
niche product. If as a result we end up adding just one more developer
to the Evergreen community, that would be a great outcome. And for
myself, I was forced to learn much more of Evergreen - just in time for
Project Conifer, I hope <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
</div>
Project Conifer update session at OLA SuperConference 20092009-01-30T16:28:00-05:002009-01-30T16:28:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-01-30:/project-conifer-update-session-at-ola-superconference-2009.html<p><strong>*Updated</strong> 2009-02-02 to add PDF formatted slides because the <a class="reference external" href="http://go-oo.org">free
and libre formats</a> just isn't good enough for some
people - heh*</p>
<p>The slides, up front and center:</p>
<ul class="simple">
<li><a class="reference external" href="/uploads/talks/2009/OLA2009Coniferupdate.odp">OpenOffice Impress format
(ODP)</a></li>
<li><a class="reference external" href="/uploads/talks/2009/OLA2009Coniferupdate.pdf">Portable Document Format
(PDF)</a></li>
</ul>
<p>Last year I <a class="reference external" href="/archives/149-The-State-of-Evergreen-OLA-Presentation.html">gave a
presentation</a>
at the OLA SuperConference 2008 on <em>The State of …</em></p><p><strong>*Updated</strong> 2009-02-02 to add PDF formatted slides because the <a class="reference external" href="http://go-oo.org">free
and libre formats</a> just isn't good enough for some
people - heh*</p>
<p>The slides, up front and center:</p>
<ul class="simple">
<li><a class="reference external" href="/uploads/talks/2009/OLA2009Coniferupdate.odp">OpenOffice Impress format
(ODP)</a></li>
<li><a class="reference external" href="/uploads/talks/2009/OLA2009Coniferupdate.pdf">Portable Document Format
(PDF)</a></li>
</ul>
<p>Last year I <a class="reference external" href="/archives/149-The-State-of-Evergreen-OLA-Presentation.html">gave a
presentation</a>
at the OLA SuperConference 2008 on <em>The State of Evergreen</em>. Yesterday,
<a class="reference external" href="http://libgrunt.blogspot.com">John Fink</a> and I gave an update on the
state of <a class="reference external" href="http://conifer.mcmaster.ca">Project Conifer</a>, the
partnership between Algoma University, Laurentian University, Northern
Ontario School of Medicine, and the University of Windsor to mount a
consortial instance of Evergreen for our respective academic libraries.</p>
<p>McMaster University (John Fink's employer) is another Project Conifer
institutional partner, albeit with a slightly different relationship.
They are contributing resources towards development of academic
features, but working towards their own Evergreen instance on their own
timeline. Their relationship in the project changed the week before our
presentation, so John and I had a fun time adjusting our presentation to
match the new reality <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
<p>In comparison to last year, which was largely an introduction to
Evergreen and the state of its various features, this session was much
more focused on Project Conifer. John gave the background of the project
and the importance of having an open source library system at the core
of our academic libraries, particularly given the short-term challenges
that most of the Project Conifer participants face with their/our
current library systems. I focused on the challenges and lessons learned
in managing the project, with most of the challenges being the
difficulty of getting skilled resources to work on our development
requirements, and most of the lessons learned being in working out
cost-sharing agreements and priority-setting procedures early on in the
project.</p>
<p>The session was well-attended, and there is clearly growing interest in
Evergreen as a viable option, as well a a bit of frustration at the pace
of development of some of the features that academics in particular are
interested in. These are "interesting times" for academic libraries -
this week an announcement has been rippling through the Ontario library
community that the <a class="reference external" href="http://bibliocentre.ca">BiblioCentre</a> consortial
library system that has served many Ontario college libraries since 2003
is being shut down. If Evergreen's academic features were already in
place, it would have been a slam-dunk to put together a business case
for a centrally hosted Evergreen system to serve the same constituency.
As those features are still in active development, it's not quite as
easy to make that business case.</p>
<p>Happily, Art Rhyno and Graham Fawcett have taken support for academic
reserves for managing both print and electronic materials from ground
zero to a reasonable interface in just a few months. They expect to
start wiring in direct Evergreen support over the next few months so
that we will have a functioning reserves system that goes far beyond our
current library system's capabilities ("our" being Laurentian
University, in this case).</p>
<p>After an exciting drive from Buffalo on a very snowy Wednesday
afternoon, Bill Erickson of <a class="reference external" href="http://esilibrary.com">Equinox Software
Incorporated</a> gave Project Conifer
participants a demo of the current state of acquisitions on Wednesday
night, and it's not too far from meeting our base requirements. Equinox
has hired a second developer to contribute to acquisitions development,
documentation is being concurrently produced, and one of Project
Conifer's contractors is working on adding EDI support. So we're
optimistic that a functioning base acquisitions system will be in place
in May - although, as one of our collection development librarians has
wryly noted, our budgets might not have any room for book purchases in
the coming fiscal year in any case.</p>
<p>A highlight of the session was when I asked Susan Downs, CEO of the
<a class="reference external" href="http://www.innisfil.library.on.ca/tsuga/">Innisfil Public Library</a>,
to talk about their success story. In October 2008, Innisfil announced
to the library world that they had migrated to Evergreen without any
vendor assistance - certainly the first known instance in Ontario, and
possibly the first self-migrated and self-supported public library on
Evergreen in the world. It was great to meet the people behind that
project and I was glad to let Susan share some of her energy,
enthusiasm, and insights with our audience.</p>
<p>I had some feedback from one attendee who was happy to see a
presentation on an in-process project, with warts and all exposed,
rather than the usual post-project stories that quickly put the rough
patches behind them (or forget them entirely). I'm happy to do as good a
job as I can to represent an objective look at the project - for one
thing, it's my job as project manager - and I hope that in some small
way I've been able to help others prepare for similar projects.</p>
Adding a new metadata format to Evergreen in a dozen lines of code2009-01-26T05:29:00-05:002009-01-26T05:29:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-01-26:/adding-a-new-metadata-format-to-evergreen-in-a-dozen-lines-of-code.html<p>Just like my <a class="reference external" href="/archives/180-Fetching-item-availability-from-Evergreen-using-the-OpenSRF-HTTP-gateway.html">last
entry</a>,
this is a preview of one part of my upcoming session at the OLA
SuperConference, <a class="reference external" href="http://www.accessola.com/superconference2009/showSession.php?lsession=1017&usession=1017">Evergreen Exposed: Hacking the open source library
system</a>.
We know from the last entry that Evergreen internally converts MARC21 to
MODS to support item display; and in fact it also …</p><p>Just like my <a class="reference external" href="/archives/180-Fetching-item-availability-from-Evergreen-using-the-OpenSRF-HTTP-gateway.html">last
entry</a>,
this is a preview of one part of my upcoming session at the OLA
SuperConference, <a class="reference external" href="http://www.accessola.com/superconference2009/showSession.php?lsession=1017&usession=1017">Evergreen Exposed: Hacking the open source library
system</a>.
We know from the last entry that Evergreen internally converts MARC21 to
MODS to support item display; and in fact it also includes support for
exposing records as OAI, RDF, SRW, and HTML. Today, we're going to be
looking at adding support for an entirely new metadata format to
Evergreen.</p>
<p><a class="reference external" href="http://article.gmane.org/gmane.education.libraries.open-ils.devel/2366/match=fgdc">Back in November,
2008</a>,
George Duimovich requested "I would like to hear from anyone on the
process for adding an additional supported format" in the specific
context of the <a class="reference external" href="http://www.fgdc.gov/">FGDC</a> metadata format for
digital geospatial data. George did a great thing to support his request
and included links to the metadata format itself, along with a pointer
to an <a class="reference external" href="http://ir.library.oregonstate.edu/dspace/handle/1957/16">XSLT
stylesheet</a>
that the inestimable <a class="reference external" href="http://oregonstate.edu/~reeset/">Terry Reese</a>
had written and published for converting MARC21 to FGDC XML. His request
has been burning at the back of my mind since then, partially because I
had quickly responded with the oh-so-helpful:</p>
<blockquote>
</p><p>Assuming that we can get over the licensing hump, it should be a</p>
<p>relatively straightforward matter of dropping the transform into</p>
<p>Open-ILS/src/perlmods/OpenILS/Application/SuperCat.pm and</p>
<p>Open-ILS/src/perlmods/OpenILS/WWW/SuperCat/Feed.pm (using something</p>
<p>like MODS32 as a template).</p>
<p></blockquote>
<p>Simple and straightforward, right? Well... yes and no. I had just gone
through the process of adding MODS 3.2 support because I needed the more
granular treatment of URLs to fix an item display problem, so I was
pretty comfortable with the code at the time. After a few months, that
familiarity goes away and one gets to go through the discovery process
again. (Oh, and about a week after the MODS 3.2 support went in and Mike
Rylander went the extra mile to update all of the indexes to use MODS
3.2, MODS 3.3 was released to the world. Sigh).</p>
<p>Without further ado, following are the diffs required to roughly support
FGDC as a SuperCat format:</p>
<pre class="literal-block">
dbs@dbs-laptop:~/source/Evergreen-rel_1_4$ svn diff Open-ILS/src/perlmods/Index: Open-ILS/src/perlmods/OpenILS/Application/SuperCat.pm===================================================================--- Open-ILS/src/perlmods/OpenILS/Application/SuperCat.pm (revision 11952)+++ Open-ILS/src/perlmods/OpenILS/Application/SuperCat.pm (working copy)@@ -143,6 +143,18 @@ # and stash a transformer $record_xslt{rss2}{xslt} = $_xslt->parse_stylesheet( $rss_xslt ); + # parse the FGDC xslt ...+ my $fgdc_xslt = $_parser->parse_file(+ OpenSRF::Utils::SettingsClient+ ->new+ ->config_value( dirs => 'xsl' ).+ "/MARC21slim2FGDC.xsl"+ );+ # and stash a transformer+ $record_xslt{fgdc}{xslt} = $_xslt->parse_stylesheet( $fgdc_xslt );+ $record_xslt{fgdc}{docs} = 'http://www.fgdc.gov/metadata/csdgm/index_html';+ $record_xslt{fgdc}{schema_location} = 'http://www.fgdc.gov/metadata/fgdc-std-001-1998.xsd';+ register_record_transforms(); return 1;
</pre>
<p>If you're still with me after that whack of code, and you're counting,
that's about 12 lines of code. Okay, I'm cheating - the diff doesn't
include the MARC21 to FGDC stylesheet - for one thing, I'm still waiting
to see a version of the stylesheet with a license attached to it. For
another, do you _really_ want to see all that XSL? After you patch
your copy of OpenILS::Application::SuperCat.pm, copy the MARC21 to FGDC
stylesheet into /openils/var/xsl, and restart the Evergreen Perl
services, you'll be able to take advantage of the new functionality.
That's it!</p>
<p>What's going on in this code? This patch against
Open-ILS/src/perlmods/OpenILS/Application/SuperCat.pm enables SuperCat
(and therefore unAPI) support for the new format. We just add an entry
to the hash of XSLT stylesheets that SuperCat knows about, and the rest
is visible in URLs like:</p>
<ul class="simple">
<li><a class="reference external" href="http://localhost/opac/extras/supercat/formats/record">http://localhost/opac/extras/supercat/formats/record</a> - list of
supported record formats</li>
<li><a class="reference external" href="http://localhost/opac/extras/supercat/retrieve/fgdc/record/1">http://localhost/opac/extras/supercat/retrieve/fgdc/record/1</a> -
display record #1 in FGDC format</li>
<li><a class="reference external" href="http://localhost/opac/extras/unapi?id=tag:localhost,2009:biblio-record_entry/1">http://localhost/opac/extras/unapi?id=tag:localhost,2009:biblio-record_entry/1</a>
- display the record formats that unAPI can return</li>
<li><a class="reference external" href="http://localhost/opac/extras/unapi?id=tag:localhost,2009:biblio-record_entry/1&format=fgdc">http://localhost/opac/extras/unapi?id=tag:localhost,2009:biblio-record_entry/1&format=fgdc</a>
- return record #1 in FGDC format via unAPI</li>
</ul>
<p>So who cares about this? Well, George cares, and (I'm guessing wildly
here), perhaps it's because his organization has tools that can import
FGDC but that also want to maintain the data in their library catalogue
because they love MARC. That might be sufficient reason. Another
reasonable use case would be to use the FGDC transform to populate
spatial data tables built on the geospatial extensions offered by
<a class="reference external" href="http://www.postgis.org">PostGIS</a> and index these for lightning-fast
retrieval of maps and map data that cover a given range of coordinates.</p>
<p>I'm sure the same approach could be used for other specialized metadata
formats. This is just one example of why I'm sold on Evergreen's
capability as a platform for the future of our library.</p>
Fetching item availability from Evergreen using the OpenSRF HTTP gateway2009-01-20T15:57:00-05:002009-01-20T15:57:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2009-01-20:/fetching-item-availability-from-evergreen-using-the-opensrf-http-gateway.html<p>This is a preview of one part of my upcoming session at the OLA
SuperConference, <a class="reference external" href="http://www.accessola.com/superconference2009/showSession.php?lsession=1017&usession=1017">Evergreen Exposed: Hacking the open source library
system</a>.
In the Conifer implementation of Evergreen, at least one of the partners
plans to use a decoupled discovery layer rather than the Evergreen OPAC.
So we needed …</p><p>This is a preview of one part of my upcoming session at the OLA
SuperConference, <a class="reference external" href="http://www.accessola.com/superconference2009/showSession.php?lsession=1017&usession=1017">Evergreen Exposed: Hacking the open source library
system</a>.
In the Conifer implementation of Evergreen, at least one of the partners
plans to use a decoupled discovery layer rather than the Evergreen OPAC.
So we needed to answer the typical question "How do I retrieve the
availability of copies for a given work at my institution?" Note that
this mini-tutorial is based entirely on OpenSRF 1.0 / Evergreen 1.4;
OpenSRF 0.9 will generate different JSON output, and the URL for the
OpenSRF gateway will be different.</p>
<div class="section" id="learning-from-the-old-masters-how-the-evergreen-opac-does-it">
<h2>Learning from the old masters: how the Evergreen OPAC does it</h2>
<p>The Evergreen OPAC itself relies heavily on JavaScript to dynamically
flesh out item details and retrieve item status, so it's actually pretty
easy to work out how to do this without even delving too deeply into
OpenSRF. First, let's use the <a class="reference external" href="http://www.getfirebug.com/">Firebug</a>
Mozilla extension to follow network requests for a given "title details"
page in the OPAC search results for the title: <a class="reference external" href="http://dev.gapines.org/opac/en-US/skin/default/xml/rdetail.xml?r=8526&t=beer&tp=keyword&d=0&hc=33&rt=keyword">The new world guide to
beer</a>.
Open up Firebug, enable network monitoring for the OPAC site, and watch
the requests flood past for the title details page. We can see that
there are a number of POST requests to
<a class="reference external" href="http://dev.gapines.org/osrf-gateway-v1">http://dev.gapines.org/osrf-gateway-v1</a>:</p>
<ul>
<li><p class="first"><strong>POST request #1 parameters</strong></p>
<ul class="simple">
<li>method = open-ils.search.biblio.record.mods_slim.retrieve</li>
<li>service = open-ils.search</li>
<li>locale = en-US</li>
<li>param = 8526</li>
</ul>
</p>
<p><p>This is how we retrieve the title / author / ISBN and other
bibliographic details of interest for display; as we're talking about
a decoupled discovery layer, we won't need to worry about this piece
of the puzzle.</p>
</li>
<li><p class="first"><strong>POST request #2 parameters</strong></p>
<ul class="simple">
<li>method = open-ils.search.config.copy_status.retrieve.all</li>
<li>service = open-ils.search</li>
<li>locale = en-US</li>
</ul>
</p><p>This is how we retrieve the list of all possible copy statuses that
have been configured for this Evergreen system; here's the response
(truncated for legibility):</p>
<pre class="literal-block">
{ "status" : 200, "payload" : [ [ { "__c" : "ccs", "__p" : [ null, null, null, "f", 3, "Lost", "f" ] }, { "__c" : "ccs", "__p" : [ null, null, null, "t", 0, "Available", "t" ] }, { "__c" : "ccs", "__p" : [ null, null, null, "t", 1, "Checked out", "t" ] }, { "__c" : "ccs", "__p" : [ null, null, null, "f", 2, "Bindery", "t" ] } ] ]}
</pre>
<p>We're getting a response in JavaScript Object Notation
(<a class="reference external" href="http://www.json.org">JSON</a>) format - the nice, compact,
easy-to-read data interchange format that almost every programming
language under the sun can interpret and generate. Yay!</p>
</li>
<li><p class="first"><strong>POST request #3 parameters</strong></p>
<ul class="simple">
<li>method = open-ils.search.biblio.copy_counts.summary.retrieve</li>
<li>service = open-ils.search</li>
<li>locale = en-US</li>
<li>param = 8526</li>
<li>param = 1</li>
<li>param = 0</li>
</ul>
</p><p>This is how we retrieve the call numbers, copies, and copy status for
a given title. We pass in the the TCN input parameter ("8526"), the
numeric ID of the organization being searched ("1" = "every branch"),
and the depth of the organization ("0" = top of the hierarchy). The
response for this request is:</p>
<pre class="literal-block">
{ "status" : 200, "payload" : [ [ [ "127", "663.42 JACKSON, MICHAEL", { "0" : 1 } ], [ "130", "663.42 JACKSON, MICHAEL", { "0" : 1 } ], [ "125", "663.42 JACKSON, MICHAEL", { "0" : 1 } ], [ "34", "R 641.23 JACKSON, MICHAEL", { "0" : 1 } ] ] ]}
</pre>
</li>
</ul>
</div>
<div class="section" id="interpreting-the-http-requests-and-responses">
<h2>Interpreting the HTTP requests and responses</h2>
<p>Okay, so we've found a couple of requests that are pertinent to our
goal. And you might be able to guess that the fifth element of the
<strong>__p</strong> entry in the copy status response is the numeric identifier
for the copy status, while the sixth element is the copy status name
(which, as of OpenSRF 1.0 / Evergreen 1.4, if you pass a different
<strong>locale</strong> value can return a translated value).</p>
<p>You might even be able to guess that the response from the
copy_counts.summary request returns an array of responses consisting of
the organization ID, the call number, and a hash of copy status and the
respective counts for each copy status. And you would be guessing
correctly. But why guess, when you can get an authoritative
interpretation by looking up the class hint (the <strong>__c</strong> value in the
copy_status response of "ccs") in Evergreen's intermediate definition
language file <strong>/openils/conf/fm_IDL.xml</strong>:</p>
<pre class="literal-block">
<class id="ccs" controller="open-ils.cstore" oils_obj:fieldmapper="config::copy_status" oils_persist:tablename="config.copy_status"> <fields oils_persist:primary="id" oils_persist:sequence="config.copy_status_id_seq"> <field name="isnew" oils_obj:array_position="0" oils_persist:virtual="true" /> <field name="ischanged" oils_obj:array_position="1" oils_persist:virtual="true" /> <field name="isdeleted" oils_obj:array_position="2" oils_persist:virtual="true" /> <field name="holdable" oils_obj:array_position="3" oils_persist:virtual="false" reporter:datatype="bool"/> <field name="id" oils_obj:array_position="4" oils_persist:virtual="false" reporter:selector="name" reporter:datatype="id"/> <field name="name" oils_obj:array_position="5" oils_persist:virtual="false" reporter:datatype="text" oils_persist:i18n="true"/> <field name="opac_visible" oils_obj:array_position="6" oils_persist:virtual="false" reporter:datatype="bool"/> </fields>
</pre>
<p>So now, by taking our first steps into Evergreen's object persistence
model, we can determine authoritatively that the order of values in the
<strong>__p</strong> array maps to "isnew", "ischanged", "isdeleted", "holdable",
"id", "name", and "opac_visible". As for the response from the
copy_counts.summary call, well, these are not Evergreen objects (they
don't have a <strong>__c</strong> class hint) - but you can use the OpenSRF shell
"srfsh" introspect command to view the documentation for the applicable
method:</p>
<pre class="literal-block">
bash$ srfshsrfsh# introspect open-ils.search... (truncated for legibility) ...Received Data: { "__c":"OpenILS_Application", "__p":{ "api_level":1, "stream":0, "object_hint":"OpenILS_Application_Search_Biblio", "package":"OpenILS::Application::Search::Biblio", "remote":0, "api_name":"open-ils.search.biblio.copy_counts.summary.retrieve", "signature":{ "params":[ ], "desc":"returns an array of these: [ org_id, callnumber_label, , , ... ] where statusx is a copy status name. the statuses are sorted by id.", "return":{ "desc":null, "type":null, "class":null } }, "server_class":"open-ils.search", "notes":"\treturns an array of these:\n\t\t[ org_id, callnumber_label, , , ... ] \n\t\twhere statusx is a copy status name. the statuses are sorted\n\t\tby id.\n", "method":"copy_count_summary", "argc":0 }
</pre>
<p>The introspect output is a bit rough - it's really intended for the
<a class="reference external" href="http://dev.gapines.org/opac/extras/docgen.xsl?service=open-ils.search&param=%22copy_counts.summary.retrieve%22">doxygen API help
interface</a>
- but it's good enough for our purposes. If we want to dig into what's
going on under the covers, we can follow the package_name value
"OpenILS::Application::Search::Biblio" to read the source code for the
<a class="reference external" href="http://svn.open-ils.org/trac/ILS/browser/branches/rel_1_4/Open-ILS/src/perlmods/OpenILS/Application/Search/Biblio.pm">OpenILS::Application::Search::Biblio</a>
Perl module, and look up the method "copy_count_summary" as indicated
by the "method" value in the introspect output. That reveals that the
input arguments are "($self, $client, $rid, $org, $depth)". Every
OpenSRF method automatically receives $self and $client as the first two
arguments, so $rid (record ID), $org (organization unit ID), and $depth
(organization unit depth) are the variables over which we have control.</p>
</div>
<div class="section" id="zeroing-in-on-the-copies-for-a-particular-library-or-library-system">
<h2>Zeroing in on the copies for a particular library or library system</h2>
<p>If we want to retrieve the visible copies for just a single organization
unit in the entire Evergreen system, we just have to adjust the values
of the organization unit ID and organization unit depth parameters
accordingly. If we ask for the visible copies for <a class="reference external" href="http://dev.gapines.org/osrf-gateway-v1?service=open-ils.search&method=open-ils.search.biblio.copy_counts.summary.retrieve&locale=en-US&param=8526&param=125&param=2">just org_unit ID
"125" at depth
"2"</a>,
we narrow down our results to a single hit:</p>
<pre class="literal-block">
{ "status" : 200, "payload" : [ [ [ "125", "663.42 JACKSON, MICHAEL", { "0" : 1 } ] ] ]}
</pre>
<p>So, with all of that ammunition at your disposal, you can write an
Evergreen copy status lookup in any decoupled discovery layer that
supports HTTP POST or GET requests. Which should be pretty much any
discovery layer, right?</p>
</div>
<div class="section" id="frequently-used-tools-and-methods-for-evergreen-opensrf-hacking">
<h2>Frequently used tools and methods for Evergreen / OpenSRF hacking</h2>
<p>Note, the first: you can easily play with different parameter values for
the HTTP POST requests using the
<a class="reference external" href="http://curl.haxx.se/%3Ecurl%3C/a%3E%20command.%20If%20you%20have%20a%20recent%20version%20of%20the%20Perl%20JSON::XS%20module%20installed,%20you%20can%20pipe%20the%20output%20from%20curl%20to%20%3Ca%20href=">json_xs</a>
command to pretty print the JSON response:</p>
<pre class="literal-block">
curl -d service=open-ils.search -d locale=en-US \ -d method=open-ils.search.biblio.copy_counts.summary.retrieve \ -d param=8526 -d param=1 -d param=0 \ http://dev.gapines.org/osrf-gateway-v1 | json_xs -t json-pretty
</pre>
<p>Note, the second: the OpenSRF gateway also supports GET requests; simply
concatenate the request parameters in <a class="reference external" href="http://dev.gapines.org/osrf-gateway-v1?service=open-ils.search&method=open-ils.search.biblio.copy_counts.summary.retrieve&locale=en-US&param=8526">a single URL like
this</a></p>
<p>.</p>
</div>
Evergreen 1.4.0.0 RC2 and OpenSRF 1.0.1 are out2008-11-21T03:41:00-05:002008-11-21T03:41:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-11-21:/evergreen-1400-rc2-and-opensrf-101-are-out.html<p>As I announced on the Evergreen mailing lists last night:</p>
<div style="margin-left: 3em;"><p>One month after the first release candidate of Evergreen 1.4.0.0, the</p>
<p>Evergreen development team is pleased to announce the availability of</p>
<p>Evergreen 1.4.0.0, release candidate 2, from</p>
<p><a class="reference external" href="http://open-ils.org/downloads.php">http://open-ils.org/downloads.php</a></p>
<p>A partial …</p></div><p>As I announced on the Evergreen mailing lists last night:</p>
<div style="margin-left: 3em;"><p>One month after the first release candidate of Evergreen 1.4.0.0, the</p>
<p>Evergreen development team is pleased to announce the availability of</p>
<p>Evergreen 1.4.0.0, release candidate 2, from</p>
<p><a class="reference external" href="http://open-ils.org/downloads.php">http://open-ils.org/downloads.php</a></p>
<p>A partial overview of the changes since 1.4.0.0 RC1:</p>
<ul>
<li><p class="first">MARC importer / exporter enhancements</p>
</li>
<li><p class="first">Improved support for marking long overdue items</p>
</li>
<li><p class="first">Z39.50 client enhancements</p>
</li>
<li><p class="first">An interface for switching locales in the staff client</p>
</li>
<li><p class="first">Localization in every interface - although we have undoubtedly</p>
</p>
<p><p>missed a few strings</p>
</li>
<li><p class="first">Bundled Armenian and French (Canadian) translations</p>
</li>
<li><p class="first">Performance improvements for new and changed item feeds</p>
</li>
<li><p class="first">Various staff client, build, and source tree fixes</p>
</li>
</ul>
<p>The complete change log between 1.4.0.0 RC1 and 1.4.0.0 RC2 can be</p>
<p>found here:
<a class="reference external" href="http://open-ils.org/downloads/ChangeLog-1.4.0.0rc1-1.4.0.0rc2">http://open-ils.org/downloads/ChangeLog-1.4.0.0rc1-1.4.0.0rc2</a></p>
<p>Please help us reach a solid 1.4.0.0 final release by testing out</p>
<p>1.4.0.0 RC2 with the freshly released OpenSRF 1.0.1 and reporting</p>
<p>problems, sending patches for improvements or fixes, or sending new or</p>
<p>updated translations to the Evergreen Development mailing list.</p>
<p>Coming soon for the 1.4.0.0 RC2 release:</p>
<ul>
<li><p class="first">Windows staff client</p>
</li>
<li><p class="first">Updated install instructions at</p>
</p>
<p><p><a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=server:1.4.0.0:ubuntu804:install">http://open-ils.org/dokuwiki/doku.php?id=server:1.4.0.0:ubuntu804:install</a></p>
</li>
<li><p class="first">VMWare image
<p></p>
</li>
</ul>
</div><p>This release has been a long time in the making, and we'd love to have
your help in testing it and flushing out bugs. Also, if you would like
to contribute a translation, this is your chance to step up! We already
have Brazilian Portugese (pt_BR), Georgian (ka), and Canadian English
(en_CA) translations in the works, along with a commitment to update
the Canadian French (fr_CA) translation. As this is the first real
round of translations for Evergreen, I fully expect that there will be
some work ahead of us to smooth out the translation process - but we
have to take the plunge some time. Many thanks to Tigran Zargaryan and
Natural Resources Canada for their respective contributions of the
Armenian (hy_AM) and Canadian French (fr_CA) translations this summer;
their willingness to be early guinea pigs for the translation process
helped immensely.</p>
<p><strong>Update:</strong> I noticed that the speedy Warren Layton <a class="reference external" href="http://thebookpile.wordpress.com/2008/11/20/evergreen-14-rc2/">beat me to the
punch</a>
on the blog announcement of the releases. Warren's been very helpful
with testing and suggestions for improvements to the documentation, so I
don't mind being scooped at all <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
An Evergreen track at the OLA SuperConference 2009?2008-10-28T20:10:00-04:002008-10-28T20:10:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-10-28:/an-evergreen-track-at-the-ola-superconference-2009.html<p>Just poked at the <a class="reference external" href="http://www.accessola.com/superconference2009/">OLA SuperConference
2009</a> schedule
(January 28 - 31, 2009) and found four sessions listed that are all
about Evergreen. Wow! Check this out:</p>
<table style="border: solid black; border-width: 0px 0px 1px 1px; border-collapse: collapse;">
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;" ead>
<tr>
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Date</p>
</th>
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Time</p>
</th>
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Title</p>
</th>
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Description (may be abridged)</p>
</th>
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Presenters</p>
</th>
</tr>
</p>
<p>
</thead>
</p>
<p>
<tbody>
</p>
<p>
<tr>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Thursday, January 29</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>9:05 am</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p><a class="reference external" href="http://www.accessola.com/superconference2009/showSession.php?lsession=410&usession=410">It.s Just a Little Bit of Programming Isn.t …</a></p></td></p></th></table><p>Just poked at the <a class="reference external" href="http://www.accessola.com/superconference2009/">OLA SuperConference
2009</a> schedule
(January 28 - 31, 2009) and found four sessions listed that are all
about Evergreen. Wow! Check this out:</p>
<table style="border: solid black; border-width: 0px 0px 1px 1px; border-collapse: collapse;">
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;" ead>
<tr>
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Date</p>
</th>
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Time</p>
</th>
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Title</p>
</th>
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Description (may be abridged)</p>
</th>
<th style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Presenters</p>
</th>
</tr>
</p>
<p>
</thead>
</p>
<p>
<tbody>
</p>
<p>
<tr>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Thursday, January 29</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>9:05 am</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p><a class="reference external" href="http://www.accessola.com/superconference2009/showSession.php?lsession=410&usession=410">It.s Just a Little Bit of Programming Isn.t
It?</a></p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;">
</p>
<p><p>“Follow the progress of the Library @ Mohawk.s development of the open
source ILS Evergreen. Hear the trials and tribulations and learn from
the mistakes and successes that have occurred along the way . we are
truly a learning organization on this project. We went live in summer
2008 . come and hear about where we.ve been, where we are and where we
hope to be soon.”</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Robert Soulliere, Systems Librarian; Cynthia Williamson, Collection &
Access Librarian, Mohawk College of Applied Arts and Technology</p>
</td>
</p>
<p>
</tr>
</p>
<p>
<tr>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Thursday, January 29</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>3:45 pm</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;">
</p>
<p><p><a class="reference external" href="http://www.accessola.com/superconference2009/showSession.php?lsession=614&usession=614">Project Conifer: Evergreen library system for Ontario
Universities</a></p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>“Find out how the Evergreen open source library system, originally
developed for a public library consortium, is being adapted for academic
libraries by three Ontario universities. Discussion will focus on the
challenges, successes and mistakes (err, .learning opportunities.) of
the project.”</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>John Fink, Digital Technologies Development Librarian, McMaster
University; Dan Scott, Systems Librarian, Laurentian University</p>
</td>
</p>
<p>
</tr>
</p>
<p>
<tr>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Friday, January 30</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>9:05 am</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p><a class="reference external" href="http://www.accessola.com/superconference2009/showSession.php?lsession=1017&usession=1017">Evergreen exposed: hacking the open source library
system</a></p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>“Join an Evergreen developer on a tour of the architecture and source
code of the Evergreen library system [...] Get ready to get your hands
dirty with Evergreen . this will be a session filled with code!”</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>William Erickson, Vice President, Software Development & Integration,
Equinox Software Inc; Dan Scott, Systems Librarian, Laurentian
University</p>
</td>
</p>
<p>
</tr>
</p>
<p>
<tr>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Saturday, January 31</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>10:40 am</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p><a class="reference external" href="http://www.accessola.com/superconference2009/showSession.php?lsession=1808&usession=1808">Multilingual Language Issues of Open Source
ILS</a></p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>“Discover the Chinese version of Evergreen along with various
multilingual issues related MARC standards, encoding, indexing,
searching, and sorting especially associated with Chinese language.”</p>
</td>
</p>
<p>
<td style="border: solid black; border-width: 1px 1px 0px 0px; margin: 0px; padding: 4px;"><p>Jason Zou, Systems Librarian, Lakehead University; Guoying (Grace) Liu,
Systems Librarian, Leddy Library, University of Windsor</p>
</td>
</p>
<p>
</tr>
</p>
<p>
</tbody>
</p>
<p>
</table>
</p><p>I was responsible for the sole Evergreen presentation at OLA
SuperConference 2008 - it's awesome to see a lot more people jumping in
this year! I'm keenly anticipating this conference - we'll have to set
up at least one Evergreen "Birds of a Feather" session.</p>
Evergreen: deOSSification of library software2008-10-23T17:45:00-04:002008-10-23T17:45:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-10-23:/evergreen-deossification-of-library-software.html<p>In a few minutes I'll be giving a talk with John Fink at the <a class="reference external" href="http://www.fsoss.ca">Free
Software Open Source Symposium</a> at Seneca
College on <a class="reference external" href="http://fsoss.senecac.on.ca/2008/?q=node/32">Evergreen: an enterprise-strength OSS solution for library
ossification</a>. I'm
jazzed!</p>
<p>Here are the slides: (<a class="reference external" href="/uploads/talks/2008/Evergreen_OSSification.odp">ODP
format</a>)
(<a class="reference external" href="/uploads/talks/2008/Evergreen_OSSification.pdf">PDF
format</a>).</p>
Access 2008 hackfest report: Zotero vs Evergreen2008-10-07T04:36:00-04:002008-10-07T04:36:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-10-07:/access-2008-hackfest-report-zotero-vs-evergreen.html<p><strong>Update: 2008-10-07</strong> As of <a class="reference external" href="http://svn.open-ils.org/trac/ILS/changeset/10774">changeset
10774</a>, the
detailed record view in Evergreen's dynamic catalog is now recognized by
Zotero.</p>
<p>I really like Zotero. And it works really well with Evergreen's current
"basic search"</p>
<p>because it embeds <a class="reference external" href="http://unapi.info">unAPI</a> links that enable Zotero
to</p>
<p>consume <a class="reference external" href="http://loc.gov/mods">MODS</a> representations of the underlying</p>
<p>bibliographic records …</p><p><strong>Update: 2008-10-07</strong> As of <a class="reference external" href="http://svn.open-ils.org/trac/ILS/changeset/10774">changeset
10774</a>, the
detailed record view in Evergreen's dynamic catalog is now recognized by
Zotero.</p>
<p>I really like Zotero. And it works really well with Evergreen's current
"basic search"</p>
<p>because it embeds <a class="reference external" href="http://unapi.info">unAPI</a> links that enable Zotero
to</p>
<p>consume <a class="reference external" href="http://loc.gov/mods">MODS</a> representations of the underlying</p>
<p>bibliographic records and generate a complete citation based on that.</p>
<p>However, Zotero doesn't work with Evergreen's current "dynamic search"
interface - which</p>
<p>is a problem, because it is the default search interface. Evergreen
embeds a link to the</p>
<p>unAPI server, and fills in the unAPI link via an AJAX call after the
underlying XHTML</p>
<p>has been loaded - but it seems that
<a class="reference external" href="http://forums.zotero.org/discussion/4069/detecting-unapi-in-dynamic-content/">Zotero</a>
doesn't</p>
<p>recognize that the DOM has been changed by the AJAX event and never
discovers the unAPI</p>
<p>link. So... I had submitted a challenge to Hackfest to fix this, because
I really want to</p>
<p>be able to use Zotero with Evergreen when Project Conifer launches.</p>
<p>And, as with every other Hackfest I have attended, I end up working on
my own challenge.</p>
<p>In discussing the problem with William from
<a class="reference external" href="http://canadiana.org">canadiana.org</a> and Walter Lewis from</p>
<p><a class="reference external" href="http://www.knowledgeontario.ca">Knowledge Ontario</a>, I described how
the dynamic interface doesn't use any templating (apart</p>
<p>from entity substitution for localization support), that there wasn't
really any way to</p>
<p>inject content server side into the underlying XHTML, and that I really
didn't want to have</p>
<p>to dig into the guts of Zotero to enable it to parse the DOM after
events had completed.</p>
<p>William asked "so you can't even do a server side include?", which ended
up breaking the</p>
<p>problem wide open - because yes, we already use server side includes to
identify which DTD</p>
<p>to load for localization purposes.</p>
<p>Step 1 was to modify the detailed record display to put the unAPI link
template in place,</p>
<p>and to modify the Apache configuration to pass in hardcoded values for
each of the SSI</p>
<p>variables. A quick test and - it didn't work. Uh oh.</p>
<p>That led to much scratching of the head. Was Zotero getting tripped up
by the masses of</p>
<p>XHTML elements in the dynamic template that are simply hidden? Did it
give up after trying to</p>
<p>parse 100K or so of content? Were there differences in the content types
being served up by</p>
<p>Apache? The next step was to compare the content of the "basic search"
output against the</p>
<p>"dynamic search" output - and that led to one seemingly innocent
difference.</p>
<p>The unAPI server link in the "basic search" output included an absolute
link to the server,</p>
<p>while the corresponding link in the "dynamic search" output used a
relative link to point</p>
<p>to the root of the server. I didn't think that would be a problem, but
eliminating variables</p>
<p>is always good - and when I tested with a hardcoded server link, the
Zotero hint icon lit</p>
<p>up and the mystery was solved. Between enabling the record unAPI link to
appear in the</p>
<p>static XHTML via SSI and changing the unAPI server link to use an
absolute value, Zotero and</p>
<p>Evergreen could work together in harmony.</p>
<p>I haven't committed the fix for this yet to the repository, as I haven't
finalized the exact</p>
<p>SSI incantations that will be needed to embed the record ID in the unAPI
link. But now you</p>
<p>know the solution, and could tackle the problem yourself if you get
tired of waiting for me</p>
<p>and feel inspired. And once the problem is fixed, I'll update the post
to let you know what</p>
<p>version of Evergreen carries the fix.</p>
<p>Oh, and my hackfest report slides <a class="reference external" href="/uploads/talks/2008/Cite_me_bite_me.pdf">are
attached</a>,
in case anyone cares.</p>
Access 2008 presentation: Project Conifer report2008-10-04T23:13:00-04:002008-10-04T23:13:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-10-04:/access-2008-presentation-project-conifer-report.html<p>On Friday, October 3rd, I had the honour of presenting the progress of
Project Conifer with my colleague <a class="reference external" href="http://libgrunt.blogspot.com">John
Fink</a> to my peers at Access 2008.
Project Conifer is the effort to bring the
<a class="reference external" href="http://evergreen-ils.org">Evergreen</a> open source library system to a
consortium of academic libraries in Ontario (Algoma, Laurentian,
McMaster …</p><p>On Friday, October 3rd, I had the honour of presenting the progress of
Project Conifer with my colleague <a class="reference external" href="http://libgrunt.blogspot.com">John
Fink</a> to my peers at Access 2008.
Project Conifer is the effort to bring the
<a class="reference external" href="http://evergreen-ils.org">Evergreen</a> open source library system to a
consortium of academic libraries in Ontario (Algoma, Laurentian,
McMaster, Northern Ontario School of Medicine, and Windsor).</p>
<p>I'm just going to link quickly to the slides for now, as I'm a little
bit brain-dead after the conference. John led off the talk with an
overview of what Conifer is all about and why we were motivated to
tackle such a large project - he has <a class="reference external" href="http://www.slideshare.net/adr/access2008-presentation-v3-presentation">posted his
slides</a>
via the SlideShare thingy. Editorial comment: I really enjoy John's
presentation style and content. He's a hard act to follow!</p>
<p>And then I rambled on with an overview of the ups and downs of the
project so far, the resources we have invested in the project, our
progress towards our target go-live date (May 2009), and some sneak
previews of the goodies that are included in the
any-day-now-if-I-would-just-stop-going-to-conferences-and-apply-myself-for-a-few-days-dangit
Evergreen 1.4 release. Well - they're not really sneak previews, because
of course you could check the code out of the repository and built it
yourself - but it's so much easier when somebody else already has it
running, right?</p>
<p>Anyway, my slides are available in both <a class="reference external" href="/uploads/talks/2008/Access2008Conifer.odp">OpenOffice.org Impress
format</a>
and
<a class="reference external" href="/uploads/talks/2008/Access2008Conifer.pdf">PDF</a>.</p>
Heating up Evergreen search2008-08-25T16:23:00-04:002008-08-25T16:23:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-08-25:/heating-up-evergreen-search.html<p>So, after loading 3.7 million records into the Project Conifer test
server, we have found that search can be slow. Not really a big
surprise, because I've spent very little time tuning the database beyond
running a VACUUM FULL and tweaking just a few parameters. But one of the …</p><p>So, after loading 3.7 million records into the Project Conifer test
server, we have found that search can be slow. Not really a big
surprise, because I've spent very little time tuning the database beyond
running a VACUUM FULL and tweaking just a few parameters. But one of the
extremely useful hints that Mike Rylander gave me about PostgreSQL a
long time back is that it relies primarily on file system caching to
cache access to data, from the reasonable perspective that your file
system already knows which files are being accessed most often.
PostgreSQL's data is stored in files that map back to individual tables
and indexes; unlike some other database systems that I've worked with,
you don't dedicate system memory specifically to caching those database
files (hello, DB2 buffers!); instead, you just trust the file system to
know what's best.</p>
<p>That caching approach works great on a system that's in production and
getting a steady stream of queries reflecting what users actually search
for on a day to day basis. However, if you've just loaded a test system,
then it doesn't have much opportunity to cache and the first dozen (or
hundreds, or thousands!) of queries will be slow as your database goes
out and loads up files from disk. Even worse, if you have a system like
ours where backups have temporarily been set up as "tar czf
/backups/backup.tar.gz /", then on a nightly basis your file system
cache is going to be filled with all kinds of irrelevant data.</p>
<p>So what are we to do? Well, actually, another extremely useful hint that
Mike Rylander gave me was to just run the pertinent data files through
/dev/null to load up the file system cache. On the surface, it seems
like a dirty hack, but it's a smart one, and we can even make it
elegant. Let's walk through the process:</p>
<ol class="arabic">
<li><p class="first">You need to know where your data files are. You (or your system
installer) will have created a PostgreSQL cluster. In my case (on
Debian Etch), I can find it at /var/lib/postgresql/main/base. Then,
by running "du -hs /var/lib/postgresql/main/base" I can see that one
of our databases (represented by a directory name that's just an
integer - "16385") weighs in at 60GB. That's our 3.7 million record
baby. If you run an "ls" command on that directory, you'll see that
it's filled with hundreds of files of differing sizes, most of them
with just plain integers for their names. This is where the data is
stored.</p>
</li>
<li><p class="first">You need to know the base filenames that you want to use to warm up
the file system cache. For my first stab at this, I decided to warm
up the cache with the full-text search indexes, as I know those are
frequently used by Evergreen's search. To figure out the base
filenames for these indexes, we can query PostgreSQL's catalog of its
own objects:</p>
<pre class="literal-block">
evergreen=# SELECT relfilenode, relname, relpagesevergreen-# FROM pg_class WHERE relname LIKE '%vector%'; relfilenode | relname | relpages -------------+----------------------------------------------+---------- 648864 | authority_full_rec_index_vector_idx | 59282 649137 | metabib_title_field_entry_index_vector_idx | 29766 649149 | metabib_author_field_entry_index_vector_idx | 20125 649161 | metabib_subject_field_entry_index_vector_idx | 23481 649173 | metabib_keyword_field_entry_index_vector_idx | 90709 649185 | metabib_series_field_entry_index_vector_idx | 8682 649210 | metabib_full_rec_index_vector_idx | 452980(7 rows)
</pre>
</p>
<p><p><strong>relfilenode</strong> is the basename of the files that we want to load
into the file system cache.</p>
</li>
<li><p class="first">The maximum size of your file system cache cannot be more than the
physical RAM installed on your system, so you'll want to tally up the
size of the index data files to ensure that their total is less than
the total amount of your physical RAM. Note that in the example from
our system, below, I'm using "*" because database objects with lots
of data will be split between multiple files with extensions like
".1" and ".2" in sequential order:</p>
</p><pre class="literal-block">
# cd /var/lib/postgresql/main/base/16385# du -hs 649185* 649210* 1065608*68M 6491851.1G 6492101.1G 649210.11.1G 649210.21.1G 10656081.1G 1065608.11.1G 1065608.2842M 1065608.3# du -hs 649207*1.1G 6492071.1G 649207.11.1G 649207.2467M 649207.3
</pre>
</p><p>Adding all of this up, we're getting close to the 16GB of RAM
installed on our database server. If we add any more data, we will
want to add more RAM to the system.</p>
<p></li>
<li><p class="first">Now we warm up the cache by outputting the contents of each file into
/dev/null.</p>
<pre class="literal-block">
# cd /var/lib/postgresql/main/base/16385# cat 648864* > /dev/null# cat 649137* > /dev/null# cat 649149* > /dev/null# cat 649161* > /dev/null# cat 649173* > /dev/null# cat 649185* > /dev/null# cat 649210* > /dev/null
</pre>
</p>
<p></li>
</ol>
<p>After running through this relatively simple exercise, searches were
definitely much snappier on our test system. I plan to automate the
process so it runs after every one of those cache-killing backups. If
there is interest, I could package it into a simple Perl script that
other sites could use to assist with their testing - or to help warm up
the file system cache after a large data load, for example.</p>
Academic reserves for Evergreen: request for comments2008-07-12T20:02:00-04:002008-07-12T20:02:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-07-12:/academic-reserves-for-evergreen-request-for-comments.html<p>I've posted a second revision of the <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=feature:academic_reserves">"academic reserves" requirements
RFC</a>.
I'm not looking to boil the ocean with the first iteration of academic
reserves for Evergreen (that's what third-party systems like
<a class="reference external" href="http://reservesdirect.org">ReservesDirect</a> and Ares are for), but I
am hoping that by engaging the community in a discussion we …</p><p>I've posted a second revision of the <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=feature:academic_reserves">"academic reserves" requirements
RFC</a>.
I'm not looking to boil the ocean with the first iteration of academic
reserves for Evergreen (that's what third-party systems like
<a class="reference external" href="http://reservesdirect.org">ReservesDirect</a> and Ares are for), but I
am hoping that by engaging the community in a discussion we can ensure
that we build something that satisfies the core set of requirements for
academic institutions in the area of reserves. My lack of familiarity
with what other institutions with more capable systems, or with local
workarounds or third-party reserves systems installed, makes me nervous
that I'm missing something obvious. So if you feel like weighing in on
the discussion, please address your comments to the <a class="reference external" href="http://open-ils.org/listserv.php">Evergreen General
mailing list</a>, add a comment here,
or send me email if you prefer to keep your comments private.</p>
<p>The biggest change in the second revision of the RFC is the inclusion of
a base set of requirements for electronic reserves. For physical items
alone, the requirements expressed in the RFC go far beyond the
capabilities of the ILS we currently use at Laurentian; getting even
basic support for electronic reserves in Evergreen would be a huge win
for us when we migrate.</p>
<p>That said, I'll probably start working on implementing a subset of the
requirements real soon now; it should be easy enough to make a course
correction should something significant turn up during the second round
of comments.</p>
(unofficial) bzr repositories for Evergreen branches2008-07-12T19:46:00-04:002008-07-12T19:46:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-07-12:/unofficial-bzr-repositories-for-evergreen-branches.html<p>I wrote a long blog post about the distributed version control workflow
that the two Laurentian students working on
<a class="reference external" href="http://open-ils.org">Evergreen</a> (Kevin Beswick and Craig Ricciuto)
are using successfully this summer, only to lose the post to a session
timeout and my own lack of caution (note to self: if writing …</p><p>I wrote a long blog post about the distributed version control workflow
that the two Laurentian students working on
<a class="reference external" href="http://open-ils.org">Evergreen</a> (Kevin Beswick and Craig Ricciuto)
are using successfully this summer, only to lose the post to a session
timeout and my own lack of caution (note to self: if writing directly in
the browser text field, CTRL-A CTRL-C before hitting preview!). So the
gist of the blog post was:</p>
<ul class="simple">
<li><a class="reference external" href="http://bazaar-vcs.org">bzr</a>, with the <a class="reference external" href="http://bazaar-vcs.org/BzrSvn">bzr-svn
plugin</a>, works quite well for
cloning and updating from a centralized Subversion repository like
Evergreen's; just watch out for memory consumption issues due to
memory leaks in the Python bindings for Subversion
(<a class="reference external" href="http://jelmer.vernstok.nl/blog/archives/218-bzr-svn-now-with-its-own-Subversion-Python-bindings.html">fixed</a>
in the development version of bzr-svn)</li>
<li>there's no compelling reason for Evergreen to move to a different
version control system; it's easy to use a distributed version
control workflow with the Evergreen Subversion repository as-is</li>
<li>you can tar up a bzr branch and untar it where ever you like and "bzr
up" will immediately happily work (which is how I worked around the
severe memory constraints on this server that ended up repeatedly
running into the Linux out of memory killer when I was trying to
create a bzr-svn checkout from scratch)</li>
<li>it's a hell of a lot faster to check out or branch from a bzr
repository than it is from a Subversion repository, so if you're
going to take this approach set up one clean bzr repository using
bzr-svn and check out or branch from that using bzr, rather than
repeatedly using bzr-svn to create new branches</li>
</ul>
<p>To enable you to get a bzr repo of Evergreen quickly, I've set up
(unofficial, of course, but updated hourly) bzr repositories of the most
useful Evergreen branches as follows:</p>
<p><strong>UPDATE 2009-10-14:</strong> I've stopped updating these repositories because
the version of bzr-svn on my server is too old and decrepit to be able
to handle the updates. Sorry <img alt=":-(" class="emoticon" src="/images/sad.png" /></p>
<ul class="simple">
<li><a class="reference external" href="http://bzr.coffeecode.net/ILS/trunk">Evergreen trunk</a></li>
<li><a class="reference external" href="http://bzr.coffeecode.net/ILS/acq-experiment">Evergreen
acq-experiment</a>
(acquisitions and serials branch)</li>
<li><a class="reference external" href="http://bzr.coffeecode.net/OpenSRF/trunk">OpenSRF trunk</a></li>
</ul>
<p>Enjoy!</p>
eIFL-FOSS ILS workshop on Evergreen, day one2008-06-24T00:34:00-04:002008-06-24T00:34:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-06-24:/eifl-foss-ils-workshop-on-evergreen-day-one.html<p>The following summary is taken almost directly from an email I wrote to
one of the would-be participants who was, sadly, prevented from making
it to Yerevan due to travel complications. I meant to clean this up
earlier and post it, but have not yet found the time - so I …</p><p>The following summary is taken almost directly from an email I wrote to
one of the would-be participants who was, sadly, prevented from making
it to Yerevan due to travel complications. I meant to clean this up
earlier and post it, but have not yet found the time - so I might as
well just post it as is with most names obfuscated and possibly some
additional editorial comments. Those who are new to installing and
configuring Evergreen might find this useful; and reading through it, I
remembered a few challenges I planned to tackle <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
<hr class="docutils" />
<p>Shortly after I arrived on Monday, I was able to try out the</p>
<p>install of Evergreen 1.2.1.4 that A. and G. from the Fundamental</p>
<p>Science Library (FSL) had completed with only two email exchanges with
me.</p>
<p>I was very happy to see that they had successfully completed the
install!</p>
<p>There was only one minor problem with the structure of the
"organizational</p>
<p>unit" hierarchy that I had to fix. After that, we confirmed that we were</p>
<p>able to import bibliographic records from Z39.50 and attached call
numbers and</p>
<p>copies to those records. Finally, we tried searching for the records in</p>
<p>the catalogue and were delighted to see that everything was working as</p>
<p>we had hoped. That allowed me to sleep well on Monday, in preparation
for the</p>
<p>first day of the workshop on Tuesday.</p>
<p>After the introductions of the workshop participants on Tuesday, I gave
the</p>
<p>introduction to Evergreen presentation and Henri Damien Laurent of
BibLibre</p>
<p>demonstrated Koha. Both Henri Damien Laurent and I showed our respective</p>
<p>library systems running with an Armenian interface, thanks to the
translation</p>
<p>efforts of Tigran! Then we broke into separate Koha and Evergreen groups
to</p>
<p>work together on our respective library systems. Of the attendees of the</p>
<p>workshop, E. was the most</p>
<p>interested in migrating his library (with 40,000 volumes) to Evergreen.
A.,</p>
<p>from one of the 29 branches of the American University of Armenia (AUA),
also</p>
<p>attended most of the Evergreen session. Even though his institution is
mostly</p>
<p>interested in Koha, he wanted to be able to compare the two systems.
Albert's</p>
<p>colleague S. attended the Koha training session so they would be able to</p>
<p>compare their experiences later. Our group also had R. from the
Netherlands</p>
<p>and A., G., and A. from FSL -- apparently Tigran is considering</p>
<p>running Evergreen as a union catalogue, so his IT people are very
interested</p>
<p>in learning more.</p>
<p>Our first exercise was to model the organizational unit hierarchy using
the</p>
<p>configuration bootstrap interfaces in the /cgi-bin/config.cgi. We began
by</p>
<p>drawing the hierarchy on a whiteboard. The "Yerevan Consortium"</p>
<p>represented the Evergreen system as a whole; we added the FSL, MSU, and
AUA</p>
<p>systems as children of the Yerevan Consortium, and then added specific
branches</p>
<p>as children of each of these systems. While we were creating this
hierarchy, I</p>
<p>showed the participants how the organization unit type defines the
labels used</p>
<p>in the catalogue as well as the respective depth in the hierarchy for
each type.</p>
<p>We then ensured that the systems and branches in the hierarchy had the
right</p>
<p>types, and that the types were defined with valid parent-child
relationships. We</p>
<p>found a few types that were children of themselves, which causes a
problem in</p>
<p>searching. There was also some confusion about the role of types to</p>
<p>organization units, resulting in the creation of types with labels like
"FSL"</p>
<p>rather than "Library System". After a few minutes of explanation and
working</p>
<p>through correcting the exercises, I think the participants were better
able to</p>
<p>understand the relationship between types and organization units.</p>
<p>After we were satisfied with the structure of the organization unit
hierarchy, I</p>
<p>ran the autogen.sh script to update the catalogue and staff client</p>
<p>representations of the hierarchy. Well, first I demonstrated how search
in the</p>
<p>catalogue will quickly be broken if you do not run the autogen.sh script
<img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
<p>Our next step was to register new users with the Evergreen staff client.
This</p>
<p>helped introduce the participants to the staff client, as well as giving
them</p>
<p>a quick introduction to some parts of Evergreen that still need to be
localized</p>
<p>to allow regional variations on postal code formats, telephone numbers,
and</p>
<p>forms of identification. The default Evergreen staff client still
enforces</p>
<p>American conventions, but fortunately I have had to create patches for
Evergreen</p>
<p>to support my own country's standards so I can assure you that it is
relatively</p>
<p>easy to change or remove these format checks. In the future, it would be</p>
<p>wonderful to include a localization pack for each locale interested in
using</p>
<p>Evergreen that supports regional variations on date formats, phone
number</p>
<p>patterns, etc. The participants were pleased with the feedback mechanism
in</p>
<p>the staff client that summarized all of the remaining problems with the
current</p>
<p>patron record (missing address, invalid phone number, etc) and made it
easy to</p>
<p>switch between screens without losing any of the data they had already
entered.</p>
<p>Once we had registered new users for each of our branches, we went to
work</p>
<p>importing new bibliographic records and attaching call numbers and
copies to</p>
<p>those records. This gave us a good opportunity to see how changing the
scope</p>
<p>of a search in Evergreen from "Everywhere" down to a specific branch
changes</p>
<p>the search results, and demonstrated how the organization type labels
are</p>
<p>displayed in the catalogue. As an aside, I should point out that in
Evergreen</p>
<p>1.4 (due by the end of this summer), the labels are internationalized so
that</p>
<p>different labels can be displayed depending on the locale in which you
are</p>
<p>using the catalogue or staff client. Good news for those of us who work
in</p>
<p>bilingual or multilingual libraries!</p>
<p>Now that we had records with copies attached and patrons registered in
our</p>
<p>Evergreen instance, we were able to use the catalogue's "My Account"
features</p>
<p>to try out features like sharable bookbags, account preferences, and the</p>
<p>account summary. Users also have the ability to specify their</p>
<p>own user names and to log in with those instead (which means that they
can</p>
<p>simply remember their unique nickname rather than, say, a 14-digit
barcode).</p>
<p>The first feature that the participants discovered, of course, was the</p>
<p>strong password enforcement feature. When a patron is registered, the
system</p>
<p>automatically generates a random 4-digit password; however, this is not</p>
<p>considered to be a safe password, so when they log in they are forced to</p>
<p>change it to a longer password containing both numbers and letters.</p>
<p>At this point, we also discovered a data validation bug: in the staff
client,</p>
<p>it is possible to enter a user barcode that consists of letters and
numbers.</p>
<p>However, in the catalogue, user barcodes containing letters are
considered</p>
<p>invalid and the system will not even attempt to log that user in; it
simply</p>
<p>rejects the barcode. I plan to ask E. to report this bug to the
Evergreen</p>
<p>mailing list; it would be an excellent outcome of the workshop if
participants</p>
<p>felt comfortable reporting problems to the mailing list, and reporting
this</p>
<p>problem in particular would help improve the quality of Evergreen.</p>
<p>Things were going reasonably well, but we noticed that the system was</p>
<p>running into a problem if you tried to edit a bibliographic record after</p>
<p>you had already created or imported the record. I had rather fortunately</p>
<p>already experienced this problem (it is a result of different behaviour</p>
<p>regarding XML namespaces between different versions of LibXML2) and knew</p>
<p>that it had been fixed in 1.2.2.1. So rather than trying to fix the
problem</p>
<p>with the installed version of 1.2.1.4, I decided to try upgrading our</p>
<p>Evergreen system to the recently released 1.2.2.1 to demonstrate to the</p>
<p>participants that the upgrade process was fast, reasonably well
documented,</p>
<p>and not nearly as complicated as the install process. This was, by the
way,</p>
<p>something Randy had urged me to do, so I blame him for the subsequent
problems</p>
<p>we experienced (hah!).</p>
<p>The first problem is that the change from 1.2.1.x to 1.2.2.x requires
the</p>
<p>installation of a new Perl module from CPAN (JSON::XS). This is not much
of a</p>
<p>problem in itself, as the module is very easy to install and compile;
however,</p>
<p>given our internet connection I had to wait a long time for the CPAN</p>
<p>repository metadata to be downloaded. The participants were still able
to use</p>
<p>the system while this was happening, but we ended up hitting the coffee
break</p>
<p>still waiting for CPAN to finish. (As an aside, Irakli and I were
discussing</p>
<p>the possibility of having the eIFL-FOSS coordinators investigate setting
up</p>
<p>local mirrors of FOSS resources like CPAN to speed up access to
frequently</p>
<p>used resources).</p>
<p>When we returned from the coffee break, the JSON::XS install had
finished but</p>
<p>the participants were having problems searching and using the staff
client. I</p>
<p>checked the logs (using the "grep ERR /openils/var/log/*" command to
start</p>
<p>with) and saw that our database connections were dying for some reason.
On a</p>
<p>hunch, I checked the system logs ("dmesg") and discovered that the Linux
"out</p>
<p>of memory (OOM) killer" had started killing random processes to try to
free up</p>
<p>memory. It was killing the PostgreSQL processes, the Evergreen processes
-</p>
<p>anything! I was lucky, because I had been reading about the OOM on Linux</p>
<p>after hearing about a Linux user that had run into a similar</p>
<p>problem, and knew that the way to disable the OOM was to prevent Linux
from</p>
<p>overcommitting memory to processes in the first place. Wondering why our</p>
<p>system had started running out of memory in the first place, I ran
"free" and</p>
<p>saw that it had been set up with no swap space; I confirmed this by
running</p>
<p>fdisk to see that there were no swap partitions. Here, however, I made a</p>
<p>mistake. I ran "echo '2' > /proc/sys/vm/overcommit_memory" to prevent
Linux</p>
<p>from overcommitting memory to new processes and to prevent the OOM
killer from</p>
<p>killing any more random processes. But this also meant that I was
immediately</p>
<p>unable to launch any new programs - so I could not safely shut down
PostgreSQL</p>
<p>and Evergreen, and we had to turn the power off to the system.</p>
<p>Fortunately, the system started up cleanly again (hurray for journalled</p>
<p>filesystems) and I was able to complete the upgrade before the rest of
our</p>
<p>hands on session for the day was finished. A few things that are missing
in the</p>
<p>current upgrade instructions:</p>
<ol class="arabic">
<li><p class="first">You have to compile the new version of Evergreen. The easiest way to
do</p>
</p><p>this is to copy install.conf over from your previous version of
Evergreen and</p>
<p>run "make config" to ensure that all of the settings are still
correct, then</p>
<p>run "make" to build the new version of Evergreen.</p>
<p></li>
<li><p class="first"><strong>Very important</strong>: Before installing the new version of Evergreen,
you must</p>
</p><p>prevent the database schema from being completely recreated or it
will destroy</p>
<p>any data that is already in your system. One way of doing this is,
during the</p>
<p>"make config" step, to list all of the Evergreen targets _except
for_</p>
<p>openils_db. I am simply incapable of remembering all of those
targets, so my</p>
<p>dirty workaround is to open Open-ILS/src/Makefile in an editor and
modify the</p>
<p>"install: " make target by removing the "storage-bootstrap" make
target. What</p>
<p>we really need is an "upgrade" target for "make config" that simply
installs</p>
<p>everything except for the database schema.</p>
<p></li>
<li><p class="first">Confirm that the new version of Evergreen has been installed by
running</p>
</p><p>the srfsh command "request open-ils.storage open-ils.system.version".</p>
<p></li>
</ol>
<p>For tomorrow (today, by the time you receive this), A. and G. are going
to</p>
<p>create a swap file to enable the system to swap memory to disk if need
be; the</p>
<p>system has 1 GB of RAM, which is enough for a small Evergreen system but
when</p>
<p>one is compiling programs at the same time as running Evergreen swap
space</p>
<p>really is necessary. This was a very good lesson learned for all of us!</p>
<ol class="upperalpha simple" start="5">
<li>also interested in learning more about basic Linux</li>
</ol>
<p>administration. His institution currently runs on an entirely Windows</p>
<p>infrastructure, so the requirement to learn Linux is a fairly high
hurdle.</p>
<p>I'm hoping that the eIFL-FOSS list will be a good resource for him to
start</p>
<p>that journey. He has also asked to go over the step-by-step instructions
for</p>
<p>installing Evergreen, so I'm considering starting that in a VMWare
session so</p>
<p>that we can run through the steps. Our major goal for tomorrow is to
migrate</p>
<p>some data from FSL's legacy system into Evergreen. Wish us luck!</p>
<p><em>Editorial comment:</em> The combination of Armenian and Russian MARC
records refused to load into the Evergreen 1.2.2.1 system, but on the
flight home I confirmed that they loaded perfectly and were searchable
on my Evergreen development system. As the development version will
become this summer's 1.4 "internationalization" release, we are in good
shape.</p>
<p><em>Editorial comment 2:</em>On the second day, while running in circles
trying to figure out why the records were refusing to load into the
1.2.2.1 system, I decided to try the
<a class="reference external" href="irc://chat.freenode.net/#openils-evergreen">#openils-evergreen</a> IRC
channel. Yerevan is 9 hours ahead of the Toronto/Atlanta time zone, so
at noon Yerevan time I was hardly expecting any of the current core
Evergreen developers to be online - yet, to our amazement, Mike Rylander
responded. This was a pretty convincing demonstration to the attendees
that the core developers really aren't far away or hard to contact at
all.</p>
Get out of jail, go free, part I2008-06-16T20:38:00-04:002008-06-16T20:38:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-06-16:/get-out-of-jail-go-free-part-i.html<p>As Mark Leggott mentioned in <a class="reference external" href="http://loomware.typepad.com/loomware/2008/05/vendor-to-open.html">Vendor to Open Source ILS in 1 Month
#1</a>,
I had the pleasure of assisting the migration of the University of
Prince Edward Island library system from Unicorn to Evergreen. <a class="reference external" href="/archives/123-Evergreen-and-the-business-case-for-choosing-an-open-source-ILS.html">A little
over a year
ago</a>,
in discussing the business case for open source library …</p><p>As Mark Leggott mentioned in <a class="reference external" href="http://loomware.typepad.com/loomware/2008/05/vendor-to-open.html">Vendor to Open Source ILS in 1 Month
#1</a>,
I had the pleasure of assisting the migration of the University of
Prince Edward Island library system from Unicorn to Evergreen. <a class="reference external" href="/archives/123-Evergreen-and-the-business-case-for-choosing-an-open-source-ILS.html">A little
over a year
ago</a>,
in discussing the business case for open source library systems, I
stated that one of the problems we faced with migrations is that the
license for a proprietary system often inhibits openly sharing of
information about how to export data from those systems in
machine-usable formats. Thus, the open source library community needs to
encourage the development of "migration ninjas". Little did I know that
I would soon join the guild of ninjas and become <em>deadly and silent, and
unspeakably violent</em>(1)(2).</p>
<p>As a result, I have created a utility script that should be of
assistance to SirsiDynix Unicorn or Symphony sites who are interested in
exploring the possibilities offered by other library systems. The rather
dryly named "export_unicorn.pl" script was added to the <a class="reference external" href="http://sirsiapi.org">Unicorn API
repository</a> as entry # 228 today under a GPL v2
license(3). As the script uses the Unicorn/Symphony API, however, I am
sadly (to the best of my knowledge) not free to simply share the script
with anyone. Therefore, to gain access to the script you must be an
API-certified Unicorn or Symphony customer. Still, by making an export
script available to SirsiDynix customers that provides the raw data in a
relatively standard output format, it should ease the effort required by
the migration ninjas for open source systems to massage the data into
the needed input formats, and to avoid the
<a class="reference external" href="http://www.google.ca/search?q=define%3Atetsubishi">tetsu-bishi</a>
scattered by the proprietary systems in defence of "their" data(4)(5).</p>
<ol class="arabic simple">
<li><a class="reference external" href="http://www.bnlmusic.com">Barenaked Ladies</a>, "The Ninjas". <em>Their
website is horrible Flash and JavaScript overkill but damnit Jim,
they're musicians, not webmasters; the "Snacktime" album is
especially recommended if you have kids.</em></li>
<li><em>Although I have to say I'm nowhere near as violent as Mike Rylander,
who with his PostgreSQL-fu can carve seemingly any piece of data into
the shape needed for import into Evergreen.</em></li>
<li><em>Thanks to Mark Leggott for insisting that I retain copyright over
the scripts created during the UPEI migration and for allowing me to
share those scripts in the appropriate avenues. It's another weapon
(shuriken? ninja-to?) in the migration ninja arsenal.</em></li>
<li><em>This data does, after all, belong to the libraries who license a
library system, but at least one company reportedly has a pattern of
repeatedly removing interfaces that enable easy machine-readable
access to library data...</em></li>
<li><em>I find myself being thankful that Unicorn does provide an API for
generating machine-readable data exports; all that it cost our
library was a week of my life and the associated training fees and
travel expenses</em></li>
</ol>
Introduction to Evergreen at eIFL-FOSS ILS workshop2008-06-16T20:04:00-04:002008-06-16T20:04:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-06-16:/introduction-to-evergreen-at-eifl-foss-ils-workshop.html<p>I was in Armenia last week, leading a <a class="reference external" href="http://www.eifl.net/cps/sections/services/eifl-foss/ils/ils-project-workshop">workshop on open source library
systems</a>
along with Henri Damien Laurent from <a class="reference external" href="http://biblibre.com">BibLibre</a>.
My charge was to introduce Evergreen and lead participants in two days
of hands-on experience with the system; Henri took on the same task for
Koha. I cannot say …</p><p>I was in Armenia last week, leading a <a class="reference external" href="http://www.eifl.net/cps/sections/services/eifl-foss/ils/ils-project-workshop">workshop on open source library
systems</a>
along with Henri Damien Laurent from <a class="reference external" href="http://biblibre.com">BibLibre</a>.
My charge was to introduce Evergreen and lead participants in two days
of hands-on experience with the system; Henri took on the same task for
Koha. I cannot say enough good things about our host for the workshop,
the <a class="reference external" href="http://www.sci.am">Fundamental Library of the National Academy of Sciences of
Armenia</a> headed up by Tigran Zargaryan; nor can I
offer enough compliments to Randy Metcalfe on his skills in ensuring
that everything ran smoothly; nor can I express how rewarding it was to
meet representatives of so many different countries and how much I
enjoyed their company! I look forward to helping the pilot sites succeed
with their implementations.</p>
<p>So, for the short term, I'll simply link to the "Introduction to
Evergreen" presentation that I gave at the start of the workshop in</p>
<p><a class="reference external" href="/uploads/talks/2008/Evergreen-eIFL-FOSS.odp">OpenOffice</a>
and
<a class="reference external" href="/uploads/talks/2008/Evergreen-eIFL-FOSS.ppt">PowerPoint</a>
formats (as I promised to the participants). In the next day or two I
plan to post a summary of the workshop activities; some of the lessons
learned; and where I think I'll focus my attention next.</p>
Weeding 2.02008-05-11T04:07:00-04:002008-05-11T04:07:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-05-11:/weeding-20.html<p>Okay, this is definitely a lame thing to be thinking about at midnight
on a Saturday, but I was just playing with the shelf browser in the
<a class="reference external" href="http://open-ils.org">Evergreen</a> representation of our 780,000
bibliographic records (okay, that is definitely the wrong thing to be
<em>doing</em> at midnight on a Saturday …</p><p>Okay, this is definitely a lame thing to be thinking about at midnight
on a Saturday, but I was just playing with the shelf browser in the
<a class="reference external" href="http://open-ils.org">Evergreen</a> representation of our 780,000
bibliographic records (okay, that is definitely the wrong thing to be
<em>doing</em> at midnight on a Saturday). For some reason, I was wandering
through the subject collection pertinent to librarians (pray for my
soul), noticed a book that probably should have been discarded years
ago, and thought "Gee, i don't want to deal with this right now, but
wouldn't it be nice if I could just mark this <strong>Weed me</strong> and forget
about it until Monday?"</p>
<p>Then I realized that that wouldn't be a stretch at all. In Evergreen,
users have "bookbags" to which they can add items. These bookbags can be
shared as RSS feeds and otherwise easily exported into other formats. If
we were running Evergreen for real, I could create a "Weed me!" bookbag,
add in the suspect along with a bunch of other festering tomes, and send
the RSS feed to a student to perform the manual labour. Or perhaps the
RSS feed gets aggregated with other weeders' feeds and a weeding list
gets generated on a monthly basis for efficient labour practices. You
get the idea.</p>
<p>Of course, you would really want to have more information than just the
stock shelf browsing interface at hand when making weeding decisions.
For example, you would need a tally of recorded uses displayed beside
the item, with the ability to drill down for totals by year. If you
participate in a consortial "last copy standing" program, you would want
a quick check to see if any other institutions still hold a copy of the
resource. So, an enhanced interface would be needed to provide an
experience that combines the traditional weeding approach of roaming the
stacks and generating reports of items matching some minimum age and
minimum usage criteria.</p>
<p>Think about it a little further though (I'm sure you're thinking a lot
faster than me at this point; you're probably having the luxury of
reading this at the beginning of the day, coffee in hand, invigorated
after an early morning run in the lingering late spring chill... or
not), and there are points in our institutional workflows where we could
naturally introduce weeding activities. How do we get to the point of
having three editions of a given text on the shelf? If I have the 1995,
2003, and 2007 editions of a text, I can assure you that when I ordered
the 2007 edition I had already checked our ILS to see if we had a copy
of that edition already, and would have noticed the previous editions.
At that point, I should have the ability to say "Oh - get rid of the
1995 edition <strong>now</strong> and once the 2007 edition is processed and on the
shelf, cull the 2003 edition to boot." If I was designing an
acquisitions module today, that's certainly something I would consider
as a nice-to-have. Ahem.</p>
<p>Weeding 2.0 may not be a sexy subject.
<a class="reference external" href="http://www.google.ca/search?q=%22weeding+2.0%22">Google</a> and
<a class="reference external" href="http://search.yahoo.com/search?p=%22weeding+2.0%22">Yahoo</a> each turn
up exactly four hits, none of them related to libraries, which is
remarkable in this overly-hyped everything 2.0 world. But it's something
we should consider in the design and tailoring of our library systems;
and while it's not going to rank in my top level of priorities for
Evergreen, it will work its way in there somewhere, sometime. Hopefully
before the stacks in my subject areas buckle under the weight of unused,
out-of-date books.</p>
Tuning PostgreSQL for Evergreen on a test server2008-04-14T18:48:00-04:002008-04-14T18:48:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-04-14:/tuning-postgresql-for-evergreen-on-a-test-server.html<p><strong>Update 2008-05-01</strong>: Fixed a typo for sysctl: -a parameter simply
shows all settings; -w parameter is needed to write the setting. Duh.</p>
<p>Once you have decided on and acquired your <a class="reference external" href="http://www.coffeecode.net/archives/155-Test-server-strategies.html">test hardware for
Evergreen</a>,
you need to think about tuning your PostgreSQL database server. Once you
start loading bibliographic records …</p><p><strong>Update 2008-05-01</strong>: Fixed a typo for sysctl: -a parameter simply
shows all settings; -w parameter is needed to write the setting. Duh.</p>
<p>Once you have decided on and acquired your <a class="reference external" href="http://www.coffeecode.net/archives/155-Test-server-strategies.html">test hardware for
Evergreen</a>,
you need to think about tuning your PostgreSQL database server. Once you
start loading bibliographic records, you might notice that after 100,000
records or so that your search response times aren't too snappy. Don't
snarl at Evergreen. By default, PostgreSQL ships with very conservative
settings (something like machines with 256 MB of RAM!) so if you don't
tune those settings you're getting a false representation of your
system's capabilities.</p>
<p>The "right" settings for PostgreSQL depend significantly on your
hardware and deployment context, but in almost any circumstance you will
want to bump up the settings from the delivered defaults. To give you an
idea of what you need to consider, I thought I would share the settings
that we're currently using on our Evergreen test server at Laurentian
University. You might be able to use these as a starting point and
adjust them accordingly once you've run some representative load tests
against your configuration. And it's useful documentation for me to fall
back on in a few months, when all of this has escaped my grasp <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
<div class="section" id="the-defaults-as-shipped-in-debian-etch">
<h2>The defaults (as shipped in Debian Etch)</h2>
<p>The defaults in Debian Etch are quite conservative. Consider that our
test server has 12GB of RAM. The default only allocates 1MB of RAM to
work memory (which is critical for sorting performance) and only 8MB of
RAM to shared buffers. Following are the defaults set in
/etc/postgresql/8.1/main/postgresql.conf:</p>
<pre class="literal-block">
# - Memory -#shared_buffers = 1000 # min 16 or max_connections*2, 8KB each#temp_buffers = 1000 # min 100, 8KB each#max_prepared_transactions = 5 # can be 0 or more# note: increasing max_prepared_transactions costs ~600 bytes of shared memory# per transaction slot, plus lock space (see max_locks_per_transaction).#work_mem = 1024 # min 64, size in KB#maintenance_work_mem = 16384 # min 1024, size in KB#max_stack_depth = 2048 # min 100, size in KB# - Free Space Map -#max_fsm_pages = 20000 # min max_fsm_relations*16, 6 bytes each#max_fsm_relations = 1000 # min 100, ~70 bytes each
</pre>
</div>
<div class="section" id="our-test-server-settings">
<h2>Our test server settings</h2>
<p>Our test server has 12 GB of RAM. Assuming that the PostgreSQL defaults
were set for a system with 1 GB of RAM, we should be able to multiply
the memory-based settings by at least a factor of 12. We're a little bit
more aggressive than that in our settings. Note, however, that this is a
single-server install of Evergreen, so we're also running memcached,
ejabberd, Apache, and all of the Evergreen services as well as the
database - oh, and a test instance of an institutional repository, among
other apps - so we're not nearly as aggressive as we would be in a
dedicated PostgreSQL server configuration. Please note that I'm making
no claims that this is the optimal set of configuration values for
PostgreSQL even on our own hardware!</p>
<pre class="literal-block">
# shared_buffers: much of our performance depends on sorting, so we'll set it 100X the default# some tuning guides suggest cranking this up to as much 30% of your available RAMshared_buffers = 100000 # 8K * 100000 = ~ 0.8 GB# work_mem: how much RAM each concurrent process is allowed to claim before swapping to disk# your workload will probably have a large number of concurrent processeswork_mem=524288 # 512 MB# max_fsm_pages: increased because PostgreSQL demanded itmax_fsm_pages = 200000
</pre>
<p>After you change these settings, you will need to restart PostgreSQL to
make the settings take effect.</p>
</div>
<div class="section" id="kernel-tuning">
<h2>Kernel tuning</h2>
<p>In addition to PostgreSQL complaining about max_fsm_pages not being
high enough, your operating system kernel defaults for SysV shared
memory might not be high enough to support the amount of RAM PostgreSQL
demands as a result of your modifications. In one of our test
configurations, we had cranked up work_mem to 8GB; Debian complained
about an insufficient SHMMAX setting, so we were able to adjust that by
running the following command as root to set the kernel SHMMAX to 8GB
(8*1024^2):</p>
<pre class="literal-block">
sysctl -w kernel.shmmax=8589934592
</pre>
<p>To make this setting sticky through reboots, you can simply modify
/etc/sysctl.conf to include the following line:</p>
<pre class="literal-block">
# Set SHMMAX to 8GB for PostgreSQL#kernel.shmmax=8589934592
</pre>
</div>
<div class="section" id="other-measures">
<h2>Other measures</h2>
<p>Debian Etch comes with PostgreSQL 8.1. The first version of PostgreSQL
8.1 was released in November 2005. That's a long time in computer years.
Version 8.2, which was released less than a year later, "adds many
functionality and performance improvements" (according to the <a class="reference external" href="http://www.postgresql.org/docs/8.2/static/release-8-2.html">release
notes</a>).
If you're not getting the performance you expect from your hardware with
Debian Etch, perhaps a <a class="reference external" href="%20http://packages.debian.org/etch-backports/postgresql-8.2">backport of PostgreSQL
8.2</a>
would help out.</p>
</div>
<div class="section" id="further-resources">
<h2>Further resources</h2>
<p>This is just a shallow dip into PostgreSQL tuning for Evergreen -
hopefully enough to alert you to some of the factors you need to
consider if you're putting Evergreen into a serious testing environment
or production environment. Here are a few places to dig deeper into the
art of PostgreSQL tuning:</p>
<ul class="simple">
<li>PostgreSQL manual, resource consumption section of server
configuration: <a class="reference external" href="http://www.postgresql.org/docs/8.1/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY">version
8.1</a>
and <a class="reference external" href="http://www.postgresql.org/docs/8.2/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY">version
8.2</a></li>
<li>An annotated version of the 8.0 parameters with more explicit advice
is available at
` <<a class="reference external" href="http://www.powerpostgresql.com/Downloads/annotated_conf_80.html">http://www.powerpostgresql.com/Downloads/annotated_conf_80.html</a>>`__</li>
<li>Some good advice is buried about halfway down <a class="reference external" href="http://cbbrowne.com/info/postgresql.html">Christopher Browne's
page</a> under the heading
"Tuning PostgreSQL", along with links to further resources</li>
<li>The "Performance Whack-A-Mole" presentation at
<a class="reference external" href="http://www.powerpostgresql.com/Docs">PowerPostgreSQL</a> is a great
tutorial for holistic system tuning</li>
</ul>
</div>
Test server strategies2008-04-10T00:39:00-04:002008-04-10T00:39:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-04-10:/test-server-strategies.html<p>Occasionally on the <a class="reference external" href="http://open-ils.org/irc.php">#OpenILS-Evergreen IRC
channel</a>, a question comes up what kind
of hardware a site should buy if they're getting serious about trying
out Evergreen. I had exactly the same chat with Mike Rylander back in
December, so I thought it might be useful to share the strategy we …</p><p>Occasionally on the <a class="reference external" href="http://open-ils.org/irc.php">#OpenILS-Evergreen IRC
channel</a>, a question comes up what kind
of hardware a site should buy if they're getting serious about trying
out Evergreen. I had exactly the same chat with Mike Rylander back in
December, so I thought it might be useful to share the strategy we
developed in case other organizations are interested in piggy-backing on
our research. We came up with three different scenarios, depending on
the funding available to the organization and how serious the
organization is about testing, developing, and deploying Evergreen.</p>
<p>You can also look at the scenarios as stages, as the scenarios enable</p>
<p>progressively more realistic testing. An organization can always</p>
<p>start with a single server and add more servers over time; if you can</p>
<p>swing a significant discount for buying in bulk, however, it might</p>
<p>make sense to bite the bullet early.</p>
<p>Some pertinent facts about our requirements: we will eventually be
loading around 5 million bibliographic records onto the system. We're an
academic organization, so concurrent searching and circulation loads
will be low relative to public libraries.</p>
<div class="section" id="scenario-1-a-single-bargain-basement-testing-server">
<h2>Scenario 1: A single bargain-basement testing server</h2>
<p>In this scenario, the organization purchases a single server for the
short</p>
<p>term, and configures it to run the entire Evergreen + OpenSRF stack:</p>
<ul class="simple">
<li>database</li>
<li>Web server</li>
<li>Jabber messaging</li>
<li>memcached</li>
<li>OpenSRF applications
</p></li>
</ul>
<p>This server needs to have powerful CPUs, large amounts of RAM, and many
fast (10K RPM or higher) hard drives in a</p>
<p>striped RAID configuration (the latter because database performance</p>
<p>typically gets knee-capped by disk access). A "higher education" quote
online from a reputable big-name vendor for a rack-mounted 2U database
server with 2x4-core</p>
<p>CPU, 16GB RAM, 6x73GB RAID 5 drives comes in at approximately $7000.</p>
<p>This scenario is fine for development and testing with a limited</p>
<p>number of users, but if you intend to do any sort of stress testing</p>
<p>with this server or throw it open to the public, performance will</p>
<p>likely grind to a halt. <strong>Note:</strong> This is close to the system that we're
currently running at <a class="reference external" href="http://biblio-dev.laurentian.ca">http://biblio-dev.laurentian.ca</a> - 12 GB of RAM, 2
dual-core CPUs - with 800K bibliographic records and pretty snappy
search performance. It's certainly nothing to sneeze at.</p>
</div>
<div class="section" id="scenario-2-one-database-server-one-network-server">
<h2>Scenario 2: one database server, one network server</h2>
<p>In this scenario, you purchase a database server and a network server.</p>
<p>We'll use the same specs from scenario 1 for the database server, and</p>
<p>a CPU + RAM-oriented server for the network server (disk access isn't</p>
<p>a factor for the network apps, so you just buy two small mirrored</p>
<p>drives). The stock higher education quote for a rack-mounted 1U</p>
<p>network server with 2x4-core CPU, 16GB RAM, 2x73GB RAID 1 drives is</p>
<p>approximately $5250.</p>
<p>This scenario will support development and testing, as well as enable</p>
<p>you perform relatively representative stress testing runs with a</p>
<p>significant number of simultaneous users.</p>
</div>
<div class="section" id="scenario-3-two-database-servers-two-or-three-network-servers">
<h2>Scenario 3: two database servers, two or three network servers</h2>
<p>In this scenario, you purchase two database servers so that you can test</p>
<p>database replication, split database loads between search and reporting,
and two or three network servers to test</p>
<p>different distributions of the caching and network apps across the
servers to</p>
<p>determine the configuration that best meets your expected demands. The
cost of the five servers adds up to less than $30,000 - less than a
single traditional proprietary UNIX server - and would be less if you
can negotiate a bulk discount.</p>
<p>The third scenario supports development and testing, and will give you</p>
<p>practical experience with a configuration that would approximate your</p>
<p>production deployment of servers. When you go live, you could move one
of the database servers</p>
<p>and all but one of the network servers over to the production cluster,
and revert back to scenario one for your ongoing test and development
environment.</p>
</div>
<div class="section" id="the-conifer-approach">
<h2>The Conifer approach</h2>
<p>We opted to go with the third scenario to build a serious test cluster
for our consortium. However, the "scenarios as stages" approach ended up
being our strategy as our original choice of Dell servers came with RAID
controllers that do not work well under Debian. After returning the
servers to Dell, we were forced to press one of our backup servers into
service as a scenario-one style server while waiting for our new order
from HP to arrive.</p>
</div>
Progress with Project Conifer2008-03-27T02:15:00-04:002008-03-27T02:15:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-03-27:/progress-with-project-conifer.html<p>Project Conifer is the effort by McMaster University, University of
Windsor, and Laurentian University to put together a consortial instance
of Evergreen. <a class="reference external" href="http://conifer.mcmaster.ca/node/15">A few weeks back</a>,
we agreed that May 2009 would be our go-live date. So the clock is
ticking quite loudly in my ears.</p>
<p>Today I got an …</p><p>Project Conifer is the effort by McMaster University, University of
Windsor, and Laurentian University to put together a consortial instance
of Evergreen. <a class="reference external" href="http://conifer.mcmaster.ca/node/15">A few weeks back</a>,
we agreed that May 2009 would be our go-live date. So the clock is
ticking quite loudly in my ears.</p>
<p>Today I got an <a class="reference external" href="http://biblio-dev.laurentian.ca">Evergreen test
server</a> up and running, loaded with
the records from the consortium of Laurentian partners. I hit a few
bumps on the road, but eventually successfully loaded about 800,000
bibliographic records and about 500,000 items. I also turned on the
Syndetics enrichment data, so some items offer cover images, tables of
contents, reviews, and author information. The response time is pretty
snappy (it's running on a 4-core server with 12GB of RAM).</p>
<p>Things that made my task harder than it probably should have been:</p>
<ul>
<li><p class="first">yaz-marcdump generated invalid XML when I converted our MARC records
from MARC21 to MARC21XML format. Maybe this problem is fixed in later
versions of yaz-marcdump (I was using the stable Debian Etch version,
2.1.56, which is <em>crazy</em> old), or I could have tried
<a class="reference external" href="http://marc4j.tigris.org/">marc4j</a> or
<a class="reference external" href="http://oregonstate.edu/~reeset/marcedit/html/index.html">MarcEdit</a>
instead to try for better results, but I didn't, and it cascaded into
problems with...</p>
</li>
<li><p class="first">Dumping all of the holdings as part of the bibliographic records
threw things off when some of the records had so many holdings
attached (think a weekly periodical that a library circulates and
therefore each issue has its own barcode) that they spilled over
MARC's record length limit, resulting in multiple MARC records just
to hold the holdings - which causes some problems for the basic
import process. I eventually punted on trying to parse the MARC21XML
for holdings and just dumped the data I needed directly from Unicorn
in pipe-delimited format.</p>
</li>
<li><p class="first">Not tuning PostgreSQL <em>before</em> starting to load data into the
database was just plain stupid. The defaults for PostgreSQL are
incredibly conservative, and must be modified to handle large
transactions and to perform. Here are the tweaks I made for our 12GB
machine, starting with the Linux kernel memory settings:</p>
<pre class="literal-block">
# -- in /etc/sysctl.conf --# Set SHMMAX to 8GB for PostgreSQLkernel.shmmax=8589934592
</pre>
</p><pre class="literal-block">
# -- in /etc/postgresql/8.1/main/postgresql.conf --# Crank up shared_buffers and work_memshared_buffers = 10000work_mem=8388608 # 8 GB, equal to our kernel.shmmaxmax_fsm_pages = 200000
</pre>
</li>
<li><div class="first"></p></div><p>Evergreen depends on accurate fixed fields to determine the format of
an item. Unfortunately, many of our electronic resources appear not
to have been coded as such... so we have some data clean-up to do.</p>
<p></li>
</ul>
<p>Ah well: as Jerry Pournelle used to say in his Chaos Manor column, "I do
these things so that you don't have to." Hopefully it makes a smoother
path for others to get to Evergreen.</p>
Evergreen Acquisitions at VALE's Next Generation Academic Library System Symposium2008-03-15T16:34:00-04:002008-03-15T16:34:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-03-15:/evergreen-acquisitions-at-vales-next-generation-academic-library-system-symposium.html<p>On Wednesday, I was fortunate enough to join a distinguished panel</p>
<p>of speakers and a crowded music hall at <a class="reference external" href="http://www.valenj.org/newvale/ols/symposium2008/">VALE's Next Generation Academic
Library System
Symposium</a> at <a class="reference external" href="http://www.tcnj.edu">The
College of New Jersey</a>. I had been invited to</p>
<p>present an update on the state of acquisitions support in Evergreen, as
well …</p><p>On Wednesday, I was fortunate enough to join a distinguished panel</p>
<p>of speakers and a crowded music hall at <a class="reference external" href="http://www.valenj.org/newvale/ols/symposium2008/">VALE's Next Generation Academic
Library System
Symposium</a> at <a class="reference external" href="http://www.tcnj.edu">The
College of New Jersey</a>. I had been invited to</p>
<p>present an update on the state of acquisitions support in Evergreen, as
well</p>
<p>as to provide a brief overview of Project Conifer (the collaboration</p>
<p>between Laurentian University, McMaster University, and the University
of</p>
<p>Windsor to create a consortial implementation of Evergreen).</p>
<p>To summarize what I intended to be the main points of my</p>
<p>presentation (which may or may not have come through in real life):</p>
<ul>
<li><p class="first">Project Conifer is an existing effort to create a shared consortial
implementation of Evergreen for academic institutions; we would be
delighted to have others join forces with us</p>
</li>
<li><p class="first">If acquisitions isn't as far along as we would have hoped by now,
it's because</p>
</p><ul>
<li><p class="first">We (the Project Conifer institutions) haven't contributed enough</p>
</p><p>development resource to the effort thus far - although we are
planning to</p>
<p>correct this problem in the near term by hiring one or more
developers to</p>
<p>work on the requirements that we, as academic institutions, need
for a</p>
<p>successful Evergreen experience. If you're interested in a
position as an</p>
<p>Evergreen developer for Project Conifer,</p>
<p><p><a class="reference external" href="mailto:dan@coffeecode.net">let's talk</a>.</p>
</li>
<li><p class="first">Creating an enterprise-grade acquisitions system demands much more</p>
</p><p>effort and attention to detail than creating a simplistic
acquisitions</p>
<p>system that would be acceptable for a small library. If it took
two years</p>
<p>to build Evergreen's circulation, cataloging, reporting, and OPAC
functionality</p>
<p>from scratch, it's not unreasonable that it should take a year or
more to</p>
<p><p>build an acquisitions system to the same standards as the rest of
Evergreen</p>
</li>
</ul>
</p>
<p></li>
<li><p class="first">Evergreen acquisitions has made significant progress since December
2007,</p>
</p><p>and at this pace we expect a complete set of basic functionality to
be in</p>
<p>place by the end of April. By "basic functionality" I mean that the
manual</p>
<p>acquisitions mode should be supported with a minimalist user
interface. MARC</p>
<p>order record batch loading, EDI send/receive support, and a more
polished</p>
<p><p>user interface will take some more time - probably September-ish
2008. You can see the in-development, regularly updated bare-bones
interface at <a class="reference external" href="http://acq.open-ils.org/oils/acq/base/index">http://acq.open-ils.org/oils/acq/base/index</a>.</p>
</li>
</ul>
<p>I have to say that Equinox is making incredible progress considering
that</p>
<p>they're still doing the bulk of the work with the same amount of
development</p>
<p>resource that they had before Georgia PINES went live on Evergreen, and</p>
<p>they started their own company, and they started bringing BC PINES on
line,</p>
<p>and they began receiving an onslaught of requests for visits and
presentations</p>
<p>and conference calls... imagine what we could do with Evergreen,
together,</p>
<p>if a few more sites or consortiums were able to devote human or</p>
<p>financial resources to enhancing Evergreen.</p>
<p>Here are my slides in
<a class="reference external" href="/uploads/talks/2008/Evergreen_acquisitions_VALE.odp">OpenOffice</a>
and
<a class="reference external" href="/uploads/talks/2008/Evergreen_acquisitions_VALE.ppt">PowerPoint</a>
format. If you're going to</p>
<p>look at my slides, I highly recommend reading the presenter notes that I
wrote;</p>
<p>I've recently realized that presenter notes are as much for the benefit
of a</p>
<p>disconnected audience as they are useful preparation material for the
presenter. In the absence of a full paper on the subject matter at hand,
presenter notes should help flesh out the brevity forced by slideware.</p>
<p>A huge thanks to Ed Corrado, Anne Hoang, and Kurt Wagner for making the
overall experience</p>
<p>so enjoyable. I was honoured to be part of such a high-quality panel of</p>
<p>speakers.</p>
<p>Oh, and as an aside - the entire symposium was videotaped, and the</p>
<p>presentations and question and answer sessions will be made available</p>
<p>from the VALE Web site. I will update this post when those become
available. I</p>
<p>wonder if Ed got this idea from code4lib... in any case, I certainly
applaud</p>
<p>the initiative.</p>
<p><strong>Update:</strong> Umm, more polished acquisitions will likely be available in
Sept. 2008, not 2007... thanks to Brad Lajeunesse for pointing out that
time travel would be required to make that happen</p>
Evergreen workshop at code4lib 20082008-02-26T13:44:00-05:002008-02-26T13:44:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-02-26:/evergreen-workshop-at-code4lib-2008.html<p>Yesterday morning we (Bill Erickson, Sally Murphy <em>aka</em> "Murph", and I
ran an <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=advocacy:evergreen_workshop">Evergreen
workshop</a>
(rough agenda, presentation, and links to associated resources from that
page) for the code4lib 2008 preconference session. My personal goals
were:</p>
<ol class="arabic simple">
<li>Walk people through a simple Evergreen install</li>
<li>Get a small set of bib records …</li></ol><p>Yesterday morning we (Bill Erickson, Sally Murphy <em>aka</em> "Murph", and I
ran an <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=advocacy:evergreen_workshop">Evergreen
workshop</a>
(rough agenda, presentation, and links to associated resources from that
page) for the code4lib 2008 preconference session. My personal goals
were:</p>
<ol class="arabic simple">
<li>Walk people through a simple Evergreen install</li>
<li>Get a small set of bib records and holdings imported</li>
<li>Attract some more developers to the project by demonstrating how
seductively simple it is to add a new service to Evergreen at the
OpenSRF layer and then expose it in the catalogue or staff client</li>
<li>Show off some of the great features of Evergreen that haven't had
nearly enough exposure (reports, "fresh meat" feeds, exporter
interface)</li>
</ol>
<div class="section" id="problems">
<h2>Problems</h2>
<p id="problem1">Problem #1: I started organizing the pre-conference too late. To
save time on the install section, I asked attendees to prepare by
setting up a VMWare image or bootable Debian or Ubuntu partition and get
a bunch of the prerequisite packages installed ahead of time. But by the
time I sent my request out, the attendees only had a few days to prepare
- and many of them probably hadn't worked with VMWare before, so they
suddenly had another learning barrier to overcome. I wasn't too
surprised when only about 25% of the room had been able to "do their
homework".</p>
<p>Problem #2: I lost at least six hours of preparation time when, due to
my own stupidity, I left my passport in a hotel in Atlanta and ended up
having to drive across the border from Vancouver to Portland, Oregon.
Six hours, man... that's almost a full day thrown away, which is
critical when you've left things too late (see <a class="reference internal" href="#problem1">problem1</a>).
Continuing on the negative side, all I could listen to during the drive
was completely formulaic rock stations and political rhetoric worthy of
10-year-olds as I drove through Washington. If radio is a dying medium,
I have a very good idea why...</p>
<p>Problem #3: We ran into bizarre projector problems that, for some
reason, prevented us from being able to see our laptop screens at the
same time as the projected screen. This laptop worked fine with the
projector at the OLA Superconference just a few weeks ago, and Bill was
afflicted by the same problem - so it really put a crimp in my ability
to switch from the presentation to the live install image. My neck was
wrecked from constantly twisting around to peer up at the screen while
trying to do some minor mousing around.</p>
<p id="problem4">Problem #4: I severely underestimated how long the install
process would take when trying to support a whole group of people at
once; you're guaranteed to have a question on almost every step. When we
were preparing for the workshop, we had this idea that we would take a
hard line and spend no more than one or two minutes on each step - which
certainly would have saved a lot of time. But when you've made a
connection with the audience, and people have made it through the first
dozen steps, it suddenly becomes a lot, lot harder to simply abandon
them with the promise that you'll help them later. So we ended up
spending something like 2 hours on the install (including a break)
rather than the 45 minutes we had been aiming for.</p>
<p>Problem #5: We were overly optimistic about how much we could get done
in 2.5 hours. Even without the severe compounding of our time crunch by
<a class="reference internal" href="#problem4">problem4</a>, in retrospect it's clear we would still have been rushing through
all of the other pieces. I think we knew that anyways, but we were just so
excited about showing off Evergreen that we wanted to show off as much as
possible.</p>
<p>It's not really all that bleak though. There were successes, too.</p>
</div>
<div class="section" id="successes">
<h2>Successes</h2>
<p>Success #1: We have at least one person who successfully made it through
the install phase and who successfully imported the bib records and
holdings, and several others who feel they are <em>very</em> close to
finishing. I'm hoping that we can spend a few minutes over the course of
the conference to help them reach that finish line.</p>
<p>Success #2: We have a real example of <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=importing:holdings:import_via_staging_table">how to import
holdings</a>
into Evergreen now. This is something that people have been asking for
on the list, and I'm really happy to have been able to package up what
Mike Rylander provided with a set of sample records and a sample "parse
holdings" script that hopefully others will be able to adopt to their
own needs.</p>
<p>Success #3: I had feedback from a number of people who, even though they
weren't trying to go through the install, still felt it was worthwhile
getting an explanation of all the pieces that OpenSRF and Evergreen
depend on and how they fit together. I think it was clear that the
complexity involved in installing Evergreen isn't so much OpenSRF or
Evergreen themselves as it is a few finicky details involving networking
- largely ejabberd and Net::Domain's insistence on specific and
sometimes conflicting definitions of hostnames.</p>
<p>Success #4: Bill did get to quickly demonstrate <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=advocacy:evergreen_workshop#customizing_evergreennew_service">how to add a new
OpenSRF
service</a>
("reset my password and email it to me") and how to integrate that into
the catalogue. It was rough and dirty code, but at approximately one
page of Perl code and about 10 lines of JavaScript I think it was a
convincing demonstration of how easy it is to extend Evergreen.</p>
<p>Success #5: We have laid the groundwork for an Evergreen workshop now,
and having gone through the experience once we'll be able to refine the
concept for future events. One idea that we've already kicked around is
to split it into several tracks so that attendees can self-select what
they're interested in and so that we can give enough time to each
section. Say, two (or three) hours for an installfest; two hours for
"exploring the dark corners of Evergreen"; and two hours on developing
and extending Evergreen (OpenSRF, catalogue, staff client). Or we could
have spent the entire pre-conference day on Evergreen.</p>
</div>
<div class="section" id="reflection">
<h2>Reflection</h2>
<p>I think it might have been really cool if we had worked with LibraryFind
and Zotero to set up an ongoing theme throughout the three
pre-conference sessions. We could have collaborated on pre-requisites,
so that the LibraryFind install could go on top of the same image as the
Evergreen install, and then the newly installed Evergreen image could
have been added as a LibraryFind source during the LibraryFind
administration section. Then, during the Zotero session, Evergreen and
LibraryFind could have been added as new sources for capturing citation
information (by making Evergreen and LibraryFind generate COInS objects
that Zotero understands or giving Zotero the ability to understand the
various formats that Evergreen offers via unAPI).</p>
<p>Of course, it also would have required a heck of a lot of pre-conference
planning. A suggestion I would make for next year's pre-conference
organizers would be to communicate as much as possible ahead of time to
set expectations and help your attendees determine what your agenda
should be. We could have just thrown out the entire Evergreen install
section, had people get comfortable with a pre-installed VMWare ahead of
time, and focused most of the session on developing and exposing OpenSRF
services, for example, if that's what our attendees wanted.</p>
</div>
The State of Evergreen: OLA Presentation2008-02-02T03:56:00-05:002008-02-02T03:56:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-02-02:/the-state-of-evergreen-ola-presentation.html<p>Well, despite getting less than four hours of broken sleep before my
9:00 am presentation, I think I successfully delivered an update on
<strong>Evergreen:State of the Open ILS</strong> to approximately 45 people at the
<a class="reference external" href="http://www.accessola.com/superconference2008">OLA Super Conference</a>
today. There were some great questions from the audience that kept …</p><p>Well, despite getting less than four hours of broken sleep before my
9:00 am presentation, I think I successfully delivered an update on
<strong>Evergreen:State of the Open ILS</strong> to approximately 45 people at the
<a class="reference external" href="http://www.accessola.com/superconference2008">OLA Super Conference</a>
today. There were some great questions from the audience that kept me on
my toes. Thank heavens David Fiander was there to provide colour
commentary and solid advice. Overall, the talk seemed to be well
received.</p>
<p>Perhaps the most pleasant surprise of the session was when I discovered
that one of the libraries close to my old home town has been working for
the last six months on migrating to Evergreen. Marvelous!</p>
<p>If you want the slides from my presentation, I've licensed them under a
Creative Commons 2.5 By Attribution license. Presentations available
below in two different formats:</p>
<ul class="simple">
<li><a class="reference external" href="/uploads/talks/2008/TheStateofEvergreen.odp">Evergreen:State of the Open ILS (OpenOffice
Impress)</a></li>
<li><a class="reference external" href="/uploads/talks/2008/TheStateofEvergreen.ppt">Evergreen:State of the Open ILS (Microsoft
PowerPoint)</a></li>
</ul>
As if you didn't see it coming...2008-01-08T18:58:00-05:002008-01-08T18:58:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2008-01-08:/as-if-you-didnt-see-it-coming.html<p>My employer, Laurentian University, <a class="reference external" href="http://laurentian.ca/Laurentian/Home/News/Evergreen+library+system+08jan08.htm">issued a press
release</a>
today announcing that we have selected
<a class="reference external" href="http://open-ils.org">Evergreen</a> as our future library system. I
wrote more about this on the <a class="reference external" href="http://open-ils.org/blog/?p=111">Evergreen
blog</a>, but what I didn't say was ...
yay!</p>
<p>We still have a long road ahead of us, but knowing that we'll be …</p><p>My employer, Laurentian University, <a class="reference external" href="http://laurentian.ca/Laurentian/Home/News/Evergreen+library+system+08jan08.htm">issued a press
release</a>
today announcing that we have selected
<a class="reference external" href="http://open-ils.org">Evergreen</a> as our future library system. I
wrote more about this on the <a class="reference external" href="http://open-ils.org/blog/?p=111">Evergreen
blog</a>, but what I didn't say was ...
yay!</p>
<p>We still have a long road ahead of us, but knowing that we'll be
migrating to a system that I can poke with a sharp stick and make it do
my bidding goes a long way towards making me feel warm and fuzzy inside.</p>
<p>I predict that we'll see a few more announcements from universities and
colleges in North America joining the Evergreen development effort /
adoption process in 2008. Outside of Ontario, I know about the
<a class="reference external" href="http://blog.benostrowsky.com/2007/10/04/my-new-job-horizon-vs-evergreen-cage-match/">University of
Utah</a>'
s interest and interest from a <a class="reference external" href="http://valenews.wordpress.com/2007/12/03/conference-agenda/">New Jersey consortium of academic
institutions</a>
(see session h. "Open Library Systems and NJ: From Vision to
Transformation")... are there other academics who have made public
statements of interest in Evergreen that I'm missing out on?</p>
Ain't no way to treat your CODI2007-11-16T01:01:00-05:002007-11-16T01:01:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2007-11-16:/aint-no-way-to-treat-your-codi.html<p>Wow. Eileen R. Kontrovitz, a board member of CODI (Customers of Dynix,
Inc.) wrote, as part of <a class="reference external" href="http://letterman.oplib.org/blog/?p=26">her summary of the recent CODI
conference</a>:</p>
<blockquote>
Many very nice things happened at the conference but the buzz, the
thing everyone who was not there wants to hear about is the
unannounced, invitation …</blockquote><p>Wow. Eileen R. Kontrovitz, a board member of CODI (Customers of Dynix,
Inc.) wrote, as part of <a class="reference external" href="http://letterman.oplib.org/blog/?p=26">her summary of the recent CODI
conference</a>:</p>
<blockquote>
Many very nice things happened at the conference but the buzz, the
thing everyone who was not there wants to hear about is the
unannounced, invitation only meeting with Martin Taylor from Vista
Equity Partners about open source. Mr. Taylor began the meeting in a
very cordial way and with a charm about him that belied the basic
message he proceeded to give. That message was, we know more than
you do and if you don't like it you can go to some other "happy
place". He actually said that on more than one occasion.</blockquote>
<p>I guess Vista feels quite threatened by open source library systems,
even though they currently account for <a class="reference external" href="http://lisnews.org/node/22251">only 1% of the US public library
market</a> today, if they're willing to
have their top brass lay on the fear, uncertainty, and doubt campaign to
their top customers behind closed doors. It also seems that Mr. Taylor's
tactics have backfired, at least in this case; after praising the common
SirsiDynix employees, Eileen goes on to say that Vista's attitude has
almost ensured that her library (Ouachita Parish Public Library) will
not move to Symphony. This, even though Eileen says:</p>
<blockquote>
...I happen to agree with his assessment of open source for a
library ILS at this point in time and the many problems with the
very nature of the beast without some kind of regulating body...</blockquote>
<p>Perhaps this should be the subject of another blog post, but I believe
there are actually a number ways Evergreen is regulated. First is that
an open source project is not an "anything goes" project: the committers
for the project act as a level of quality control for Evergreen. If code
doesn't further <a class="reference external" href="http://open-ils.org/mission.php">the Evergreen
mission</a> by contributing towards
stability, robustness, flexibility, security, or user-friendliness, then
it's simply not going to go into the project proper. Second is that at
least one company (<a class="reference external" href="http://esilibrary.com">Equinox</a>) is staking its
success on Evergreen, and others are starting to build up business
around Evergreen. They're not going to sit back and rest easy; they know
that they have to enhance Evergreen beyond its current core strengths if
they want to build inroads into markets like academic libraries. Third
is that Evergreen's open source license also acts as a regulator - the
code that comprises Evergreen can never be pulled from the market, so
the future of Evergreen is always in the hands of the community.</p>
Laurentian goes ever greener2007-10-12T16:36:00-04:002007-10-12T16:36:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2007-10-12:/laurentian-goes-ever-greener.html<p>This is slightly in advance of our official press release, which is
currently in translation, but I will be giving / have given a lightning
talk at Access 2007 on this subject and have decided to make the
following materials available:</p>
<ul class="simple">
<li>Report: <a class="reference external" href="http://www.coffeecode.net/uploads/reports/Assessing_Evergreen.pdf">Assessing Evergreen for an academic bilingual
library</a></li>
<li>Evergreen Business …</li></ul><p>This is slightly in advance of our official press release, which is
currently in translation, but I will be giving / have given a lightning
talk at Access 2007 on this subject and have decided to make the
following materials available:</p>
<ul class="simple">
<li>Report: <a class="reference external" href="http://www.coffeecode.net/uploads/reports/Assessing_Evergreen.pdf">Assessing Evergreen for an academic bilingual
library</a></li>
<li>Evergreen Business Readiness Rating:
(<a class="reference external" href="http://www.coffeecode.net/uploads/reports/BRR-Evergreen.ods">OpenOffice</a>)
(<a class="reference external" href="http://www.coffeecode.net/uploads/reports/BRR-Evergreen.xls">Excel</a>)
- see <a class="reference external" href="http://openbrr.org">OpenBRR</a> for more information about the
Open Business Readiness Rating templates)</li>
<li>Presentation: <a class="reference external" href="http://www.coffeecode.net/uploads/talks/2007/Ever_greener_at_LU.pdf">Lightning talk: Going Ever Greener at Laurentian
University</a></li>
</ul>
Committing to Evergreen2007-09-09T02:35:00-04:002007-09-09T02:35:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2007-09-09:/committing-to-evergreen.html<p>Yesterday, over on the Evergreen blog, Mike announced that I am now <a class="reference external" href="http://open-ils.org/blog/?p=100">a
full committer</a> to the Subversion
repository for Evergreen. (It was blog post #100 for Evergreen, by the
way - two milestones in one!). The road to getting here was pretty
standard fare for open-source projects: submit patches that …</p><p>Yesterday, over on the Evergreen blog, Mike announced that I am now <a class="reference external" href="http://open-ils.org/blog/?p=100">a
full committer</a> to the Subversion
repository for Evergreen. (It was blog post #100 for Evergreen, by the
way - two milestones in one!). The road to getting here was pretty
standard fare for open-source projects: submit patches that do useful
things (like simplify build processes or add i18n support); listen to
feedback about those patches and incorporate those lessons leanred into
the next patches; and repeat, as described in Evergreen's <a class="reference external" href="http://open-ils.org/documentation/contributing.html">contribution
process</a>:</p>
<blockquote>
</p><p>From time to time, and as individual community members become more
familiar and skilled with the complete codebase of Evergreen, some
individuals may be asked to join the core team. We see this as both
an honor and a responsibility, as this group is charged with being
the final quality control mechanism for the source code, as well as
helping other less experienced community members come up to speed.
It is not simply a way to get code into Subversion, but also about
mentoring new contributors and helping to keep the overall vision of
the project in focus, tempered by the history and evolution of the
code and lessons learned from past successes and failures.</p>
<p></blockquote>
<p>I'm not just tooting my own horn, here. I think it's important to
emphasize that the Evergreen community is healthy, welcoming to
newcomers, and growing. I am honoured to join the Evergreen team (as
Mike says, "again"), this time as a committer - and I look forward to
helping the Evergreen community continue to grow.</p>
<p>If you're interested, there are plenty of ways to help us - through your
contribution of use cases, documentation, graphics and design, patches,
translations, testing... Hmm. I'm <a class="reference external" href="/archives/137-Open-source-in-libraries-community-strength.html">repeating
myself</a>
a bit here <img alt=":-)" class="emoticon" src="/images/smile.png" /> See you on the lists / IRC!</p>
Open source in libraries: community = strength2007-08-31T01:58:00-04:002007-08-31T01:58:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2007-08-31:/open-source-in-libraries-community-strength.html<p>Karen G. Schneider has a great post on <a class="reference external" href="http://www.techsource.ala.org/blog/2007/08/enterprise-open-source.html">Enterprise Open
Source</a>
on the <a class="reference external" href="http://techsource.ala.org">ALA TechSource</a> blog:</p>
<blockquote>
<p>But the truly significant activity in LibraryLand technology hasn't
been vendor-driven. It has been the maturation of what I call
"enterprise open source": products such as Evergreen and Koha that
are robust, well-implemented library …</p></blockquote><p>Karen G. Schneider has a great post on <a class="reference external" href="http://www.techsource.ala.org/blog/2007/08/enterprise-open-source.html">Enterprise Open
Source</a>
on the <a class="reference external" href="http://techsource.ala.org">ALA TechSource</a> blog:</p>
<blockquote>
<p>But the truly significant activity in LibraryLand technology hasn't
been vendor-driven. It has been the maturation of what I call
"enterprise open source": products such as Evergreen and Koha that
are robust, well-implemented library automation packages with strong
development communities and equally strong funded-support models.</p>
</p>
<p></blockquote>
<p>Hear hear! Karen examines the value of open source, and finds that it's
not so much in that it's a lower-cost alternative (although that can be
a persuasive argument), and not so much that you have the ability to
modify the code (although that can also be a persuasive argument), but
that it depends on the strength of the community to continue to exist
and improve. And that makes it a very good match for libraries, because
we seem to do "community" better than most other industries.</p>
<p>So let me take a different tack than Karen, and assume that if you've
read this far that you're interested in supporting open source for your
library, but maybe you don't have a programmer on staff. How can you
help?</p>
<p>Well, there are many ways other than programming to contribute to an
open source community. <a class="reference external" href="http://open-ils.org">Evergreen</a>, for example,
just posted a call on its development mailing list for <a class="reference external" href="http://list.georgialibraries.org/pipermail/open-ils-dev/2007-August/001696.html">help in defining
and prioritizing the requirements for its acquisitions and serials
modules</a>.
If you have experience with these areas, and have blue-sky ideas for how
you could build a better system, this is a great opportunity to step
into the conversation. There's a bit of a parallel here to proprietary
systems, although with a proprietary system it's called a "request for
enhancement" and most of those tend to get filed in the distant future.
With Evergreen's invitation for discussion, you _know_ the developers
are happy to listen to the ideas for making the best possible product.
They don't have prior baggage holding them back, so they really can
start from square one - and they have a huge incentive to do better than
the existing options, because they want to convince you to pick
Evergreen the next time you're thinking about your next ILS.</p>
<p>Or you can contribute your hard-won experience and knowledge to the
documentation wiki.
<a class="reference external" href="http://open-ils.org/dokuwiki/doku.php">Evergreen</a> and
<a class="reference external" href="http://wiki.koha.org">Koha</a> both have wikis to which you can
contribute. Interestingly, there is a parallel here to at least one
proprietary vendor, which set up a wiki (behind a password-protected
site) after many requests from their user group. It boggles _my_ mind,
but some of these same customers have also argued that they (the
customers) should pool their efforts and write a new set of manuals for
the product for which they are paying support and licensing fees. I'm
sure the Evergreen and Koha projects would really appreciate your
assistance in writing a good set of manuals, and they won't charge you
for the privilege, either.</p>
<p>You can also participate simply by joining in the conversations on the
mailing lists or chat rooms (#OpenILS-Evergreen on Freenode for
Evergreen, #koha on Freenode for Koha). You'll take some time to get
familiar with the products, no doubt, but once you've climbed over the
brick walls (with the help of the others on the mailing list), you will
have the opportunity to pay it back a dozen times over as others face
the same walls that you faced. I see this same principle on our
proprietary vendor's mailing lists. The customers do a far better job of
supporting each other than the vendor to whom they're paying support and
licensing fees.</p>
<p>And it feels good, working together to build something that belongs to
an entire community. For the little bits that I've been able to do, I
get a huge sense of satisfaction. It's a nice little addiction.</p>
Wrapping up the AcqFest2007-07-24T04:12:00-04:002007-07-24T04:12:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2007-07-24:/wrapping-up-the-acqfest.html<p>Well, I'm finally back from Atlanta and the Evergreen AcqFest. I'll
apologize right off the top for not providing more blog updates over the
course of the weekend, but the requirements and design discussions were
pretty intense so I didn't want to risk continuously missing subtle but
important details and …</p><p>Well, I'm finally back from Atlanta and the Evergreen AcqFest. I'll
apologize right off the top for not providing more blog updates over the
course of the weekend, but the requirements and design discussions were
pretty intense so I didn't want to risk continuously missing subtle but
important details and not being able to participate intelligently by
live-blogging the event. After each full day of work, we "unwound" with
a serious meal--which, after some socializing, usually involved slipping
back into kicking more design and implementation ideas, problems, and
potential solutions around. By the time I got back to the hotel room, I
was either completely wiped out, or itching to commit something to the
group document / or play with some code. So I hope you understand (all
three of you that are reading this!).</p>
<p>On top of everything else, it was a bit of a gruelling trip back. In
order to save a few hundred dollars and make a greener transportation
choice for the final leg of my journey, I took the bus back to Sudbury.
Hello, five-hour layover in Toronto and a packed six-hour bus ride
(thanks to Hwy 401 construction) back home! It was quite a relief to get
back and see the family.</p>
<p>Anyways, here is a mini-summary of what we accomplished:</p>
<ul class="simple">
<li>Agreement on some realistic time frames for acquisitions and serials
development: call the stages one philosopher, two philosopher, and
three philosophers</li>
<li>After kicking the left-of-field idea of using a calendar server to
handle serials schedules, coverage information, predictions, and
claiming events for a day or so, and clearly invoking the concern of
at least one library blogger, Mike had a brilliant idea for how to
represent all of this natively in PostgreSQL. He's going to take a
few days to work through a proof-of-concept to ensure that it's as
solid as it sounds, so I won't give away the details just yet...</li>
<li>Agreement on the requirements for basic item-at-a-time acquisition
workflow support (to be implemented first) and more advanced
acquisitions support (batch orders via MARC record import, integrated
vendor discovery API support, EDI support)</li>
<li>Agreement on adding internationalization support to OpenSRF. Right
now OpenSRF (the messaging infrastructure on which Evergreen depends)
knows nothing about locales. We've been able to use URL tricks to
support translation of the catalog interfaces thus far, but Mike
worked through the changes that will be required to pass locale as a
property of each session. This will enable the service being invoked
to "do the right thing" if locale is of a concern to whatever output
it returns.</li>
<li>_Almost_ agreement on how to add internationalization support to
Evergreen. We worked through a number of different scenarios for
supporting translation of dynamic strings (library names, for
example) that reside in the database, from a single table that holds
all of the translated strings, to an i18n schema that holds tables
that parallel any table in another schema that holds translatable
content. We settled on the latter schema. I say "almost agreement"
because until something gets committed to code, I have a feeling that
this is still subject to change a little bit <img alt=":-)" class="emoticon" src="/images/smile.png" /></li>
<li>Exposure to some parts of Evergreen that many of us hadn't seen
before -- in particular, the reporting interface that Evergreen
provides is extremely powerful and well-designed. It even supports
basic line and bar charts for adding punch to your presentations.</li>
<li>Art introduced us to OFBiz and OpenTaps via an online training video,
and later on Art and Ed successfully played around with the Java
OpenSRF client via BeanShell. My takeaway lesson about OFBiz if I
ever need to customize something built on it: it's all in
controller.xml!</li>
</ul>
<p>There was a lot more than that that we covered, but for now I have to
get some shut-eye. For any skeptics out there, the actual acquisitions
and serials workflows employed at our constituent libraries were used as
testbed scenarios for the discussions about Evergreen's serials and
acquisitions requirements and design. I'm feeling good about the work we
accomplished, I think we found some elegant solutions for some of the
age-old problems in these areas, and I think we have a common
understanding of the path forward.</p>
On the road again: Evergreen acqfest2007-07-18T01:21:00-04:002007-07-18T01:21:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2007-07-18:/on-the-road-again-evergreen-acqfest.html<p>So I'm taking off tomorrow for Atlanta to spend four days deeply
immersed in discussing, designing, planning, and implementing
Evergreen's <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=scratchpad:acq_serials">acquisitions and
serials</a>
support. At least that's the plan. In our spare time (heh), we're going
to tackle the internationalization infrastructure as well. The spirit of
the event is modelled …</p><p>So I'm taking off tomorrow for Atlanta to spend four days deeply
immersed in discussing, designing, planning, and implementing
Evergreen's <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=scratchpad:acq_serials">acquisitions and
serials</a>
support. At least that's the plan. In our spare time (heh), we're going
to tackle the internationalization infrastructure as well. The spirit of
the event is modelled loosely after the Access conference hackfests, and
has therefore been dubbed <strong>Acqfest</strong>. Unlike hackfest, however, where
the journey itself is usually the goal, with Acqfest there's a much
stronger emphasis on actually getting things done. I may, err, <em>acquire</em>
a slight southern accent after a few days, but I mostly hope to increase
my understanding of Evergreen while kicking in some design suggestions,
code, and documentation here and there.</p>
<p>Laurentian is covering my travel--at this point in our evaluation of the
future of our systems, it's in the library's interests to give me an
opportunity to stare deep into the heart of Evergreen--and my local
arrangements are being covered by BC Public Libraries, Georgia PINES,
and Equinox Software. I'm contributing my time and, uh, expertise. All
round, I think the whole community is going to benefit from the
Evergreen Acqfest. Assuming I have a few minutes, I'll try to post some
updates on our progress over the next few days.</p>
Evergreen 1.2.0-rc1 is out! And so is the Gentoo VMWare image...2007-07-07T18:43:00-04:002007-07-07T18:43:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2007-07-07:/evergreen-120-rc1-is-out-and-so-is-the-gentoo-vmware-image.html<p>So, yesterday afternoon Mike Rylander from the Evergreen (a.k.a
Open-ILS) project pushed out the <a class="reference external" href="http://open-ils.org/blog/?p=96">first release candidate of Evergreen
1.2.0</a>. Hurrah! If you tried
installing Evergreen before, but got hung up on some of the build,
install, or configuration steps, I think you'll find this release …</p><p>So, yesterday afternoon Mike Rylander from the Evergreen (a.k.a
Open-ILS) project pushed out the <a class="reference external" href="http://open-ils.org/blog/?p=96">first release candidate of Evergreen
1.2.0</a>. Hurrah! If you tried
installing Evergreen before, but got hung up on some of the build,
install, or configuration steps, I think you'll find this release a lot
easier to deal with. For example, there's one less configuration file to
deal with now -- bootstrap.conf is a thing of the past.</p>
<p>I'm happy to point out that I've updated my Gentoo-based VMWare image of
Evergreen as well: <a class="reference external" href="http://open-ils.org/~denials/Evergreen_1.2.0-rc1_Gentoo_x86.zip">Evergreen
1.2.0-rc1</a>
(479M). Along with that, I've updated my instructions for <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=installing_prerequisites_on_gentoo">installing
Evergreen on
Gentoo</a>
to reflect the newer, simpler install process.</p>
Know your sources: Evergreen / Koha comparisons2007-06-24T02:33:00-04:002007-06-24T02:33:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2007-06-24:/know-your-sources-evergreen-koha-comparisons.html<p><strong>Correction update: 2007/06/26</strong>Wow. I am incredibly embarassed.
Somehow, I made a very stupid mistake in my summary of the State Library</p>
<p>of Ohio ILS Options Discussion Meeting Minutes - April 24, 2007. The
mistake was that I incorrectly attributed Joshua Ferraro of LibLime with
making statements about Evergreen …</p><p><strong>Correction update: 2007/06/26</strong>Wow. I am incredibly embarassed.
Somehow, I made a very stupid mistake in my summary of the State Library</p>
<p>of Ohio ILS Options Discussion Meeting Minutes - April 24, 2007. The
mistake was that I incorrectly attributed Joshua Ferraro of LibLime with
making statements about Evergreen at that meeting when he was not even
present. All of the statements about Evergreen should have been
attributed to Stephen Hedges. I apologize profusely to Josh for this
mistake, and will repeat this correction and apology in the section of
the blog entry closer to the text.</p>
<p>I have been in the process of gathering information about the possible
future</p>
<p>of our library system, with a focus on SirsiDynix's Rome, Evergreen, and
Koha,</p>
<p>for a number of months now. This results in having to sift through
claims from</p>
<p>a number of different sources about the capabilities (present and
future) for</p>
<p>all of these systems. In the context of a recent post on the
open-ils.org blog (<a class="reference external" href="http://open-ils.org/blog/?p=90">Lies, Damned Lies, and Library Automation
Software</a>),</p>
<p>as well as all of the shilling that will undoubtedly be going on on the
exhibit</p>
<p>floor and in the hospitality rooms of ALA, and finally building on</p>
<p>Karen Coombs' post on <a class="reference external" href="http://www.librarywebchic.net/wordpress/2007/06/22/bias-objectivity-and-authority/">Bias, Objectivity and
Authority</a>,
I would</p>
<p>like to make a point that ideally shouldn't need to be made (especially
for</p>
<p>librarians!), but sadly seems to be necessary in the context of
discussions about Evergreen and Koha.</p>
<p>The point? "Know your sources." And "Check your facts." When you've been
given information about</p>
<p>something, you don't blindly accept the information as given - you check
the</p>
<p>references and determine the authority of the source. This is par for
the</p>
<p>course for reference librarians educating patrons performing research in
their</p>
<p>libraries, but oddly enough seems to be a common blind spot when it
comes to</p>
<p>performing evaluations of the software that powers your libraries.</p>
<p><strong>Problem #1</strong>: Evaluating information about a given product from the
company or</p>
<p>organization that stands to benefit from your adoption of that product.</p>
<p>This is especially hard when dealing with companies offering proprietary</p>
<p>products that won't hand you an evaluation copy to try out in your own</p>
<p>organization; or that run closed mailing lists; or that don't make their</p>
<p>documentation or support infrastructure openly available.</p>
<p>But it can also be hard with companies or organizations offering open
source</p>
<p>products or support for open source products. The company or
organization may</p>
<p>point you at an online demo of their product, but that demo may reflect
a</p>
<p>heavily customized, bleeding-edge version of the product that mere
mortals</p>
<p>cannot install - or even more insidiously, may include code that is not</p>
<p>currently included in the open source repository.</p>
<p>The good news on the open source front is that independent contributors
have</p>
<p>made VMWare images of some of the most popular library systems available</p>
<p>(<a class="reference external" href="http://open-ils.org/cvs.php">Evergreen</a>,</p>
<p><a class="reference external" href="http://kylehall.info/index.php/projects/koha/koha-virtual-appliance/">Koha</a>)
for</p>
<p>download that reflect a standard install of the product directly from
the open</p>
<p>source repository.</p>
<p>Note that a common approach to marketing a product is to provide a
feature</p>
<p>list - basically a checklist of features. A naive decision maker might
assume</p>
<p>that more is better, which often results in products breaking down the</p>
<p>features that they do well into many sub-features. It's a form of
checklist</p>
<p>inflation - but as long as you've got your eyes open, at least it's more</p>
<p>information rather than less. For each feature you're actually
interested in,</p>
<p>you have to ask a couple of additional questions:</p>
<ul>
<li><p class="first"><em>How many other libraries are currently using this feature?</em> It may
be</p>
</p><p>great that the software you're looking at includes LDAP
authentication as an</p>
<p>option, but if there's only one other library using the feature it's</p>
<p>unfortunately likely that they will be using a different LDAP
directory</p>
<p>product than you (Novell eDirectory vs. MS ActiveDirectory vs.
OpenLDAP vs.</p>
<p>IBM Directory) and they will be using it in a different way.</p>
<p></li>
<li><p class="first">Is this feature part of the base package, or is it an optional extra</p>
</p><p>that's going to cost me more? Not such a problem with the open source</p>
<p>options, although it depends in that case whether you're buying
commercial</p>
<p>support for the product. The model that the company uses may be
all-inclusive</p>
<p>or a menu of different costed support options.</p>
<p></li>
<li><p class="first">Is this a massive feature that hasn't been broken down into</p>
</p><p>sub-features? The danger here is that, for the purposes of looking
good in</p>
<p>feature comparisons, a product may have added a number of "features"
that</p>
<p>really just scratch the surface of what comparable products offer.</p>
<p></li>
</ul>
<p>For example, if a library systems product says it has a serials module
and an</p>
<p>acquisitions module, you have to dig into what that really means. Does
the</p>
<p>"serials module" just mean that it will spit out a routing list whenever
you</p>
<p>check in a new issue of a given serial? Or does it mean that it handles</p>
<p>predictions, claiming, holdings, etc., in the way that meets your
library</p>
<p>needs? By "acquisitions module", does the product mean that it simply
records</p>
<p>the cost of each item that you have acquired? Does it allow you to make</p>
<p>on-order items visible in the catalogue, with the ability to place
holds? Does</p>
<p>it include EDI capabilities? Does it provide a complete fiscal
management</p>
<p>system with funds and reporting and electronic record import / export
hooks</p>
<p>for the ERP system that your university or municipality uses so that
costs and</p>
<p>invoices don't have to be manually entered multiple times in multiple
systems?</p>
<p>Perhaps most importantly, does the system have the flexibility to adapt
to your</p>
<p>needs, or does the system require you to adapt to its needs? Can you
live with</p>
<p>the 80% of functionality that most sites need, or does your site live in
the</p>
<p>long tail of requirements. In the case of serials, for example, do you
need</p>
<p>the ability to specify any pattern, or can you just deal with irregular</p>
<p>patterns those as exceptions?</p>
<p><strong>Problem #2</strong>: Evaluating information about a given product from a
company or</p>
<p>organization that offers a competitive product.</p>
<p>Sales people make it their business to know their competitors so that
they can</p>
<p>accomplish two goals:</p>
<ol class="arabic">
<li><div class="first"></p></div><p>Focus attention on (and often embellish) their own product's
strengths,</p>
<p>and know how to spin responses to their own product's weaknesses that
might</p>
<p>be identified by a competitor.</p>
<p></li>
<li><div class="first"></p></div><p>Identify the weaknesses of their competitor's product, particularly
when</p>
<p>their own product has comparable strengths in that same area. Note
that these</p>
<p>weaknesses don't necessarily have to be real, they just have to be
believable</p>
<p>and hard to disprove.</p>
<p></li>
</ol>
<p>Among the proprietary options, without having access to a hands-on test
system,</p>
<p>the system documentation, or the product mailing lists, it is incredibly
hard to</p>
<p>verify claims about a products' strengths or weaknesses. Even for claims
about</p>
<p>the future development of a proprietary product that you already have
access</p>
<p>to, the company cannot be held liable if plans change. Companies can,
for</p>
<p>example, cancel an entire product even after beta versions of the
product have</p>
<p>been released into the wild for "development partners." Horizon 8,
anyone?</p>
<div class="section" id="development-partners-test-our-beta-for-us">
<h2>Development partners: test our beta for us!</h2>
<p>Oh - on the topic of "development partners" - this is typically a
euphemism for</p>
<p>"we'll give you a discount on product XXX if you put it into production
and</p>
<p>report all the bugs you find." Companies love this approach because it
gives</p>
<p>them visibility in the marketplace ("Look, we already have deployed
product XXX</p>
<p>to five sites! It's proven and ready for you!") while enabling them to</p>
<p>effectively continue development on product XXX and hope to have a
polished</p>
<p>product ready in time for the bulk of their potential customer base to
actually</p>
<p>adopt it. In the past, Microsoft very effectively used the "product
announce"</p>
<p>to prevent customers from purchasing a competitor's product that offered</p>
<p>compelling features and stalling the decision long enough to then
develop and</p>
<p>bring their own product to market.</p>
</div>
<div class="section" id="disinformation-and-open-source-projects">
<h2>Disinformation and open source projects</h2>
<p>Surprisingly, the disinformation approach works even in open source
projects.</p>
<p>For example, I have read and heard claims about Evergreen like: "Oh,
Evergreen</p>
<p>is just for massive consortiums / it needs 40 servers to run / it
doesn't scale</p>
<p>down to just a single library." You can see how this could be believable
if you</p>
<p>don't push too hard on the claims, because Evergreen's was developed for
a</p>
<p>consortial library system and much has been made about the impressive
server</p>
<p>cluster that GPLS runs Evergreen on -- however, having run Evergreen on
a single</p>
<p>VMWare machine on my laptop, I can personally attest to its ability to
scale</p>
<p>down to a single server (or portion thereof). And you can run that same
VMWare</p>
<p>image on your own laptop or spare desktop machine and disprove that
claim</p>
<p>yourself; but many of the decision makers do not have the technical
skills,</p>
<p>time, or interest to get hands-on with products like Evergreen. So they
have</p>
<p>to trust what they read or hear, hopefully from the most trustworthy of</p>
<p>sources.</p>
<p>Another swipe at Evergreen is that it is not a true open source project;
that</p>
<p>its history as a top-down project initiated by GPLS means that there is
no</p>
<p>real development community around Evergreen. If you've followed the
Evergreen</p>
<p>development</p>
<p>mailing list, you wouldn't believe a claim like this, and you would</p>
<p>proclaim it a blatant lie. To disprove this claim, you just</p>
<p>need to browse through the open-ils-dev mailing list and look for the
emails</p>
<p>with the subject keyword "PATCH" and you'll see that some of us have
indeed</p>
<p>been contributing patches to the source code. Beyond that, you'll also
see</p>
<p>that there are many volunteer contributors for install and configuration</p>
<p>support, documentation, creation of VMWare images. So how could someone
make</p>
<p>such a claim about Evergreen and get away with it?</p>
<p>It's all about trusting "authorities", not checking sources, and
integrity (or perhaps</p>
<p>a lack thereof). Here's an excerpted quote about Evergreen from the
State Library</p>
<p>of Ohio ILS Options Discussion Meeting Minutes - April 24, 2007</p>
<blockquote>
</p><p>The documentation for the process is very poor, which is typical
because it is</p>
<p>the last thing developers are thinking about. ... The source code is
open but</p>
<p>they don't really follow the "playground" rules for the open source
production</p>
<p>process.</p>
<p></blockquote>
<p>Here's where you need to really know your sources and check your
references.</p>
<p>Note that the claims about the nature of Evergreen as not being a true
open</p>
<p>source project are credited to the introductory speaker, Stephen Hedges.
Who</p>
<p>is Stephen Hedges? He was the director of Nelsonville Public Library
(NPL)</p>
<p>when he worked with Joshua Ferraro to install Koha as the NPL integrated</p>
<p>library system. In addition, he is listed as the contact for Koha
documentation</p>
<p>submissions. It seems, then, that he has a fairly significant personal
stake in</p>
<p>the success of Koha, and if the meeting minutes accurately capture his</p>
<p>statements about Evergreen, it sounds like he was interested in
dissuading</p>
<p>attendees from seriously considering Evergreen as an option.
<em>Subjectivity alert</em>: as one of the volunteer contributors of code,
documentation, install assistance, and a VMWare image of Evergreen from
outside GPLS, this quote got me pretty hot under the collar; I've
contributed to other open source projects, such as the Linux
Documentation Project and PHP, and you always have to prove that you
understand the project before being granted commit access.</p>
<p><strong>Correction update: 2007/06/26</strong>Wow. In the following paragraph, I
somehow made a very stupid mistake by incorrectly attributing Joshua
Ferraro of LibLime with making statements about Evergreen at that
meeting when he was not even present. All of the statements about
Evergreen should have been attributed to Stephen Hedges. I apologize
profusely to Josh and LibLime for this mistake.</p>
<p>Who was Stephen introducing as the guest speaker of honour on the
subject of</p>
<p>open source ILS options in libraries? The speaker was Joshua Ferraro,</p>
<p>president of LibLime, the company best known for offering commercial
support</p>
<p>for Koha. LibLime did announce that they would offer commercial</p>
<p>support for Evergreen, and have added sections about Evergreen to their
Web</p>
<p>site, so it would on the surface seem to be a logical choice to invite a</p>
<p>LibLime employee as a one-speaker-fits-all host to cover both Koha and</p>
<p>Evergreen. However, LibLime has a rather unusual relationship with
Evergreen.</p>
<p>It seems that LibLime has positioned Evergreen among their other
offerings as</p>
<p>such a high-end product that only a handful of potential customers would
qualify</p>
<p>for that market:</p>
<blockquote>
<div class="line-block">
<div class="line"><strong>Evergreen</strong></div>
</div>
</p><div class="line-block">
<div class="line">For consortia who need:</div>
</div>
<div class="line-block">
<div class="line">* Scalability to hundreds of libraries, tens of millions of
records</div>
</div>
<p>[<a class="reference external" href="http://liblime.com/products">LibLime Products</a>]</p>
<p></blockquote>
<p>It sounds impressive, but way too high-end for the vast majority of</p>
<p>libraries. So of course people browsing the LibLime Web site will focus
on the</p>
<p>Koha options instead. It seems like a deliberate bait-and-switch</p>
<p>move to attract libraries interested in Evergreen after the successful
launch in</p>
<p>Georgia, but to get them to buy support for Koha instead. Consider:
LibLime</p>
<p>has not contributed a single patch to the Evergreen development
(open-ils-dev)</p>
<p>mailing list. LibLime has not contributed a single line of documentation
to</p>
<p>the Evergreen wiki. LibLime does not include Evergreen among their
demos.</p>
<p>LibLime hasn't made an Evergreen sale. So I think it's a fair question
to ask</p>
<p>how committed LibLime really is to Evergreen - is LibLime's claim to
support</p>
<p>Evergreen just a means to get people in the door, in hopes that they'll
walk</p>
<p>out with a copy of Koha under their arms? I think so. You can come to
your own conclusions.</p>
<p>In case you think that Ohio quote was just an unfortunate one-off, and
that</p>
<p>I'm making a big deal about nothing, here's a more recent quote from the</p>
<p>Open</p>
<p>Source Session Q&A of the "Everything You Ever Wanted to Know about Open</p>
<p>Source" conference held on June 6th, 2007 that caught my attention (and</p>
<p>which apparently no-one in attendance at the meeting was capable of
providing</p>
<p>a rebuttal to):</p>
<blockquote>
</p><div class="line-block">
<div class="line">Q. Contrast Koha & Evergreen?</div>
</div>
<div class="line-block">
<div class="line"><br /></div>
</div>
<p>A. Major difference: Koha was grassroots: started w/rural libraries,
distributed organization, bottom-up decision making. Evergreen:
PINES library system; top-down decision making. Koha: 800 libs
worldwide, 8 years old; Evergreen: 1 year old, 1 consortium.</p>
<p></blockquote>
<p>So there's the comment about the "top-down" nature of Evergreen again,
and</p>
<p>this time Evergreen is being attacked for being immature and not very
widely</p>
<p>used. (Note: on that very day, the British Columbia Ministry of
Education</p>
<p>announced the <a class="reference external" href="http://pines.bclibrary.ca/">BC PINES Website</a> - so</p>
<p>another consortium is getting on board the Evergreen express.) If there
really</p>
<p>are 800 libraries using Koha, I'm shocked at how many basic install,
config,</p>
<p>and runtime problems are being reported on the Koha mailing lists with
the</p>
<p>current 2.2.9 release... but I'm getting off-topic. The speaker was</p>
<p>[STRIKEOUT:once again] Joshua Ferraro, who:</p>
<blockquote>
<p>... talked with us about open source integrated library systems,</p>
</p><p>specifically Koha and Evergreen, and about his company,</p>
<p>LibLime... [</p>
<p><p>href="<a class="reference external" href="http://blogs.umass.edu/ealling/2007/06/06/open-source-session-reflections/">http://blogs.umass.edu/ealling/2007/06/06/open-source-session-reflections/</a>">reference]</p>
</blockquote>
<p>If your definition of "talking about" is "praise the product that pays
your</p>
<p>bills and criticize the product that represents a major threat", then
mission</p>
<p>accomplished. You can't blame the speaker for being in a perfect
position</p>
<p>to pitch his product at the expense of a competing product, while being
credited</p>
<p>with being an objective authority on both products. But I suspect the
audience actually wanted a balanced presentation</p>
<p>about the two products.</p>
<p>So what's my point? <em>Know your sources</em>. If you invite someone to speak</p>
<p>on a broad topic, such as the State Library of Ohio meeting, where [t]he
invitation was expanded to include any</p>
<p>library interested in the possibility of open source integrated library</p>
<p>systems (ILS), you might want to ensure</p>
<p>that any personal biases are very much out in the open for your audience</p>
<p>(both the in-person audience and the audience reading the meeting
minutes at</p>
<p>home). If you're the speaker in such a situation, you should reveal any
such biases.</p>
<p>If you're a company selling a</p>
<p>product or services related to a product, perhaps it's inevitable that
the</p>
<p>profit motive is going to override ethics in such opportunities - but I
can</p>
<p>dream. If you read the full minutes from the State Library of Ohio
meeting,</p>
<p>you can see that in terms of an open source ILS option, Evergreen is
given</p>
<p>only the most cursory coverage and the major focus is on selling Koha.</p>
</div>
<div class="section" id="getting-a-fair-comparison">
<h2>Getting a fair comparison</h2>
<p>For a fair comparison of Koha and Evergreen, please consider either</p>
<p>hosting two separate presentations (you wouldn't consider asking
SirsiDynix</p>
<p>to give a balanced presentation on all of the proprietary ILS options,
would</p>
<p>you?), or try to find an independent speaker who can provide a more
objective</p>
<p>analysis of the products at hand. Ask the speaker if they have any</p>
<p>financial ties to the products at hand. Heck, has anybody asked Marshall</p>
<p>Breeding and Andrew Pace if they've had any financial ties to ILS
companies? I</p>
<p>assume the answer is no, but our community relies so much on their
analysis of</p>
<p>the overall library systems landscape with so much financial implication
for the</p>
<p>companies at question that it would be comforting to have a positive
assertion</p>
<p>accompany any "state of the ILS landscape" articles in the future.</p>
<p>Ideally, you would find a member of the development community for each
of</p>
<p>Evergreen and Koha. At the moment, I'm afraid that I can only qualify as
a</p>
<p>member of the Evergreen community, but I plan to become more familiar
with</p>
<p>Koha's codebase over the course of the summer - so maybe I can grow into
that</p>
<p>position. Of course, then you would have to trust me. C'mon, you can
trust me! **grin**</p>
</div>
<div class="section" id="make-technology-not-war">
<h2>Make technology, not war</h2>
<p>What would I like to avoid? I would really like to avoid negative energy
being</p>
<p>invested in a Koha vs. Evergreen or LibLime vs. Equinox battle royale.
That</p>
<p>doesn't interest me, but I'm sure it greatly interests the companies
offering</p>
<p>proprietary products. Instead, I hope that this energy can continue to
be</p>
<p>invested in making both Koha and Evergreen better by those with the
technical</p>
<p>skills. Let's have a competition on product design and implementation,
rather than</p>
<p>on marketing spin or dirty tricks. Everyone benefits from strong open
source</p>
<p>library systems - even if you don't adopt an open source system, it
raises the</p>
<p>bar for the proprietary systems to differentiate themselves.</p>
</div>
Evergreen and the business case for choosing an open source ILS2007-04-22T13:56:00-04:002007-04-22T13:56:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2007-04-22:/evergreen-and-the-business-case-for-choosing-an-open-source-ils.html<p>Due to a sad event, <a class="reference external" href="http://infoservices.uwindsor.ca/ils/">Art Rhyno</a>
asked me to be his co-presenter at the <a class="reference external" href="http://odyssey2007.wordpress.com">OLITA Digital Odyssey
2007</a>. Our broad subject was
<a class="reference external" href="http://open-ils.org">Evergreen</a>, more specifically introducing the
Evergreen ILS to an audience that was aware of Evergreen's existence but
wanted to know more about it from both a technical …</p><p>Due to a sad event, <a class="reference external" href="http://infoservices.uwindsor.ca/ils/">Art Rhyno</a>
asked me to be his co-presenter at the <a class="reference external" href="http://odyssey2007.wordpress.com">OLITA Digital Odyssey
2007</a>. Our broad subject was
<a class="reference external" href="http://open-ils.org">Evergreen</a>, more specifically introducing the
Evergreen ILS to an audience that was aware of Evergreen's existence but
wanted to know more about it from both a technical and a business
perspective. I had two days' notice to prepare for the presentation, so
I split my time between polishing the <a class="reference external" href="/archives/122-Evergreen-VMWare-image-oh-so-close!.html">VMWare image of
Evergreen</a>
and creating the <a class="reference external" href="/uploads/talks/EG_business.pdf">slides for my presentation
(PDF)</a>.</p>
<p>Art gave a general introduction to open source development, told the
story of how Evergreen came about, and described its architecture and
the capabilities currently demonstrated on the in-production system at
PINES. Perhaps of most interest to the audience, Art talked a bit about
the direction that he's taking
<a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=scratchpad:acq_serials">Woodchip</a>,
the serials and acquisitions module based on <a class="reference external" href="http://ofbiz.apache.org">Apache
OFBiz</a> that the University of Windsor <a class="reference external" href="http://open-ils.org/blog/?p=80">has
agreed to develop</a> for Evergreen. No
pressure, Art <img alt=":-)" class="emoticon" src="/images/smile.png" /></p>
<p>Then the presentation was handed off to me. I started by asking for
demographic information from the audience; to no surprise, about half of
the audience of approximately 60 ran Horizon systems. Many of the
attendees in the audience paid more than $20,000 annually for support
and licensing costs. Most of the sites had the equivalent of one
full-time position devoted to the care and feeding of their current
library system.</p>
<p>The goals of my presentation were to:</p>
<ol class="arabic simple">
<li>Demonstrate that the library community has a strong culture of
self-support with respect to library systems (based on the volume of
email on our closed library systems mailing lists)</li>
<li>Suggest that the quality of official support we receive from our
closed library systems does not warrant the annual support fees we
pay</li>
<li>Point out that we already devote personnel to the care and feeding of
our closed library systems, so the refrain of "open source is like
getting a free kitten" is fine given that we're currently paying for
a dog of a closed system</li>
<li>Urge the audience to consider what a waste of money and time it is to
train staff to learn the proprietary API, templating language, etc
for a closed system when that knowledge can become useless if the
system is pulled from the market -- while investing money and time in
learning the API and templating language for Evergreen results in
reusable skills for your personnel because those are based on open
standards.</li>
<li>Let the community know the preliminary results of my evaluation of
the internationalization support offered by Evergreen</li>
<li>List some of the challenges that we face in achieving a wide adoption
of Evergreen</li>
<li>Notify the community that my VMWare image is really, truly, close to
being released and suggest that it would be a great way to get
started with Evergreen</li>
<li>Run a quick live demo with the VMWare image to prove that a full
install of Evergreen can scale down to running in a virtual machine
with 512M of RAM</li>
</ol>
<p>My self-assessment? I did not want to come across as an open source
zealot; rather, I wanted to point out where our current relationships
with our vendors are failing us and how open source can fill in some of
those gaps. Unfortunately, I feel that I probably veered a little too
much towards the rant side of the continuum a couple of times -- my
passion for this subject came through, no doubt, but it was perhaps a
little too strong.</p>
<p>I knew my presentation was text-heavy, but I didn't beat myself up too
much because a good visual presentation needs more than just a couple of
days to come together and I didn't have a variation of this already in
the can somewhere... this was brand new content. I was pleased that I
came up with and shared the visual image of <strong>migration ninjas</strong>. As the
closed vendors' licensing terms might prevent us from openly sharing
migration kits or migration how-tos, the “migration ninjas” would be the
community's system gurus who would slip into a library and perform the
secret, inhuman feats necessary to migrate from a closed system to an
open system.</p>
<p>I wasn't at all happy with my live demo. First, I failed to arrange with
the conference hosts to obtain an Internet connection, so the cover art
in the catalog and the Z39.50 copy cataloging in the staff client facets
of the demo were a bust. Second, while I knew it would be an exploratory
live demo, given that I had just achieved a full working install a few
days prior to the session, it's not very impressive for an audience to
watch a presenter fumbling around the command line in response to a
question about the API. Third, I failed to show off some really cool
features of Evergreen such as the shelf-browser (although without cover
art it wouldn't have been nearly as impressive). I tried firing up the
reports Web interface and failed. So, now that I have a working install,
I'll be able to prepare a much better live demo in the future - I just
hope that our audience didn't take away a bad impression from our
session on Friday.</p>
<div class="section" id="questions-from-the-audience">
<h2>Questions from the audience</h2>
<p>We had some good questions from the audience; here's what I can
remember. Please add more to the comments on this post, if you have
them!</p>
<div class="section" id="why-is-there-so-much-interest-in-evergreen-and-why-aren-t-we-hearing-much-about-koha">
<h3>Why is there so much interest in Evergreen and why aren't we hearing much about Koha?</h3>
<p><strong>Dan</strong> said something about how his first investigation of Koha
revealed evidence of classic MySQL dependencies and assumptions in the
codebase that, as a former product planner for IBM DB2 relational
database, made him cringe. Evergreen, in comparison, is built on
PostgreSQL which was reassuring. <em>I failed to note at the time that
Evergreen has been developed so that it can support other databases,
although some work would be required to convert to the SQL dialect and
full-text search required by the target database.</em></p>
<p><strong>Art</strong> mentioned that while Koha had been quite popular internationally
for the past number of years, it had not been as popular in North
America. Part of that reason may have been a severe scalability problem
that kicked in somewhere around 450,000 records. Dan suggested that
problem could be traced directly to MySQL 3 / 4, but that it might have
been alleviated in MySQL 5 (which Koha does not yet support). Art noted
that Koha ZOOM, using indexdata.dk's Zebra indexing engine, overcame
that performance problem but some extra care was required to commit
updates to the index.</p>
</div>
<div class="section" id="what-about-the-dangers-of-someone-forking-the-code">
<h3>What about the dangers of someone forking the code?</h3>
<p>In my opinion, we didn't really answer this question well. Art didn't
think that a fork was likely as Evergreen had been built with the
best-of-breed components and plenty of input from the PINES library
staff and community. What I should have added was that the ability to
create a fork of a project is actually a wonderful feature of open
source - it enables communities to route around projects that become
overly bureaucratic, or closed to new developers, or not interested in
input or exploring new directions.</p>
</div>
<div class="section" id="you-dan-talked-a-lot-about-the-benefits-of-a-system-built-on-standards-can-you-show-us-what-the-web-templating-language-looks-like">
<h3>You (Dan) talked a lot about the benefits of a system built on standards. Can you show us what the Web templating language looks like?</h3>
<p>I fumbled this one badly. I quickly brought up footer.xml, but that
doesn't contain any dynamic content so it was a bad example. I then
suffered from presentation brain and couldn't remember the word
“introspect” to demonstrate srfsh's ability to introspect its objects.
Finally I (lamely) showed an example of a srfsh API request.</p>
</div>
</div>
<div class="section" id="summary">
<h2>Summary</h2>
<p>I believe that a solid business case needs to be developed on a
library-by-library basis or on a consortial basis for migrations to
Evergreen. I think that my presentation provided some useful input to
those business cases, but in and of itself is not enough. Certainly, as
our own library considers its options in the coming years, we're going
to have to have a much more solid set of criteria before we can make any
decision. I encourage you to take what you can from the presentation and
improve, polish, and contribute your own analysis back to the Evergreen
community so that you can help other libraries make an informed
decision.</p>
</div>
Evergreen VMWare image -- oh so close!2007-04-18T00:11:00-04:002007-04-18T00:11:00-04:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2007-04-18:/evergreen-vmware-image-oh-so-close.html<p>Many of you know that I have been working on step-by-step instructions
for <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=installing_prerequisites_on_gentoo">installing Evergreen on
Gentoo</a>
on the official Evergreen documentation wiki. At the same time, I have
been working on using that documentation to create a VMWare image of
Evergreen -- this avocation dates all the way back to …</p><p>Many of you know that I have been working on step-by-step instructions
for <a class="reference external" href="http://open-ils.org/dokuwiki/doku.php?id=installing_prerequisites_on_gentoo">installing Evergreen on
Gentoo</a>
on the official Evergreen documentation wiki. At the same time, I have
been working on using that documentation to create a VMWare image of
Evergreen -- this avocation dates all the way back to the <a class="reference external" href="http://infoservices.uwindsor.ca/ils/?p=29">ILS
Symposium</a> hosted by the
University of Windsor in November, 2006. I owe endless thanks to miker,
berick, bradl, and phasefx from the Evergreen development team for all
of their assistance with my annoying questions over the past months.</p>
<p>Obligatory defining of terms: <em>What is a VMWare image, and why should I
care?</em> VMWare is a virtualization product (the "VM" stands for "virtual
machine"). <em>Virtualization</em> is a technology that allows you to run one
or more "guest" operating systems on top of a "host" operating systems.
So, let's say you're really interested in trying out Evergreen, but
don't have a spare computer to install Linux on, or don't have the time
or interest in learning how to compile packages from source on Linux, or
don't have much Linux experience -- you can install the free (zero
dollar, but not open source) <a class="reference external" href="http://vmware.com/products/server/">VMWare
Server</a> on any Windows computer,
download an Evergreen VMWare image to your computer, and start up the
Evergreen image. In less than an hour (assuming you have good banwidth
to download VMWare Server and the Evergreen image), you can have
Evergreen-running-on-Linux, running in a virtual machine on top of
Windows. That's the basic testing / evaluation test case for
virtualization, anyways. For some small libraries, this may in fact be
all that they need for a production library system -- but that's a
discussion for another blog post.</p>
<p>One more note on virtualization technology: there are other
virtualization options, like <a class="reference external" href="http://www.xensource.com/">Xen</a> or
<a class="reference external" href="http://bochs.sourceforge.net/">Bochs</a>. But VMWare is the 900-pound
gorilla on the scene, and it's what I happen to have the most experience
and success using, so that's why I'm working with it. But it's an open
community, so if you've got the skills to create images for other
virtualization software, go for it!</p>
<p>The good news is that Evergreen appears to be running cleanly on my
system. The OPAC works, albeit without any bibliographic entries at the
moment as I'm still pestering <strong>miker</strong> with questions about the MARC
record and holdings import process. But getting a working install seemed
like the more important first task. Importing holdings and patron
information is going to require different steps depending on which ILS
you are currently using, so this should be a reasonable starting point
for an image.</p>
<p>In my documentation, I haven't attached the exact set of configuration
files that I have used in the VMWare image, but I can do that if people
indicate that they are desired. If you have questions about anything
that seems missing from my documentation or why I made certain choices,
I would be glad to share that information with you and correct the docs.
But rather than supplying just the docs and config files, I suspect the
whole VMWare image would be more generally useful in the short term. I'm
guessing that most libraries interested in kicking the tires of
Evergreen don't want to spend a large chunk of their evaluation period
working out the installation kinks, but just want to get right to the
hands-on portion of the evaluation.</p>
<p>So, Sunday night I uploaded my first version of the image and shared the
URL with a few close contacts, asking them to flush out any bugs. Kudos
to <strong>dmcmorris</strong> for indirectly leading me to discover that I had missed
a minor dependency. Another upload last night, and I'm anxiously
awaiting the feedback from my comrades in arms. If all goes well, a
VMWare image of Evergreen should be available for download by the end of
the week. <em>Crossing fingers...</em></p>
Evergreen internationalization chat2006-11-17T05:11:00-05:002006-11-17T05:11:00-05:00dan@coffeecode.net (Dan Scott)tag:coffeecode.net,2006-11-17:/evergreen-internationalization-chat.html<p>I managed to corner Mike Rylander after Brad Lajeunesse waved his hands
in surrender and offered Mike up as a sacrifice to my questions about
Evergreen's support for internationalization. If you're travelling to
Canada to tout a piece of (or multiple components of) software, you can
be sure that somebody …</p><p>I managed to corner Mike Rylander after Brad Lajeunesse waved his hands
in surrender and offered Mike up as a sacrifice to my questions about
Evergreen's support for internationalization. If you're travelling to
Canada to tout a piece of (or multiple components of) software, you can
be sure that somebody in the crowd is going to be interested in knowing
how capable that software is of supporting a bilingual community. As
Laurentian University is a bilingual institution, I took it upon myself
to be "that guy" and grill Mike a bit on that topic. The good news is
that he survived the grilling, and didn't earn the nickname "pork chop";
the better news is that it sounds like Evergreen hits most of the
internationalization requirements on the head.</p>
<ul>
<li><p class="first">OPAC interface can be multilingual; Georgia has a large Spanish
community and PINES is in the process of translating the OPAC
interface into Spanish</p>
</p><ul>
<li><p class="first">Sorting results alphabetically (for browsing by author / title) is
problematic, however:</p>
</p><ul class="simple">
<li>PostgreSQL doesn't have a good locale implementation for
collating sequences</li>
<li>Probably not as much of an issue for French / English as it
would be for Finnish</li>
</ul>
</li>
</ul>
</li>
<li><p class="first">Search currently ignores diacritics (e == é == è), but this setting
can be changed in TSearch2</p>
</li>
<li><p class="first">Subject heading equivalency is possible for the simple use case of
"when I search for History--United States--19th century, also show me
records with Histoire--Les Etats Unis--19e annee" (or whatever the
real LCSH/RVM equivalence would be)</p>
</p><ul class="simple">
<li>This possibility is based on authority records containing both
sets of headings -- we can probably rely on / or possibly
participate in the EU project to generate equivalence for LCSH,
RVM, and German subject headings to seed this data</li>
</ul>
</li>
<li><p class="first">Staff client is mostly multilingual-ready (hasn't been a priority
requirement for PINES):</p>
</p><ul class="simple">
<li>Most strings are contained in XML files, but there are still
pockets of hardcoded strings</li>
<li>Switching the locale would immediately load the new strings in the
staff client interface</li>
<li>"JavaScript doesn't have a good sprintf() implementation" -- check
to see whether this suggests that token order can't be rearranged.
LibX seems to manage to be able to do this.</li>
</ul>
</li>
<li><p class="first">Forgot to ask about boolean operators (e.g. AND / ET, OR / OU)</p>
</li>
</ul>