Wednesday, August 1. 2012
John Mark Ockerbloom recently said, while trying to buy a DRM-free copy of John Scalzi's Redshirts on the Google Play Store: The catalog page doesn’t tell me what format it’s in, or whether it has DRM; it instead just asks me to sign in to buy it. I noted that if you read the full description (you have to click More), it actually does say:
At the publisher's request, this title is being sold without Digital Rights Management software (DRM) applied.
And that lead to the realization you search for the magic phrase: this title is being sold without Digital Rights Management software, you'll find plenty of books up for sale on Google Play without DRM. When you buy one, you can then download the ePub from the How to Read tab for the book. Pretty straightforward, no?
Monday, June 11. 2012
Update 2012-06-19: And here's how to implement stream-oriented XML parsing
Many academic libraries are already generating electronic resource holdings summaries in the Google Scholar XML holdings format, and it seems to provide most of the metadata you would need to provide a discovery layer summary in a nice, granular format... but unfortunately EBSCO doesn't want that for their Discovery Service. They want a tab-delimited file with just a few fields, and they don't want to go and fetch the Google Scholar XML holdings file and parse it themselves--even though that would seem to be a nice way to avoid having to teach each potential library client how to export holdings in their own desired input format. Lots of libraries don't expose their Google Scholar XML holdings publicly for some reason; I don't get why not, but have to investigate locally...
That gave me the excuse and opportunity to go off and invest some time in learning something new. I could have whipped up a script in Perl or Python or PHP, or written an XSL transform, but I opted to try out Go for this task. I've been introduced to Go twice in the past two years and was impressed by the language, but there's only so much you absorb in a one-hour workshop, and unless you need to get real work done, you never really learn a programming language.
Thus, I present to you my first crappy Go program: eds_holdings.go As I wrote this, I noted:
- I appreciate the clear reference documentation at http://golang.org but it would really benefit from more inline examples. I ended up writing the XML parsing portion of this using xml.Unmarshal primarily because there was an example for it. Of course, Unmarshal parses the entire document into structures in memory at once... I knew that wasn't what I wanted, but for whatever reason I didn't find any mention of "SAX" or "event" or "pull" that would lead me to a stream-oriented, low-memory parsing option on the page.
- However, the #go-nuts IRC channel on Freenode gave me a reply within minutes, pointing me at xml.Decoder for that purpose. Which is really great - a supportive community is critical. My only problem is that without a simple example to follow, I didn't want to dive into rewriting my quasi-working code, so I ploughed onward.
- My current approach to I/O is far from optimal. Not only am I parsing the entire XML structure in memory, I'm also reading the entire XML file into memory to begin with (via ioutil.FileRead as a natural outgrowth of my "begin by parsing a hard-coded string"). Don't follow my example! Please point me at a better example

- The standards and consistency wonk in me likes that Go delegates whitespace wars to go fmt <filename>. Problem solved and time-wasting arguments averted for everyone.
- Go is (subjectively) fast and for all of my worrying about in-memory work, it was pretty lean -- at its maximum, consuming less than 200 MB of RAM while iterating over a 32 MB XML file. For comparison, Firefox was consuming around 750 MB the entire time.
In summary: I enjoyed writing in Go, and hope to find an excuse to do it again. Also, I have much to learn!
Tuesday, November 22. 2011
Preface: I'm talking to my daughter's kindergarten class tomorrow about my job. Exciting! So I prepped a little bit; it will probably go entirely different, but here's how it's going to go in my mind...
My name is Dan Scott. I’m Amber’s dad. I’m a systems librarian at Laurentian
University.
Today you’re going to learn what a systems librarian does. Exciting, eh?
I bet you have all been to a library. When you think about a library, what do
you think of first?
Yep - books! I do a lot of work with books! Can you guess what sorts of things
I do with books?
-
Choose which books to add to the library: Selection
-
And when we get new books, where do we put them? : Organization
-
And when we have too many books, choose which books to give away: Weeding
But there’s more to libraries than books! What else can you think of that are
in libraries?
-
Movies
-
Music
-
Magazines
-
Newspapers
-
Puppets
-
Computers
Ah, computers. That’s where I really spend a lot of time. When I was a little
boy, I was a voracious reader - I would read anything, including cereal boxes
and encyclopedia - and I was fascinated by computers. Completely obsessed.
So, naturally, when I went to high school I also got a job at the children’s
library as a "computer page". I was a big kid, and I helped all the little kids
use the computers at the library.
Now that I’m grown up, I’m still doing pretty much the same thing - except now
I’m helping the adults use computers. Except now I’m helping them by making it
easier for them to get the books or magazines or music or movies or puppets
(yes! puppets!) that they need; and a lot of the time, they don’t even have to
come to the library. They can read or watch or listen to whatever they need
right on their computers - and sometimes they need help, but that’s what I’m
there for.
Thursday, May 26. 2011
Since the announcement of the new v1 Google Books API, I've been doing a bit of work with it in Python (following up on my part of the conversation). Today, Google announced that many of their older APIs were now officially deprecated. Included in that list are the Google Books Data API and the Google Books JavaScript API. These APIs will be retired as of December 1, 2011. (Thanks to jgeerdes in the #googleapis IRC channel for the heads up today).
There already has been some outrage expressed over the switch to new APIs; five months is not a lot of time to shift gears if you've built a significant architecture on top of the old APIs. But I have some sympathy for Google, in this case; the new "Discovery" APIs are based on a common, consistent architecture that will be easier for them to document, maintain, manage, and ... monetize, of course. (Good time for full disclosure, I suppose: I am a Google stockholder.)
So far, the only major concern I have with the new v1 Google Books API is one missing function that was available in the Data API: the ability to do a full-text search of a custom bookshelf. Accordingly, I've filed a bug in the AJAX APIs issue tracker. Here's hoping that the deprecation of the old APIs enables Google to focus on their anointed APIs on all fronts: documentation, features, and support. Bug 587 should be a good testcase.
|