Social Basil

Basil pot tweeting every few hours (thanks to a Raspberry Pi)

Source code

Twitter account

Building web services for the new Mobile Oxford

We began to build an entirely new version of Mobile Oxford to replace the old one which had built up technical debt over the years and with the rapid changes in smartphone technology, was unable to target our market appropriately (more details in a different blog post).

The new architecture is centered around an API providing data to a JS application (client). We decided to build our API over HTTP (RESTful-ish), serving JSON as our primary representation and architecting in such a way that it is easy to extend this to other representations.

The API presents information about different domains in the university, some of which are already provided by the current Mobile Oxford and some new ones as well. After analysing our use cases, we found that most were search problems and as such we organised as a search solution, for example:

  • searching for people
  • search libraries (books)
  • search graduate courses
  • search for places

Towards a generic API

The current version of Mobile Oxford is already consumed by other clients (such as the Blavatnik School of Government internal iPad application), as the system was not originally designed for this purpose, it provided a sub-optimal experience for those trying to integrate with it (a JSON output of a template context for an HTML page).

With that information in mind, we decided that our API should be generic enough to be consumed by other applications as well, Mobile Oxford being the first consumer for it. Being an aggregator and an integrator of many disparate systems in the university, it makes sense for us to be able to output these to others in a consistent, easy to use form, and vendor agnostic.

This of course raises some issues, we now have to carefully consider the clear separation between data and our client (versus having a "private" API tailored for our client and a public one, which we currently do not feel the need to have).

Providing links in our API

As we want our API to be easily usable and understandable, we investigated options to embed qualified links between resources in our API (hypermedia). Various options are available (JSON-LD, OData, Collection+JSON...) and we made the choice of HAL (Hypertext Application Language) because it makes a clear separation between properties of your resources and metadata (links) by using a standard syntax, which is easy to understand and not coupled to JSON.

You can visit api.m.ox.ac.uk to have an overview of the API using the HAL browser, a developer friendly browser to view resources and follow links (similar to the Google API Explorer).

For example, our API provides Points of Interest (POI) around Oxford, some of them having real-time information, a client can follow the link rti to discover an associated resource. Alternatively, you can get the parent POI by following the link parent.

Our code is open-sourced on GitHub as "Moxie", this is a work in progress and we will publish more technical details in a different blog post soon.


Cross-posted from the Mobile Oxford blog

Building lightweight HTTP services in Java

Our telecoms projects requires access to a SOAP Web Service called Cisco "CUCM" Administrative XML, used to manage Cisco IP phones. Our project being a Django (Python) application, we used the de facto SOAP library for Python called Suds.

Unfortunately, we encountered many problems as the CUCM service is composed of a lot of custom types, and we reached a point where suds was not good enough (except by starting to use a customised version of suds such as what is discussed here). Having difficulties to find an alternative in Python, we began to look at alternatives on different platforms.

Java comes to the rescue

The Java platform is well-known for working with SOAP web services, and the tooling seemed to be appropriate and free to use. Generating an usable client to consume the SOAP web service is very easy (using wsimport) but the main question was: how do we expose the web service in a better way?

Enter Dropwizard, a Java framework to develop HTTP services, quite popular (originally developed by Yammer) and very easy to learn. It is very pragmatic and contains everything needed out-of-the box (configuration, logging, deployment) to build HTTP/JSON services.

After having spent days trying to access the web service in Python, we got an initial working version in a matter of hours. We now expose entities and methods doing exactly what we need, exposing just what is needed, and ready to be consumed by our front-end Django application.

The component is really easy to package and deploy, Dropwizard uses a shaded ("fat") JAR containing Jetty web server which can be run directly.

A good tradeoff?

Although it doesn't seem ideal at first sight, introducing this middleware seems to be a good tradeoff, also allowing us to have a better separation of concerns and easily move part of the service on a different machine if need be. Having a separated component also means that we can open-source it to be potentially reused.

We are quite happy with the result as we have been able to develop quite rapidly an efficient service, you can see the result of our work on GitHub, this project is actively developed and may change in the near future.

Dropwizard appears to be a growing trend away from the J2EE days, the following article is an interesting read : J2EE is dead.


Cross-posted from the Mobile Oxford blog

Kindle Clippings to RDF

Simple script to structure data from the clipping of Kindle to RDF/XML.

Source code

Thoughts on Kindle

I have been using a Kindle for a few months, and I am just starting to enjoy it.

What I like:

  • Integrated dictionnary, having a word definition just by selecting it is GREAT!
  • Being able to read "offline" articles from the internet, thanks to the great Instapaper service (select articles that you want to read with a bookmarlet, then generate a mobi file to put in your Kindle).
  • Kindle is not just an e-reader, it is a platform. Last version of the software and its "social" features (such as following "friends" favorite quotes or reading activities) looks like a good start.

What could be improved:

  • Playing with "Named entities": (both of them are possible regarding current state of the art of named entities recognition)
  • Same principle than dictionnary, open an encyclopedia (or Wikipedia) when it is not a word from a dictionnary (e.g. display context informations for places name, people...)
  • Contextual navigation in the text itself (e.g. display informations about characters. By the way, "Extracting social networks from literary fiction" is a very interesting paper!)
  • Exporting annotations: it would be great to be able to export clippings (highlights and notes) to a web service (or at least in a easier way than doing some not very beautiful things...)
  • Layout and typography: sometimes it gets horrible.
  • PDF reader! reading research papers in two columns is nearly impossible (someone tried to create a converter to ebook, but it doesn't work very well). It asks the question of the separation of content and format, also addressed in a very interesting article, "Taking scientific publishing to the next level".
  • How to define universally the position in a text? (Amazon is introducing a "real number page" in its Kindle but I'm not sure of its value as you can change font, size and screen orientation...). I'm interested by that question for a long time but I couldn't find any relevant literature on that topic...

Very interesting post by O'Reilly Radar too: "the future of the book".