Detailed depictions with IIIF, Wikidata and Wikimedia Commons

Extract from “High Street Oxford.” Ashmolean Museum WA2016.48

The International Image Interoperability Framework (IIIF) is a standard, developed by a consortium including the Bodleian Libraries, that allows images and associated metadata to be shared across the web. It’s used by many sites including Digital Bodleian and Wikimedia’s image server, Wikimedia Commons.

As of November this year, Wikidata can point to the IIIF manifests associated with a digitised object (example near the foot of this page). However, the opportunity of Wikidata and IIIF is not just about discoverability of the IIIF data itself. Included in IIIF is the ability to address a specific rectangular region of an image with a URL. Wikidata can use this to express statements about part of an image

Anyone familiar with Turner’s “High Street, Oxford” will recognise several landmarks included in the scene. In this sense, there is a lot of structure in the image that is obvious to humans but not naturally captured in the painting’s digital representation (image + catalogue record). My mission, should I choose to accept it, is to express in open data not just that the painting depicts the Church of St. Mary the Virgin but that a specific part of the image depicts the church. Continue reading

A global collection of astrolabes in linked open data

I previously wrote about how easy it is to describe a GLAM collection item in Wikidata: it’s quicker than writing a blog post in WordPress and the resulting data are endlessly reusable. This time I’ll go into more detail about using Wikidata’s interface to describe items from museum collections, and announcing a new tool to browse the aggregated collection.

The Museum of the History of Science recently shared catalogue data about its outstanding collection of 165 astrolabes on Wikidata. Although Wikidata already had the power to describe astrolabes, very few had been entered, so this donation is a huge leap forward. If nothing comes to mind when I say “astrolabes”, here’s an image gallery generated by a query on Wikidata.

I’m going to take a random entry from David A. King’s “A Catalogue of Medieval
Astronomical Instruments” and describe it in Wikidata. Having checked that it isn’t already there, I click “Create new item” on the left hand side of any Wikidata page. At first I’ll be asked for a name and one-line description in my chosen language.

Continue reading

Translating a blog post into structured data

Timur Beg Gurkhani (1336-1405) plays a small role in our story. Public domain image via Wikimedia Commons

Recently my Bodleian colleague Alasdair Watson posted an announcement about an illuminated manuscript that is newly available online. To get the most long-term value out of the announcement, I decided to express it as Linked Open Data by representing its content in Wikidata. This blog post goes through that process. Continue reading

Some ways Wikidata can improve search and discovery

I have written in the past about how Wikidata enables entity-based browsing, but search is still necessary and it is worth considering how a semantic web database can be useful to a search engine index.

This post is about three ways Wikidata could help search and discovery applications, without replacing them: 1) providing more or less specific terms (hypernyms and hyponyms), 2) providing synonyms for a search term, 3) structuring a thesaurus of topics to provide meaningful connections. I end with the real-world example of Quora.com who are using Wikidata to manage a huge user-generated topic list.

Hypernyms and hyponyms

Continue reading

A Reconciliation Recipe for Wikidata

We have a list of names of things, plus some idea of what type of things they are, and we want to integrate them into a database. I have been working on place names in Chinese, but it could just as well have been a list of author names in Arabic. This post reports on a procedure to get Wikidata identifiers — and thereby lots of other useful information — about the things in the list.

To recap a couple of problems with names covered in a previous post:

  • Things share names. As covered previously, “cancer” names a disease, a constellation, an academic journal, a taxonomic term for crab, an astrological sign and a death metal band.
  • Things have multiple names. One place is known to English speakers as “Beijing”, “Peking” or as “Peiping”. Similarly, there are multiple names for that place even within a single variant of Chinese.

There are some problems specific to historic names for places in China: Continue reading

Deletion is not the end: making an academic article stick on Wikipedia

Identity fusion is a concept central to a lot of research in social psychology and cognitive anthropology. So it is understandable that a member of an anthropology research group wrote an explanation of this concept for Wikipedia, explaining the idea to the widest possible audience and citing the key papers.

Unfortunately, writing an article and getting it accepted by Wikipedia are different things. The draft was rejected multiple times and eventually deleted, removing hours of work. Many academics have at least heard of a similar experience and it can be very discouraging. However, these stories can have a happy ending. We were able to get the draft back and post it as an article where it became one of the top two search engine hits for its topic. This article is about that process, and what academics can do to make sure their articles are accepted by Wikipedia. Continue reading

Semantic data and the stories we’re not telling

One of my earliest memories of television was James Burke’s series Connections. It was fascinating yet accessible: each episode explored technology, history, science and society, jumping across topics based on historical connections or charming coincidences. One episode started with the stone fireplace and ended with Concorde.

In a digital utopia, we would each be our own James Burke, creating and sharing intellectual journeys by following the connections that interest us. We are not there yet. Many very valuable databases exist online, but the connections between them are obscured rather than celebrated, and this is an obstacle for anyone using those data in education or research. In a previous post I described the problems that come from the fact that things have different names in different databases, and described a semantic web approach to link them together.

Building on this approach, web applications can help people create their own stories; choosing their own path through sources of reliable information, building unexpected connections. In this post I describe three design principles behind these applications. Let’s start with a story.

Continue reading

Creating Wikipedia articles from research data

Hillfort images shared on Wikimedia Commons

The Atlas of Hillforts of Britain and Ireland is a collaboration between the Universities of Oxford, Edinburgh and Cork, funded by the Arts and Humanities Research Council. It provides a definitive list of hillfort sites in the British Isles- more than four thousand in total. As well as publishing a lot of fieldwork done by expert archaeologists, the site uses crowdsourcing, in that some of the sites were visited by volunteer investigators. The site invites users—expert or amateur—to submit their own photographs of the hillforts.

The Atlas launched in June 2017 and generated national media coverage. An issue for any newly-launched site is how to get incoming links from other sites; how to plumb the site into the existing paths by which people find information. This case study describes how, by sharing selected data from the Atlas, we were able to create thousands of incoming links from Wikipedia and related apps and sites, and to encourage the creation and use of hillfort articles in Wikipedia. Continue reading