→ Cet article en Français
I’ve been experimenting with a way to show how Wikidata represents knowledge; specifically how it makes pathways out of relationships between things. In a previous post I wrote about how Wikidata’s representation enables new pathways between entities. Since those pathways link into a giant web they offer new ways to discover existing collection objects. Now that I have been describing Oxford’s GLAM collections on Wikidata, we can show concrete examples of this expanding knowledge graph.
Normally with Wikidata we specify properties and get results that are identifiable things. For example if we ask for “female historians born in the 1730s with a biography in Electronic Enlightenment”, we get Catherine Macaulay. Here I’m using queries that specify a group of things and request the properties connecting them. So we get a tiny fragment of the Wikidata knowledge graph (which right now has just over 54 million people, places, publications, object and concepts). We can see how different kinds of data (biographical, bibliographic, and catalogue data) are combined in the same model. I’ve captured these graphs as screenshots, but I recommend clicking through to the live query where you get a draggable, stretchy graph. Continue reading
Extract from “High Street Oxford.” Ashmolean Museum WA2016.48
The International Image Interoperability Framework (IIIF) is a standard, developed by a consortium including the Bodleian Libraries, that allows images and associated metadata to be shared across the web. It’s used by many sites including Digital Bodleian and Wikimedia’s image server, Wikimedia Commons.
As of November this year, Wikidata can point to the IIIF manifests associated with a digitised object (example near the foot of this page). However, the opportunity of Wikidata and IIIF is not just about discoverability of the IIIF data itself. Included in IIIF is the ability to address a specific rectangular region of an image with a URL. Wikidata can use this to express statements about part of an image
Anyone familiar with Turner’s “High Street, Oxford” will recognise several landmarks included in the scene. In this sense, there is a lot of structure in the image that is obvious to humans but not naturally captured in the painting’s digital representation (image + catalogue record). My mission, should I choose to accept it, is to express in open data not just that the painting depicts the Church of St. Mary the Virgin but that a specific part of the image depicts the church. Continue reading
Wikidata identifiers (Q-numbers) for common objects. Public Domain image by Bleeptrack
Wikidata celebrated its sixth birthday on Monday, with celebrations, “data-thons” and cake around the world. Things move quickly in the world of Wikidata, so it’s time for a sequel to my round-up from earlier this year.
I previously wrote about how easy it is to describe a GLAM collection item in Wikidata: it’s quicker than writing a blog post in WordPress and the resulting data are endlessly reusable. This time I’ll go into more detail about using Wikidata’s interface to describe items from museum collections, and announcing a new tool to browse the aggregated collection.
The Museum of the History of Science recently shared catalogue data about its outstanding collection of 165 astrolabes on Wikidata. Although Wikidata already had the power to describe astrolabes, very few had been entered, so this donation is a huge leap forward. If nothing comes to mind when I say “astrolabes”, here’s an image gallery generated by a query on Wikidata.
I’m going to take a random entry from David A. King’s “A Catalogue of Medieval
Astronomical Instruments” and describe it in Wikidata. Having checked that it isn’t already there, I click “Create new item” on the left hand side of any Wikidata page. At first I’ll be asked for a name and one-line description in my chosen language.
Wikidata is a very rapidly developing topic, and month by month it is becoming a more mainstream part of how library and other cultural data are curated and accessed. This is a round-up of some recent activities and publications.
Timur Beg Gurkhani (1336-1405) plays a small role in our story. Public domain image via Wikimedia Commons
Recently my Bodleian colleague Alasdair Watson posted an announcement about an illuminated manuscript that is newly available online. To get the most long-term value out of the announcement, I decided to express it as Linked Open Data by representing its content in Wikidata. This blog post goes through that process. Continue reading
I have written in the past about how Wikidata enables entity-based browsing, but search is still necessary and it is worth considering how a semantic web database can be useful to a search engine index.
This post is about three ways Wikidata could help search and discovery applications, without replacing them: 1) providing more or less specific terms (hypernyms and hyponyms), 2) providing synonyms for a search term, 3) structuring a thesaurus of topics to provide meaningful connections. I end with the real-world example of Quora.com who are using Wikidata to manage a huge user-generated topic list.
Hypernyms and hyponyms
We have a list of names of things, plus some idea of what type of things they are, and we want to integrate them into a database. I have been working on place names in Chinese, but it could just as well have been a list of author names in Arabic. This post reports on a procedure to get Wikidata identifiers — and thereby lots of other useful information — about the things in the list.
To recap a couple of problems with names covered in a previous post:
- Things share names. As covered previously, “cancer” names a disease, a constellation, an academic journal, a taxonomic term for crab, an astrological sign and a death metal band.
- Things have multiple names. One place is known to English speakers as “Beijing”, “Peking” or as “Peiping”. Similarly, there are multiple names for that place even within a single variant of Chinese.
There are some problems specific to historic names for places in China: Continue reading
Identity fusion is a concept central to a lot of research in social psychology and cognitive anthropology. So it is understandable that a member of an anthropology research group wrote an explanation of this concept for Wikipedia, explaining the idea to the widest possible audience and citing the key papers.
Unfortunately, writing an article and getting it accepted by Wikipedia are different things. The draft was rejected multiple times and eventually deleted, removing hours of work. Many academics have at least heard of a similar experience and it can be very discouraging. However, these stories can have a happy ending. We were able to get the draft back and post it as an article where it became one of the top two search engine hits for its topic. This article is about that process, and what academics can do to make sure their articles are accepted by Wikipedia. Continue reading
One of my earliest memories of television was James Burke’s series Connections. It was fascinating yet accessible: each episode explored technology, history, science and society, jumping across topics based on historical connections or charming coincidences. One episode started with the stone fireplace and ended with Concorde.
In a digital utopia, we would each be our own James Burke, creating and sharing intellectual journeys by following the connections that interest us. We are not there yet. Many very valuable databases exist online, but the connections between them are obscured rather than celebrated, and this is an obstacle for anyone using those data in education or research. In a previous post I described the problems that come from the fact that things have different names in different databases, and described a semantic web approach to link them together.
Building on this approach, web applications can help people create their own stories; choosing their own path through sources of reliable information, building unexpected connections. In this post I describe three design principles behind these applications. Let’s start with a story.