Wikimedia for public engagement

By Dr Martin Poulter, Wikimedian in Residence at the University of Oxford

A takin is a Himalayan goat-antelope with whom I feel a personal connection, and the reason goes back to an event I attended in 2011. The wildlife charity ARKive had allowed some of its descriptions of threatened species to be copied into Wikipedia. After presentations introducing both ARKive and Wikipedia, we split up the room. One table took birds, another took lizards, and I must have been among the mammals. After carefully reading what ARKive and Wikipedia said about the takin, I found a couple of sentences that could be copied from one to the other. Everyone in the room made a small but concrete improvement to their target Wikipedia article.

The trainer at the event, Andy Mabbett, thanked me afterwards with a message through Wikipedia. Making that change, and being recognised for it, connected me to the topic that a film or lecture could not. Somehow the takin had become my endangered mammal. People had turned up with a general curiosity about threatened species, engaged with the question of how to describe a specific species and had a positive experience with a peer-reviewed source – the ARKive site.

How do we create similar events where people are not just informed about a topic or a resource, but engage with it in a way that makes a lasting impression? Here are some suggested requirements for a public engagement event:

  • a collaborative task around a topic;
  • that requires thinking and reading, but not expertise, so anyone can take part;
  • that can be broken down into small chunks, identified beforehand;
  • that can be done in-person or remotely;
  • with a way to track individual contributions. We want to thank and reward contributors, and it’s also useful to assess the quality of their work. For a big, long-term project we might want something like a leader-board or a participation award.

Wikimedia platforms

Wikipedia and its sister projects are ideal platforms for meeting the above criteria. They

  • cover all academic subjects;
  • support collaboration between experts and non-experts;
  • have various tools to generate lists of “targets”: things that need improvement;
  • can be accessed by anyone with an internet connection;
  • have contributor records which publicly show what changes each user has made, even allowing ‘thank you’ messages for individual changes.

Perhaps most relevant is that Wikimedia resources do not exist in isolation but are derived from something else. A fact in Wikipedia or Wikidata needs to be backed up with a citation of a reliable source. A photo in Wikimedia Commons needs a description of where it was taken, or a citation of the collection it is drawn from. A transcribed text in Wikisource needs a pointer to the page scans that were transcribed, and ultimately to the physical copy of the book. So a Wikimedia event is always necessarily about a Wikimedia project and something else: a scholarly site or database, or physical exhibits, books or artworks.

Four Wikimedia projects hold distinct types of information about the same subject.

The best-established type of public event is a Wikipedia editathon, in which visitors are invited to write Wikipedia articles. Newcomer participants in editathons usually achieve little, because a lot of time and thought is needed to get to grips with Wikipedia’s interface, with Wikipedia’s culture and norms, and with the sources they will be using. Editathons can be very productive if participants are confident wiki editors, but that confidence does not come immediately. Fortunately, the Wikimedia projects offer simpler, less demanding ways for the public to engage with a subject.

An example: a Wikisource transcribe-a-thon

Looking to create events for the Ada Lovelace Bicentennial in 2015, I read about Mary Somerville, a 19th century mathematician and scientist who tutored Lovelace and for whom Somerville College is named. I could find none of Somerville’s works in electronic form, but some were available as scanned documents in the Internet Archive. This suggested how we could engage an audience interested in women scientists.

Wikisource is a platform for sharing and connecting out-of-copyright or freely-licensed text. Wikipedia’s article about Lord Byron summarizes his life, with brief mentions or quotes from his work. Wikisource, on the other hand, has a brief description of who he was and the full text of many of his poems and other works. Naturally, the two profiles link to each other. Most works on Wikisource come from scanned books which have been put through Optical Character Recognition (OCR) and then manually fixed. Each page has to be checked and approved by at least two different users before it is considered “validated” and ready for public readers.

Attendees at the event were given a shortened URL for the transcription and each got a post-it note with the page number that they should fix. They adapted to Wikisource at different rates, but that worked out fine because the quicker, more confident people checked and tweaked the work of those who made slower progress. In two hours, we got through one paper and a large proportion of a book by Somerville. After some further checking, these texts were linked from the front page of Wikisource. Feedback on the event was very good: participants recognised they were doing something important; not just learning about Somerville but helping to republish her work. The “transcribe-a-thon” format has been repeated as a conference session.

A transcription project on Wikisource: pages are yellow when they’ve been approved by one user, and green for two users.

A transcribe-a-thon needs some careful preparation in choosing the text, importing it into Wikisource and preparing it for transcription. The import process on Wikisource is documented, but not very intuitive, even for experienced wiki editors. Not all scans are suitable: if the images are poor quality, the OCR does not produce usable text and if there is non-standard text such as mathematical formalism, transcription will be too difficult for newcomers. A little more work is necessary once all the pages are validated, to assemble them into a single work.

Photographs and WikiShootMe

Wikipedia articles about a place or building usually have a geographical point, defined by latitude/ longitude pairs, attached to them in a machine-readable way. For example, the article about Oxford has coordinates that correspond to the central junction at Carfax. Wikidata, another sister project, has many more entries with locations, for items such as listed buildings. On some historic streets, almost every building has a Wikidata entry.

WikiShootMe is an online mapping tool that shows these articles and Wikidata items, colour-coded according to whether they have an image. It also allows users to upload images, but they need to register an account first on Wikimedia Commons. The images do not have to be professional quality, and photos taken with a smartphone or cheap digital camera are often suitable. As more images are uploaded, red dots disappear from the map.

A WikiShootMe scan around the North end of St. Giles, Oxford

So for an event or campaign that gets people engaged with local history, public art, or architecture, the group can decide on places to photograph and describe, then go to the location, and either upload their images from home or return to a central computer room and transfer images from their devices.

A tip to monitor contributions: when uploading an image, users are prompted for image categories. If they all add the same category, then it is possible to track images uploaded with that tag using PetScan (explained below). However, categories are case-sensitive so you have to make sure people type the category tag exactly as instructed. Commons helps by colouring the text red if it does not correspond to an existing category.

Instructions to attendees:

  • Create an account on commons.wikimedia.org
    • A Wikipedia account will work if you already have one of those.
  • Open in another tab
  • Click on the red dot on the map where your photo was taken
  • Press the button to ‘Authorise uploading’
  • Click ‘Allow’. This will permit WikiShootMe to accept your photo, and return you to the map.
  • Navigate through the map to the red dot again. This time when you click the dot, the button says ‘Upload image’
  • Select the image on the computer or device
  • Give the image a title, description (say what you’ve photographed, e.g. the address of the building) and date.
  • In the Category box, type Buildings in Oxford with that exact capitalization.

PetScan is a tool for customised queries  If you are running an event or campaign where people create or upload images to a given category, use this procedure to get an overview of their contributions.

  • Go to https://petscan.wmflabs.org/
  • Click ‘Commons’
  • Enter the category “Buildings in Oxford”
  • Select the Page properties tab and click the checkbox next to File.
  • Select the Output tab, then choose Sort by date, Sort order descending.
  • Click ‘Do it!’ and on the resulting page, bookmark the link ‘for the query you just ran’.

This gives you a list of images in the category, most recent additions first. Clicking on an entry in the list will take you to the full description of that file, including the user profile of the uploader.

Other quick ideas

  • Use a biographical source to add individual facts, such as universities attended or birthplaces, to Wikidata entries or Wikipedia articles.
  • Examine a free image source (e.g. Europeana’s World War I collection) and find Wikipedia articles that the images can illustrate.
  • Search through audio archives for short clips that can be uploaded to Commons and embedded in Wikipedia articles about a person or event.

This post licensed under a CC-BY-SA 4.0 license

Value, metrics and action in publishing data

The funding community and other proponents of Open Science and Open Data have been trying to persuade the mainstream research community to publish their data for some time with only partial success [1].

A key problem is that, although the arguments for doing so are logical – research becomes more reproducible, data can be cited and re-used, opportunities for cross-domain cooperation are increased, and so forth – concrete underlying evidence has until recently been in quite short supply, with a resulting lack of engagement from the wider research community.

It’s been possible to argue for a while that linking an open dataset to a primary publication is correlated with increased citation rates (of up to 30%) [2]. But this still doesn’t draw attention to the dataset itself. Researchers are busy and need to optimise their behaviour towards activities that will drive their research field, departments, institutions and personal career progress and to date the proactive management, deposition and publication of their data has often simply not been a logical priority.

With Giving Researchers Credit for their Data we’re hoping to lower the barrier to action by automating and simplifying the process of submitting data papers to journals. The carrot of having a publishable, citable product at the end of the process is also part of the value proposition. And the proposition itself has been strengthened in recent weeks by the news of the data journal Earth System Science Data’s high citation rates. ESSD has been assigned an Impact Factor over 8, leapfrogging its primary research competitor titles to achieve a ranking of 2nd in Meteorology & Atmospheric Sciences, and 3rd in Geosciences, Multidisciplinary.

Whilst it can rightly be argued that the Impact Factor is a blunt instrument at best with which to measure the value of individual articles, this announcement does imply that researchers use and credit data papers in their work at levels comparable to, or exceeding, many traditional research articles (at least in Geosciences). Perhaps this development will lead to ‘write my data paper’ making its way on to the standard academic To Do list.

And that is certainly worth celebrating!

-Neil Jefferies (PI for Giving Researchers Credit for Their Data)

 

Giving Researchers Credit for their Data, funded as part of the Jisc Data Spring Initiative, aims to provide a button that can be added to a DataCite compliant data repository which largely automates the process of data paper submission for an authenticated researcher. The project uses a cloud-based app and SWORD2-based APIs to link with multiple repositories and publishers, taking advantage of existing DataCite and ORCID metadata so that a paper can be automatically inserted into a publisher’s submission system without requiring any data re-entry by the author.

References:
[1] Aleixandre-Benavent, R et al. Scientometrics (2016) 107: 1. doi:10.1007/s11192-016-1868-7
[2] Piwowar HA, Vision TJ. (2013) Data reuse and the open data citation advantage. PeerJ 1:e175 https://doi.org/10.7717/peerj.175. This analysis specifically concentrated on micro-array data.

Act on Acceptance reaches 1,250 deposits

The Act on Acceptance programme was launched on 1 October 2015. This programme is a mechanism allowing Oxford researchers to comply with the new mandated Open Access requirements set out by HEFCE for the next REF.

These requirements, which ask for a copy of the accepted manuscript to be deposited into an Institutional Repository within three months of acceptance, come into force on 1 April 2016. Launching on 1 October will allow all researchers to become familiar with the process before the requirements come into effect.

The programme includes the Open Access team, Research Services, Symplectic Elements and ORA. To date we have had over 1,250 deposits, and the rate of deposits continues to climb.

ORA currently receives at least 30 deposits a day via Symplectic Elements. Follow how we’re going on Twitter with the “#actonacceptance” hashtag.

DivisionalTotals

DivisionalTotalsByType

RunningTotals

—Sarah Barkla

ORA-Data: managing research data

oraWe’re pleased to announce that depositing in ORA-Data will now allow researchers and the University to comply with the EPSRC’s (Engineering and Physical Sciences Research Council) policy about research data management, which comes into effect from 1 May 2015.

The Bodleian Libraries recently launched a new service for the University, the Oxford Research Archive for Data (ORA-Data). A digital repository and catalogue for research data, ORA-Data offers a service to archive and enable the discovery, citation and sharing of data produced by researchers at Oxford.

ORA-Data is aimed especially at researchers who wish:

  • to deposit data that underpins publications
  • to deposit data that their funding body requires be preserved and made accessible
  • to add a record to the University’s catalogue of data

Any type of digital research data, from across all academic disciplines, may be deposited in ORA-Data, and we accept any file format. A permanent catalogue record is created for all data deposited in ORA-Data and a persistent unique identifier generated (a DOI, or Digital Object Identifier), which allows the dataset to be cited. Data and records will be discoverable through Google and other search engines, maximising the visibility and impact of the research. Researchers can also choose to set an embargo period for their files if they wish.

ORA-Data is running as a free pilot until July 2015. We’re keen for users to try it out, and would welcome any feedback to help us improve the service. ORA-Data can be accessed via the main ORA website: just select ‘Contribute’ followed by the ‘Data’ link. Our ‘How to deposit’ guide is available via our LibGuide and the ORA-Data team can be contacted at: ora@bodleian.ox.ac.uk – we would love to hear from anyone interested to find out more.

—Amanda Flynn, Digital Scholarship Support Officer

ORA: more than you might think

ORA doesn’t just take work published in academic journals!  We also include working papers, reports, Oxford doctoral theses and metadata records.

Our recent work with the Transport Studies Unit (TSU) is online.

We’ve been working with Computer Science to add another 5,000 records, including 1,300 full-text papers.

The working papers from the Oxford Institute for Energy Studies (OIES) as well as their publication the Oxford Energy Forum are in the final stages of being added.

Our Polonsky Foundation-funded theses digitization project is winding up and ORA already holds these records.

—Sarah Barkla