Photo by Jj Englert on Unsplash
"Empowering Wikidata with Quickstatements 3.0: A Project overview of my Outreachy Journey with Wikimedia"
Quickstatments3.0 in progress
Introduction
Hello! I’m excited for today. Over the past few weeks, I've been working as an Outreachy intern at the Wikimedia Foundation. Today, I'll share my experiences and help you understand the project I've been working on: Quickstatements 3.0. I will keep this engaging and not too technical.
What is Quickstatements?
Quickstatements is a tool used to batch edit Wikidata items. To truly understand what this means, you need to first understand what Wikidata is, the problem it’s trying to solve, and how Wikidata editing works. Trust me, it’s really easy and fun. Anyone can do this.
Let’s delve in.
Understanding Wikidata
If you are reading this, I’m certain that you have at some point read an article on Wikipedia. It’s estimated that there are over 62 million wiki pages containing a vast amount of knowledge. These articles are written and kept up to date by people like yourselves (volunteers).
Fun fact: Wikipedia is operated by the Wikimedia Foundation, a non-profit organization. It also oversees several other free knowledge projects like Wikimedia Commons, Wiktionary, and Wikibooks. The foundation’s mission is to make knowledge accessible to everyone while keeping it free.
Problems with Wikipedia
Let’s talk about wikipedia for a bit, wikipedia has some limitations that needed to be addressed, addressing these limitations birthed Wikidata.
Information on Wikipedia is not always up to date. For example, an article about a small English town might be translated into French, Russian, Albanian, and Croatian. Assume the article contains metrics such as the population of the town, names of the members of its council, and the name of the mayor. These articles may be correct at their time of creation and translation, but the data they contain is sure to change over time. How likely is it that they will be kept up to date 10 years later? Even if the English article is kept up to date, the Albanian version will probably not be updated as there is no trigger to update it whenever there are changes in the English article.
Even if there is a volunteer dedicated to updating these articles, it will be a hassle. Imagine if instead of only five translations, an article has 15 translations. This means the editor will have to manually edit each of them one after the other. There are millions of articles like this on Wikipedia with numerous translations. Imagine having to manually update them—it’s such a hassle. This is a limitation.
Querying Wikipedia for information is not always possible, even when Wikipedia contains the desired information. For example, say you are writing an article and you need the names of all major English TV broadcasters whose fathers were footballers, or the names of all English footballers that have been capped by England whose surname starts with an “R.” The answers to these questions are contained in articles on Wikipedia, but querying for them is not exactly easy.
The problem of not being able to effectively query the data we already have in Wikipedia is a disturbing limitation. Wikidata was created to solve these issues.
Now that we understand some limitations of wikipedia, let’s delve into wikidata.
What is Wikidata?
The comprehensive definition of Wikidata is that it is a free and open knowledge base that acts as a central repository for structured data. It was launched by the Wikimedia Foundation in 2012 to support Wikipedia and other Wikimedia projects by providing a common source of data that can be used across multiple languages and platforms.
How Wikidata Solves These Problems
It’s basically a huge central store of data where Wikipedia and other Wikimedia projects pull data from. So when data is updated in the store, it automatically reflects in all linked Wikipedia articles and projects.
Think of Wikidata as a giant library in the heart of a big city. This library is structured, organized, and constantly updated with the latest information. Now, imagine that all the other Wikimedia projects—like Wikipedia, Wikimedia Commons, and Wiktionary—are like different schools and research centers in the city. These institutions rely on the library to provide them with accurate and up-to-date information for their lessons, and research.
Whenever a school needs information for anything, they send someone to the library to gather the necessary data. Similarly, Wikimedia projects pull data from Wikidata to ensure they have the most accurate and consistent information available.
This makes it easier to track and keep information up to date across multiple language Wikipedias as they all use Wikidata as a common data source. Using Wikidata as a central repository also ensures that updates are automatically propagated to all linked articles. This reduces the burden on editors and ensures that information remains consistent and up to date across multiple languages.
Wikidata has a Query Service that allows users to perform complex queries to extract specific information from the vast amount of data stored in Wikidata. This is mostly made possible because Wikidata uses a structured format to store data, hence its definition as “a central repository for structured data.”
We have now seen why it exists and how it solves the limitations of Wikipedia outlined earlier. Below is a short video that shows how Wikidata is edited manually.
QUICKSTATEMENTS3.0.
Quickstatements is a tool used to batch edit Wikidata items. From the video above, one thing becomes quickly apparent: though editing Wikidata manually is relatively easy, it can quickly become repetitive, boring, and time-consuming when an editor has to edit hundreds of items, which is usually the case. Quickstatements is a tool that was developed to somewhat automate the process of editing items and make the whole editing more structured and seamless for editors.
Here’s a short video of how editing with Quickstatements works:
The tools allows editors to prepare the dataset to be edited and carry out batch edits. This is cool because it not only saves editors a significant amount of time but also provides a comprehensive record of their edits. The prepared datasets allow past edits to be revisited and referenced easily, ensuring that all changes are well-documented and traceable.
The Wiki Movement Brasil group are developing a new version of Quickstatements, and I have been privileged to work with them on it. An overview of my tasks includes:
Understanding the command syntax. Quickstatements has two command syntaxes: v1 command syntax and CSV command syntax.
Creating comprehensive dataset and testing the software extensively.
Writing comprehensive, novice-focused documentation for Quickstatements 3.0.
This is a non-technical overview of all the fun I’ve been having the past few months. If you would love to become a volunteer and get started editing wikidata items, I’ve got you. The links below will get you started.