Wikibase Resistance : a six-step process
As we explained last November, CegeSoma is coordinating an ambitious project, Wikibase-Resistance, aimed at creating an advanced research tool on people involved in resistance activities in Belgium during the Second World War.
At present, data from more than 72,000 personal files of resistance fighters kept by CegeSoma have been encoded. This progress has been made thanks to the efforts of a team of 15 people (volunteers, and administrative staff), who dedicate one to three days a week to this task.
However, before this data can be uploaded, thereby making information on the actors of the Belgian resistance is searchable online, several steps are still necessary. In fact, encoding the data is only the first of six steps. Let's go through them:
- The first step, encoding, aims to encode data on these resistance fighters (such as their name, date of birth or membership in a resistance network) into a computer file. This work is based on the forms and personal files contained in archives relating to the resistance, such as the Archives of the Intelligence and Action Services.
- The second step, verification, involves checking the quality and the coherence of the data as well as the form and the content, taking into account what has been encoded in the remarks column.
- The third step, alignment, consists of establishing links between the names of people or places and external databases (such as GeoNames for places or Wikidata for people), in order to limit ambiguities and enrich the data.
- The fourth step, deduplication, aims to identify, in a semi-automated way, if several records refer to the same person.
- The fifth step, formatting, is a technical step that adapts the encoded data to the destination format.
- The sixth and final step, importing, relies on the use of tools to upload all the content onto the data storage and publishing platform.
If you want to follow the evolution of this project, follow us on the CegeSoma Facebook page, where we will be posting details about the content, technical aspects and behind-the-scenes of the project in the coming months.