World War I as Linked Open Data
Linked Data Finland
General Dataset Description
This dataset contains strictly quality-controlled rich information on events, actors and places related to the First World War. As such, it is meant to be used as a reference dataset to which other datasets (e.g. museum or library collections dealing with WW1 topics) can be linked.
The dataset is published using the CC-BY-SA 4.0 license, and is based mainly on joint work by the Semantic Computing Research Group at Aalto University and the University of Colorado Boulder (CU). For a general introduction to the dataset, the reader is referred to this article.
The dataset itself can be browsed here. A visual statistical overview can be accessed at http://ldf.fi/ww1lod/void/. Textual schema documentation generated from the dataset, on the other hand, can be viewed here. The dataset can be queried using SPARQL at http://ldf.fi/ww1lod/sparql.
Sample Usage
- Applications:
- Example SPARQL queries:
- Works in the CU WW1 Collection related to themes that are also related to General Friedrich Adolf Julius von Bernhardi
- Items from Europeana, the CU WW1 Collection and Out of the Trenches relating to events that happened in West Flanders
- Units of the German 3rd Army ordered by the count of atrocities they participated in in Belgium
- Population change in Belgian provinces during the war years as compared to the number of atrocities as well as total events that occurred there
Detailed Dataset Contents
Imperial War Museum (IWM) Toplevel Events
( View / Statistical Description / Schema Usage / Download )
To provide a useful common base, general events pertinent to the whole war were included. For this, an authoritative framework of 326 top-level wartime events was provided by the IWM’s First World War Centenary Partnership.
This event timeline was principally derived from the official British series on the history of the war, the History of the Great War Based on Official Documents, particularly:
Great Britain, Committee of Imperial Defence. Principal Events, 1914-1918. History of the Great War Based on Official Documents. London: HMSO, 1922.
Additional published works were used to verify dates and facts.
Information included:event name, description, date(s), and whether a military, naval, aviation, political, or social event.
The primary schema used to model the data is the CIDOC-CRM.
Detailed source information:
Imperial War Museum
The IWM is often considered the premier cultural heritage institution in the English-speaking world relating to the war. Thus both historians and cultural heritage professionals consider the IWM’s vocabularies authoritative, and they are likely to be re-used by others who are preparing datasets in this subdomain.
Rich Events
( View / Statistical Description / Schema Usage / Download )
While authoritative, the IWM events did not contain place or actor information. To overcome this limitation, a separate catalogue of some 250 events selected by domain experts was built for richer description, including annotation of places, participating actors and temporal relationships.
The information entered is drawn from various sources, including approved terminologies from the Imperial War Museum and the British Army’s Battle Nomenclatures Committee, as well as a custom term list on Belgium and WW1.
These events were manually linked to the top-level events where appropriate, resulting in 46 owl:sameAs links. In addition, all events have been automatically linked to DBpedia, with a little over 100 owl:sameAs relationships. These latter links were validated by domain experts.
Information included: name, alternate names, description, agent, time of action, place of action, is contained in, contains, cause, effect, same as
The primary schema used to model the data is the CIDOC-CRM.
Detailed source information:
- “Indexing by Event: IWM Approved Termlist for First World War”.
- Great Britain, Battles Nomenclature Committee. Official Names of the Battles and Other Engagements Fought by the Military Forces of the British Empire during the Great War, 1914-1919. London: HMSO, 1921.
- Belgique et la Première Guerre mondiale: bibliographie. Edited by Patrick Lefevre. Brussels: Musée Royal de l'Armée, 1987-2001.
- New York Public Library. Subject Catalog of the World War I Collection. Boston: Hall, 1961.
Musée Royal l'Armée et d’Histoire Militaire
The Musée Royal l'Armée is considered the authority on matters relating to the war in Belgium. It published Patrick Lefevre's standard bibliography on this topic, from which a librarian and historian derived much of the term list on WWI Belgium. Historians specializing in WWI Belgium and France also reviewed and supplemented the term list.
Atrocity Events in Belgium
( View / Statistical Description / Schema Usage / Download )
WWI historians John Horne and Alan Kramer (Trinity College, Dublin) wrote the standard work on the “German atrocities” of 1914. With their permission, detailed data on these incidents in Belgium was sourced from Appendix 1 of their vast study:
Horne, John, and Alan Kramer. German Atrocities, 1914: A History of Denial. New Haven: Yale University Press, 2001.
Information included: name, agent, time of action, place of action, combat related, deportations, human shields used, panic, destroyed buildings, killing
The primary schema used to model the data is the CIDOC-CRM with additional properties created in the ww1lod-schema namespace.
German Army Structure
( View / Statistical Description / Schema Usage / Download )
Information on the naming and organization of the Imperial German army units mentioned in was derived from the following trusted reference source:
Tessin, Georg. Deutsche Verbände und Truppen. Osnabrück: Biblio-Verlag, 1974.
Information included: name, unit type, part of
The primary schema used to model the data is the CIDOC-CRM with additional properties created in the ww1lod-schema namespace.
Other Actors
( View / Statistical Description / Schema Usage / Download )
Other than the German Army structure, actor information has been input by CU domain specialists in conjunction with enriching the event network. During this work, actors have not only been linked to the events they participate in, but also to each other and the organizations they belong to.
Information included: name, alternate names, organizational information, relationship information
The primary schema used to model the data is the CIDOC-CRM. Additional vocabularies used for organizational and relationship information include the W3C organization ontology, the relationship ontology, FOAF and the schema.org vocabulary.
Geography of Belgium and France
( View / Statistical Description / Schema Usage / Download )
- Western Front place names from “Indexing by Geographical Keyword: IWM Approved Terminology for Northeastern France and Belgium”.
- Coordinate data was sourced from GeoNames, the major geographical hub in the LOD cloud and verified by domain experts.
- Additional work by CU domain experts.
Information included: name, alternate names, part of, coordinates
The primary schema used to model the data is the CIDOC-CRM. Coordinate data is expressed using the W3C geo vocabulary.
Belgian Statistical Data for the War Years
( View / Statistical Description / Schema Usage / Download )
Statistics on the population of Belgian provinces during the war years were sourced from:
Belgium. Ministère de l’Intérieur et de l’Hygiène. Annuaire statistique de la Belgique et du Congo Belge, volume 46, Brussels, 1922
Information included: male and female population of each Belgian province for each of the war years
The statistics have been encoded using the W3C data cube vocabulary.
Polygons of Belgian provinces in the Wartime
( View / Statistical Description / Schema Usage / Download )
Wartime boundaries for Belgian provinces were obtained from HISSTAT, a collaborative project of the Universities of Ghent, Brussels, and Louvain-la-Neuve, and the State Archives of Belgium. HISSTAT is developing a research infrastructure that brings together Belgian digital statistics and enables the creation of historical maps. Their geographies are highly accurate, and penetrate to the municipal level.
Information included:name of region, part of, polygon
The schemas used to model the data are the GeoRSS vocabulary for polygons and the W3C geo vocabulary for points.
Wikipedia Event Timeline
( View / Statistical Description / Schema Usage / Download )
Wikipedia World War I event timeline as extracted by Jon Voss.
Information included: name of event, time, Wikipedia link
The primary schema used to model the data is the CIDOC-CRM.
Principal Events
( View / Statistical Description / Schema Usage / Download )
Timeline of principal events of the war, as automatically extracted from an OCR’d version of:
Great Britain, Committee of Imperial Defence. Principal Events, 1914-1918. History of the Great War Based on Official Documents. London: HMSO, 1922.
Note: contains a lot of errors due to the automatic extraction
Information included: name of event, time
The primary schema used to model the data is the CIDOC-CRM.
Dataset Links
In summary, the dataset contains the following links to external resources:
- 1248 places links to GeoNames
- 152 event and actor links to DBPedia
- 47 internal links between IWM top level events and rich events
- 29 event links to the Out of the trenches project
- 7 event links to the Muninn project
Programmatic Use
Most of the access mechanisms to the dataset provide data in RDF given suitable Accept-headers. Particularly:
- http://ldf.fi/ww1lod/sparql leads to the SPARQL Service Description and extended VoID descriptions also available at: http://ldf.fi/ww1lod/void/
- All dataset URIs provide Linked Data browsing (e.g., http://ldf.fi/ww1lod/a74d369d)
- Complete graphs can be downloaded by their URIs (e.g., http://ldf.fi/ww1lod/iga/)



