Since the 18th century, mountaineering has developed into an activity that historians* regard as extremely important in terms of how the so-called Western world views concepts such as conquest, human achievement and wilderness. The mountains and especially alpinism are significant in Europe, as are the discourses on them. Especially when it comes to mountaineering, talking or writing about it is absolutely central to the activity itself.
The project Semantics for Moutaineering History (Sem4Hist, SEMOHI) aims at semantically enriching the corpus of Alpine words by identifying and tagging:
places like mountains, regions, paths or huts
persons like mountaineers, guides or scientists* and
First ascent events, such as the first ascent of the Großvenediger by Josef Schwab in 1841
- places like mountains, regions, paths or huts
- persons like mountaineers, guides or scientists* and
- First ascent events, such as the first ascent of the Großvenediger by Josef Schwab in 1841
The project uses the corpus Alpenwort (2014 – 2016), a lingusitically annotated corpus of the journal of the Austrian and German Alpine Association in the predecessor project Alpenwort.
What does semantically enriched mean?
Semantic enrichment makes it possible to ask interesting questions about texts, such as
What would be a list of all persons mentioned in connection with the Venice Group up to the year 1914?
Which places were described between 1935 and 1960?
What were the first ascents in the Julian Alps?
In order to answer such questions, vocabularies have to be created for places and persons that are important for the history of alpinism.
An important source has been prepared for places, which will be made publicly accessible for the first time by this project: The index of photographs of the PES from 1927 to 1941, which contains descriptions for about 32,000 photographs, of which about 85% deal with some kind of place. In addition, other sources already freely available on the Internet will be used to create vocabularies for places and people. Links to Internet sources form a network of information that can be explored (keyword Linked Open Data). Our vocabularies integrate these sources and thus refer to the related information on the Internet. In a further step, vocabulary entries for specific persons and places in the corpus Alpine Word are identified and thus establish a link from the places and persons on the Internet to the texts in the corpus describing them.
International standards and semantic web-community technologies are used to generate our project data, which will be made available to research and the interested public.
The developed methodology will provide possibilities for enriching and querying text corpora that go beyond the possibilities of traditional full text and even corpus searches. The created vocabularies and the semantically enriched Alpine Word Corpus will provide sources for research and further enrichment of other sources, thus demonstrating the potential of Digital Humanities.
The project objectives in brief:
- Identification of place and person names in the sources
- Workflow for automated location and person name recognition (Named Entity Recognition – NER / Named Entity Linking – NEL)
- Identification of first ascents in the Alpine word corpus
- Semantic annotation and representation of first ascents, places and persons in machine readable form
- Visualization of the locations given in the sources in a spatial and temporal context
- Open access/online dissemination