CLARIN Knowledge Centre for the Languages of Sweden (SWELANG)
Welcome to SWELANG – the CLARIN Knowledge for the Languages of Sweden. The knowledge centre is an information service offering advice on the use of digital language resources and tools for Swedish and other languages in Sweden, as well as other parts of the intangible cultural heritage of Sweden.
The centre is placed at the Language Council of Sweden (Stockholm) and is run in cooperation with the other sections of the Institute of Language and folklore (ISOF) in Uppsala and Göteborg. The institute has a government mission to collect, preserve, process and disseminate scientific knowledge and material concerning the Swedish language, the national minority languages, the Swedish sign language and Swedish dialects, as well as other parts of the intangible cultural heritage of Sweden.
SWE-CLARIN and the National Language Bank of Sweden
The SWELANG knowledge centre cooperates closely with the National Language Bank of Sweden External link. (Nationella språkbanken) and SWE-CLARIN External link. to make digital language resources available to scholars, researchers, students and citizen-scientists from all disciplines, especially in the humanities and social sciences. This work is carried out as part of CLARIN External link. — the European research infrastructure for language resources and technology.
The National Language Bank is made up of three departments working in close collaboration with a number of SWE-CLARIN centers of which the SWELANG knowledge centre is one. The Institute of Language and Folklore (ISOF) is also operating Språkbanken Sam (Society).
The SWELANG knowledge centre gives you information about and access to tools, data and services at Språkbanken Sam and other parts of the National language bank and SWECLARIN. More information of these resources can be found in the SWE-CLARIN catalogue External link..
Språkbanken Sam is developing the means to annotate, process and explore text and speech data in the archives through language technology‐based tools together with the other two departments Språkbanken Text at the University of Gothenburg and Språkbanken Tal (Speech) at the Royal Institute of Technology (KTH).
Below, you will find more information about the resources and activities at Språkbanken Sam.
The National Language Bank and SWE-CLARIN is funded by the Swedish Research Council 2018–2024 with 1,5 milj EUR/year (2017-00626).
Development of digital tools and services
Språkbanken Sam is developing digital methods for collecting a) dialect and folklore data, e.g. through web‐based questionnaires, crowdsourcing and automatic speech recognition b) official texts and terminology in Swedish and minority languages through web crawling and harvesting.
We are developing methods to manage and making available contextualized digital archive material through a map based research interface called Digitalt kulturarv (Digital Cultural Heritage). A limited public version of Digitalt kulturarv called Sägenkartan (Map of Legends) can be accessed on the web (only in Swedish). A login version with broader content for researchers is on its way.
We are also developing dictionary infrastructure to store and make available official terminology and dialect words in collaboration with Språkbanken Text. Dictionary content for more than 20 language pairs (from and to Swedish) can be downloaded from the SWE-CLARIN catalogue External link..
Focus on two kinds of language data
The data collected, processed and disseminated at Språkbanken Sam is primarily of two kinds:
- Official texts and terminology for research in official communication and social conditions. The material is multilingual with parallell texts in Swedish and translations into easy‐to‐read/plain language, national minority languages, and minority languages.
- Folk narratives, as well as other text and speech material from the dialect and folklore archives. The material consists of inventories, dialect word databases, letters, recordings, transcriptions, etc. It is interesting both in terms of content and linguistic quality, with a great geographical, social, and stylistic variation.
Interdisciplinary collaboration within the Tilltal project
In the Tilltal project we examine how speech technology methods can make the historical speech recordings more accessible to research in cooperation between data holders, researchers and speech and language technologists.
Three case studies examine how speech technology can be used to investigate issues within different subject fields (ethnology, linguistics and conversation analysis). An activity‐theoretical perspective is applied to investigate how the archive material is used for research.
Considering the needs of researchers, language technological solutions are suggested and their practical usefulness is assessed through the case studies.
In a paper presented at CLARIN2017 External link. we describe how we have been working to involve researchers and collaborating between disciplines within the project Tilltal. In a follow up interview External link. we talk more about the importance of involving researchers in making language resources available for researchers. The presentation can be viewed online External link..
Tilltal is a collaboration project involving SB Sam, SB Tal, Riksarkivet and SWE-CLARIN. The project is funded by the Swedish Foundation for Humanities and Social Sciences 2017–2020 (SAF16‐0917:1).
Collection of resources
List of online databases at ISOF (only in Swedish)
Sägenkartan – Map of legends (only in Swedish)