swell-release-v1

SweLL-release

Final documentation for the SweLL corpora (as of August 2021)

Online version: https://spraakbanken.github.io/swell-release-v1/

Procedure for providing access to the SweLL corpora

If approved:


Dear XXX,

Thank you for your interest in the SweLL data!

You should by now have received

(1) a mail invitation from “SBX” (or some variation of that) to access the data in the folder “SweLL_release_v1”. If not, please check your Junk-folder.

(2) a mail invitation to log in to Korp. To do searches in Korp in available learner corpora, please, check our webpage: https://spraakbanken.gu.se/en/projects/swell/l2korp .

This access in personal, and should not be shared with others.

Happy exploring,

SweLL team (swell@svenska.gu.se)


.zip files for download

The users who have been approved following an access application, will get access to the following two .zip files

SweLL-pilot.zip contains

  1. folders for TISUS, SW1203 and SpIn subcorpora in three formats:
    • json (SVALA format)
    • xml (Korp format)
    • xml with linguistic annotations (Korp format)
    • raw text
  2. metadata in an excel file, ordered by essay-IDs; divided into subcorpora per spreadsheet
  3. metadata descriptions as pdf files
  4. readme file with links to medata descriptions for each subcorpus and links to articles: https://spraakbanken.github.io/swell-release-v1/Readme-SweLL-pilot

SweLL-gold.zip contains

  1. SweLL-gold corpus files (502 essays) in three formats:
    • json (SVALA format)
    • two xml files - original and normalizaed versions - Korp format, one for the original version and one for the normalized version
    • two xml files - original and normalizaed versions - linguistically annotated in Korp format
    • 2 files with raw texts, one for the original version and one for the normalized version
  2. metadata in an excel file, ordered by essay-IDs
  3. metadata description in pdf
  4. readme file with links to medata description, links to articles: https://spraakbanken.github.io/swell-release-v1/Readme-SweLL-gold

ReadMe files for corpus users

Metadata description files