The texts for the text bank were collected and annotated by the Research Institute for the Languages of Finland and the Department of General Linguistics at the University of Helsinki. The texts are published between 1970-1997 and include electronic versions of books as well as major national newspapers and journals.
The texts are annotated according to TEI (Text Annotation Initiative) P3 recommendations. The TEI is an application of SGML (Standard Generalized Markup Language) recommending what textual features should be encoded (i.e. made explicit) in an electronic text, and how that encoding should be represented for loss-free, platform-independent interchange.
The next resource to be made available in the Language bank server will be the Oulu corpus of standard written Finnish of the1960s. Large foreign resources, such as British National Corpus are also going to be licensed and made available on Language bank server later on.
CSC, the Center for Scientific Computing is a service organization owned by the Finnish Ministry of Education. CSC offers the researchers services in computing, databases and Funet network (the Finnish University and Research Network).