Localisation

All our projects are run in UTF-8, and have been so since the spring of 2005.

North Saami

In our web interface there is a filter to let users feed in c1, d1, s1, etc, instead of the correct Saami letters. Output to the user is in UTF-8.

Lule Saami

There is a script, spell-relax.regex, so that the descriptive analyser understands ñ and ń for ŋ.

South Saami

The script spell-relax.regex is used for South Saami as well, but with a slightly different purpose: It is used to accept the wide-spread sloppy use of i for ï.

Our other languages

Languages with Cyrillic have their source code in Cyrillic. Languages with both Latin and Syllabic script (such as Plains Cree) have source code in Latin with conversion transducers to and fro Syllabic.

Note especially that for Iñupiaq, we do not use the wide-spread 8-bit Interactive IñupiaQ Dictionary encoding, as it has placed the Iñupiaq characters in the ASCII area. There is a converter on the Iñupiaq page.

Technical maintenance

Installation and setup

Infrastructure Make-over

Basics

Related to the old infra

Debugging

Intermediate

Advanced topics

For Infrastrucure developers

Project details

eXist

Localisation

Localisation

North Saami

Lule Saami

South Saami

Our other languages