All our projects are run in UTF-8, and have been so since the spring of 2005.

North Saami

In our web interface there is a filter to let users feed in c1, d1, s1, etc, instead of the correct Saami letters. Output to the user is in UTF-8.

Lule Saami

There is a script, spell-relax.regex, so that the descriptive analyser understands ñ and ń for ŋ.

South Saami

The script spell-relax.regex is used for South Saami as well, but with a slightly different purpose: It is used to accept the wide-spread sloppy use of i for ï.

Our other languages

Languages with Cyrillic have their source code in Cyrillic. Languages with both Latin and Syllabic script (such as Plains Cree) have source code in Latin with conversion transducers to and fro Syllabic.

Note especially that for Iñupiaq, we do not use the wide-spread 8-bit Interactive IñupiaQ Dictionary encoding, as it has placed the Iñupiaq characters in the ASCII area. There is a converter on the Iñupiaq page.