Documentation on North Saami

Contents:

Source file documentation
Using the analysers
Projects involving North Saami
Tags used for analysis
Discussions on improving our linguistic analysis
Morphophonology, morphology and syntax
Pre- and postprocessing
Normativity issues
Speller optimisations
Obsolete test reports, for reference

Source file documentation

Documentation written in the source files
The source files themselves: stems / affixes / twolc / IPA / syntax

Using the analysers

In the terminal: analyse words by writing usme, generate with dsme
Generation of: paradigms / text /
For more info, see How to use the morphological parsers

Projects involving North Saami

Tags used for analysis

Discussions on improving our linguistic analysis

Morphophonology, morphology and syntax

Documentation of the twol-sme.txt rule file
Documentation of the lexicon files
The use of flag diacritics
Partly obsolete Documentation of the disambiguation file
Syntax regression testing: run sh test/src/syntax/disambiguation_developertest.sh (you may eventually have to adjust the path following $GTBIG, the files are in $GTBIG/gt/sme/corp)
See also the general disambiguation page.

Pre- and postprocessing

Documentation of the preprocessing of running text
1. The perl-based preprocess script, our current preprocessor
2. For reference: Documentation of the old xfst-based preprocessor tok.txt is found here
Documentation of inituppercase.regex, (initial capitalisation) and allcaps.xfst, the file for words written in all-caps. Note: The latter is presently not in use.
Translating from xerox-style to vislcg3-style is done with the script lookup2cg

Normativity issues

A list of issues to the Saami language board

Speller optimisations

There is a separate page on speller optimisations for SME.

Obsolete test reports, for reference

A test plan for sme (obsolete)
A test diary for sme (obsolete)
Bug report sheet from the days before we got a bug report system) (obsolete)
Our earlier treatment of foreign words (obsolete)