root-morphology
Divvun & Giellatekno - open source grammars for North Sámi.
North Sámi morphological analyser
Multicharacter symbols
Tags for POS
- 
+Ex/N	 - This tag is not added in lexc. The POS tag before derivation is converted into this tag when compiling FST for disambiguation. 
- 
+Ex/A	 - This tag is not added in lexc. The POS tag before derivation is converted into this tag when compiling FST for disambiguation. 
- +Ex/V - This tag is not added in lexc. The POS tag before derivation is converted into this tag when compiling FST for disambiguation.
- 
+N        - Noun 
- 
+A        - Adjective 
- 
+Adv      - Adverb 
- 
+V        - Verb 
- 
+Pron     - Pronoun 
- 
+CS       - Subjunction 
- 
+CC       - Conjunction 
- 
+Adp      - Adposition, ie Post- and Prepostion, NOT IN USE 
- 
+Po       - Postpostion 
- 
+Pr       - Preposition 
- 
+Interj   - Interjection 
- 
+Pcle     - Particle 
- +Num - Numeral
Tags for sub-POS
- 
+Prop       - Propernoun 
- 
+Pers       - Personal Pronoun 
- 
+Dem        - Demonstrative Pronoun 
- 
+Interr     - Interrogative Pronoun 
- 
+Refl       - Reflexive Pronoun 
- 
+Recipr     - Reciprocal Pronoun 
- 
+Rel        - Relative Pronoun 
- 
+Indef      - Indefinitive Pronoun 
- 
+Coll       - Collective numerals, subtag for +N 
- 
+Arab       - Arabic numeral, subtag for +Num 
- 
+Rom        - Roman numeral, subtag for +Num 
- 
+Pass       - hallat/haddat not in use 
- +Known - man (different from maid): mii+Pron+Rel+Sg+Acc+Known
Tags for Inflection
Tags for Case and Number Inflection
- 
+Sg - Singular 
- 
+Du - Dual 
- +Pl - Plural
- 
+Ess - Essive 
- 
+Nom - Nominative 
- 
+Gen - Genitive 
- 
+Acc - Accusative 
- 
+Ill - Illative 
- 
+Loc - Locative = Inessive and Ellative 
- 
+Com - Comitative 
- +Com/Sh - Comitative Plural Hyphened Shortform (w/o -guin), ie Beatnagii-, Biillai-, Bohccui- etc.
Possessive tags
- 
+PxSg1 Singular First Person 
- 
+PxSg2 Singular Second Person 
- 
+PxSg3 Singular Third Person 
- 
+PxDu1 Dual First Person 
- 
+PxDu2 Dual Second Person 
- 
+PxDu3 Dual Third Person 
- 
+PxPl1 Plural First Person 
- 
+PxPl2 Plural Second Person 
- +PxPl3 Plural Third Person
Adjectival tags
- 
+Comp Comparative 
- 
+Superl Superlative 
- 
+Attr Attributive 
- 
+Card Cardinal Number Not in use 
- +Ord Ordinal Number
Moods
- 
+Ind Indicative 
- 
+Pot Potential 
- 
+Cond Conditional 
- +Imprt Imperative
Tenses
- 
+Prs Present Tense 
- +Prt Past Tense, Preterite
Verb person-number
- 
+Sg1 Singular First Person 
- 
+Sg2 Singular Second Person 
- 
+Sg3 Singular Third Person 
- 
+Du1 Dual First Person 
- 
+Du2 Dual Second Person 
- 
+Du3 Dual Third Person 
- 
+Pl1 Plural First Person 
- 
+Pl2 Plural Second Person 
- +Pl3 Plural Third Person
Infinite verb forms
- 
+Inf Infinitive 
- 
+Ger Gerund 
- 
+ConNeg Negation Form, ie Mana, Doalvvo, Juoge etc 
- 
+ConNegII Alternative, Rather Declamatory Negation Form - Infrequent 
- 
+Neg Negation Verb, Ii and its forms, ie Ale, Alli, Allot, Ehpet, Eat etc. 
- 
+ImprtII Alternative, Rather Declamatory Imperative Form - Infrequent not in use 
- 
+PrsPrc Present Participe 
- 
+PrfPrc Perfect Participe 
- 
+Sup Supine 
- 
+VGen VerbGenitive 
- 
+VAbess VerbAbbesive 
- +Actio Action Verb Form
Other tags
- 
+ABBR Abbreviation, subtag for e.g. +N 
- +Symbol = independent symbols in the text stream, like £, €, © 
- 
+ACR Acronym, subtag for +N 
- 
+CLB Clause border (full stop, comma..)
- 
+PUNCT punctuation 
- 
+LEFT left paranthesis 
- 
+RIGHT right paranthesis 
- 
+Dyn Dynamically generated (acronyms) +ACR+Dyn
- +CLBfinal Sentence final abbreviated expression ending in full stop, so that the full stop is ambiguous
- 
+TV Transitive Verb, +V+TV 
- 
+IV Intransitive Verb, +V+IV 
- 
+G3 Grade 2-3 for homonymies with grade 1-2, +N+G3 
- 
+G7 Grade 3, no consonant gradation, +N+G7 
- +NomAg Actor Noun From Verb - Nomen Agentis, +N+NomAg
- +Gram/TAbbr
- Transitive abbreviation (it needs an argument)
- +Gram/NoAbbr
- Intransitive abbreviations that are homonymous - +Gram/TNumAbbr
- Transitive abbreviation if the following - +Gram/NumNoAbbr
- Transitive abbreviations for which numerals - +Gram/TIAbbr
- Both transitive and intransitive abbreviation 
- +Gram/IAbbr
- Intransitive abbreviation (it takes no argument)
- +Gram/3syll
- trisyllabic verbs
Question and Focus particles:
- 
+Qst Question Particle: +Pcle+Qst 
- 
+Subqst Embedded Question Particle: +Adv+Subqst 
- 
+Foc/naj Focus clitic 
- 
+Foc/Neg-ge Focus clitic 
- 
+Foc/Pos-ge Focus clitic 
- 
+Foc/gen Focus clitic 
- 
+Foc/ges Focus clitic 
- 
+Foc/gis Focus clitic 
- 
+Foc/ba Focus clitic 
- 
+Foc/be Focus clitic 
- 
+Foc/hal Focus clitic 
- 
+Foc/han Focus clitic 
- 
+Foc/bai Focus clitic 
- 
+Foc/bas Focus clitic 
- 
+Foc/bat Focus clitic 
- 
+Foc/ban Focus clitic 
- 
+Foc/son Focus clitic 
- 
+Foc/bahal Focus clitic 
- 
+Foc/behal Focus clitic 
- 
+Foc/bahan Focus clitic 
- 
+Foc/behan Focus clitic 
- 
+Foc/bason Focus clitic 
- 
+Foc/beson Focus clitic 
- 
+Foc/mat Focus clitic 
- 
+Foc/mis Focus clitic 
- +Foc/s Focus clitic
Tags distinguishing different versions of the same lemma (before POS)
- +v1 
- +v2 
- +v3 
- +v4 
- +v5 
- +v6 
- +v7 
- +v8 
- +v9 
- +v10 
- +v11 
- +v12 
- +v13 
- +v14 
- +v15 
- +v16 
- +v17 
- +v18 
- +v19 
- +v20 
- +v21 
- +v22 
- +v23 
- +v24
Note: These high +v... number are in use for one word only: 
Escaped chars
- 
%         
- +Guess for the name guesser 
- +MWE - Multi-word expressions treated as such in the preprocessor. To be added as first tag after the lemma
- +PxCPlComRecipr used in pronoun-sme-morph.txt
Error (non-standard language) tags
- 
+Err/Orth substandard, not in normative fst 
- 
+Err/Orth-a-á substandard, not in normative fst 
- 
+Err/Orth-nom-gen substandard, not in normative fst 
- 
+Err/Orth-nom-acc substandard, not in normative fst 
- 
+Err/Lex substandard, not in normative fst, no normative lemma 
- 
+Err/DerSub substandard for derivation, not in normative fst, no normative lemma 
- 
+Err/CmpSub substandard for compounding, not in normative fst (wrong form or POS in first part)
- 
+Err/MissingSpace indicates that there is a missing space, causing an orthographic error 
- 
+Err/MissingHyph when there is no hyphen where it should have been 
- 
+Err/Hyph when there is a hyphen where none should have been 
- 
+Err/SpaceCmp used for compounds written apart - only retained in the HFST Grammar Checker disambiguation analyser 
- 
+Err/Spellrelax used to tag spellrelaxed typos (tag is inserted via flag diacritics)
- 
+Err/Confused grammarcheking rela word error confusion pairs 
- +Err/Confused-Ess grammarcheking rela word error confusion pairs
Usage tags
- 
+Use/-Spell Orthographically correct, typically perifer words, excluded in speller because they cause trouble for frequent words 
- 
+Use/-PLX Excluded in PLX-speller 
- 
+Use/SpellNoSugg recognized but not suggested in speller 
- 
+Use/Circ circular paths (old ^C^)
- 
+Use/CircN circular paths for the numerals (old ^N^)
- 
+Use/MT Generate for MT only, for restricting analyses needed - 
+Use/LIA only for LIA-analyser 
- 
+Use/NG not-generate, for ped generation isme-ped.fst and MT 
- 
+Use/NGminip Not for miniparadigm in NDS dicts 
- 
+Use/PMatch means that the following is only used in the analyser feeding the disambiguator 
- 
+Use/-PMatch Do not include in fst's made for hfst-pmatch 
- 
+Use/GC only retained in the HFST Grammar Checker disambiguation analyser 
- 
+Use/-GC never retained in the HFST Grammar Checker disambiguation analyser 
- +MWESplit Split point for MWE
Dialect tags:
- 
+Dial/-KJ   forms not in use in KJ (Kárásjohka)
- 
+Dial/-GG   forms not in use in GG (Guovdageaidnu)
- 
+Dial/-GS   forms not in use in GS (Gárasavvon) NOT IN USE
- +South foreløpig lagt til Sg Loc -n, som er en sub-form
Tags for indicating the orthography used
The above should either be used in pairs, or not at all. That is, if a word 
Multichars for marking start and end of IPA sequences
- %{%<ipa#%} - ipa text to the left
- %{#ipa%>%} - ipa text to the right 
- %<sent%> apertium
Compounding tags
The tags are of the following form: 
- 
+CmpNP/xxx - Normative (N), Position (P), ie the tag describes what- 
+CmpN/xxx - Normative (N) form ie the tag describes what - 
+Cmp/xxx - Descriptive compounding tags, ie tags that  describes 
This entry / word should be in the following position(s):
- 
+CmpNP/All - ... in all positions,  default, this tag does not have to be written 
- 
+CmpNP/First - ... only be first part in a compound or alone 
- 
+CmpNP/Pref - ... only  first part in a compound, NEVER alone 
- 
+CmpNP/Last - ... only be last part in a compound or alone 
- 
+CmpNP/Suff - ... only  last part in a compound, NEVER alone 
- 
+CmpNP/None - ... does not take part in compounds 
- 
+CmpNP/Only - ... only be part of a compound, i.e. can never 
If unmarked, any position goes.
The tagged part of the compound should make a compound using:
- 
+CmpN/SgN Singular Nominative 
- 
+CmpN/SgG Singular Genitive 
- 
+CmpN/PlG Plural Genitive 
- +CmpN/PlN Plural Nominative, propers!
Unmarked = Default, ie +CmpN/SgN for SME.
The second part of the compound may require that the previous (left part) is:
- 
+CmpN/SgNomLeft Singular Nominative 
- 
+CmpN/SgGenLeft Singular Genitive 
- +CmpN/PlGenLeft Plural Genitive
Tags for descriptive compound analysis - this is what a compound actually is:
- 
+Cmp - Dynamic compound. This tag should always be part - 
+Cmp/Attr - Attributive 
- 
+Cmp/SgNom - Singular Nominative 
- 
+Cmp/SgGen - Singular Genitive 
- 
+Cmp/PlGen - Plural Genitiv 
- 
+Cmp/SplitR - This is a split compound with the other part to - 
+Cmp/SplitL - This is a split compound with the other part to the left 
- 
+Cmp/Sh - testing +Cmp/Sh 
- 
+Cmp/Hyph - on dynamic compounds that have a hyphen 
- 
+Cmp/NoHyph - On compounds that COULD have had a hyphen (and usually have), but doesn't
- 
+Cmp/SoftHyph - Tags compounds containing SOFT HYPHENS (U+00AD)
- 
+Cmp/Cit - Tags citation compounds, which can in principle 
Compounding tag ordering
- 
+CmpN/ tags 
- 
+CmpNP/ tags 
- 
+Cmp/ tags - this is always true since the descriptive tags are always 
Semantic tags to help disambiguation & synt. analysis:(before POS)
- +Sem/Act          = Activity 
- +Sem/Adr          = Webadr 
- +Sem/Amount       = Amount 
- +Sem/Ani          = Animate 
- +Sem/Aniprod      = Animal Product 
- +Sem/Body         = Bodypart 
- +Sem/Body-abstr   = siellu, vuoig?a, jierbmi, (noe man kan bruke i fysisk aktivitet som en kroppsdel, f.eks. synet, stemmen, etc.)
- +Sem/Build        = Building 
- +Sem/Build-room   = Room in a building, typically place to be 
- +Sem/Buildpart   = Part of Bulding, like the wall 
- +Sem/Cat          = Category 
- +Sem/Clth         = Clothes 
- +Sem/Clth-jewl    = Jewelery 
- +Sem/Clthpart    = part of clothes, boallu, sávdnji... 
- +Sem/Ctain        = Container 
- +Sem/Ctain-abstr  = Abstract container like bank account 
- +Sem/Ctain-clth   = Soft container, like a rucksack 
- +Sem/Ctain-Obj    = Soft container, like a rucksack 
- +Sem/Curr         = Currency like dollár, Not Money 
- +Sem/Date         = Date 
- +Sem/Dance        = Dance 
- +Sem/Dir          = Direction like GPS-kursa 
- +Sem/Domain       = Domain like politics, reindeerherding (a system of actions)
- +Sem/Drink        = Drink 
- +Sem/Dummytag     = Dummytag 
- +Sem/Edu          = Educational event 
- +Sem/Event        = Event 
- +Sem/Feat         = Feature, like Árvu. (noe som man kan ha mye eller lite av, det kan være en skala og som er på en måte karakteriserende. (høyde, vekt, farge, kreativitet etc.)
- +Sem/Feat-phys    = Physiological feature, ivdni, fárda 
- +Sem/Feat-psych   = Psychological feauture 
- +Sem/Feat-measr   = Psychological feauture 
- +Sem/Fem          = Female name 
- +Sem/Food         = Food 
- +Sem/Food-med     = Medicine 
- +Sem/Fruit        = Fruits, vegetables, seeds, nuts 
- +Sem/Furn         = Furniture 
- +Sem/Game         = Game 
- +Sem/Geom         = Geometrical object 
- +Sem/Group        = Animal or Human Group 
- +Sem/Hum          = Human 
- +Sem/Hum-abstr    = Human abstract 
- +Sem/Hum-prof     = Human professional 
- +Sem/Ideol        = Ideology 
- +Sem/ID        = ID 
- +Sem/Lang         = Language 
- +Sem/Mal          = Male name 
- +Sem/Mat          = Material for producing things 
- +Sem/Measr        = Measure 
- +Sem/Money        = Has to do with money, like wages, not Curr(ency)
- +Sem/Obj          = Object 
- +Sem/Obj-clo      = Cloth 
- +Sem/Obj-cogn     = Cloth 
- +Sem/Obj-el       = (Electrical) machine or apparatus
- +Sem/Obj-ling     = Object with something written on it 
- +Sem/Obj-rope     = flexible ropelike object 
- +Sem/Obj-surfc    = Surface object 
- +Sem/Org          = Organisation 
- +Sem/Part         = Feature, oassi, bealli 
- Perc = (perception) er noe man kan kjenne i en begrensa periode og som er forårsaka av noe utenifra, f.eks. Mus lea ballu. Mus lea bavččas.
- +Sem/Perc-cogn    =   
- +Sem/Perc-emo     = Emotional perception 
- +Sem/Perc-phys    = Physical perception 
- +Sem/Perc-psych   = Psychological perception 
- +Sem/Phonenr = Telephone number 
- +Sem/Plant        = Plant 
- +Sem/Plantpart   = Plant part 
- +Sem/Plc          = Place 
- +Sem/Plc-abstr    = Abstract place 
- +Sem/Plc-elevate  = Place 
- +Sem/Plc-line     = Place 
- +Sem/Plc-water    = Place 
- +Sem/Pos          = Position (as in social position job)
- +Sem/Process      = Process 
- +Sem/Prod         = Product 
- +Sem/Prod-audio   = Audio product 
- +Sem/Prod-cogn    = Cognition product 
- +Sem/Prod-ling    = Linguistic product 
- +Sem/Prod-vis     = Visual product 
- +Sem/Rel          = Relation 
- +Sem/Route        = Route 
- +Sem/Rule         = Rule or convention 
- +Sem/Semcon       = Semantic concept 
- +Sem/Sign         = Sign (e.g. numbers, punctuation)
- +Sem/Sport        = Sport 
- +Sem/State        = 
- +Sem/State-sick   = Illness 
- +Sem/Substnc      = Substance, like Air and Water 
- +Sem/Sur          = Surname 
- +Sem/Symbol       = Symbol 
- +Sem/Time         = Time 
- +Sem/Time-clock   = Time clock 
- +Sem/Tool         = Prototypical tool for repairing things 
- +Sem/Tool-catch   = Tool used for catching (e.g. fish)
- +Sem/Tool-clean   = Tool used for cleaning 
- +Sem/Tool-it      = Tool used in IT 
- +Sem/Tool-measr   = Tool used for measuring 
- +Sem/Tool-music   = Music instrument 
- +Sem/Tool-write   = Writing tool 
- +Sem/Txt          = Text (girji, lávlla...)
- +Sem/Veh          = Vehicle 
- +Sem/Wpn          = Weapon 
- +Sem/Wthr         = The Weather or the state of ground 
- +Sem/Year - year (i.e. 1000 - 2999), used only for numerals
Multiple Semantic tags:
- +Sem/Act_Fruit                       
- +Sem/Act_Group Activity and Group 
- +Sem/Act_Plc   A persons job is an activity, and a place as well 
- +Sem/Act_Route Activity and Route, ie johtolat 
- +Sem/Act_Tool-it 
- +Sem/Amount_Build   Amount and Building 
- +Sem/Amount_Semcon 
- +Sem/Ani_Body-abstr_Hum 
- +Sem/Ani_Build 
- +Sem/Ani_Buildpart 
- +Sem/Ani_Build_Hum_Txt 
- +Sem/Ani-fish 
- +Sem/Ani_Group 
- +Sem/Ani_Group_Hum 
- +Sem/Ani_Group_Prod-vis 
- +Sem/Ani_Hum 
- +Sem/Ani_Hum_Plc 
- +Sem/Ani_Hum_Time 
- +Sem/Ani_Plc 
- +Sem/Ani_Plc_Txt 
- +Sem/Ani_Time 
- +Sem/Ani_Veh 
- +Sem/Aniprod_Hum 
- +Sem/Aniprod_Obj-clo 
- +Sem/Aniprod_Perc-phys 
- +Sem/Aniprod_Plc 
- +Sem/Aniprod_Plc_Route 
- +Sem/Body-abstr_Feat-psych 
- +Sem/Body-abstr_Prod-audio_Semcon 
- +Sem/Body_Body-abstr 
- +Sem/Body_Clth 
- +Sem/Body_Food 
- +Sem/Body_Group_Hum 
- +Sem/Body_Group_Hum_Time 
- +Sem/Body_Hum 
- +Sem/Body_Mat 
- +Sem/Body_Measr 
- +Sem/Body_Obj_Tool-catch 
- +Sem/Body_Plc 
- +Sem/Body_Plc-elevate 
- +Sem/Body_Time 
- +Sem/Build_Clthpart 
- +Sem/Build_Edu_Org 
- +Sem/Build_Event_Org 
- +Sem/Build_Obj 
- +Sem/Build_Org 
- +Sem/Build_Route 
- +Sem/Build-room_Cat_Ctain_Mat 
- +Sem/Buildpart_Cat                  
- +Sem/Buildpart_Cat_Ctain            
- +Sem/Buildpart_Cat_Ctain_Mat        
- +Sem/Buildpart_Ctain                
- +Sem/Buildpart_Ctain_Mat            
- +Sem/Buildpart_Ctain_Obj            
- +Sem/Cat_Group_Hum                   
- +Sem/Cat_Group_Hum_Plc               
- +Sem/Cat_Edu                         
- +Sem/Cat_Obj                         
- +Sem/Clth-jewl_Curr 
- +Sem/Clth-jewl_Curr_Obj 
- +Sem/Clth-jewl_Curr_Obj_Org 
- +Sem/Clth-jewl_Fruit 
- +Sem/Clth-jewl_Money 
- +Sem/Clth-jewl_Plant 
- +Sem/Clth_Hum 
- +Sem/Clth_Obj-clo 
- +Sem/Ctain-abstr_Org 
- +Sem/Ctain-clth_Plant 
- +Sem/Ctain-clth_Veh 
- +Sem/Ctain_Feat-phys 
- +Sem/Ctain_Furn 
- +Sem/Ctain_Plc 
- +Sem/Ctain_Tool 
- +Sem/Ctain_Tool-measr 
- +Sem/Curr_Org 
- +Sem/Dance_Org 
- +Sem/Dance_Prod-audio 
- +Sem/Domain_Food-med 
- +Sem/Domain_Hum                      
- +Sem/Domain_Prod-audio 
- +Sem/Drink_Plant                     
- +Sem/Edu_Event 
- +Sem/Edu_Geom                        
- +Sem/Edu_Group_Hum 
- +Sem/Edu_Hum                         
- +Sem/Edu_Mat 
- +Sem/Edu_Org 
- +Sem/Event_Food 
- +Sem/Event_Hum 
- +Sem/Event_Plc 
- +Sem/Event_Plc-elevate 
- +Sem/Event_Time 
- +Sem/Feat-measr_Plc 
- +Sem/Feat-phys_Tool-write 
- +Sem/Feat-phys_Veh 
- +Sem/Feat-phys_Wthr 
- +Sem/Feat-psych_Hum 
- +Sem/Feat-psych_Plc 
- +Sem/Food_Obj-surfc 
- +Sem/Feat_Plant 
- +Sem/Food_Perc-phys 
- +Sem/Food_Plant 
- +Sem/Food_Sign 
- +Sem/Fruit_Hum                       
- +Sem/Game_Obj-play 
- +Sem/Geom_Hum_Plc 
- +Sem/Geom_Obj 
- +Sem/Group_Hum 
- +Sem/Group_Hum_Org 
- +Sem/Group_Hum_Plc 
- +Sem/Group_Hum_Plc-abstr 
- +Sem/Group_Hum_Prod-vis 
- +Sem/Group_Hum_Time 
- +Sem/Group_Org 
- +Sem/Group_Prod-vis                  
- +Sem/Group_Sign 
- +Sem/Group_Txt 
- +Sem/Hum_Lang 
- +Sem/Hum_Lang_Plc 
- +Sem/Hum_Lang_Time 
- +Sem/Hum_Mat_Tool 
- +Sem/Hum_Obj 
- +Sem/Hum_Org 
- +Sem/Hum_Sign 
- +Sem/Hum_Plant 
- +Sem/Hum_Plc 
- +Sem/Hum_Tool 
- +Sem/Hum_Tool-it                     = Human 
- +Sem/Hum_Veh 
- +Sem/Hum_Wthr 
- +Sem/Lang_Tool 
- +Sem/Mat_Plant 
- +Sem/Mat_Txt 
- +Sem/Measr_Obj_Time                  
- +Sem/Measr_Sign                      = Sign (e.g. numbers, punctuation)
- +Sem/Measr_Time 
- +Sem/Money_Obj 
- +Sem/Money_Org 
- +Sem/Money_Part 
- +Sem/Money_Txt 
- +Sem/Obj-play 
- +Sem/Obj-play_Sport 
- +Sem/Obj_Semcon 
- +Sem/Obj_Sign 
- +Sem/Obj_Veh 
- +Sem/Clth-jewl_Org 
- +Sem/Obj_Symbol 
- +Sem/Org_Rule 
- +Sem/Org_Txt 
- +Sem/Org_Veh 
- +Sem/Part_Prod-cogn 
- +Sem/Part_Substnc 
- +Sem/Perc-emo_Wthr 
- +Sem/Plant_Plantpart 
- +Sem/Plant_Tool 
- +Sem/Plant_Tool-measr 
- +Sem/Plc-abstr_Rel_State 
- +Sem/Plc-abstr_Route 
- +Sem/Plc_Pos 
- +Sem/Plc_Route 
- +Sem/Plc_Semcon 
- +Sem/Plc_State 
- +Sem/Plc_Substnc 
- +Sem/Plc_Substnc_Wthr 
- +Sem/Plc_Time 
- +Sem/Plc_Tool-catch 
- +Sem/Plc_Txt 
- +Sem/Plc_Wthr 
- +Sem/Prod-audio_Txt 
- +Sem/Prod-cogn_Txt 
- +Sem/Semcon_Txt 
- +Sem/Obj_State 
- +Sem/Substnc_Wthr 
- +Sem/Plc_Time_Wthr 
- +Sem/Time_Wthr 
- +Sem/State-sick_Substnc 
- +Sem/Obj-ling_Obj-surfc              
- +Sem/Org_Prod-audio 
- +Sem/Org_Prod-cogn 
- +Sem/Org_Prod-vis
- +Allegro from LEXICON GOADE-IU-
All non-positional derivations should be preceded by this tag, to make it possible 
- +Der
Other/unclassified derivations, can appear in all positions:
- +Der/veara  NA# 
- +Der/viđá  NA# 
- +Der/viđi  NA# 
- +Der/has only one in the code
Miscellanious list
- +Der/A Adjective derivated from Noun or Verb 
- +Der/Adv Adverb derivated from Adjective
Tags for originating language
The following tags are used to guide conversion to IPA: loan words 
- any untagged word is pronounced with SME orthographic conventions 
- NNO and NOB have identical pronunciation, NNO is only used if - SWE has mostly the same pronunciation as NOB, and is only used - Occasionally even SME (the default) may be tagged, to block other
- +OLang/SME - North Sámi 
- +OLang/SMJ - Lule Sámi 
- +OLang/SMA - South Sámi 
- +OLang/FIN - Finnish 
- +OLang/SWE - Swedish 
- +OLang/NOB - Norw. bokmål 
- +OLang/NNO - Norw. nynorsk 
- +OLang/ENG - English 
- +OLang/RUS - Russian 
- +OLang/UND - Undefined
Triggers for morphophonological rules
- X1  Diphthong Simplification, Metaphony 
- X2  Diphthong Simplification, Metaphony, Word Final Neutralization of g8, h8, m8 
- X3  Diphthong Simplification, Metaphony 
- X4  WeG, Vowel Shortening, Stem vowel alternations, Word Final Deletion of n8 m8 g8 h8 
- X5  WeG, Diphthong Simplification, Stem vowel alternations 
- X6  WeG, Diphthong Simplification, Metaphony, Word Final Deletion of n8 m8 g8 h8 
- X7  Vowel Shortening, Stem vowel alternations, Word Final Neutralization of g8, h8, m8 
- X8  WeG, Vowel Shortening, Metaphony, Stem Vowel alternations, Word Final Deletion of n8 m8 g8 h8 
- X9  WeG, Dipthtong simplification, Word Final Deletion of n8 m8 g8 h8 
- Y1  Lengthening of Central Consonants, Stem Vowel alternations, 
- Y2  Lengthening of Central Consonants, Stem Vowel alternations, 
- Y3  Lengthening of Central Consonants, Stem Vowel alternations, 
- Y4  Lengthening of Central Consonants, Stem Vowel alternations, 
- Y5  Lengthening of Central Consonants, Word Final Consonant Deletion, Diphthong Simplification, Stem vowel alternations 
- Y6  Lengthening of Central Consonants, Word Final Consonant Deletion, Diphthong Simplification, Stem vowel alternations 
- Y7  Lengthening of Central Consonants, Diphthong Simplification, Stem vowel alternations 
- Y8  Not in use 
- Y9  Lengthening of Central Consonants, Diphthong Simplification 
- Q1  Stem vowel alternations, 
- Q2  Diphthong Simplification, Stem vowel alternations, 
- Q3  Diphthong Simplification, Stem vowel alternations, 
- Q4  WeG, Stem vowel alternations, 
- Q5  WeG, Diphthong Simplification, Stem vowel alternations, 
- Q6  WeG, Vowel shortening, 
- Q7  WeG, Diphthong Simplification, Metaphony, 
- Q8  WeG, Diphthong Simplification, Stem vowel alternations, 
- Q9  Not in use 
- W1  WeG, Vowel Shortening 
- W2  Vowel Shortening, 
- W3  Stem vowel deletion in compounding, 
- W4  WeG, Word Final Cluster Simplification, Optional vowel-shortening, Word Final Deletion of n8 m8 g8 h8 
- W5  WeG, Diphthong Simplification, Stem vowel alternations 
- W6  Stem vowel alternations, WeG, 
- W7  Stem vowel alternations, WeG 
- W8  Stem vowel alternations, 
- W9  Not in use 
- %^DISIMP diphthong simpification
Morphophonemes and Sámi letters
- b9  twol rule override, so that b doesn't turn into t infront of hash 
- e7  shortened i = "e with dot below" from the dictionary
- e9  twol rule override, so that e doesn't turn into i infront of j 
- d9  twol rule override, so that d doesn't turn into t infront of hash 
- g8  Word Final Neutralization and Deletion 
- g9  twol rule override, so that g doesn't turn into t infront of hash 
- h7 
- h8  Word Final Neutralization and Deletion 
- h9  twol rule override, so that h doesn't turn into t infront of hash 
- i7  twol rule override, so that i doesn't turn into e in certain contextes 
- j9  twol rule override, so that j doesn't turn into i after i 
- k9  twol rule override, so that k doesn't turn into t infront of hash 
- m8  Word Final Neutralization and Deletion 
- m9  twol rule override, so that m doesn't turn into n infront of hash 
- n8  Word Final Neutralization and Deletion 
- n9  twol rule override, 
- o7  shortened u = "o with dot below" from the dictionary
- o9  twol rule override,  so that o doesn't turn into u infront of j 
- p9  twol rule override, so that p doesn't turn into t infront of hash 
- s9  twol rule override, so that we can have two ss in front of hash 
- t9  twol rule override, so that we can have st in front of hash 
- u7 
- z9  twol rule override, to avoid Word Final Consonant Neutralization 
- ž9  twol rule override, to avoid Word Final Consonant Neutralization 
- š9  twol rule override, so that we can have two šš in front of hash 
- r9 
- æ7  in smi, for lulesámi 
- u6  twol rule override, so that u doesn't turn into o in certain contextes 
- æ9 in smi, for lulesámi
∑ - a symbol used in front of  # to block backtracking and 
Symbols that need to be escaped on the lower side (towards twolc):
- » 
- « 
- > (escaped with square brackets, to avoid collision with > as morpheme boundary)
- < (escaped with square brackets, to avoid collision with < as morpheme boundary)
Flag diacritics
| @P.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised | 
| @D.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised | 
| @C.NeedNoun@ | (Dis)allow compounds with verbs unless nominalised | 
| @P.Vgen.add@ | (Dis)allow VGen | 
| @R.Vgen.add@ | (Dis)allow VGen | 
| @P.12p.add@ | (Dis)allow 1. and 2. pers forms | 
| @R.12p.add@ | (Dis)allow 1. and 2. pers forms | 
| @P.Pmatch.Loc@ | Used on multi-token analyses; tell hfst-tokenise/pmatch where in the form/analysis the token should be split. | 
| @P.Pmatch.Backtrack@ | Used on single-token analyses; tell hfst-tokenise/pmatch to backtrack by reanalysing the substrings before and after this point in the form (to find combinations of shorter analyses that would otherwise be missed) | 
| @D.ErrOrth.ON@ | 
| @C.ErrOrth@ | 
| @P.ErrOrth.ON@ | 
For languages that allow compounding, the following flag diacritics are needed 
| @P.CmpFrst.FALSE@ | Require that words tagged as such only appear first | 
| @D.CmpPref.TRUE@ | Block such words from entering ENDLEX | 
| @P.CmpPref.FALSE@ | Block these words from making further compounds | 
| @D.CmpLast.TRUE@ | Block such words from entering R | 
| @D.CmpNone.TRUE@ | Combines with the next tag to prohibit compounding | 
| @U.CmpNone.FALSE@ | Combines with the prev tag to prohibit compounding | 
| @U.CmpNone.TRUE@ | Combines with the two previous ones to block compounding | 
| @P.CmpOnly.TRUE@ | Sets a flag to indicate that the word has passed R | 
| @D.CmpOnly.FALSE@ | Disallow words coming directly from root. | 
| @D.CmpHyph.TRUE@ | Flag to control hyphenated compounds like proper nouns | 
| @U.CmpHyph.FALSE@ | Flag to control hyphenated compounds like proper nouns | 
| @U.CmpHyph.TRUE@ | Flag to control hyphenated compounds like proper nouns | 
| @C.CmpHyph@ | Flag to control hyphenated compounds like proper nouns | 
Use the following flag diacritics to control downcasing of derived proper 
| @U.Cap.Obl@ | Allowing downcasing of derived names: deatnulasj. | 
| @U.Cap.Opt@ | Allowing downcasing of derived names: deatnulasj. | 
- @U.NeedsVowRed.OFF@ is used to force hyphenation/non-reduction: samediggi- 
- @U.NeedsVowRed.ON@ is used to force reduction w/o hyphen: samedigge#xxx 
- @C.NeedsVowRed@ Clearing this feature, so that it doesn't interfere with further compounding
- @C.Px@ 
- @C.Nom3Px@ 
- @P.Px.add@ 
- @R.Px.add@ 
- @P.Px.block@ 
- @D.Px.block@
- @R.SpellRlx.ON@ Flag used to tag spell-relax-analysed strings (and only those).
- @D.SpellRlx.ON@ Flag used to tag spell-relax-analysed strings (and only those).
- @C.SpellRlx@ Flag used to tag spell-relax-analysed strings (and only those).
- @R.SpaceCmp.ON@ Flag to tag compounds written with a space 
- @D.SpaceCmp.ON@ Flag to tag compounds written with a space 
- @C.SpaceCmp@ Flag to tag compounds written with a space+
Basic lexica, pointing to the other lexicon files
- LEXICON Root is the basic lexicon starting everything
- LEXICON Acronym
- LEXICON ProperNoun
Lexicon ENDLEX
@D.CmpOnly.FALSE@@D.CmpPref.TRUE@@D.NeedNoun.ON@ ENDLEX2 ;
The  @D.CmpOnly.FALSE@ flag diacritic is ued to disallow words tagged 

