root-morphology
Morphology
Symbols used for analysis {{Multichar_Symbols}}
Parts-of-speech
Temporary list of added tags
Are there tags not declared in root.lexc or misspelled?
-
+Dash : XXX check this tag!
-
+Dial/Finland : XXX check this tag!
-
+Dial/standard : XXX check this tag!
-
+Gyr : XXX check this tag!
-
+Pref- : XXX check this tag!
-
+Pro : XXX check this tag!
-
+TruncPrefix : XXX check this tag!
-
+Use/N : XXX check this tag!
-
+Use/sub : XXX check this tag!
- +s : XXX check this tag!
- +CLBfinal Sentence final abbreviated expression ending in full stop, so that the full stop is ambiguous
Parts of speech
-
+V : Verb
-
+N : Noun
-
+A : Adjective
-
+ACR : Acronym
-
+ABBR : Abbreviation
- +Symbol = independent symbols in the text stream, like £, €, ©
-
+Acr : Acronym
-
+Num : Numerals
-
+Adv : Adverb
-
+Pron : Pronoun
-
+Pcle : Particle, except:
- +Interj : Interjection
The part-of-speech analyses are typically the first:
Analysis examples examples:
-
kutoo: kutoa+V+Act+Ind+Prs+Sg3 (Eng. # knits)
-
talo: talo+N+Sg+Nom (Eng. # house)
-
nopea: nopea+A+Sg+Nom (Eng. # fast)
-
yksi: yksi+Num+Card+Sg+Nom (Eng. # one)
-
nopeasti: nopeasti+Adv (Eng. # fast)
-
hän: hän+Pron+Pers+Sg+Nom (Eng. # he)
-
ahaa: ahaa+Pcle (Eng. # ah)
- äh: äh+Interj (Eng. # uh)
Nouns
The code for proper nouns:
- +Prop : Proper noun
Proper noun tag follows noun analysis:
- Pekka: Pekka+N+Prop+Sg+Nom (Eng. # Pekka)
Pronouns
-
+Pers : Personal
-
+Dem : Demonstrative
-
+Interr : Interrogative
-
+Rel : Relative
-
+Qu : Quantor
-
+Qnt : Quantor?????
-
+Refl : Reflexive
-
+Recipr : Reciprocal
- +Indef : Indefinite
Semantic tags follow pronoun analyses:
-
minä: minä+Pron+Pers+Sg+Nom (Eng. # I)
-
tämä: tämä+Pron+Dem+Sg+Nom (Eng. # this)
-
kuka: kuka+Pron+Interr+Sg+Nom (Eng. # who)
-
joka: joka+Pron+Rel+Sg+Nom (Eng. # which)
-
kaikki: kaikki+Pron+Qu+Sg+Nom (Eng. # every)
-
itse: itse+Pron+Refl+Sg+Nom (Eng. # self)
-
toistaan: toinen+Pron+Recipr+Sg+Par+PxSg3 (Eng. # each other)
- joku: joku+Pron+Qu+Indef+Sg+Nom (Eng. # someone)
-
+Sem/Human : Semantic class: Human
-
+Sem/Geo : Semantic class: Geographic
- +Sem/Org : Semantic class: Organisation
-
+Sem/Build :
-
+Sem/Build-room :
-
+Sem/Cat :
-
+Sem/Date :
-
+Sem/Domain :
-
+Sem/Dummytag :
-
+Sem/Event :
-
+Sem/Fem :
-
+Sem/Group_Hum :
-
+Sem/Hum :
-
+Sem/ID :
-
+Sem/Mal :
-
+Sem/Mat :
-
+Sem/Measr :
-
+Sem/Money :
-
+Sem/Obj :
-
+Sem/Obj-el :
-
+Sem/Obj-ling :
-
+Sem/Org_Prod-audio :
-
+Sem/Org_Prod-vis :
-
+Sem/Plc :
-
+Sem/Prod-vis :
-
+Sem/Route :
-
+Sem/Rule :
-
+Sem/State-sick :
-
+Sem/Substnc :
-
+Sem/Time-clock :
-
+Sem/Tool-it :
-
+Sem/Txt :
-
+Sem/Veh :
- +Sem/Year :
Numerals
-
+Card : Cardinal
- +Ord : Ordinals
-
kolme: kolme+Num+Card+Sg+Nom (Eng. # three)
- kolmas: kolmas+Num+Ord+Sg+Nom (Eng. # third)
Particles
-
+CC : Coordinating
- +CS : Adverbial
The conjunction tags take place of part-of-speech tags for legacy reasons:
-
ja: ja+CC
- vaikka: vaikka+CS
Adpositions
Adposition syntax tags:
-
+Adp : Adposition
-
+Po : Postposition
- +Pr : Preposition
Adpositions are tagged in POS position:
- läpi: läpi+Po
Tags for sub-POS
-
+Arab :
-
+Attr :
-
+Coll :
- +Rom :
Bound root morphs
- +Pref : Prefixes
Suffixes are typically word forms or derivations that only appear as
- +Suff : Suffixes
Symbols
-
+Punct : any punctuation
- +Quote : quote marks
The analyses for symbols are like POSes:
- .: .+Punct (Eng. # .)
Nominal analyses
-
+Sg : Singular
- +Pl : Plural
Number tags are next to POSes in nominal analyses, and in order of morphs:
-
padassa: pata+N+Sg+Ine (Eng. # pot)
- padoissa: pata+N+Pl+Ine
The analyses of nominals have case inflection marked.
-
+Nom : (Mostly) Syntactic cases: Nominative
-
+Par : Partitive
-
+Gen : Genitive
-
+Ine : Inner Locative cases: Inessive
-
+Ela : Elative
-
+Ill : Illative
-
+Ade : Outer locative cases: Adessive
-
+Abl : Ablative
-
+All : Allative
-
+Ess : Others, semantic, marginal: Essive
-
+Ins : Instructive
-
+Abe : Abessive
-
+Tra : Translative
- +Com : Comitative
The case is next to number and last obligatory analysis in nominals:
-
taloa: talo+N+Sg+Par
-
talon: talo+N+Sg+Gen
-
talossa: talo+N+Sg+Ine
-
talosta: talo+N+Sg+Ela
-
taloon: talo+N+Sg+Ill
-
talolla: talo+N+Sg+Ade
-
talolta: talo+N+Sg+Abl
-
talolle: talo+N+Sg+All
-
talona: talo+N+Sg+Ess
-
taloin: talo+N+Pl+Ins
-
talotta: talo+N+Sg+Abe
-
taloksi: talo+N+Sg+Tra
- taloine: talo+N+Com
The analyses of a infinitive short form have lative ending; this is largely
- +Lat : Lative case
The analyses of certain nominals give explicit analysis for accusative case.
- +Acc : Explicit accusative analysis
- hänet: hän+Pron+Pers+Sg+Acc
Adverbs and adpositions may have some special analyses in diachronic
-
+Prl : Adverbial cases: Prolative
-
+Distr : Distributive
- +Tempr : Temporal
Possessives
-
+PxSg1 : Possessives: First singular (mine)
-
+PxSg2 : Second singular (yours)
-
+PxSg3 : Third singular (his)
-
+PxPl1 : First plural (ours)
-
+PxPl2 : Second plural (yours)
-
+PxPl3 : Third plural (theirs)
- +Px3 : Third ambiguous (his/theirs)
-
+Sg1 : Verbs: First singular (I)
-
+Sg2 : Second singular (your
-
+Sg3 : Third singular (he)
-
+Pl1 : First plural (we)
-
+Pl2 : Second plural (you)
- +Pl3 : Third plural (thy)
-
taloni: talo+N+Sg+Nom+PxSg1
-
talosi: talo+N+Sg+Nom+PxSg2
-
talonsa: talo+N+Sg+Nom+PxSg3
-
talomme: talo+N+Sg+Nom+PxPl1
- talonne: talo+N+Sg+Nom+PxPl2
Compound forms
- naisien: nainen+N+Der/s#ien+N+Sg+Nom (Eng. # female gum)
Finite verbs
-
+Act : Active voice
- +Pss : Passive voice
It is the first analysis of verb strings:
- kudot: kutoa+V+Act+Ind+Prs+Sg2
Finite verb form analyses have a reading for tense. The tense has two values.
-
+Prs : Non-past (present)
- +Prt : Past (preterite)
The tense is marked in indicative forms after mood:
-
kudon: kutoa+V+Act+Ind+Prs+Sg1
- kudoin: kutoa+V+Act+Ind+Prt+Sg1
Finite verb form analyses have a reading for mood. Mood has four central
-
+Ind : Common moods: Indicative
-
+Cond : Conditional
-
+Pot : Potential
-
+Imprt : Imperative
-
+Opt : Archaic moods: Optative
- +Eventv : Eventive
The mood is after voice in the analysis string and in morph order:
- kutonen: kutoa+V+Act+Pot+Sg1
Finite verb form analyses have a reading for person. Personal ending of verb
-
+Sg1 : First singular
-
+Sg2 : Second singular
-
+Sg3 : Third singular
-
+Pl1 : First plural
-
+Pl2 : Second plural
- +Pl3 : Third plural
The person is the last required analysis for verbs, after the mood:
-
kudon: kutoa+V+Act+Ind+Prs+Sg1
-
kudot: kutoa+V+Act+Ind+Prs+Sg2
-
kutoo: kutoa+V+Act+Ind+Prs+Sg3
-
kudomme: kutoa+V+Act+Ind+Prs+Pl1
-
kudotte: kutoa+V+Act+Ind+Prs+Pl2
- kutovat: kutoa+V+Act+Ind+Prs+Pl3
Negation and verbs
- +ConNeg : Connegative form
- kudo: kutoa+V+Ind+Prs+ConNeg
The suitable negation verbs have sub-analysis that can be matched to negated
- +Neg : Negation verb
- ei: ei+V+Neg+Act+Sg3
Infinite verb forms
-
+InfA : A infinitive (first)
-
+InfE : E infinitive (second)
-
+InfMa : MA infinitive (third)
-
+Der/minen : minen derivation (fourth)
- +Der/maisilla : maisilla derivation (fifth)
Infinitive analysis comes after voice, followed by nominal analyses:
-
kutoa: kutoa+V+Act+InfA+Sg+Lat
-
kutoessa: kutoa+V+Act+InfE+Sg+Ine
-
kutomatta: kutoa+V+Act+InfMa+Sg+Abe
-
kutominen: kutoa+V+Der/minen+Sg+Nom
- kutomaisillani: kutoa+V+Act+Der/maisilla+PxSg1
Participles
-
+PrfPrc : NUT participle (first, perfect)
-
+PrsPrc : VA participle (second, present)
-
+NegPrc : Negation participle
- +AgPrc : Agent partiicple
Participle analyses are right after voice, followed by adjectival analyses:
-
kutonut: kutoa+V+Act+PrfPrc+Sg+Nom
-
kutova: kutoa+V+Act+PrsPrc+Sg+Nom
-
kutomaton: kutoa+V+NegPrc+Sg+Nom
- kutomani: kutoa+V+AgPrc+Sg+Nom+PxSg1
There are number of implementations that mix up MA infinitives and Agent
Comparation
-
+Comp : Comparative
- +Superl : Superlative (was +Sup, now standardised)
The comparison analysis occupies derivation spot, after POS:
-
nopeampi: nopea+A+Comp+Sg+Nom
- nopein: nopea+A+Sup+Sg+Nom
Enclitic focus particles
-
+Foc/han : -hAn; affirmative etc.
-
+Foc/kaan : -kAAn; "neither"
-
+Foc/kin : -kin; "also"
-
+Foc/pa : -pA; "indeed"
-
+Foc/s : -s; polite?
-
+Foc/ka : -kA; "nor"
- +Qst : -kO: Question focus
Derivation
-
+Der/sti : Common derivations: A→Adv (in A manner)
-
+Der/ja : V→N (doer of V)
-
+Der/inen : N→A (containing N)
-
+Der/lainen : N→A (style of N)
-
+Der/tar : N→N (feminine N)
-
+Der/llinen : N→N (consisting of N)
-
+Der/ton : N→A (without N)
-
+Der/tse : N→Adp (via N)
-
+Der/vs : A→N (quality of A)
-
+Der/u : V→N (act of V)
-
+Der/ttain : N→Adv (by amounts of N)
-
+Der/ttaa : V→V (make someone do V)
-
+Der/tattaa : V→V (make someone do V; "first indirection")
-
+Der/tatuttaa : V→V (make someone do V; "second indirection")
-
+Der/uus : A→N (A-ness)
- +Der/nti : V→N (regular derivation from all but 2 -da/-dä V)
Usage
-
+Err/Orth : Sub-standard usage
-
+Err/Hyph :
-
+Err/Lex :
- +Err/SpaceCmp :
-
+Use/Marg : Marginal
-
+Use/Rare : Rare
-
+Use/NG : Do not generate
-
+Use/Hyphen : With hyphens
- +Use/NoHyphens : With hyphens
- +Use/PMatch means that the following is only used in the analyser feeding the disambiguator. This is missing.
- +Use/-PMatch :
-
+Use/-Spell :
-
+Use/Arch :
- +Use/SpellNoSugg :
Usage tags are pushed wherever appropriate:
- nallein: nalle+N+Pl+Gen+Use/Rare
Homonym tags
-
+v1 :
- +v2 :
Dialects
-
+Dial : any unclassified dialect
-
+Dial/Standard : standard spoken Finnish
-
+Dial/East : Eastern dialects
-
+Dial/West : Western dialects
-
+Dial/Southwest : South-western dialects
-
+Dial/Häme : Tavastian dialects
-
+Dial/Eteläpohjalaiset : South Osthrobotnian dialects
-
+Dial/Keskipohjalaiset : Middle Osthrobotnian dialects
-
+Dial/Peräpohjalaiset : North Osthrobotnian dialects
-
+Dial/North : North Finnish dialects
-
+Dial/Savo : Savonian dialects
- +Dial/Southeast : South-eastern dialects
Tags for language of unassimilated name
-
+OLang/ENG :
-
+OLang/eng : is a typo, FIX
-
+OLang/FIN :
-
+OLang/NNO :
-
+OLang/NOB :
-
+OLang/RUS :
-
+OLang/SMA :
-
+OLang/SME :
-
+OLang/SWE :
- +OLang/UND :
Others
-
+Cmp - Dynamic compound. This tag should always be part
-
+Cmp/Hyph :
- +CmpNP/None :
The word and morpheme boundaries are used to limit the effective range of
-
##: Lexical boundary
-
# word boundary
-
> inflectional morph boundary
-
» derivational morph boundary
- _ weak boundary
Flag diacritics
@P.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised |
@D.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised |
@C.NeedNoun@ | (Dis)allow compounds with verbs unless nominalised |
@C.ErrOrth@ |
@D.ErrOrth.ON@ |
@P.ErrOrth.ON@ |
For languages that allow compounding, the following flag diacritics are needed
@P.CmpFrst.FALSE@ | Require that words tagged as such only appear first |
@D.CmpPref.TRUE@ | Block such words from entering ENDLEX |
@P.CmpPref.FALSE@ | Block these words from making further compounds |
@D.CmpLast.TRUE@ | Block such words from entering R |
@D.CmpNone.TRUE@ | Combines with the next tag to prohibit compounding |
@U.CmpNone.FALSE@ | Combines with the prev tag to prohibit compounding |
@P.CmpOnly.TRUE@ | Sets a flag to indicate that the word has passed R |
@D.CmpOnly.FALSE@ | Disallow words coming directly from root. |
Use the following flag diacritics to control downcasing of derived proper
@U.Cap.Obl@ | Allowing downcasing of derived names: deatnulasj. |
@U.Cap.Opt@ | Allowing downcasing of derived names: deatnulasj. |
The start of the dictionary Root
Parts-of-speech examples:
-
talo: talo+N+Sg+Nom (Eng. # house)
-
nopea: nopea+A+Sg+Nom (Eng. # fast)
- kutoa: kutoa+V+Act+InfA+Sg+Lat (Eng. # to knit)