rus
Free and Open source Russian analyser giella-rus
- Authors
- Divvun and Giellatekno teams, community members
- Software version
- 2012
- Documentation license
- GNU GFDL
- SVN Revision
- $Revision
: 68217 $ - SVN Date
- $Date
: 2013-01-16 11: 31: 33 +0200 (Wed, 16 Jan 2013) $
giella-rus
This is free and open source Russian morphology.
Russian tags
Stressed vowels
- а́ е́ ё́ и́ о́ у́ ы́ э́ ю́ я́ Primary stress (lower)
- а̀ ѐ ё̀ ѝ о̀ у̀ ы̀ э̀ ю̀ я̀ Secondary stress (lower)
- А́ Е́ Ё́ И́ О́ У́ Ы́ Э́ Ю́ Я́ Primary stress (upper)
- А̀ Ѐ Ё̀ Ѝ О̀ У̀ Ы̀ Э̀ Ю̀ Я̀ Secondary stress (upper)
Symbols that need to be escaped on the lower side (towards twolc):(copied from sme)
Markers
- ¹ ² ³ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ⁰ = Used to enumerate homonymous lemmas
- %> = End-of-stem marker (nominals)
- %< = End-of-stem marker (verbs)
- %^F = Fleeting vowel marker
- %^o %^O = Verbal prefix fleeting vowel
- %^G = Irregular GenPl marker (to keep ов/ев on n stems, e.g. ов%^G
- %^Z = Zero ending (resolves to 0/й/ь)
- %^M = Verb stem mutation
- %^D = archiphoneme for д~жд alternation in past passive participles
- %^T = archiphoneme for т~щ alternation in verbs
- %^d = archiphoneme for verb stems with -дший past active participles (-сти 7 (-д-) )
- %^t = archiphoneme for verb stems with -тший past active participles (-сти 7 (-т-) )
- %^R = archiphoneme for бороть and пороть
- %^U = Imperative ending (unstressed)
- %^S = Imperative ending (stressed)
- %^P = Attenuative comparative prefix: по~
- %^A = Attenuative comparative prefix: по~
- %^Y = Verbal prefix вы́-
POS
- +A = Adjective
- +Abbr = Abbreviation
- +Adv = Adverb
- +CC = Coordinating conjunction
- +CS = Subordinating conjunction
- +Det = Determiner
- +Interj = Interjection
- +N = Noun
- +Num = Numeral
- +Paren = Parenthetical вводное слово
- +Pcle = Particle
- +Po = Postposition (ради is the only postposition)
- +Pr = Preposition
- +Pron = Pronoun
- +V = Verb
Sub-POS
- +All = All: весь
- +Coll = Collective numerals
- +Def = Definite
- +Dem = Demonstrative
- +Indef = Indefinite: кто-то, кто-нибудь, кто-либо, кое-кто, etc.
- +Interr = Interrogative: кто, что, какой, ли, etc.
- +Neg = Negative: никто, некого, etc.
- +Pers = Personal
- +Pos = Possessive, e.g. его, наш
- +Prcnt = Percent
- +Prop = Proper
- +Recip = Reciprocal: друг друга
- +Refl = Pronoun себя, possessive свой
- +Rel = Relativizer, e.g. который, где, как, куда, сколько, etc.
- +Symbol = independent symbols in the text stream, like £, €, ©
Verbal MSP
- +Impf +Perf = Imperfective, perfective
- +IV +TV = Intransitive, transitive (Zaliznjak does not mark trans-only, so transitive verbs all have both TV and IV)
- +Inf +Imp = Imperatives: 2nd person = читай, 1st person = прочитаем
- +Pst +Prs +Fut = Past, present, future
- +Sg1 +Sg2 +Sg3 = person sg
- +Pl1 +Pl2 +Pl3 = person pl
- +PrsAct +PrsPss = Participles (+PrsAct+Adv and +PstAct+Adv are used for the verbal adverbs)
- +PstAct +PstPss = Participles
- +Pass = Passive
- +Imprs = Impersonal (cannot have explicit subject)
- +Lxc = Lexicalized (for participial forms)
- +Der = Derived (for participial forms)
- +Der/PrsAct = Derived (for participial forms)
- +Der/PrsPss = Derived (for participial forms)
- +Der/PstAct = Derived (for participial forms)
- +Der/PstPss = Derived (for participial forms)
Nominal MSP
- +Msc +Fem +Neu +MFN = grammatical gender, +MFN = gender unspecifiable (pl tantum)
- +Inan +Anim +AnIn = animacy (+AnIn = ambivalent animacy for non-accusative modifiers)
- +Sem/Sur +Sem/Pat = Surname (фамилия), Patronymic
- +Sem/Ant +Sem/Alt = Anthroponym/Given name, Other
- +Sg +Pl = number
- +Nom +Acc +Gen
- +Loc +Dat +Ins
- +Loc2 +Gen2 +Voc
- +Count = Count (for человек/людей or лет/годов, etc. also шага́/шара́/часа́/etc.)
- +Ord = Ordinal
- +Cmpar = Comparative
- +Sint = Synthetic comparative is possible, e.g. старее
- +Pred = "Predicate", also used for short-form adjectives
- +Cmpnd = "Compound", used for compounding adjectives, such as русско-английский
- +Att = Attenuative comparatives like получше, поновее, etc.
Punctuation
- +PUNCT = Punctuation
- +CLB = Clause boundary ! TODO SENT vs CLB which is which?
- +SENT = Clause boundary
- +COMMA = Comma
- +DASH = Dash
- +LQUOT = Left quotation
- +RQUOT = Right quotation
- +QUOT = "Ambidextrous" quotation
- +LPAR = Left parenthesis/bracket
- +RPAR = Right parenthesis/bracket
- +LEFT = Left parenthesis/bracket/quote/etc.
- +RIGHT = Right parenthesis/bracket/quote/etc.
Other tags
- +Prb = +Prb(lematic): затруднительно - предположительно - нет
- +Fac = Facultative
- +PObj = Object of preposition (prothetic н: него нее них)
- +Epenth = epenthesis on prepositions (о~об~обо or в~во)
- +Leng = Lengthened доброй~доброю (marks less-canonical wordform that has more syllables)
- +Elid = Elided (Иванович~Иваныч, новее~новей, чтобы~чтоб, или~иль, коли~коль)
- +Use/NG = Do not generate (used for apertium, etc.)
- +Use/Obs = Obsolete
- +Use/Ant = Antiquated "устаревшее"
- +Err/Orth = Substandard
- +Err/L2_ii = L2 error: Failure to change ending ие to ии in +Sg+Loc or +Sg+Dat, e.g. к Марие, о кафетерие, о знание
- +Err/L2_Pal = L2 error: Palatalization: failure to place soft-indicating symbol after soft stem, e.g. земла (compare земля)
- +Err/L2_FV = L2 error: Presence of fleeting vowel where it should be deleted, e.g. отеца (compare отца)
- +Err/L2_NoFV = L2 error: Lack of fleeting vowel where it should be inserted, e.g. окн (compare окон)
- +Err/L2_SRo = L2 error: Failure to change о to е after hushers and ц, e.g. Сашой (compare Сашей)
- +Err/L2_SRy = L2 error: Failure to change ы to и after hushers and velars, e.g. книгы (compare книги)
Key lexicon
- LEXICON Root
- Abbreviation ;
- : %^P%^A Adjective ;
- Adverb ;
- Comparative ;
- Conjunction ;
- Interjection ;
- Noun ;
- Numeral ;
- Parenthetical ;
- Particle ;
- Predicative ;
- Preposition ;
- Pronoun ;
- Verb ;
- Propernoun ;
- Punctuation ;
- Symbols ;
- LexicalizedParticiple ;