140627
Meeting 27.6.2014
Present: Heli, Heiki-Jaan, Jaak, Sjur, Trond
- Status
- Tag discussion
- Plan a physical meeting
- Dictionary
- Oahpa
- Plan
- Next meeting
Status
We have a taglist draft.
Fst work has been slower due to midsummer etc. With --enable-oahpa some
The xfst/hfst instructions should be written in parallel in the Makefiles.
Tag discussion
- script in the new tags in all lexc files, and in the yaml etc. test files. (Heiki-Jaan, then Heli)
- Then have a newtag to plamk tag routine in src/tagsets/plamk.relabel (Sjur)
- Then have a newtag to apertium tag routine in src/tagsets/apertium.relabel (apertium has <n> for +N etc) - this is already in place
Heli would be happy to have this within a week.
Plan a physical meeting
Not this time?
Dictionary
Heiki-Jaan did the work already, we have access to the 16669 lemma est-nor/nor-est,
me+aatom ge+-i tp+2e nn+atom gn+-et me+aatomi+elektri+jaam ge+-a tp+22u nn+atomkraftverk gn+-et --
Oahpa
Estonian Oahpa (Leksa) is online: http: //testing.oahpa.no/eesti/leksa/
Plan
- Dictionary: Sjur, Trond
- Tags, as decided, next week (important for Oahpa progress)
- Oahpa: when tags are done
- FST: Work continues, mainly by Jaak
- CG: the eternal question
Next meeting
August 12th 1300 Swedish time?
Appendix:The notes from the tag discussion
Tag conversion table:
+S +N +H +N+Prop +A +A +Num +Num+Card +Ord +Num+Ord +Pron +Pron +V +V (... see the documentation) +prefix +Pref Stems used as a prefix. = compound? +suffix +Suf Similar for suffixes? G1 G1 These are multichar symbols for CG etc., for twolc. (... see the documentation)
Heiki's suggestions for tag conversions:
+Num +Num+Card +Ord +Num+Ord +X +Adv +adit +Ill +G +N+Gen
Bearing in mind that ordinals are adjectives, we would thus have:
+Num +Num+Card +Ord +A+Ord - but ordinals do not have comparative and superlative - do all other adjectives have those? You could form comparative for a regular noun as well.. "majam" -- more house-like. Yes, but are there regular adjectives (nouns) that do not compare? Like many Swedish & Norwegian adjectives. can you give an example of a norwegian adjective? eksemplarisk - *eksemplariskere - *eksemplariskest yes, actually there exist similar adjectives in Estonian, too. Exactly, and among them the ordinals ;) kaksi kaksi+Num+Card+Sg+Nom toinen toinen+Num+Ord+Sg+Nom kolmas kolmas+Num+Ord+Sg+Nom ==> +A+Ord pari pari+Num+Card+Sg+Nom
+N+Prop
Compare with http://nl.ijs.si/ME/Vault/V3/msd/html/
Tag sequence: +MainPOS +SubPOS
+prefix +Cmp
In the working document in est/doc/
When dust has settled, in root.lexc:
newtag !!≈ * @CODE@ comments plamktag !! More comments.
+prefix, +suffix
kultur- kultuvra+Sem/Domain+N+Cmp/SgNom+Cmp/SplitR kultuvra kultuvra+Sem/Domain+N+Sg+Nom kultur kultur +? kulturheasta kultuvra+Sem/Domain+N+Cmp/SgNom+Cmp#heasta+Sem/Ani+Sem/Veh+N+Sg+Nom
Verbs
The first question in determining the tagset for verb categories is: what is
Below is the set of all the possible combinations of morphological categories
tegumood, aeg, kõneviis, isik, arv, kõnelaad voice, tense, mood, person, number, aspect
The categories are given in the order in which the allomorphs (if they can be
-
voice: personal vs. impersonal (0-morph vs. t/d (aiu)), eg. elaks vs
-
tense: present vs. past (0-morph vs. s/si/nu), e.g. elan vs. elasin;
-
mood: indicative vs. conditional vs imperative vs quotative
-
person+number: notice that in personal present imperative 3rd person
-
aspect: affirmative vs. negative. The aspect manifests itself via
Below, brackets are used to list the set of non-specified alternative values.
personal present indic 1s afirmative elan personal present indic 2s afirmative elad personal present indic 3s afirmative elab personal present indic 1p afirmative elame personal present indic 2p afirmative elate personal present indic 3p afirmative elavad personal present indic (1s/2s/3s/1p/2p/3p) negative ela, pole personal present condit 1s afirmative elaksin personal present condit 2s afirmative elaksid personal present condit (1s/2s/3s/1p/2p/3p) (afirmative/negative) elaks personal present condit 1p afirmative elaksime personal present condit 2p afirmative elaksite personal present condit 3p afirmative elaksid personal present condit 1s negative poleksin personal present condit 2s negative poleksid personal present condit (1s/2s/3s/1p/2p/3p) negative poleks personal present condit 1p negative poleksime personal present condit 2p negative poleksite personal present condit 3p negative poleksid personal present imper 2s (afirmative/negative) ela personal present imper 3 (singular/plural) (afirmative/negative) elagu personal present imper 1p (afirmative/negative) elagem personal present imper 2p (afirmative/negative) elage personal present imper 2s negative ära personal present imper 3 (singular/plural) negative ärgu personal present imper 1p negative ärgem personal present imper 2p negative ärge personal present quotat (afirmative/negative) elavat personal present quotat negative polevat personal past indic 1s afirmative elasin personal past indic 2s afirmative elasid personal past indic 3s afirmative elas personal past indic 1p afirmative elasime personal past indic 2p afirmative elasite personal past indic 3p afirmative elasid personal past indic (1s/2s/3s/1p/2p/3p) negative polnud personal past condit 1s afirmative elanuksin personal past condit 2s afirmative elanuksid personal past condit (1s/2s/3s/1p/2p/3p) (afirmative/negative) elanuks personal past condit 1p afirmative elanuksime personal past condit 2p afirmative elanuksite personal past condit 3p afirmative elanuksid personal past condit (1s/2s/3s/1p/2p/3p) negative polnuks personal past quotat (afirmative/negative) elanuvat personal past quotat negative polnuvat personal present partic elav personal past partic elanud personal supine abessive elamata personal supine elative elamast personal supine illative elama personal supine inessive elamas personal supine translative elamaks impersonal present indic afirmative elatakse impersonal present indic negative elata impersonal present condit (afirmative/negative) elataks impersonal present imper (afirmative/negative) elatagu impersonal present quotat (afirmative/negative) elatavat impersonal present partic elatav impersonal past indic afirmative elati impersonal past indic negative poldud impersonal past condit (afirmative/negative) elatuks impersonal past partic elatud impersonal supine elatama gerund elades infinit elada
Exceptional cases:
present personal (afirmative/negative), 3 words: kuulukse, tunnukse, näikse negative ei ?
Analytical forms (olen elanud, olin elanud, oleksin elanud, ei olnud elanud,
Suggestions to simplify the way verb categories are combined above:
- Omit "personal" where person and number are specified anyway.
- If there is underspecification (afirmative/negative), leave it out.
- If there is underspecification (1s/2s/3s/1p/2p/3p), leave it out, but keep
- For personal present imper 3 (singular/plural), keep only personal present
- If a combination contains "affirmative", leave "affirmative" out. This will