SamEst meeting 07.10.2014
Present: Fran, Heiki, Heli, Sjur, Trond
Jaak checked in a todo list in langs/est/doc/.
- Proper gt-style tagging of components (+Cmpnd-stuff)
- Ask HJK for improved verb paradigm -- after a month...?
- Decide how to (not to?) encode "defaults"
Improvements waiting to happen
- proper gt-style punctuation in FST
- stress information in FST
Stress information in the lexicon will simplify many (two-level?) rules a lot and make the network of continuation lexica more logical, but we can do without as well.
Do this for 1 word from each continuation class for open classes. For the closed classes, all. E.g. You expand "minä" "sinä" etc.
This can be done using hfst-intersection and a regex like "t a l o <n> ?*"
1) Expand a single word form, e.g. "talo"
2) Replace the lemma with the Estonian:
3) Try and generate. See where there are generation errors.
Heiki + Jaak to start on it.
eng-fin wordnet, eng-est X
Reliable: (eng-)fin-est 1-1-1
- (eng-)fin-est 1-1-m /
- (eng-)fin-est 1-m-1
Word triplets ordered by the frequency of the English word.
The Reliable List will be proofread at the end of this week.
The worker writes YES and NO
- YES from reliable
- YES from unreliable
- MAYBE from reliable
- MAYBE from unreliable
Cooperation with teachers
Heli and Kadri met some teachers of Estonian the day after the SamEst meeting in Tartu.
Teachers will cooperate and send material about stuff to put into Oahpa (raw material for Morfa-C templates, new exercise types in Morfa-C). User interface for Leksa, Morfa-S and Morfa-C was also discussed.
Remember to send in your tickets and receipts for travel reimbursements!
The next meeting
In two weeks - Tuesday, 21th Oct at 12.00 Norwegian time.