170113
Samest meeting 13.1.2017
Participants: Heiki, Heli, Jaak, Jack, Trond
Agenda
- Status reports
- Jorgal
- Final conference of the project
Status reports
Heiki will make a presentation at IWCLUL in St. Petersburg.
Jaak has started to work on parallel forms.
E.g.
väär+A+Sg+Par väära, väärat
väärat is not a correct word, according to estmorf.
vaba+A+Pl+Par vabasid, vabu
Both are correct.
exp-langs:
echo 'vabasid' | hfst-lookup analyser-gt-desc.hfstol > vabasid vaba+A+Pl+Par+Use/Rare vaba+A+Pl+Par vaba+A+Pl+Par vabu
For Saami we have:
- generator-gt-desc.xfst: gives all
- generator-gt-norm.xfst: gives only normative
Tag-wise, sme has:
- no usage tag: always ok (-norm., -desc)
- Both no tag and ...
- Both no tag and ...
- Use/NG: in speller, not in MT output (-norm, -desc)
- (= Use/Rare)
- (= Use/Rare)
- Err/Orth: understood, but neither speller-accepted nor >MT-output (-desc)
est:
+Use/Rare !!= * {{@CODE@}}: ! norm, but rare: puusid:puu+Pl+Par+Use/Rare +Use/Hyp !!= * {{@CODE@}}: ! norm, but so rare that norm is probaly wrong: tiivasse:tiib+Sg+Ill+Use/Hyp +Use/NotNorm !!= * {{@CODE@}}: ! not norm, but sometimes used: pöidlates:pöial+Pl+Ine+Use/NotNorm +Use/CommonNotNorm !!= * {{@CODE@}}: ! not norm, and used more than norm: peeneid:peen+Pl+Par+Use/CommonNotNorm +Use/Rare = +Use/NG (accept, but do not tell anyone) +Use/Hyp = +Use/NG (accept, but do not tell anyone) +Use/NotNorm = Err/Orth +Use/CommonNotNorm = Err/Orth
Cf a corresponding page for North Saami:
Trond created a link to the documentation of exp-langs/est.
What is needed for MT:
- MT-generation (fin2est) needs to output one form
- MT-analysis (est2fin) needs to accept both forms
- speller wants to accept the normative
1st thing for Heiki to do:
write a script that removes usage tags from some words (list sent by Heli).
Testing of the FSTs
We have hand-disambiguated corpora which we can use for testing.
Jorgal
sme-nob would like to keep the jorgal page for working languages only
Suggestion:
- gtweb.uit.no/jorgal for sme2X only
- gtweb.uit.no/mt/testing for the full list (but no web translation)
- A page for Finnic MT, called, hmm
- gtweb.uit.no/mt/
- gtweb.uit.no/kaantaminen
- gtweb.uit.no/tolkimine
- something even more (less?) nifty
- gtweb.uit.no/mt/
- evt., the gtweb.uit.no/mt/ could be a jorgal-type all-in page (?)
Final conference
Project end: the final conference of the Norwegian-Estonian scientific cooperation program will take place on 21.–22.09.2017 in Laulasmaa http://www.laulasmaa.ee/en/.
We will be there, and give an interesting presentation.
Next meeting