160105
Meeting 5.1.16
Áššit
- Bidix improvement
- Progress
- Next meeting
Bidix improvement
We will improve bidix by:
- frequency sorting: Ciprian
- checking sme lemmas: Lene
- checking smj lemmas:
- in dev: sh bidix-sanity.sh > sanityoutput (Sandra)
- if same structure in sme and smj, one can remove the word pair instead of lexicalizing it
- in dev: sh bidix-sanity.sh > sanityoutput (Sandra)
- adding word pairs from Lenes giza-experiments in 2008: Lene makes a csv-file, Sandra proofread
- checking sme verbs IV vs TV: done
- change back to Po, Pr and Adv in smj-lexc
- add transfer word pairs from termlists (sme-nob + nob-smj) and compare them to bidix:
- Give priority to sme-nob/nob-smj (where nob-smj = Kintel) pairs that are already in bidix, and lift they over in a priority bidix list
- evaluate sme-smj-bidix not in sme-nob/nob-smj and sme-nob/nob-smj not in sme-smj-bidix separately, after the seeded gang
- For the sme-smj-bidix residue: Give priority to sme-smj pairs where smj are already in FST
- Give priority to sme-nob/nob-smj (where nob-smj = Kintel) pairs that are already in bidix, and lift they over in a priority bidix list
- make missing list from sme-corpus: Trond
Progress
Pronouns
Transfer rules
- using the same names on variabels and sets
- document relevant rules in a document on smX MT web-page
- suggest for Francis to hold regular meetings for discussing transfer rules, and improvements
Next meeting
We'll have meeting once a week (but next meeting will be the last week of January)