120103
Komi dictionary
Presnt: Ciprian, Jack, Trond
Agenda
- Status quo
- Webdict
- Fileformat
Status quo
- All a), б) ... removed
- The komrus is online on apertiumdict
Webdict
Inversion
Do we need rus-kom, fin-kom and eng-kom? I don't think so.
src>grep '___processing___' y_errors.txt | sort | uniq -c | sort -nr 3468 ___processing___ i 1806 ___processing___ gov 624 ___processing___ register 575 ___processing___ com 93 ___processing___ field 47 ___processing___ clarif 38 ___processing___ sns 20 ___processing___ range 11 ___processing___ val 11 ___processing___ att
We will have to update the about the dictionary files. Trond
TODO
- For fin-kom and eng-kom: yes, and both directions
- For rus-kom: Cip will do a quick and-dirty version
- The 10372 kom/src/Not-V_kvru-lex.xml entries should not be added
- Lingustic deadline: Wednesday 20.00 (21.00 Finnish time)
Non-Russian Cyrillic Unicode characters
We need input help for two special characters for the Komi alphabet
Yes, we need two letters in addition to the Russian repertoire:
- Unicode: ӧ, і (Ӧ, І)
- x0406, x0456
Paula and komfin
Paula Kokkonen should edit Kom-Fin with XMLMind in
TODO
- Add paula to svn (Trond)
- Set up the machine
- Teach Paula to use relevant programs (Jack, Trond)
- XMLMind, XMLEditor
- svn (either command line or the Versions.app)
- XMLMind, XMLEditor
Fileformat
Files
24th oct (???!!)
Jack: Everything™ is in working_files. Delete private
Oct. Jack's recollection:
-
lemma, stem, contlex, stem, content derived from komfineng,
- Derived content joined with kom-rus
kt/kom/src/Not-V_kvru-lex.xml
Principles
- All structure is xml structure
- text and nodes should not be sisters
All underlines in the lemma field be replaced by space.
<lemma>Войвывса кытшсайса</lemma> <!-- ws --> <lemma>Войвывса_кытшсайса</lemma> < <lemma>Войвывса% кытшсайса</lemma>