MovingPLX And Hunspell To The New Infra
Contents:
Present building
- spellernonrec
- plxnonrecder
-
%» %» %» - derivations that are lexicalised?
-
%» %» %» - derivations that are lexicalised?
-
plxnonrec = ( spellernonrec - plxnonrecder ) .o. remove-hyphen
- POS specific fst (spellerPOS > spellerPOS-plx)
- spellermwe - text file with multi-word PLX entries
- spellerverbs.fst - used by both PLX and Hunspell
- spellerverbs.txt - PLX variant is made with PLX tags, Hunspell variant is
- spellerverbs.fst - used by both PLX and Hunspell
- spellernouns. ... (see verbs)
- spelleradjs. ... (see verbs)
- spellerabbrs. ... (see verbs) - rename fst to others
- spellerproper. ... (see verbs)
- spellernums. This is unioned with spellernouns.fst
- spellermwe - text file with multi-word PLX entries
- concatenate txt files (4.2.b etc above)
- build final speller files:
- Hunspell - two variants, with and without compounding
- PLX
- Hunspell - two variants, with and without compounding
In the POS build targets, abbr = other POS's.
New dir layout
tools/spellcheckers/listbased/ <= build common hunspell/plx files here hunspell/ <= build final hunspell here plx/ <= build final plx here
Targets for each dir above:
- listbased:
- spellernonrec (l. 121 in the old Makefile.plx)
- POS fst's
- spellernonrec (l. 121 in the old Makefile.plx)
- hunspell:
- spellerPOS-plx.fst > spellerPOS-plx.txt
- hyph-remove
-
cat all plx.txt | sort
- convert to hunspell using wordlist2hunspell
- spellerPOS-plx.fst > spellerPOS-plx.txt
- plx:
- spellerPOS-plx.fst > spellerPOS-plx.txt
- print version > revsort > mklex > upload
- spellerPOS-plx.fst > spellerPOS-plx.txt
Work plan
- make targets for listbased/
- make targets for hunspell/
- add tests
- decide whether to allow PLX for all languages, or only SMA, SME, SMJ
- depending the previous choice, integrate PLX building in separate or