Speller Configuration
This text documents the speller configuration that has turned out to be the
- options to the ./configure command
- settings in Makefile.am files
- flag elimination
The optimisations described here relate to speed and file size. Fine tuning the
This document is up-to-date as of 28.11.2016.
{{configure}}options
The following configuration is what seems to produce the optimal speller:
./configure --with-hfst --without-xfst --enable-alignment --enable-spellers
Note specifically that the following option does not improve the SME
--disable-minimised-spellers
The following can be added to increase compilation speed, although it should
--with-backend-format=foma --enable-reversed-intersect
Settings in {{Makefile.am}}files
The file tools/spellcheckers/fstbased/desktop/Makefile.am contains the
ENABLE_CORPUS_WEIGHTS=yes CORPUS_SIZE=
Enabling corpus weights does help improving suggestion quality quite a bit. And
Flag elimination
Eliminating flag diacritics can have a tremendeous effect on both speller speed
The flag elimination is done in tools/spellcheckers/fstbased/Makefile.am.
eliminate flag CmpHyph eliminate flag CmpN eliminate flag Der1 eliminate flag Der2 eliminate flag Der3 eliminate flag Der4 eliminate flag Der5 eliminate flag Der_PassL eliminate flag Der_PassS
There are more flags being used in SME, but eliminating them made the fst
eliminate flag NeedNoun eliminate flag NeedsVowRed eliminate flag Want_Left
NeedNoun and Want_Left crosses word boundaries, and will most likely