Webdict Compilation
This text documents the compilation process of the web dictionaries.
For the moment, we use the Apertium dictionary format.
The files
The online files themselves are stored in the relevant catalogue for displaying
The conversion procedure
The files are converted from the original dictionary files,
You need the Apertium dixtools:
- download it from
- install it via the following commands:
- cd apertium-dixtools/
- ant jar
- sudo ant install
- cd apertium-dixtools/
Then convert the Giellatekno xml format into Apertium xml format
(For an example of commands, see below)
- collect all relevant entries in a single file (this script does only that)
- in issue the following command (where INPUT_DIR is
-
java -Xmx2048m -Dfile.encoding=UTF8 net.sf.saxon.Transform -it:main collect-dict-parts.xsl inDir=INPUT_DIR/ > PATH_TO_OUTPUT_FILE
- different filters for different language pair are possible/needed
- filtering takes place both in collect-dict-parts.xsl and
- in issue the following command (where INPUT_DIR is
- convert the gt_format into the accepted apertium_xml format
- The output file from last command shall be the INPUT_FILE here.
- In the file gtdict2simple-apertiumdix.xsl, edit the variables inFile
- Then, issue the command:
-
java -Xmx2048m -Dfile.encoding=UTF8 net.sf.saxon.Transform -it:main gtdict2simple-apertiumdix.xsl
- The output file from last command shall be the INPUT_FILE here.
- compile the file using the apertium tools (see above), with this command
-
apertium-dixtools dix2trie INPUT_FILE lr LANG1-LANG2-lr-trie.xml
-
apertium-dixtools dix2trie INPUT_FILE lr LANG1-LANG2-lr-trie.xml
- update the file from the $GTHOME/apps/dicts/apertiumdict/ into the
Here comes an exampls, again assuming you stand in
java -Xmx2048m -Dfile.encoding=UTF8 net.sf.saxon.Transform -it:main collect-dict-parts.xsl inDir=../smefin/src > tull/out_simple-apertium/tull.xml see tull/out_simple-apertium/tull.xml THEN DELETE THE LINES BETWEEN THE FIRST LINE AND THE <r> NODE java -Xmx2048m -Dfile.encoding=UTF8 net.sf.saxon.Transform -it:main gtdict2simple-apertiumdix.xsl tail tull/ut/tull.xml apertium-dixtools dix2trie tull/ut/tull.xml lr ../../../apps/dicts/apertium_dict/dics/fin-smn-lr-trie.xml