Getting Started On The Mac

This page is a part of the overall Getting started page. It describes what you need to install on the Mac to be ready to develop language tools for your language.

System setup

  • You need a text editor - we recommend SEE - then you can collaborate with us over the net straight from your hard disk (only the things you explicitly share or invite peope to).
  • Basic programming tools (in this order):

Then you need a number of tools for the build chain. On the Mac, you can get them by running the following commands:

sudo port install autoconf automake libtool python37 py37-pip wget bison cmake gawk saxon

sudo port select --set python3 python37

sudo pip-3.7 install PyYAML

python3 -m pip install pexpect --user

sudo cpan install Text::Brew XML::LibXML

Additional software

Certain tasks can require additional software. Here's some additional software you might need depending on what you need to do.

If you want to do corpus work, then also do this:

sudo port install \
python27 py27-pip py27-beautifulsoup4 py27-unittest2 py27-lxml py-pysvn \
py27-html5lib py27-feedparser p5-xml-twig antiword wv libxslt poppler tidy \
p5-xml-libxml texlive-bin-extra

sudo port select --set python python27
sudo pip-2.7 install pyth pytidylib

If you want to have documentation pages locally on your own machine, you need Forrest:

Article authoring using LaTeX

sudo port install \
       TeXShop3      \
       texlive-basic \
       texlive-latex-extra

Note for Java avoiders

Some of the tools above require or use Java, notably Saxon and Forrest. Saxon is used to convert XML-based source files into Lexc files, and Forrest is used to validate documentation extracted from the source files.

None of these functions are strictly required for developing language tools. The lexc files converted from XML are stored in svn, and if Saxon is not available, the lexc files will be used as is. And if Forrest is not available, the step for building documentation out of source code comments will just be skipped.

That is, Java is not required to do development using the Divvun/Giellatekno infrastructure, unless you specifically work with xml-based lexicons.

Linguistic software

You need tools to convert your linguistic source code (lexicons, morphology, phonology, syntax, etc.) into usefull tools like analysers, generators, hyphenators and spellers. Install the following linguistic programming tools:

  • One or more of:
    • Xerox tools - Freely available, faster compilation, but not open source and no spellers. The software is found under the link NewSoftware, Binaries Only is enough. Unpack the files and store them in e.g. /usr/local/bin/.
    • HFST tools - Open source. Required for turning your morphology and lexicon into a spellchecker.
    • Foma - Open Source. NB! Foma support is experimental at the moment.
  • Visl CG3  (for syntactic disambiguation and analysis)

If you want to work with proofing tools, see Proofing tools to install here