OurSVN Repositories

Overview

We have at the moment four online repositories:

  • langtech - our main source code repository, with all grammars, dictionaries, etc.
  • biggies - large datasets like spell checker test results, recordings and test corpora
  • freecorpus - freely available corpus files (the non-free corpus data is available for research and development purposes upon request, and with a signed user agreement); corpus files are organised according to format, converted quality and purpose, then according to language, and then genre
  • speech - speech language technology data, presently speech synthesis recordings and accompanying text files

Details

langtech

biggies

freecorpus

speech