:::freecorpus/sme/admin/sd/other_files::: dc1990-4.pdf.xml: dc1991-2.pdf.xml: enormous amount of scanning errors, imo there's no point to have change-replace for t->č, S->š, etc.. I guess every dc* file has same problems satnelistu.doc.xml: this is a wordlist file that has different languages on each line, language recognition should be updated to check every word in order to correct it stedsnavn4.doc.xml: this is a report file which includes corrections for misspelled sami words 64547_1_P.doc.xml: spelling errors, wrong lang corrected ccat to catch span elements