Chechen
- Parallel Bilingual Corpus. Some translations to Chechen from a standard set of newspaper stories we used and some translations from Chechen to English.
- Monolingual Corpus.
- Lexicon. About 19,800 inflected forms
- Texts with named entities tagged. In two forms. One word per line file and a list of offsets for the propoer names. This was used by a tagger we used in the lab. And XML files with named-entity markup.