Language Resources

Chechen

  • Parallel Bilingual Corpus. Some translations to Chechen from a standard set of newspaper stories we used and some translations from Chechen to English.
  • Monolingual Corpus.
  • Lexicon. About 19,800 inflected forms
  • Texts with named entities tagged. In two forms. One word per line file and a list of offsets for the propoer names. This was used by a tagger we used in the lab. And XML files with named-entity markup.