Parallel corpora in languages of the Greater Himalayas

Welcome on the ANR HimalCo project website!

Through this project we have collected data on five languages from three sub-groups of the Tibeto-Burman family, all of them under-described oral languages. The data is presented here in two families of linguistic tools:

  • a comparable corpus of mythological narrative texts, aligned across languages for similarities in narrative and linguistic content, and
  • a number of talking dictionaries, which provide, for all dictionary entries, sound recordings of individual entries and of example sentences. In the case of one of these languages, Khaling, the dictionary consists of a novel concept, namely a verb dictionary, with verbs presented by root (as opposed to by infinitive form, which does not provide enough information to assign any given verb to a conjugation) and accompanied by conjugation tables making it possible to generate all verb forms possible in Khaling.

Corpus and dictionaries compiled with ANR funding.