logo

Kindly select the language

This project produced a parallel corpus between Swahili and 2 other Kenya Languages: Dholuo and Luhya. The Luhya Language has several dialects. In the project 3 dialects were chosen as a start: Lumarachi, Logooli and Lubukusi. A total of 12, 400 sentences were translated to Kiswahili from a sample of Dholuo, Luhya texts (1500 Dholuo-Kiswahili sentence pairs and 10,900 Luhya-Kiswahili sentence pairs). Each document contains sentence pairs, the sentence in the original language starts with letter “O” followed by a full colon (“O:”) while the translated Kiswahili sentence below it starts with letter “T” followed by a full colon (“T:”).

To cite this dataset:

X