Home Research Publications Corpora People


Corpora


The Providence (English) Corpus
The Providence Corpus consists of twice-monthly digital audio/video recordings of hour-long mother-child spontaneous speech interactions from 6 English-speaking children between approximately 1-3 years. The data were collected in and around southern New England from 2000-2004, and total approximately 363 hours. The child utterances are transcribed in broad phonetic transcription. The work was funded by NIH, and carried out by Katherine Demuth and collaborators at the Child Language Lab at Brown University in Providence, RI. The data are available on the CHILDES database.

Those wishing to use the corpus should cite the following reference:

Demuth, K., Culbertson, J. & Alter, J. 2006. Word-minimality, epenthesis, and coda licensing in the acquisition of English. Language & Speech, 49, 137-174.




The Lyon (French) Corpus
The Lyon Corpus consists of twice-monthly digital audio/video recordings of hour-long mother-child spontaneous speech interactions from 4 French-speaking children between approximately 1-3 years. The data were collected in Lyon, France from 2000-2004, and total approximately 185 hours. The child utterances are transcribed in broad phonetic transcription. The work was funded by NIH, and carried out in collaboration with Harriet Jisa and colleagues at Dynamique du Langage at the University of Lyon 2, France. The data are available on the CHILDES database.

Those wishing to use the corpus should cite the following reference:

Demuth, K. & A. Tremblay. 2007. Prosodically-conditioned variability in children's production of French determiners. Journal of Child Language, 34, 1-29.




The Demuth Sesotho Corpus
The Demuth Sesotho Corpus contains 98-hours of spontaneous speech interactions with four children aged 2-4. Audio taping took place at monthly intervals for 3-4 hours during interactions with family and peers in rural Lesotho. The data are morphologically tagged, and available as part of the CHILDES database. For a more detailed description of the Sesotho files please refer to pages 23-30 in the CHILDES documentation. Corpus preparation and research have been funded by NSF, Fulbright, and SSRC.

Those wishing to use this corpus should notify Katherine Demuth and cite the following reference:

Demuth, K. 1992. Acquisition of Sesotho. In D. Slobin (ed.), The Cross-Linguistic Study of Language Acquisition, vol 3, 557-638. Hillsdale, N.J.: Lawrence Erlbaum Associates.

To download the Demuth Sesotho Corpus, click here.

Want to learn some Sesotho? Here are 13 easy lessons for getting started.




back to top