Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence
This paper describes the stages involved in implementing a corpus of spoken Irish. This pilot project (consisting of approximately 140K of transcribed data) implements parts of the design of a larger corpus of spoken Irish which it is hoped will contain approximately 2 million words when complete. It hoped that such a corpus will provide material for linguistic research, lexicography, the teaching of Irish and for development of language technology for the Irish language.
Uí Dhonnchadha, E., Frenda, A. & Vaughan, B. (2012) Issues in Designing a Corpus of Spoken Irish. LREC: 8th. international conference on Language Resources and Evaluation, Istanbul, 23-25 May.