Document Type

Conference Paper


Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence


Computer Sciences

Publication Details

LREC: 8th. international conference on Language Resources and Evaluation, Istanbul, 23-25 May, 2012.


This paper describes the stages involved in implementing a corpus of spoken Irish. This pilot project (consisting of approximately 140K of transcribed data) implements parts of the design of a larger corpus of spoken Irish which it is hoped will contain approximately 2 million words when complete. It hoped that such a corpus will provide material for linguistic research, lexicography, the teaching of Irish and for development of language technology for the Irish language.