Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence
Computer Sciences, Information Science, Linguistics
This paper describes a simple but competitive unsupervised system for hypernym discovery. The system uses skip-gram word embeddings with negative sampling, trained on specialised corpora. Candidate hypernyms for an input word are predicted based on cosine similar- ity scores. Two sets of word embedding mod- els were trained separately on two specialised corpora: a medical corpus and a music indus- try corpus. Our system scored highest in the medical domain among the competing unsu- pervised systems but performed poorly on the music industry domain. Our approach does not depend on any external data other than raw specialised corpora.
Maldonado, A. & Klubicka, F. (2018) ADAPT at SemEval-2018 Task 9: Skip-Gram Word Embeddings for Unsupervised Hypernym Discovery in Specialised Corpora