SeCoDa: Sense Complexity Dataset

Proceedings of the 12Th Language Resources and Evaluation Conference (2020)
  Copy   BIBTEX

Abstract

The Sense Complexity Dataset (SeCoDa) provides a corpus that is annotated jointly for complexity and word senses. It thus provides a valuable resource for both word sense disambiguation and the task of complex word identification. The intention is that this dataset will be used to identify complexity at the level of word senses rather than word tokens. For word sense annotation SeCoDa uses a hierarchical scheme that is based on information available in the Cambridge Advanced Learner’s Dictionary. This way we can offer more coarse-grained senses than directly available in WordNet.

Analytics

Added to PP
2020-06-15

Downloads
91 (#83,436)

6 months
51 (#71,685)

Historical graph of downloads since first upload
This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.
How can I increase my downloads?