Building MUSCLE, a Dataset for MUltilingual Semantic Classification of Links between Entities accepted at LREC-COLING 2024

Exploiting Wikidata as source of structured knowledge, we have done a paradigm shift regarding building linguistic datasets building on semantic maps theory. In thi case, we propose a fresh knowledge-based multilingual dataset for classification of lexico-semantical relationships (LRC) between words. Apart from providing the dataset, we thoroughly test it against different possible cases of memorization, and provide a baseline using the SoA for LRC. Our results are included in “Building MUSCLE, a Dataset for MUltilingual Semantic Classification of Links between Entities” which has been accepted at LREC-COLING 2024. We will soon release the dataset and the camera ready #COLING24

Lucía Pitarch, David Avián, Jorge Gracia, and Jordi Bernad … Great Work!!

This entry was posted in News. Bookmark the permalink.