Our EACL 2024 paper promotes a strict definition of entity salience by presenting GUMsley, a 12-genre challenge dataset for entity salience evaluation and shows how salient entities added to summarization models are beneficial for deriving higher-quality summaries with fewer hallucinated entities
Research
-
GUM7 – four added genres, Wikification and more!
The first release of GUM series 7 now adds four more genres to our multilayer corpus, in addition to brand new annotation layers, corrections, and more. This post outlines the main changes and additions to the corpus.
More -
Entities in the Coptic Treebank
With the release of Version 2.6 of Universal Dependencies, our focus has shifted to handling Named and Non-Named Entity Recognition (NER/NNER) in Coptic data. As a result of intensive work by the Coptic Scriptorium team in the past few months,...
More -
New features in our NLP pipeline
-
A Neural Network Reads the Newspaper
... in search of discourse signals! We now know a lot about what cues people use to identify discourse relations, but can we teach computers to notice the same signals?
More -
What you say where - a discourse heatmap
Does discourse structure constrain where we talk about what? Research on recurring mentions within discourse graphs shows back-reference is sensitive to the reasons why sentences and groups of sentences are uttered. In the image above, ...
More