An unsupervised approach that summarises and orders the main changes verified in two versions of the same document – this is the research work that earned Ricardo Campos, a researcher at INESC TEC, Adam Jatowt and Lukas Éder, researchers at the University of Innsbruck (Austria), the Best Demo Paper Award at CIKM’23 – ACM International Conference on Information and Knowledge Management.
“Documents with different versions are common in different situations and play an important role in allowing an overview of the revisions made to a given document or set of documents”, explained Ricardo Campos, researcher at INESC TEC. However, the larger the document, the more difficult it is not only to summarise, but also to understand the changes made to documents with various versions. This question led to the development of an easy-to-use comparison and summary tool.
This way, the research team developed a prototype that allows users to summarise differences in two versions of the same document, from the extraction of keywords. The result of the research and development work, described in the paper “Contrastive Keyword Extraction from Versioned Documents”, allows “to understand the changes that have occurred in different types of documents”.
“The work stems from a collaboration with two researchers – Adam Jatowt and Lukas Éder – from the University of Innsbruck and presents an unsupervised approach that summarises and orders the main changes identified in two versions of the same document”, stated Ricardo Campos, adding that the solution is already available for use. “There is also a python package available online” concluded the researcher.
The paper received the Best Demo Paper Award (runner-up) at the 32nd ACM International Conference on Information and Knowledge Management – an A-rank conference in Artificial Intelligence and Data Science, that took place in late October, at the University of Birmingham
The award aims to acknowledge the best scientific papers focusing on the demonstration of applications and software, involving innovative scientific ideals. In 2023, 2435 scientific paper were submitted (74 of which were on the demo track) and 629 were accepted for presentation (26 on the demo track), resulting in an average acceptance rate of 27% in the five tracks of the conference. The conference proceedings are published by the Association for Computing Machinery (ACM).
The researcher mentioned in this news piece is associated with INESC TEC, the University of Beira Interior and Ci2 (Smart Cities Research Centers – Polytechnic Institute of Tomar)