DeTermIt! 2024

Workshop theme

In today's interconnected world, where information dissemination knows no linguistic bounds, it is mandatory to ensure that knowledge is accessible to diverse audiences, regardless of their language proficiency. Automatic Text Simplification (ATS) is the process that involves the reduction of linguistic complexity within a text to enhance its comprehensibility and readability. ATS plays a pivotal role in enhancing content and conveying clear, unambiguous information (comprehensibility and readability) for diverse audiences.
The DeTermIt! workshop builds upon the recent achievements of several initiatives that addressed specific areas within our realm of interest. It is aligned with the CLEF SimpleText Track which provides appropriate reusable data and benchmarks for scientific text summarization and simplification.
DeTermIt! aims to bring together researchers and practitioners in the field of text simplification, with a particular focus on the intersection of lexicography, terminology, and keyword extraction. This workshop will explore the theoretical and practical perspectives surrounding the evaluation of text difficulty in a multilingual context, and it will serve as a platform for discussing advancements, methodologies, and applications in simplification techniques that target different linguistic nuances and audiences.
We welcome contributions that present different viewpoints on automatic text simplification, considering document genres, diverse languages, and the challenges posed by linguistic complexities in general. In particular, we encourage authors to explore: theoretical elements identifying text or lexical complexity and experimental analyses for aligning text with the reading proficiency of diverse audiences. The workshop seeks contributions including, but not limited to, the following themes:

Theoretical Perspectives:

Refinement of models and strategies for Automated Text Simplification.
Identification of common linguistic patterns and challenges in different languages for ATS.
Role of multilingual resources in simplifying complex terminology.
Exploration of innovative methodologies for simplifying complex terminologies without compromising meaning.
Study the role of lexicography in simplifying texts; for example, the development of lexicons and dictionaries tailored for simplification tasks.

Practical Applications:

Creation of effective tools and multilingual resources for linguistic inclusivity.
Development and utilization of language resources like bilingual and multilingual glossaries, translation memories, and terminology databases.
Evaluation of machine translation and NLP techniques in text simplification across languages.
Analysis of practical methods to adapt domain-specific terminology for enhanced accessibility in various fields such as medicine, law, or technology.
Creation of lexical resources that assist in the automatic generation of simplified texts across different domains and languages.
Enhancement of summarization techniques by effectively identifying and prioritizing key information in simplified content.

The provisional agenda of the conference is:

(extended) March 10, 2024: deadline for submitting paper proposals (Anywhere on Earth)
~~March 18 March 25, 2024: notification to the authors~~
~~April 5, 2024: deadline for submitting the camera-ready version of the paper~~
May 2, 2024: publication of provisional program
May 21, 2024: Workshop

Submissions

We invite original contributions, including research papers, case studies, and system demonstrations. Submissions may include previously unpublished work or work in progress.
When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e., also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC-COLING authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones).
Papers must be compliant with the stylesheet adopted for the LREC-COLING conference Proceedings. Workshop Proceedings will be published on the LREC-COLING 2024 website.

Paper types.

Submissions may be of three types:

Regular long papers – up to eight (8) pages maximum, presenting substantial, original, completed, and unpublished work.
Short papers – up to four (4) pages, describing a small focused contribution, negative results, system demonstrations, etc.
Position papers – up to eight (8) pages, discussing key hot topics, challenges and open issues, as well as cross-fertilization between computational linguistics and other disciplines.

Authors must submit their papers via the SoftConf platform at the following link DeTermIt! 2024.

Keynote Speaker

We have the pleasure to announce the following keynote speaker:

Prof. Sara Carvalho

University of Aveiro, Aveiro, Portugal

Title: Clear Communication, Better Healthcare: Leveraging Terminological Data for Automatic Text Simplification

Abstract: Effective communication lies at the heart of quality healthcare delivery. Yet, the complexity of medical terminology often creates barriers in patient-healthcare provider interactions and may hamper patient engagement. Automatic Text Simplification (ATS) has emerged as a promising approach for improving the readability and understanding of medical texts. However, the success of ATS systems relies on consistent and structured terminological data. Drawing upon a double-dimensional approach to terminology work, this talk explores how the systematic representation, organisation, and sharing of terminological data in healthcare can contribute to the development of ATS tools that can better address the unique needs of healthcare communication.
While there are still many challenges to overcome in this regard - namely concerning ambiguity, polysemy, and terminological variation, along with conceptual multidimensionality and interoperability -, there is undeniable potential in leveraging terminological data to help advance ATS tools in this subject field (e.g. tailored outputs based on health literacy levels and background knowledge, enhancing the training of machine learning models, as well as the precision of simplification algorithms).
By underscoring the tangible benefits of incorporating terminological data and its underlying organisation principles into the development pipeline of ATS tools, this talk highlights the broader implications of clear communication in healthcare, emphasising its role in improving health literacy, fostering more effective patient-healthcare provider interactions, enhancing patient satisfaction and engagement, and ultimately driving better healthcare outcomes.

Biography: Sara Carvalho is an Assistant Professor at the Department of Languages and Cultures, University of Aveiro, where she teaches courses in the fields of terminology, specialised translation, English and German linguistics, and technical communication.
She holds a PhD in Linguistics, with a specialisation in Lexicology, Lexicography and Terminology, from the Faculty of Social Sciences and Humanities – Universidade NOVA de Lisboa (NOVA FCSH). Her thesis, entitled “A terminological approach to knowledge organisation within the scope of endometriosis: the EndoTerm project”, was developed within the scope of a co-tutelle agreement between the Universidade NOVA de Lisboa and the Communauté Université Grenoble Alpes. She holds an MA in German Studies – specialisation in German Linguistics – from the University of Aveiro (UA), and graduated in Modern Languages and Literature (English and German Studies) at the Faculty of Arts and Humanities – University of Coimbra.
She is a researcher at the Languages, Literatures and Cultures Research Centre of the University of Aveiro (CLLC-UA) and at the Linguistics Research Centre of the Universidade NOVA de Lisboa (NOVA CLUNL). In addition, she is a member of the ISO/TC 37 "Language and terminology" and of the Portuguese mirror committee "CT 221 – Terminologia, Língua e Linguagens" at IPQ. She also integrates the COST Action 18209 - European network for Web-centred linguistic data science, where she currently leads Working Group 4 (Use cases and applications).

Program Outline

Tuesday 21 May 2024

08:30 - 09:00

Registration

09:00 - 09:10

Opening and welcome

09:10 - 10:00

Keynote speaker: Sara Carvalho, University of Aveiro, Aveiro, Portugal

Clear Communication, Better Healthcare: Leveraging Terminological Data for Automatic Text Simplification

10:00 - 10:30

Session 1 (short papers: 12 mins presentation + 3 mins questions)

10:00 - 10:15

Plain Language Summarization of Clinical Trials

Polydoros Giannouris, Theodoros Myridis, Tatiana Passali and Grigorios Tsoumakas

10:15 - 10:30

Pre-Gamus: Reducing Complexity of Scientific Literature as a Support against Misinformation

Nico Colic, Jin-Dong Kim and Fabio Rinaldi

10:30 - 11:00

Coffee break

11:00 - 13:00

Session 2 (long papers: 15 mins presentation + 5 mins questions)

11:00 - 11:20

Reproduction of German Text Simplification Systems

Regina Stodden

11:20 - 11:40

Simplification Strategies in French Spontaneous Speech

Lucía Ormaechea, Nikos Tsourakis, Didier Schwab, Pierrette Bouillon and Benjamin Lecouteux

11:40 - 12:00

Towards Automatic Finnish Text Simplification

Anna Dmitrieva and Jörg Tiedemann

12:00 - 12:20

Complexity-Aware Scientific Literature Search: Searching for Relevant and Accessible Scientific Text

Liana Ermakova and Jaap Kamps

12:20 - 12:40

DARES: Dataset for Arabic Readability Estimation of School Materials

Mo El-Haj, Sultan Almujaiwel, Damith Premasiri, Tharindu Ranasinghe and Ruslan Mitkov

12:40 - 13:00

Legal Text Reader Profiling: Evidences from Eye Tracking and Surprisal Based Analysis

Calogero J. Scozzaro, Davide Colla, Matteo Delsanto, Antonio Mastropaolo, Enrico Mensa, Luisa Revelli and Daniele P. Radicioni

13:00 - 14:00

Lunch break

14:00 - 16:00

Session 3 (long papers: 15 mins presentation + 5 mins questions)

14:00 - 14:20

Beyond Sentence-level Text Simplification: Reproducibility Study of Context-Aware Document Simplification

Jan Bakker and Jaap Kamps

14:20 - 14:40

A Multilingual Survey of Recent Lexical Complexity Prediction Resources through the Recommendations of the Complex 2.0 Framework

Matthew Shardlow, Kai North and Marcos Zampieri

14:40 - 15:00

LARGEMED: A Resource for Identifying and Generating Paraphrases for French Medical Terms

Ioana Buhnila and Amalia Todirascu

15:00 - 15:20

Enhancing Lexical Complexity Prediction through Few-shot Learning with Gpt-3

Jenny Alexandra Ortiz-Zambrano, César Humberto Espín-Riofrío and Arturo Montejo-Ráez

15:20 - 15:40

Clearer Governmental Communication: Text Simplification with ChatGPT Evaluated by Quantitative and Qualitative Research

Nadine Beks van Raaij, Daan Kolkman and Ksenia Podoynitsyna

15:40 - 16:00

Simpler Becomes Harder: Do LLMs Exhibit a Coherent Behavior on Simplified Corpora?

Miriam Anschütz, Edoardo Mosca and Georg Groh

16:30 - 17:00

Coffee break

16:30 - 18:00

Session 4 (short papers: 12 mins presentation + 3 mins of questions)

16:30 - 16:45

An Approach towards Unsupervised Text Simplification on Paragraph-Level for German Texts

Leon Fruth, Robin Jegan and Andreas Henrich

16:45 - 17:00

Legal Science and Computer Science: A Preliminary Discussions on How to Represent the "Penumbra" Cone with AI

Angela Condello and Giorgio Maria Di Nunzio

17:00 - 17:15

The Simplification of the Language of Public Administration: The Case of Ombudsman Institutions

Gabriel Gonzalez-Delgado and Borja Navarro-Colorado

17:15 - 17:30

Term Variation in Institutional Languages: Degrees of Specialization in Municipal Waste Management Terminology

Nicola Cirillo and Daniela Vellutino

17:30 - 18:00

Closing

General Chairs

DeTermIt!

DeTermIt! Workshop
Evaluating Text Difficulty in a Multilingual Context

21 May 2024, Turin, Italy

* Official ACL Anthology Workshop Proceedings *

Workshop theme

Submissions

Paper types.

Authors must submit their papers via the SoftConf platform at the following link DeTermIt! 2024.

Important dates

General Chairs

Scientific committee

Conference Venue

Keynote Speaker

Prof. Sara Carvalho

University of Aveiro, Aveiro, Portugal

Title: Clear Communication, Better Healthcare: Leveraging Terminological Data for Automatic Text Simplification

Program Outline

Tuesday 21 May 2024

CONTACT US

DeTermIt!

DeTermIt! WorkshopEvaluating Text Difficulty in a Multilingual Context

21 May 2024, Turin, Italy

*** Official ACL Anthology Workshop Proceedings ***

Workshop theme

Submissions

Paper types.

Authors must submit their papers via the SoftConf platform at the following link DeTermIt! 2024.

Important dates

General Chairs

Scientific committee

Conference Venue

Keynote Speaker

Prof. Sara Carvalho

University of Aveiro, Aveiro, Portugal

Title: Clear Communication, Better Healthcare: Leveraging Terminological Data for Automatic Text Simplification

Program Outline

Tuesday 21 May 2024

CONTACT US

DeTermIt! Workshop
Evaluating Text Difficulty in a Multilingual Context

* Official ACL Anthology Workshop Proceedings *