BioCreative9@IJCAI-2025 2025 : BioCreative IX Challenge and Workshop@IJCAI-2025: Large Language Models for Clinical and Biomedical NLP

posted by user: dongfangxu || 3175 views || tracked by 5 users: [display]

BioCreative9@IJCAI-2025 2025 : BioCreative IX Challenge and Workshop@IJCAI-2025: Large Language Models for Clinical and Biomedical NLP

Link: https://www.ncbi.nlm.nih.gov/research/bionlp/biocreative9

When	Aug 21, 2025 - Aug 22, 2025
Where	Montreal, CA
Submission Deadline	May 5, 2025
Notification Due	Jun 6, 2025

Categories NLP biomedical LLM AI

Call For Papers

Large Language Models for Clinical and Biomedical NLP:
BioCreative IX Challenge and Workshop CFP
at IJCAI, 2025
Where, When:
The BioCreative IX workshop will run with IJCAI 2025, August 16-22, 2025, In Montreal, CA.

BioCreative IX:
The 9th BioCreative workshop seeks to attract researchers interested in developing and evaluating automatic methods of extracting medically relevant information from clinical data and aims to bring together the medical NLP community and the healthcare researchers and practitioners. The challenge tracks explore MedHopQA, a dataset for benchmarking LLM-based reasoning systems with disease-centered question answers, ToxHabits, a task exploring the information extraction related to substance use and abuse in Spanish clinical content, and Sentence segmentation of real clinical notes using MIMIC-II clinical notes. We also will feature paper submissions on relevant topics and poster/tool demonstrations.

Workshop Proceedings and Special Issue:
The BioCreative IX Proceedings will host all the accepted papers submissions, and submissions from participating teams, and they will be freely available by the time of the workshop.
In addition, select papers will be invited for a journal BioCreative IX special issue for work that passes their peer-review process. More details and information to submit will be posted in June.

Participation:
We welcome both general participation and shared task participation. Shared task participation teams can participate in one or more tracks. Registration will continue until April 30th, when final commitment is requested.
To register a team go to the Registration Form. If you have restrictions accessing Google forms please send e-mail to BiocreativeChallenge@gmail.com.

Call for Papers
In addition to the shared tasks, we welcome short paper submissions for oral/poster presentations and tool demonstrations at the workshop. Topics of interest include but are not limited to:
Novel methods development for biomedical literature mining
Development of benchmarking datasets for clinical NLP/AI
Generative AI for synthetic data generation in health informatics
Explainable AI for medical/biomedical data
Creating and evaluating synthetic data using LLMs and its impact for downstream tasks
Creative use of data augmentation for increasing tool accuracy and trustworthiness
Use of LLMs to streamline annotation tasks
NLP/AI-systems capable of identifying entities in multilingual corpora
NLP/AI-systems capable of semantic interoperability across different terminologies/ ontologies for efficient data curation
Integrating ontologies and knowledge bases for factual LLM production
Annotated corpora and other resources for health care and biomedical data modelling
Predictive methods for NLP/AI systems capable of identifying biomedical information
Intelligent agent frameworks for health data analysis
Important Dates for Call for Papers Participants
March - May: Registration
May 05, 2025: Submission of papers deadline
Jun 06, 2025: Notification of accepted papers -- No extensions due to ICJAI publication deadline.
Aug 16- Aug 22 2025: IJCAI 2025

BioCreative IX Tracks:
Track 1: MedHopQA
Large language models (LLMs) are commonly evaluated on their capabilities to answer questions in various domains, and it has become clear that robust QA datasets are critical to ensure proper evaluation of LLMs prior to their deployment in real-world biomedical or healthcare related applications. This track aims to advance the development of LLM-based systems that are capable of answering questions that involve multi-step reasoning. We have created a resource consisting of 1,000 question-answer pairs – focusing on diseases, genes and chemicals, mostly pertaining to rare diseases – based on public information in Wikipedia. The participants are encouraged to use any training data they wish to design and develop their NLP system agents that understand asserted information on genes, diseases, chemicals etc. and are able to answer multi-step reasoning questions involving such information. This track builds on the previous success in biomedical QA benchmarking (e.g., PubMedQA and BioASQ, MedQA) but differs from them in the fact that for MedHopQA it is necessary to employ a multi-step reasoning process to find the correct answer.
Track 2: Sentence segmentation of real-life clinical notes
Sentence segmentation is a fundamental linguistic task and is widely used as a pre-processing step in many NLP tasks. Although the development of LLMs and the sparse attention mechanism in transformer networks have reduced the necessity of sentence level inputs in some NLP tasks, many models are designed and tested only for shorter sequences. The need for sentence segmentation is particularly pronounced in clinical notes, as most clinical NLP tasks depend on this information for annotation and model training. In this shared task, we challenge participants to detect sentence boundaries (spans) for MIMIC-III clinical notes, where fragmented and incomplete sentences, complex graphemic devices (e.g. abbreviations, and acronyms), and markups are common. To encourage generalizability to multi-domain texts, participants will receive annotated texts from newswire articles and biomedical literature, in addition to clinical notes, for model development and evaluation.
Track 3: ToxHabits
There is a pressing need to extract information related to substance use and abuse more systematically, including not only smoking and alcohol abuse but also other harmful drugs and substances from clinical content. These toxic habits have a considerable health impact on a variety of medical conditions and also affect the action of prescribed medications. To make such information actionable, it is critical to not only detect instances of consumption, but also to characterize certain aspects related to it, such as duration or mode of administration. Some initial efforts have been made to automatically detect social determinants of health, including smoking status, for content in English, but very limited efforts have been made for content in other languages. Therefore, we propose the ToxHabits track to address the automatic extraction of substance use and abuse information from clinical cases in Spanish. This task will consist of three subtasks: (a) toxic habit mention recognition, (b) detection of relevant clinical modifiers related to substance abuse, as well as (c) toxic habit condition QA challenge.
Important Dates for BioCreative IX Tracks Participants
March - April: Team Registration
May 16, 2025: (All Tracks) Testing predictions/evaluation results
May 23, 2025: Submission of BioCreative IX Tracks participants papers deadline
Jun 06, 2025: Notification of accepted papers: -- No extensions due to ICJAI publication deadline.

Aug 16- Aug 22 2025: IJCAI 2025
Organizing Committee
Dr. Rezarta Islamaj, National Library of Medicine
Dr. Graciela Gonzalez-Hernandez, Cedars-Sinai Medical Center
Dr. Martin Krallinger, Barcelona Supercomputing Center
Dr. Zhiyong Lu, National Library of Medicine