Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks

Zizhang Chen; Pengyu Hong; Sandeep Madireddy

Back

Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks

Conference paper

Open access

Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks

Zizhang Chen, Pengyu Hong and Sandeep Madireddy

NeurIPS Workshop: Statistical Foundations of LLMs and Foundation Models (Vancouver, Canada, 12/10/2024–12/15/2024)

12/14/2024

Abstract

Large language models

Uncertainty

Large language models (LLMs) have exhibited impressive reasoning capabilities and proficiency in answering complex questions. However, they are prone to generating inaccurate or fabricated responses, a phenomenon commonly referred to as hallucination. This issue is particularly critical in high-stakes fields such as molecular chemistry, where errors can have significant consequences. It is essential to implement robust uncertainty quantification methods that enable us to evaluate the reliability of outputs generated by large language models. In this work, we present a novel Question Rephrasing technique to assess the input uncertainty of LLMs, which refers to the uncertainty arising from equivalent variations of the inputs provided to LLMs. This technique is integrated with sampling methods that measure the output uncertainty of LLMs, thereby offering a more comprehensive uncertainty assessment. We validated our approach to property prediction and reaction prediction for molecular chemistry tasks.

Files and links (1)

pdf

NeurIPS_2024_workshop1.79 MBDownload View

Open Access

Metrics

1 Record Views

Details

Title: Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks
Creators: Zizhang Chen
Pengyu Hong (Author) - Brandeis University, Michtom School of Computer Science
Sandeep Madireddy - Argonne National Laboratory
Conference: NeurIPS Workshop: Statistical Foundations of LLMs and Foundation Models (Vancouver, Canada, 12/10/2024–12/15/2024)
Grants: Identifying and addressing missingness and bias to enhace discovery from multimodal health data, R01 LM014239, National Institutes of Health (United States, Bethesda) - NIH
Identifiers: 9924457453501921
Academic Unit: Benjamin and Mae Volen National Center for Complex Systems; Michtom School of Computer Science
Language: English
Resource Type: Conference paper

Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks

Abstract

Files and links (1)

Metrics

Details

Brandeis University Social media