Abstract
Large language models (LLMs) have exhibited impressive reasoning capabilities and proficiency in answering complex questions. However, they are prone to generating inaccurate or fabricated responses, a phenomenon commonly referred to as hallucination. This issue is particularly critical in high-stakes fields such as molecular chemistry, where errors can have significant consequences. It is essential to implement robust uncertainty quantification methods that enable us to evaluate the reliability of outputs generated by large language models. In this work, we present a novel Question Rephrasing technique to assess the input uncertainty of LLMs, which refers to the uncertainty arising from equivalent variations of the inputs provided to LLMs. This technique is integrated with sampling methods that measure the output uncertainty of LLMs, thereby offering a more comprehensive uncertainty assessment. We validated our approach to property prediction and reaction prediction for molecular chemistry tasks.