Abstract
Large Language models (LLMs) possess the capability to engage In-context
Learning (ICL) by leveraging a few demonstrations pertaining to a new
downstream task as conditions. However, this particular learning paradigm
suffers from high instability stemming from substantial variances induced by
factors such as the input distribution of selected examples, their ordering,
and prompt formats. In this work, we demonstrate that even when all these
factors are held constant, the random selection of examples still results in
high variance. Consequently, we aim to explore the informative ability of data
examples by quantifying the Information Gain (IG) obtained in prediction after
observing a given example candidate. Then we propose to sample those with
maximum IG. Additionally, we identify the presence of template bias, which can
lead to unfair evaluations of IG during the sampling process. To mitigate this
bias, we introduce Calibration Before Sampling strategy. The experimental
results illustrate that our proposed method can yield an average relative
improvement of 14.3% across six classification tasks using three LLMs.