Logo image
Home Academic units
Sign in
From tests to effect sizes: Quantifying uncertainty and statistical variability in multilingual and multitask NLP evaluation benchmarks
Preprint

From tests to effect sizes: Quantifying uncertainty and statistical variability in multilingual and multitask NLP evaluation benchmarks

Jonne Sälevä, Duygu Ataman and Constantine Lignos
09/26/2025

Abstract

Computer Science - Computation and Language

Metrics

1 Record Views

Details

Logo image