Outlier Gradient Analysis: Efficiently Improving Deep Learning Model Performance via Hessian-Free Influence Functions

Anshuman Chhabra; Bo Li; Jian Chen; Prasant Mohapatra; Hongfu Liu

doi:10.48550/arxiv.2405.03869

Back

Preprint

Outlier Gradient Analysis: Efficiently Improving Deep Learning Model Performance via Hessian-Free Influence Functions

Anshuman Chhabra, Bo Li, Jian Chen, Prasant Mohapatra and Hongfu Liu

05/06/2024

DOI: https://doi.org/10.48550/arxiv.2405.03869

Abstract

Computer Science - Artificial Intelligence

Computer Science - Learning

Influence functions offer a robust framework for assessing the impact of each training data sample on model predictions, serving as a prominent tool in data-centric learning. Despite their widespread use in various tasks, the strong convexity assumption on the model and the computational cost associated with calculating the inverse of the Hessian matrix pose constraints, particularly when analyzing large deep models. This paper focuses on a classical data-centric scenario--trimming detrimental samples--and addresses both challenges within a unified framework. Specifically, we establish an equivalence transformation between identifying detrimental training samples via influence functions and outlier gradient detection. This transformation not only presents a straightforward and Hessian-free formulation but also provides profound insights into the role of the gradient in sample impact. Moreover, it relaxes the convexity assumption of influence functions, extending their applicability to non-convex deep models. Through systematic empirical evaluations, we first validate the correctness of our proposed outlier gradient analysis on synthetic datasets and then demonstrate its effectiveness in detecting mislabeled samples in vision models, selecting data samples for improving performance of transformer models for natural language processing, and identifying influential samples for fine-tuned Large Language Models.

Metrics

1 Record Views

Details

Title: Outlier Gradient Analysis: Efficiently Improving Deep Learning Model Performance via Hessian-Free Influence Functions
Creators: Anshuman Chhabra
Bo Li
Jian Chen
Prasant Mohapatra
Hongfu Liu
Identifiers: 9924352047701921
Academic Unit: Michtom School of Computer Science; Benjamin and Mae Volen National Center for Complex Systems
Language: English
Resource Type: Preprint

Outlier Gradient Analysis: Efficiently Improving Deep Learning Model Performance via Hessian-Free Influence Functions

Abstract

Metrics

Details

Brandeis University Social media