Scholarship list
Journal article
Can Machine Learning Target Health Care Fraud? Evidence From Medicare Hospitalizations
Published 2026
Journal of policy analysis and management, 45, 1, n/a
The United States spends more than $4 trillion per year on health care, largely conducted by private providers and reimbursed by insurers. A major concern in this system is overbilling and fraud by hospitals, who face incentives to misreport their claims to receive higher payments. In this work, we develop novel machine learning tools to identify hospitals that overbill insurers, which can be used to guide investigations and auditing of suspicious hospitals for both public and private health insurance systems. Using large‐scale claims data from Medicare, the US federal health insurance program for the elderly and disabled, we identify patterns consistent with fraud among inpatient hospitalizations. Our proposed approach for fraud detection is fully unsupervised, not relying on any labeled training data, and is explainable to end users, providing interpretations for which diagnosis, procedure, and billing codes lead to hospitals being labeled suspicious. Using newly collected data from the Department of Justice on hospitals facing anti‐fraud lawsuits, and case studies of suspicious hospitals, we validate our approach and findings. Our method provides a nearly fivefold lift over random targeting of hospitals. We also perform a postanalysis to understand which hospital characteristics, not used for detection, are associated with suspiciousness.
Report
Unsupervised Machine Learning for Explainable Health Care Fraud Detection
Published 2023
The US spends more than 4 trillion dollars per year on health care, largely conducted by private providers and reimbursed by insurers. A major concern in this system is overbilling, waste and fraud by providers, who face incentives to misreport on their claims in order to receive higher payments. In this work, we develop novel machine learning tools to identify providers that overbill insurers. Using large-scale claims data from Medicare, the US federal health insurance program for elderly adults and the disabled, we identify patterns consistent with fraud or overbilling among inpatient hospitalizations. Our proposed approach for fraud detection is fully unsupervised, not relying on any labeled training data, and is explainable to end users, providing reasoning and interpretable insights into the potentially suspicious behavior of the flagged providers. Data from the Department of Justice on providers facing anti-fraud lawsuits and case studies of suspicious providers validate our approach and findings. We also perform a post-analysis to understand hospital characteristics, those not used for detection but associate with a high suspiciousness score. Our method provides an 8-fold lift over random targeting, and can be used to guide investigations and auditing of suspicious providers for both public and private health insurance systems.
Conference proceeding
Less is more: Slimg for accurate, robust, and interpretable graph mining
Published 2023
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 08/2023
How can we solve semi-supervised node classification in various graphs possibly with noisy features and structures? Graph neural networks (GNNs) have succeeded in many graph mining tasks, but their generalizability to various graph scenarios is limited due to the difficulty of training, hyperparameter tuning, and the selection of a model itself. Einstein said that we should "make everything as simple as possible, but not simpler." We rephrase it into the careful simplicity principle: a carefully-designed simple model can surpass sophisticated ones in real-world graphs. Based on the principle, we propose SlimG for semi-supervised node classification, which exhibits four desirable properties: It is (a) accurate, winning or tying on 10 out of 13 real-world datasets; (b) robust, being the only one that handles all scenarios of graph data (homophily, heterophily, random structure, noisy features, etc.); (c) fast and scalable, showing up to 18 times faster training in million-scale graphs; and (d) interpretable, thanks to the linearity and sparsity. We explain the success of SlimG through a systematic study of the designs of existing GNNs, sanity checks, and comprehensive ablation studies.
Journal article
Benefit-aware early prediction of health outcomes on multivariate eeg time series
Published 2023
Journal of biomedical informatics, 139, 104296
Given a cardiac-arrest patient being monitored in the ICU (intensive care unit) for brain activity, how can we predict their health outcomes as early as possible? Early decision-making is critical in many applications, e.g. monitoring patients may assist in early intervention and improved care. On the other hand, early prediction on EEG data poses several challenges: (i) earliness-accuracy trade-off; observing more data often increases accuracy but sacrifices earliness, (ii) large-scale (for training) and streaming (online decision-making) data processing, and (iii) multi-variate (due to multiple electrodes) and multi-length (due to varying length of stay of patients) time series. Motivated by this real-world application, we present BENEFITTER that infuses the incurred savings from an early prediction as well as the cost from misclassification into a unified domain-specific target called benefit. Unifying these two quantities allows us to directly estimate a single target (i.e. benefit), and importantly, (a) is efficient and fast, with training time linear in the number of input sequences, and can operate in real-time for decision-making, (b) can handle multi-variate and variable-length time-series, suitable for patient data, and (c) is effective, providing up to 2× time-savings with equal or better accuracy as compared to competitors.
Conference proceeding
Fairod: Fairness-aware outlier detection
Published 2021
Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society
AAAI/ACM Conference on AI, Ethics, and Society, 07/2021
Fairness and Outlier Detection (OD) are closely related, as it is exactly the goal of OD to spot rare, minority samples in a given population. However, when being a minority (as defined by protected variables, such as race/ethnicity/sex/age) does not reflect positive-class membership (such as criminal/fraud), OD produces unjust outcomes. Surprisingly, fairness-aware OD has been almost untouched in prior work, as fair machine learning literature mainly focuses on supervised settings. Our work aims to bridge this gap. Specifically, we develop desiderata capturing well-motivated fairness criteria for OD, and systematically formalize the fair OD problem. Further, guided by our desiderata, we propose FairOD, a fairness-aware outlier detector that has the following desirable properties: FairOD (1) exhibits treatment parity at test time, (2) aims to flag equal proportions of samples from all groups (i.e. obtain group fairness, via statistical parity), and (3) strives to flag truly high-risk samples within each group. Extensive experiments on a diverse set of synthetic and real world datasets show that FairOD produces outcomes that are fair with respect to protected variables, while performing comparable to (and in some cases, even better than) fairness-agnostic detectors in terms of detection performance.
Conference proceeding
Gen 2 out: Detecting and ranking generalized anomalies
Published 2021
2021 IEEE International Conference on Big Data (Big Data)
IEEE International Conference on Big Data, 2021
In a cloud of m-dimensional data points, how would we spot, as well as rank, both single-point- as well as group-anomalies? We are the first to generalize anomaly detection in two dimensions: The first dimension is that we handle both point-anomalies, as well as group-anomalies, under a unified view - we shall refer to them as generalized anomalies. The second dimension is that Gen2Out not only detects, but also ranks, anomalies in suspiciousness order. Detection, and ranking, of anomalies has numerous applications: For example, in EEG recordings of an epileptic patient, an anomaly may indicate a seizure; in computer network traffic data, it may signify a power failure, or a DoS/DDoS attack.We start by setting some reasonable axioms; surprisingly, none of the earlier methods pass all the axioms. Our main contribution is the Gen2Out algorithm, that has the following desirable properties: (a) Principled and Sound anomaly scoring that obeys the axioms for detectors, (b) Doubly-general in that it detects, as well as ranks generalized anomaly– both point- and group-anomalies, (c) Scalable, it is fast and scalable, linear on input size. (d) Effective, experiments on real-world epileptic recordings (200GB) demonstrate effectiveness of Gen2Out as confirmed by clinicians. Experiments on 27 real-world benchmark datasets show that Gen2Out detects ground truth groups, matches or outperforms point-anomaly baseline algorithms on accuracy, with no competition for group-anomalies and requires about 2 minutes for 1 million data points on a stock machine.
Conference proceeding
Entity resolution in dynamic heterogeneous networks
Published 2020
Companion Proceedings of the Web Conference 2020
WWW '20
Networks evolve continuously over time not only with the addition and deletion of links and nodes but also with changes in the importance of edges. Even though many networks contain this type of temporal weightings, vast majority of research in network representation learning and classification has focused on static snapshots of the graph, while largely ignoring the temporal dynamics. In this work, we describe two approaches for incorporating weighted temporal information into network embedding methods such as Graph Convolutional Networks (GCNs). While the first approach aggregates time-weighted edges and nodes, the second approach uses temporal random walks to find relevant convolution nodes. With experiments on public and proprietary datasets, we demonstrate the effectiveness of the proposed TimeSage for link prediction tasks. By applying these predictions, we show improvements in our task of identifying fraudulent actors on a large e-commerce website selling software as subscriptions.
Conference proceeding
Incorporating privileged information to unsupervised anomaly detection
Published 2018
Machine Learning and Knowledge Discovery in Databases: European Conference Part I
Machine Learning and Knowledge Discovery in Databases European Conference, 10/10/2018–10/14/2018, Dublin, Ireland
We introduce a new unsupervised anomaly detection ensemble called SPI which can harness privileged information - data available only for training examples but not for (future) test examples. Our ideas build on the Learning Using Privileged Information (LUPI) paradigm pioneered by Vapnik et al. [19,17], which we extend to unsupervised learning and in particular to anomaly detection. SPI (for Spotting anomalies with Privileged Information) constructs a number of frames/fragments of knowledge (i.e., density estimates) in the privileged space and transfers them to the anomaly scoring space through "imitation" functions that use only the partial information available for test examples. Our generalization of the LUPI paradigm to unsupervised anomaly detection shepherds the field in several key directions, including (i) domain knowledge-augmented detection using expert annotations as PI, (ii) fast detection using computationally-demanding data as PI, and (iii) early detection using "historical future" data as PI. Through extensive experiments on simulated and real datasets, we show that augmenting privileged information to anomaly detection significantly improves detection performance. We also demonstrate the promise of SPI under all three settings (i-iii); with PI capturing expert knowledge, computationally expensive features, and future data on three real world detection tasks.
Conference proceeding
Spreading Activation Way of Knowledge Integration
Published 2015
Mining Intelligence and Knowledge Exploration: Third International Conference, MIKE 2015, Hyderabad, India, December 9-11, 2015, Proceedings 3
MIKE 2015: Mining Intelligence and Knowledge Exploration, 2015
Search and recommender systems benefit from effective integration of two different kinds of knowledge. The first is introspective knowledge, typically available in feature-theoretic representations of objects. The second is external knowledge, which could be obtained from how users rate (or annotate) items, or collaborate over a social network. This paper presents a spreading activation model that is aimed at a principled integration of these two sources of knowledge. In order to empirically evaluate our approach, we restrict the scope to text classification tasks, where we use the category knowledge of the labeled set of examples as an external knowledge source. Our experiments show a significantly improved classification effectiveness on hard datasets, where feature value representations, on their own, are inadequate in discriminating between classes.
Conference proceeding
Linking cases up: An extension to the case retrieval network
Published 2014
Case-Based Reasoning Research and Development: 22nd International Conference, ICCBR 2014, Cork, Ireland, September 29, 2014-October 1, 2014. Proceedings 22
ICCBR 2014: Case-Based Reasoning Research and Development, 2014
In many domains, cases are associated with each other though this is not easily explained by the set of features they share. It is hard, for example to explicitly enumerate features that make a movie romantic. We present an extension to the Case Retrieval Network architecture, a spreading activation model initially proposed by Burkhard and Lenz, by allowing cases to influence each other independently of the features. We show that the architecture holds promise in improving effectiveness of retrieval in two distinct experimental domains.