Abstract
Even pruned by the state-of-the-art network compression methods, Graph Neural
Networks (GNNs) training upon non-Euclidean graph data often encounters
relatively higher time costs, due to its irregular and nasty density
properties, compared with data in the regular Euclidean space. Another natural
property concomitantly with graph is class-imbalance which cannot be alleviated
by the massive graph data while hindering GNNs' generalization. To fully tackle
these unpleasant properties, (i) theoretically, we introduce a hypothesis about
what extent a subset of the training data can approximate the full dataset's
learning effectiveness. The effectiveness is further guaranteed and proved by
the gradients' distance between the subset and the full set; (ii) empirically,
we discover that during the learning process of a GNN, some samples in the
training dataset are informative for providing gradients to update model
parameters. Moreover, the informative subset is not fixed during training
process. Samples that are informative in the current training epoch may not be
so in the next one. We also notice that sparse subnets pruned from a
well-trained GNN sometimes forget the information provided by the informative
subset, reflected in their poor performances upon the subset. Based on these
findings, we develop a unified data-model dynamic sparsity framework named
Graph Decantation (GraphDec) to address challenges brought by training upon a
massive class-imbalanced graph data. The key idea of GraphDec is to identify
the informative subset dynamically during the training process by adopting
sparse graph contrastive learning. Extensive experiments on benchmark datasets
demonstrate that GraphDec outperforms baselines for graph and node tasks, with
respect to classification accuracy and data usage efficiency.