Hongfu Liu

Assistant Professor of Computer Science

Data Mining

Machine Learning

Data analytics in terms of cluster analysis

Outlier detection

Transfer learning

Feature selection and fair machine learning

Conference paper

Characterizing the Influence of Graph Elements

by Zizhang Chen, Hongfu Liu and Pengyu Hong

Date presented 05/03/2023

The International Conference on Learning Representations, 05/01/2023–05/05/2023, Kigali, Rwanda

Influence function, a method from robust statistics, measures the changes of model parameters or some functions about model parameters concerning the removal or modification of training instances. It is an efficient and useful post-hoc method for studying the interpretability of machine learning models without the need for expensive model re-training. Recently, graph convolution networks (GCNs), which operate on graph data, have attracted a great deal of attention. However, there is no preceding research on the influence functions of GCNs to shed light on the effects of removing training nodes/edges from an input graph. Since the nodes/edges in a graph are interdependent in GCNs, it is challenging to derive influence functions for GCNs. To fill this gap, we started with the simple graph convolution (SGC) model that operates on an attributed graph and formulated an influence function to approximate the changes in model parameters when a node or an edge is removed from an attributed graph. Moreover, we theoretically analyzed the error bound of the estimated influence of removing an edge. We experimentally validated the accuracy and effectiveness of our influence estimation function. In addition, we showed that the influence function of an SGC model could be used to estimate the impact of removing training nodes/edges on the test performance of the SGC without re-training the model. Finally, we demonstrated how to use influence functions to guide the adversarial attacks on GCNs effectively.

Conference poster

Exploiting Temporal Relations on Radar Perception for Autonomous Driving

by Peizhao Li, Pu Wang, Karl Berntorp and Hongfu Liu

Date presented 06/24/2022

IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 06/21/2022–06/24/2022, New Orleans, Louisiana

We consider the object recognition problem in autonomous driving using automotive radar sensors. Comparing to Lidar sensors, radar is cost-effective and robust in all-weather conditions for perception in autonomous driving. However, radar signals suffer from low angular resolution and precision in recognizing surrounding objects. To enhance the capacity of automotive radar, in this work, we exploit the temporal information from successive ego-centric bird-eye-view radar image frames for radar object recognition. We leverage the consistency of an object's existence and attributes (size, orientation, etc.), and propose a temporal relational layer to explicitly model the relations between objects within successive radar images. In both object detection and multiple object tracking, we show the superiority of our method compared to several baseline approaches.

Conference presentation

Spatially Constrained GAN for Face and Fashion Synthesis

by Songyao Jiang, Hongfu Liu, Yue Wu and Yun Fu

Date presented 12/18/2021

IEEE International Conference on Automatic Face and Gesture Recognition 2021, 12/15/2021–12/18/2021, Online

Image synthesis has raised tremendous attention in both academic and industrial areas, especially for conditional and target-oriented image synthesis, such as criminal portrait and fashion design. The current studies have achieved encouraging results along this direction, but they mostly focus on class labels where spatial contents are randomly generated from latent vectors. Some recent studies have explored spatial constraints for generative models guided by semantic segmentation, but most of them are designed for scene generation and lack random variation. Such methods are not suitable for face or fashion image synthesis, where different images may share the same semantics. Different from all the current methods, we decouple the image synthesis task into three independent dimensions and propose a novel Spatially Constrained Generative Adversarial Network (SCGAN) to model it. SCGAN uses a simple yet effective way to decouple spatial constraints and attribute conditions from latent vectors, and treat them as additional controllable signals via a segmentor and a specially designed generator. Other unregulated contents are left to be generated from latent vectors. Experimentally, we provide both qualitative and quantitative results on CelebA and DeepFashion datasets to demonstrate that the proposed SCGAN is very effective in synthesizing spatially controllable and attribute-specific images with high visual quality and large variations. Our code is provided at https://github.com/jackyjsy/SCGAN.

Conference poster

Implicit Semantic Response Alignment for Partial Domain Adaptation

by Wenxiao Xiao, Zhengming Ding and Hongfu Liu

Date presented 12/07/2021

Conference on Neural Information Processing Systems (NeurIPS), 12/07/2021–12/10/2021, Online

Partial Domain Adaptation (PDA) addresses the unsupervised domain adaptation problem where the target label space is a subset of the source label space. Most state-of-art PDA methods tackle the inconsistent label space by assigning weights to classes or individual samples, in an attempt to discard the source data that belongs to the irrelevant classes. However, we believe samples from those extra categories would still contain valuable information to promote positive transfer. In this paper, we propose the Implicit Semantic Response Alignment to explore the intrinsic relationships among different categories by applying a weighted schema on the feature level. Specifically, we design a class2vec module to extract the implicit semantic topics from the visual features. With an attention layer, we calculate the semantic response according to each implicit semantic topic. Then semantic responses of source and target data are aligned to retain the relevant information contained in multiple categories by weighting the features, instead of samples. Experiments on several cross-domain benchmark datasets demonstrate the effectiveness of our method over the state-of-the-art PDA methods. Moreover, we elaborate in-depth analyses to further explore implicit semantic alignment.

Conference paper

Fairness-Aware Unsupervised Feature Selection

by Xiaoying Xing, Hongfu Liu, Chen Chen and Jundong Li

Date presented 11/2021

ACM International Conference on Information and Knowledge Management, 11/01/2021–11/05/2021, Online

Feature selection is a prevalent data preprocessing paradigm for various learning tasks. Due to the expensive cost of acquiring supervision information, unsupervised feature selection sparks great interests recently. However, existing unsupervised feature selection algorithms do not have fairness considerations and suffer from a high risk of amplifying discrimination by selecting features that are over associated with protected attributes such as gender, race, and ethnicity. In this paper, we make an initial investigation of the fairness-aware unsupervised feature selection problem and develop a principled framework, which leverages kernel alignment to find a subset of high-quality features that can best preserve the information in the original feature space while being minimally correlated with protected attributes. Specifically, different from the mainstream in-processing debiasing methods, our proposed framework can be regarded as a model-agnostic debiasing strategy that eliminates biases and discrimination before downstream learning algorithms are involved. Experimental results on multiple real-world datasets demonstrate that our framework achieves a good trade-off between utility maximization and fairness promotion.

Conference paper

Towards Novel Target Discovery Through Open-Set Domain Adaptation

by Taotao Jing, Hongfu Liu and Zhengming Ding

Date presented 10/2021

IEEE/CVF International Conference on Computer Vision (ICCV), 10/11/2021–10/17/2021, Online

Open-set domain adaptation (OSDA) considers that the target domain contains samples from novel categories unobserved in external source domain. Unfortunately, existing OSDA methods always ignore the demand for the information of unseen categories and simply recognize them as "unknown" set without further explanation. This motivates us to understand the unknown categories more specifically by exploring the underlying structures and recovering their interpretable semantic attributes. In this paper, we propose a novel framework to accurately identify the seen categories in target domain, and effectively recover the semantic attributes for unseen categories. Specifically, structure preserving partial alignment is developed to recognize the seen categories through domain-invariant feature learning. Attribute propagation over visual graph is designed to smoothly transit attributes from seen to unseen categories via visual-semantic mapping. Moreover, two new cross-main benchmarks are constructed to evaluate the proposed framework in the novel and practical challenge. Experimental results on open-set recognition and semantic recovery demonstrate the superiority of the proposed method over other compared baselines.

Conference paper

Deep Clustering-based Fair Outlier Detection

by Hanyu Song, Peizhao Li and Hongfu Liu

Date presented 08/2021

ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 08/14/2021–08/18/2021, Online

In this paper, we focus on the fairness issues regarding unsupervised outlier detection. Traditional algorithms, without a specific design for algorithmic fairness, could implicitly encode and propagate statistical bias in data and raise societal concerns. To correct such unfairness and deliver a fair set of potential outlier candidates, we propose Deep Clustering based Fair Outlier Detection (DCFOD) that learns a good representation for utility maximization while enforcing the learnable representation to be subgroup-invariant on the sensitive attribute. Considering the coupled and reciprocal nature between clustering and outlier detection, we leverage deep clustering to discover the intrinsic cluster structure and out-of-structure instances. Meanwhile, an adversarial training erases the sensitive pattern for instances for fairness adaptation. Technically, we propose an instance-level weighted representation learning strategy to enhance the joint deep clustering and outlier detection, where the dynamic weight module re-emphasizes contributions of likely-inliers while mitigating the negative impact from outliers. Demonstrated by experiments on eight datasets comparing to 17 outlier detection algorithms, our DCFOD method consistently achieves superior performance on both the outlier detection validity and two types of fairness notions in outlier detection.

Conference paper

SelfDoc: Self-Supervised Document Represen-tation Learning

by Hongfu Liu, Peizhao Li, Jiuxiang Gu, Jason Kuen, Vlad Morariu, Handong Zhao, Rajiv Jain and Varun Manjunatha

Date presented 06/2021

IEEE Conference on Computer Vision and Pattern Recognition, 06/20/2021–06/25/2021, virtual / Nashville, TN

Conference paper

On Dyadic Fairness: Exploring and Mitigating Bias in Graph Connections

by Hongfu Liu, Peizhao Li, Yifei Wang and Pengyu Hong

Date presented 05/2021

International Conference on Learning Representations, 05/03/2021–05/07/2021, virtual

Conference paper

Tweet Sentiment Analysis of the 2020 U.S. Presidential Election

by Hongfu Liu, Han Yue and Ethan Xia

WWW '21: The Web Conference 2021, 04/19/2021–04/23/2021, Ljubljana, Slovenia

In this paper, we conducted a tweet sentiment analysis of the 2020 U.S. Presidential Election between Donald Trump and Joe Biden. Specially, we identied the Multi-Layer Perceptron classier as the methodology with the best performance on the Sanders Twitter benchmark dataset. We collected a sample of over 260,000 tweets related to the 2020 U.S. Presidential Election from the Twitter website via Twitter API, processed feature extraction, and applied Multi- Layer Perceptron to classify these tweets with a positive or negative sentiment. From the results, we concluded that (1) contrary to popular poll results, the candidates had a very close negative to positive sentiment ratio, (2) negative sentiment is more common and prominent than positive sentiment within the social media domain, (3) some key events can be detected by the trends of sentiment on
social media, and (4) sentiment analysis can be used as a low-cost and easy alternative to gather political opinion.

Hongfu Liu

Assistant Professor of Computer Science

Scholarship list

Brandeis University Social media