Toward Unified Scientific Representation Learning: from Pure-Data Driven Forecasting to Multimodality Fusion

Zhengyang Zhou

doi:10.48617/etd.1421

Back

Toward Unified Scientific Representation Learning: from Pure-Data Driven Forecasting to Multimodality Fusion

Dissertation

Open access

Toward Unified Scientific Representation Learning: from Pure-Data Driven Forecasting to Multimodality Fusion

Zhengyang Zhou

Doctor of Philosophy (PhD), Brandeis University, Graduate School of Arts & Sciences

2025

DOI:

https://doi.org/10.48617/etd.1421

Abstract

As scientific challenges become increasingly complex and data sources more diverse, there is a growing need for learning systems that move beyond purely data-driven meth- ods. In domains such as chemistry and materials science, where data can be scarce, heterogeneous, or deeply tied to domain expertise, conventional models often struggle to produce accurate and interpretable results. This dissertation addresses these chal- lenges by investigating unified scientific representation learning through the integration of multimodal data and relational similarity learning. It explores three interrelated di- rections that reflect a progression from unimodal analysis to full multimodal fusion. The first part focuses on predicting spatial orientation using image modalities, demonstrat- ing how pixle information can guide models in learning directional patterns. The second part centers on multimodal alignment in chemistry, aligning molecular graphs with spec- tral using contrastive learning to ensure consistent and enriched representations across data types. The third part introduces a framework for multimodal fusion in molecular property prediction, showing how integrating multiple modalities within a graph-based architecture captures both local and global relationships critical for generalization. Col- lectively, these contributions advance the development of flexible, modality-aware rep- resentation learning systems that improve robustness, interpretability, and predictive performance in scientific machine learning.

Files and links (1)

pdf

Zhengyang_Zhou_s_thesis_final (2)20.81 MBDownload View

Open Access

Metrics

1 Record Views

Details

Title: Toward Unified Scientific Representation Learning: from Pure-Data Driven Forecasting to Multimodality Fusion
Creators: Zhengyang Zhou
Contributors: Pengyu Hong (Advisor)
Li Zhou (Committee Member)
Hongfu Liu (Committee Member)
Jinfeng Zhang (Committee Member)
Awarding Institution: Brandeis University, Graduate School of Arts & Sciences; Doctor of Philosophy (PhD)
Theses and Dissertations: Doctor of Philosophy (PhD), Brandeis University, Graduate School of Arts & Sciences
Number of pages: 124
Identifiers: 9924505531201921
Academic Unit: Michtom School of Computer Science
Language: English
Resource Type: Dissertation

Toward Unified Scientific Representation Learning: from Pure-Data Driven Forecasting to Multimodality Fusion

Abstract

Files and links (1)

Metrics

Details

Brandeis University Social media