Abstract
The recent success of artificial intelligence can partly be attributed to advancements in data-driven learning approaches, increased data availability, and enhanced computational power. However, in scenarios where data are hard to collect and the learning task demands complex domain-specific knowledge---for example, in biomedical applications---purely data-driven approaches may yield inappropriate and uninterpretable results. These outcomes sometimes contradict natural or human rules and may even raise ethical concerns. Incorporating human or expert knowledge, such as extra annotations, human-defined rules, and domain-specific engineering, can significantly enhance representation learning on small datasets. A major challenge in this context is that much of human knowledge cannot be directly represented as numerical values, making it difficult for models to effectively utilize this information. In light of these issues, this dissertation contributes to interdisciplinary research through iterative development and collaboration with domain experts. It explores methods for improving learning on small datasets by integrating nonquantitative human knowledge from three perspectives: (1) integrating human logic to learn general relational knowledge, (2) leveraging domain-specific knowledge to enhance molecular representation learning, and (3) incorporating clinical and ethical considerations to refine assessment in medical predictions.