Abstract
Decision making and reward learning involve the representation of choice, the anticipation of choice outcome, value assignment to choices by their attributes, incorporating additional restraints, making value-based decisions, and according to the outcome, reevaluating and adjusting the representation and valuation. Most of these steps require conditional valuation and updates of valuation based on the context and characteristics of a reward to grant flexibility and specificity in choice. Yet the field of decision-making has frequently overlooked this need for specificity in pursuit of pure valuation of “how good it is”. To address this issue, the current study used multivariate pattern analysis on fMRI data collected during an olfactory trans-reinforcer reversal learning task to elucidate the role of the orbitofrontal cortex (OFC) in reward identity anticipation, and its relationship with midbrain identity prediction error (iPE). I found significantly above chance decoding accuracy of expected reward identity in left lateral OFC during anticipation of the reward. Moreover, identity decoding accuracy in lateral OFC in the first trial after a reversal trial was significantly correlated with the univariate midbrain iPE in the reversal trial. These findings suggest that lOFC encodes value-independent reward information, and the identity information flow between lOFC and midbrain may support learning new reward identities when that feature of expectations is violated on reversals. This study confirms the role of OFC in constructing a cognitive map and helps build a more comprehensive picture of midbrain-cortical function in reward learning.