Abstract
In order to achieve its overarching goal of surviving and thriving in a complex and ever-changing environment, the brain must be able to take in sensory stimuli and transform this information into behavioral outputs appropriate for its current situation. This transformation requires solutions to many distinct sub-problems. Prominent among these are 1) the ability to flexibly alter behavioral responses to stimuli given the current context and 2) the ability to efficiently allocate the resources of time and energy to gain additional resources. This first problem requires a neural network whose activity pattern is capable of representing the entire history of stimuli over a behaviorally relevant timescale (the context). In Chapter 1 of this dissertation, we propose and evaluate the abilities of such a network which uses transitions between point attractor states in high-dimensional space to encode the recent history of stimuli. We find that this type of network is capable of excellent sequence discrimination even under assumptions of realistic noise levels. Using a simple readout method to map network activity states to decisions, we then compare behavior produced by the network to that of mice in a working-memory task as well as to human free-recall data. We conclude that such a network provides a general and biologically plausible mechanism for the encoding of a history of stimuli.
The second above problem is the domain of value-based decision-making which spans the fields of foraging, neuroeconomics, and reinforcement learning. Every choice we make involves the weighing of potential rewards and costs. In Chapters 2 through 4, we analyze and model the behavior of rats and mice in foraging-like tasks in order to shed light on the algorithms and neural circuits that underly these cost-benefit analyses. In the process, we identify likely sources of suboptimality in their behavior. In Chapter 2, we carried out a foraging-like preference test that mimics the naturalistic stay-switch decision-making required of animals as they search for food and other resources. Through an analysis of the sampling durations at each option as well as the sequence of sampling choices, we find evidence for a competitive impact of the relative palatabilities of the two options on sampling durations. We additionally compare the behavior of rats to that produced by a point-attractor network with two bistable groups and find that this network displays both the competitive impact between stimuli as well as the exponential distribution of sampling durations produced by rats. In Chapters 3 and 4 we analyze and model the behavior of mice in a foraging-like task where the mice chose between two levers with differing reward amounts and changing effort costs. Additionally, the rewards and costs of the levers were independently varied across sessions to assess the impact choice optimality. In Chapter 3 we analyze the optimality of the mice across task parameter regimes and find that effort costs have a larger impact than reward size on mouse optimality. We additionally identify a susceptibility to the sunk cost fallacy as a source of suboptimal behavior. In Chapter 4 we make use of reinforcement learning methods to compare generative models of mouse behavior in this task. We find that the inclusion of a well-known perceptual limitation, the Weber-Fechner law, is indispensable in describing the behavior. We additionally find, together with observations from Chapter 2, that rodents display a strong innate drive for ‘exploratory’ actions even when they are not compatible with optimal behavior. Taken together, in this dissertation I present a body of work that highlights likely algorithms and neural circuit implementations that underly flexible value-based decision-making.