The pop-neuroscience story is that dopamine is the “reward” chemical. Click a link on Facebook? That’s a hit of dopamine.
And there’s obviously an element of truth to that. It’s no accident that popular recreational drugs are usually dopaminergic. But the reality is a little more complicated. Dopamine’s role in the brain — including its role in reinforcement learning — isn’t limited to “pleasure” or “reward” in the sense we’d usually understand it.
The basal ganglia, located at the base of the forebrain, below the cerebral cortex and close to the limbic system, have a large concentration of dopaminergic neurons. This area of the brain deals with motor planning, procedural learning, habit formation, and motivation. Damage causes movement disorders (Parkinson’s, Huntington’s, tardive dyskinesia, etc) or mental illnesses that have something to do with “habits” (OCD and Tourette’s). Dopaminergic neurons are relatively rare in the brain, and confined to a smal number of locations: the striatal area (basal ganglia and ventral tegmental area), projections to the prefrontal cortex, and a few other areas where dopamine’s function is primarily neuroendocrine.
Dopamine, in other words, is not an all-purpose neurotransmitter like, say, glutamate (which is what the majority of neurons use.) Dopamine does a specific thing or handful of things.
The important thing about dopamine response to stimuli is that it is very fast. A stimulus associated with a reward causes a “phasic” (spiky) dopamine release within 70-100 ms. This is faster than the gaze shift (mammals instinctively focus their eyes on an unexpected stimulus). It’s even faster than the ability of the visual cortex to distinguish different images. Dopamine response happens faster than you can feel an emotion. It’s prior to emotion, it’s prior even to the more complicated parts of perception. This means that it’s wrong to interpret dopamine release as a “feel-good” response — it happens faster than you can feel at all.
What’s more, dopamine release is also associated with things besides rewards, such as an unexpected sound or an unpredictably flashing light. And dopamine is not released in response to a stimulus associated with an expected reward; only an unexpected reward. This suggests that dopamine has something to do with learning, not just “pleasure.”
Redgrave’s hypothesis is that dopamine release is an agency detector or a timestamp. It’s fast because it’s there to assign a cause to a novel event. “I get juice when I pull the lever”, emphasis on when. There’s a minimum of sensory processing; just a little basic “is this positive or negative?” Dopamine release determines what you perceive. It creates the world around you. What you notice and feel and think is determined by a very fast, pre-conscious process that selects for the surprising, the pleasurable, and the painful.
Striatal dopamine responses are important to perception. Parkinson’s patients and schizophrenics treated with neuroleptics (both of whom have lowered dopamine levels) have abnormalities in visual contrast sensitivity. Damage to dopaminergic neurons in rats causes sensory inattention and inability to orient towards new stimuli.
A related theory is that dopamine responds to reward prediction errors — not just rewards, but surprising rewards (or surprising punishments, or a surprisingly absent reward or punishment). These prediction errors can depend on models of what the individual expects to happen — for example, if the stimulus regularly reverses on alternate trials, the dopamine spikes stop coming because the pattern is no longer surprising.
In other words, what you perceive and prioritize depends on what you have learned. Potentially, even, your explicit and conscious models of the world.
The incentive salience hypothesis is an attempt to account for the fact that damage to dopamine neurons does not prevent the individual from experiencing pleasure, but damages his motivation to work for desired outcomes. Dopamine must have something to do with initiating purposeful actions. The hypothesis is that dopamine assigns an “incentive value” to stimuli; it prioritizes the signals, and then passes them off to some other system (perceptual, motor, etc.) Dopamine seems to be involved in attention, and tonic dopamine deficiency tends to be associated with inattentive behavior in humans and rats. (Note that the drugs used to treat ADHD are dopamine reuptake inhibitors.) A phasic dopamine response says “Hey, this is important!” If the baseline is too low, you wind up thinking everything is important — hence, deficits in attention.
One way of looking at this is in the context of “objective” versus “subjective” reality. An agent with bounded computation necessarily has to approximate reality. There’s always a distorting filter somewhere. What we “see” is always mediated; there is no place in the brain that maps to a “photograph” of the visual field. (That doesn’t mean that there’s no such thing as reality — “objective” probably refers to invariants and relationships between observers and time-slices, ways in which we can infer something about the territory from looking at the overlap between maps.)
And there’s a sort of isomorphism between your filter and your “values.” What you record and pay attention to, is what’s important to you. Things are “salient”, worth acting on, worth paying attention to, to the extent that they help you gain “good” stuff and avoid “bad” stuff. In other words, things that spike your dopamine.
Values aren’t really the same as a “utility function” — there’s no reason to suppose that the brain is wired to obey the Von Neumann-Morgenstern axioms, and in fact, there’s lots of evidence suggesting that it’s not. Phasic dopamine release actually corresponds very closely to “values” in the Ayn Rand sense. They’re pre-conscious; they shape perceptions; they are responses to pleasure and pain; values are “what one acts to gain and keep”, which sounds a whole lot like “incentive salience.”
Values are fundamental, in the sense that an initial evaluation of something’s salience is the lowest level of information processing. You are not motivated by your emotions, for instance; you are motivated by things deeper and quicker than emotions.
Values change in response to learning new things about one’s environment. Once you figure out a pattern, repetition of that pattern no longer surprises you. Conscious learning and intellectual thought might even affect your values, but I’d guess that it only works if it’s internalized; if you learn something new but still alieve in your old model, it’s not going to shift things on a fundamental level.
The idea of identifying with your values is potentially very powerful. Your striatum is not genteel. It doesn’t know that sugar is bad for you or that adultery is wrong. It’s common for people to disavow their “bestial” or “instinctive” or “System I” self. But your values are also involved in all your “higher” functions. You could not speak, or understand language, or conceive of a philosophical idea, if you didn’t have reinforcement learning to direct your attention towards discriminating specific perceptions, motions, and concepts. Your striatum encodes what you actually care about — all of it, “base” and “noble.” You can’t separate from it. You might be able to rewire it. But in a sense nothing can be real to you unless it’s grounded in your values.
Glimcher, Paul W. “Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis.” Proceedings of the National Academy of Sciences 108.Supplement 3 (2011): 15647-15654.
Ljungberg, T., and U. Ungerstedt. “Sensory inattention produced by 6-hydroxydopamine-induced degeneration of ascending dopamine neurons in the brain.” Experimental neurology 53.3 (1976): 585-600.
Marshall, John F., Norberto Berrios, and Steven Sawyer. “Neostriatal dopamine and sensory inattention.” Journal of comparative and physiological psychology94.5 (1980): 833.
Masson, G., D. Mestre, and O. Blin. “Dopaminergic modulation of visual sensitivity in man.” Fundamental & clinical pharmacology 7.8 (1993): 449-463.
Nieoullon, André. “Dopamine and the regulation of cognition and attention.”Progress in neurobiology 67.1 (2002): 53-83.
Redgrave, Peter, Kevin Gurney, and John Reynolds. “What is reinforced by phasic dopamine signals?.” Brain research reviews 58.2 (2008): 322-339.
Schultz, Wolfram. “Updating dopamine reward signals.” Current opinion in neurobiology 23.2 (2013): 229-238.