Term
4 things used to describe "rewards" (functional definitions) |
|
Definition
Generates approach/consummatory behaviors (reinforces an action) Produces learning of this behavior May engage positive emotions Represents the positive outcomes of economic decisions |
|
|
Term
In general, what do we use rewards for? |
|
Definition
Use rewards to make predictions about the environment to engage in goal-directed behavior |
|
|
Term
Examples of primary rewards... |
|
Definition
Biologically ingrained into us; needed for survival
Water, oxygen, food, sex, temperature |
|
|
Term
Examples of secondary (social) rewards... |
|
Definition
Derive value from primary rewards; help us feel good
Shelter, praise, fame, power |
|
|
Term
What is the behavior of schadenfreude? |
|
Definition
Also known as spite/vengefulness When we engage in this behavior, we are willing to face negative consequences (inflict some form of pain on self) in order for someone else to feel more pain |
|
|
Term
What does valence refer to when describing rewards? |
|
Definition
The positive or negative value associated with the reward
Indicates that rewards are subjective; if a person likes something, has a positive valence; if something is disliked, has a negative valence |
|
|
Term
What is the main assumption underlying Decision Theory? |
|
Definition
Main assumption is that the choice that a person makes in a situation, should be the choice that you perceive as having the most value (what you value the most); assumes that people will choose rationally |
|
|
Term
What are the important decision variables? |
|
Definition
Subjective Value of Reward (UTILITY) Weighted Probability of Reward (LIKELIHOOD) Temporal Discounting (want more now) |
|
|
Term
In terms of decision making and decision theory, where is there typically a discrepancy? |
|
Definition
Often see discrepancy between what people should choose (NORMATIVE representation) and what they actually choose (DESCRIPTIVE representation) |
|
|
Term
Normative vs. Descriptive Representation |
|
Definition
Normative = what people should choose Descriptive = what people actually choose |
|
|
Term
What is "temporal discounting"? |
|
Definition
Tend to value rewards MORE as they move towards the present (want more now); e.g. would rather choose $500 now, as opposed to $700 in 5 years |
|
|
Term
|
Definition
EV = reward x probability
Should always make decisions (according to decision theory) based on the HIGHEST expected value; value of reward times how likely it is to occur |
|
|
Term
What is true of people when it comes to GAINS vs. LOSSES? |
|
Definition
People are risk AVERSE when it comes to GAINS
People are risk SEEKING when it comes to LOSSES. |
|
|
Term
Why are people risk averse when it comes to gains? |
|
Definition
In the money e.g. from class, the most important variable is the magnitude of the reward (subjective value/utility); we value the first 500k more than the 2nd 500k (principle of diminishing return as the reward magnitude increases) |
|
|
Term
Example of diminishing return.. |
|
Definition
In the scenario where you have 100% of winning 500k vs. 80% of winning 1M, most people will stick with 100% scenario because they are risk averse. We value the 1st 500k more than the 2nd 500k so we think the subjective value of the 100% scenario is higher |
|
|
Term
Why are people risk "seeking" when it comes to losses? |
|
Definition
Because losses "hurt" more than gains; notice the steeper slope on the graph of subjective value vs. gains/losses |
|
|
Term
Compare subjective utility for loss of 500k compared to a gain of 500k |
|
Definition
The difference is MUCH larger for the loss of 500k because of the steeper slope on the graph; this causes us to engage in risk "seeking" behavior when it comes to losses |
|
|
Term
Explain S-shaped curve around probability and how it leads us to act given its subjective weighting... |
|
Definition
We tend to over weight improbable events; perceive low probability events to be more likely (convex curve); POTENTIAL minded We tend to under weight higher probability events (concave curve); SECURITY minded |
|
|
Term
What does the shape of the subjective weighting of probability curve describe our behavior to be? |
|
Definition
S-shaped curve describes us to be "cautiously hopeful/optimistic" |
|
|
Term
Given similar rewards, what do humans prefer? What is this called? |
|
Definition
Given similar rewards, humans prefer those that arrive sooner (tend to over value rewards that will occur closer to the present) This is known as "temporal discounting" |
|
|
Term
How do perceived values of rewards diminish as they occur farther into the future? |
|
Definition
Value diminishes by a hyperbolic function (value of rewards discounted hyperbolically) |
|
|
Term
How does our ability to delay gratification change with time? |
|
Definition
Make it so rewards occurring farther in the future are relatively less discounted (function becomes less steep) Learn to temporally discount rewards less over time |
|
|
Term
Game Theory (& two examples of where it can be applied) |
|
Definition
Looks at decision making in social interactions Look at decision making in situations where a person's success is dependent on their actions AND those of others
Examples - ultimatum game & trust game |
|
|
Term
Evidence DA is central NT in reward systems... |
|
Definition
1. Effects of DBS 2. Drugs of abuse act on DA pathways - all increase DA release 3. Lesions to DA pathways cause lack of goal-motivated behaviors & motivation 4. Recording of DA neurons - activity scales with reward magnitude |
|
|
Term
What happens to the baseline reading of a DA neuron (in regards to DBS) when a DA agonist vs. antagonist is added? |
|
Definition
With AGONIST - get curve shifted to the L and remaining the same size; because there is an increase in baseline DA activity need lower frequency to reach same response rate
With ANTAGONIST - curve shifts to the right and shrinks; shifts to right because decrease in tonic DA activity and shrinks because less DA receptors available |
|
|
Term
What happens to DBS if it is not tied to an action or event (i.e. cannot be controlled or predicted)? |
|
Definition
Becomes unpleasurable or even aversive; need something to tie to the DA release (event/stimulus) so we can LEARN to predict/control it We need to have control over environment to enjoy things therefore need event to tie to it to be able to predict its occurrence |
|
|
Term
In what two scenarios does DA release occur? |
|
Definition
When something pleasurable occurs When a salient stimulus is presented (something surprising) |
|
|
Term
What happens to animals who have lesions to DA neurons? |
|
Definition
Seem to lack motivation and cannot engage in goal directed behaviors |
|
|
Term
What is true of the activity of DA neurons? |
|
Definition
They are all TONICALLY ACTIVE; all DA neurons fire at some baseline rate (constant release of DA across synapse) |
|
|
Term
Do DA neurons discriminate between the type of reward received? |
|
Definition
No - see phasic burst after reward presentation regardless of what type of reward was received |
|
|
Term
What does the amount of DA firing vary with? |
|
Definition
Varies with the PREDICTED REWARD (varies with how reward presented matches up with what you thought you were going to get) |
|
|
Term
If there is more vs. less reward presented than is expected what happens to DA neuron firing? |
|
Definition
If reward > expected, get increase in DA neuron firing after reward is presented
If reward is < expected or omitted, get cessation or decrease in DA neuron firing |
|
|
Term
What does DA neuron firing scale with? (what does it vary according to?) |
|
Definition
DA neuron firing scales with the MAGNITUDE OF THE REWARD (when compared to its expected value); if you get more than you thought you would, get more DA firing, for example |
|
|
Term
What is the benefit of pairing a stimulus with a reward? |
|
Definition
They are paired for the benefit of PREDICTION - want to be able to use the stimulus to predict the occurrence of the reward |
|
|
Term
What 3 factors govern learning according to Learning Theory? |
|
Definition
Contiguity - need sequential presentation of stimulus followed by reward Contingency - need the (conditioned) stimulus to be able to accurately predict reward; learning is contingent on this prediction RPE - to learn we need "discrepancy" between predicted reinforcer and the actual reinforcer (reward) received |
|
|
Term
When referring to Learning Theory, what does "contiguity" refer to? |
|
Definition
Refers to the necessity of the stimulus to be presented, followed by the presentation of the reward/reinforcer in that specific order. Need this to be repeated in continuous trials and for there to only be a short time delay between presentation. |
|
|
Term
In referring to contiguity, how can learning be accomplished faster/to a stronger degree? |
|
Definition
By shortening the time interval between presentation of the stimulus and the reward Need the two to be temporally proximal |
|
|
Term
Differences between operant & classical conditioning? |
|
Definition
Operant = trial and error learning; don't have any environmental cues to help
Classical = use environmental cues (stimuli) to predict what behaviors elicit rewards |
|
|
Term
What occurs to DA activity during learning "acquisition"? |
|
Definition
In initial trials, see DA phasic burst occur after reward is presented However, as learning occurs, DA burst occurs after presentation of CS, instead of after reward is presented; use DA to PREDICT OCCURRENCE of reward |
|
|
Term
In terms of learning acquisition, what is the function of DA? Evidence? |
|
Definition
DA, as learning is occurring/has occurred, is used to PREDICT occurrence of the reward This is evidenced by the fact that the phasic burst of DA firing occurs after the CS has been presented after learning occurs, instead of after reward is presented |
|
|
Term
In terms of learning theory & contingency, what is learning proportional to? |
|
Definition
How good of a predictor the CS is for the occurrence of the reward
Strongest learning occurs if the CS predicts reward 100% of the time. |
|
|
Term
What happens to DA firing if... a) CS 100% predicts reward b) CS is neutral to reward (0%) c) CS fully predicts reward's absence |
|
Definition
a) Get increase in DA firing after CS presented b) Get tonic DA firing after CS c) Cessation of DA firing (no firing) after CS |
|
|
Term
|
Definition
RPE = difference between the perceived reward (what you think you'll get) vs. the actual reward received (what you got) |
|
|
Term
In operant conditioning, what does DA neuron firing respond to? |
|
Definition
In operant conditioning, DA firing corresponds to TEMPORAL PREDICTION of the rewards - have an "expected time" (after action is completed) that the reward is expected within |
|
|
Term
In operant conditioning with a learned task, what happens if: a) Reward presented at expected time b) Reward not presented c) Reward presented 1/2 second early or late |
|
Definition
a) Get tonic DA activity because reward is expected and presented at the same time b) Get cessation/decrease in DA activity because no reward is presented c) Get increase in activity in DA firing because there is a temporal discrepancy and the reward is presented at a different time |
|
|
Term
In terms of the reinforcement learning algorithm, what do the DA neurons encode for/calculate? |
|
Definition
Encode for the difference/change in associated value from trial to trial (this is the reward prediction error) |
|
|
Term
Why is it beneficial that we use the reinforcement algorithm as a recursive function in everyday life, as opposed to trying to remember everything? |
|
Definition
Use it so that we can weight more recent events more heavily; only remember Vt from trial to trial, so by weighting more recent things more heavily we can adapt to our current environments |
|
|
Term
What does a high vs. low learning rate favor? |
|
Definition
High learning rate -> favors exploration
Low learning rate -> favors exploitation |
|
|
Term
Type of curve seen in variable vs. constant reward schedule? |
|
Definition
Variable - jagged curve which averages out onto expected value
Constant - straight curve with plateau at expected value |
|
|
Term
For a slot machine type game with "variable" reward schedules, what type of curve would be seen? |
|
Definition
Would see a jagged curve, with the average being located at the expected value of V in the long run |
|
|
Term
Mesocortical vs. Mesolimbic (comparison)... |
|
Definition
Mesolimbic - associative, can work in parallel, fast, intuitive, automatic, emotionally influenced Mesocortical - works in series (one at a time), slow, effortful, not emotionally influenced, rule governed |
|
|
Term
Alternate names for mesolimbic vs. mesocortical pathways? |
|
Definition
Mesolimbic - INTUITIVE Mesocortical - REASONING |
|
|
Term
In the mesolimbic vs. mesocortical pathways where do the modulatory DA neurons stem from and where do they modulate the pathway? |
|
Definition
In mesolimbic - come from VTA; modulate at the level of the striatum In mesocortical - come from VTA; modulate at the level of the PFC (cortical) |
|
|
Term
Name the pathway of the mesocortical system (anatomical connections)... |
|
Definition
Runs from VTA to the frontal/prefrontal cortices Input from dlPFC to striatum (VTA modulates input at PFC level); output from GPi/SNpr, through thalamus, back up to the dlPFC |
|
|
Term
Name the pathway of the mesolimbic system (anatomical connections)... |
|
Definition
From VTA to nucleus accumbens in STRIATUM Input comes from mOFC, ACC, amygdala, hippocampus, insula (output eventually goes here too) BG output comes from GPi/SNpr through thalamic relay nuclei |
|
|
Term
Functions of the mesolimbic vs. mesocortical pathways... |
|
Definition
Mesolimbic - involved in depression, schizophrenia, addiction; "reward circuit" of the brain Mesocortical - involved in executive function (attention, working memory), and most importantly INHIBITION |
|
|
Term
What would humans behave like without over-developed PFC's? |
|
Definition
Would behave as very reflexive creatures - would be extremely impulsive with our actions towards our surroundings and would always strive for immediate gratification, as opposed to long-term goal directed behavior |
|
|
Term
Name deficits/symptoms of someone with lesions to their PFC: |
|
Definition
Cognitive - short attention, impaired working memory, lack of motivation Behavioral - overly aggressive/sexual behavior, perseveration (repeated behavior) Emotional - angry, depressed |
|
|
Term
What are the 3 main types of deficits seen in those with PFC lesions? |
|
Definition
Cognitive, Behavioral, Emotional |
|
|
Term
With application of DA antagonists to the dlPFC, what was seen? |
|
Definition
Affliction in ability to perform contralateral memory guided saccades - suggests function of dlPFC in working memory |
|
|
Term
What is the role of the dlPFC in anti-saccades? |
|
Definition
Saw that with dlPFC lesions (DA antagonists) had problems with contralateral anti-saccades - could not inhibit eye movement toward cue before moving in other direction; need dlPFC to suppress contralateral reflexive responses |
|
|
Term
Role of dlPFC in saccadic eye movements (2 points): |
|
Definition
1) Needed for working memory (problem with contralateral memory guided saccades) 2) Needed for inhibition of reflexive saccadic eye movements (problem with contralateral anti-saccades) |
|
|
Term
|
Definition
Encodes and compares the relative value of REALIZED (R) and POTENTIAL (Vt+1) rewards |
|
|
Term
What are the relative firing rates of mOFC neurons when the decision is... a) a no-brainer (something vs. nothing) b) A and B have equal value c) there is a deviation from equal value |
|
Definition
a) low firing rates - because the decision is so basic, don't really need to compare relative values b) low firing rates - because they are of equal value, either choice will suffice c) high firing rates - this becomes subjective decision making so firing rates increase |
|
|
Term
When do you see the highest activity from mOFC neurons? |
|
Definition
After the offer is presented (encoding Vt+1) and after the reward is presented (encoding R) |
|
|
Term
In the human food auction experiment, what is the function of the mOFC neurons? |
|
Definition
These mOFC neurons encode how much the subject is "willing to pay" for the food item being auctioned; assign relative value to different foods to make decisions |
|
|
Term
What are the 3 main functions of the ACC in humans? |
|
Definition
Error Detection - e.g. w/ typing task Conflict Monitoring - Stroop Task Reward Based Learning (Task Switching) |
|
|
Term
|
Definition
Check w/ typing test; when a typing error is made, see a spike in error-related negativity in the ACC (hyperpolarization); ACC monitors errors in tasks & remembers them so you can perform better in the future |
|
|
Term
Conflict Monitoring & ACC |
|
Definition
Shown in Stroop Task - when incongruent color/word pairings were shown (conflict present) saw highest amount of activity in ACC (conflict between 2 different mental processes used to perceive situation) |
|
|
Term
In see-saw task, what are the 2 different signals for task switching to occur? |
|
Definition
Auditory feedback - hears a beep Reward feedback - reward magnitude decreases by 1/2 |
|
|
Term
What activates neurons in the ACC during the see-saw task in reward based learning? |
|
Definition
The actual switching from one task to another activates neurons; neurons serve to predict when to switch tasks |
|
|
Term
Most important function of the ACC in relation to reward based learning? |
|
Definition
TASK SWITCHING - involved in changing task based on reward feedback |
|
|
Term
How can we prove that the ACC is involved in task switching in reward based learning? |
|
Definition
Use GABA agonist in ACC and see what happens - only see deficits in reward feedback task switching (behavior perseveration), but not in auditory feedback task switching |
|
|
Term
Amygdalae & fear based learning |
|
Definition
Amygdalae are important for forming long-term memory associations between neutral stimuli and painful/fearful events Ablation to amygdalae can lead to the inability to form these associations; inability to encode for aversive/punishing rewards |
|
|
Term
In the ultimatum game, what happens in the mesocortical and mesolimbic systems? |
|
Definition
See more activation in amygdala, insula & dlPFC in unfair proposals as opposed to fair offers Increased insular activity is associated with people rejected unfair offers (as well as digusting/fearful stimuli) Increased dlPFC, amygdala & insula activity is predictive & scales how unfair an offer is |
|
|
Term
What factors influence the speed vs. accuracy tradeoff... |
|
Definition
Quality of Evidence (e.g. % coherence of dots) Pre-Existing Decision Variables (expected value/probability of rewards) Urgency until Choice (how long you have to decide) |
|
|
Term
Does any learning occur if there is no RPE? Where on the graph is RPE shown? |
|
Definition
RPE occurs on the steep/sloped part of the graph; at the plateau RPE = 0 and NO learning is occurring Need RPE (difference between expected and actual reward) for learning to occur |
|
|
Term
Size of PFC in normal person vs. ADHD? |
|
Definition
In ADHD children have under-developed PFCs - causes hyperactive behaviors with attention deficits because of the lack of inhibition from the underdeveloped PFC |
|
|
Term
What area of the brain do drugs of abuse (nicotine, meth, cocaine) influence and what is their effect? |
|
Definition
Affect the mesolimbic circuit of the brain (VTA to NA) and can lead to addiction; all involve increasing the amount of dopamine activity in this pathway |
|
|
Term
What area of the brain does schizophrenia affect? What treatments exist for it? |
|
Definition
Affects the mesolimbic circuits of the brain; may be caused by excess DA present in the mesolimbic circuit Drugs to treat it function by decreasing the amount of DA present to alleviate symptoms |
|
|
Term
What pathway is affected in ADHD? What occurs here? |
|
Definition
In ADHD the mesocortical pathway is affected; see a decrease in DA activity in the under-developed PFC in those with ADHD Use Ritalin to treat and increase DA levels to increase function of PFC (increase inhibition) - ritalin blocks DA reuptake to increase activity |
|
|