Image from The Blue Diamond Gallery / Creative commons

Edited by Matthew A. McIntosh / 02.26.2018
Historian
Brewminate Editor-in-Chief

1 – Introduction to Learning

1.1 – Defining Learning

Learning involves a change in behavior or knowledge that results from experience.

1.1.1 – What is Learning?

Learning is an adaptive function by which our nervous system changes in relation to stimuli in the environment, thus changing our behavioral responses and permitting us to function in our environment. The process occurs initially in our nervous system in response to environmental stimuli. Neural pathways can be strengthened, pruned, activated, or rerouted, all of which cause changes in our behavioral responses.

Instincts and reflexes are innate behaviors—they occur naturally and do not involve learning. In contrast, learning is a change in behavior or knowledge that results from experience. The field of behavioral psychology focuses largely on measurable behaviors that are learned, rather than trying to understand internal states such as emotions and attitudes.

1.1.2 – Types of Learning

There are three main types of learning: classical conditioning, operant conditioning, and observational learning. Both classical and operant conditioning are forms of associative learning, in which associations are made between events that occur together. Observational learning is just as it sounds: learning by observing others.

1.1.3 – Classical Conditioning

Classical conditioning is a process by which we learn to associate events, or stimuli, that frequently happen together; as a result of this, we learn to anticipate events. Ivan Pavlov conducted a famous study involving dogs in which he trained (or conditioned) the dogs to associate the sound of a bell with the presence of a piece of meat. The conditioning is achieved when the sound of the bell on its own makes the dog salivate in anticipation for the meat.

1.1.4 – Operant Conditioning

Operant conditioning is the learning process by which behaviors are reinforced or punished, thus strengthening or extinguishing a response. Edward Thorndike coined the term “law of effect,” in which behaviors that are followed by consequences that are satisfying to the organism are more likely to be repeated, and behaviors that are followed by unpleasant consequences are less likely to be repeated. B. F. Skinner researched operant conditioning by conducting experiments with rats in what he called a “Skinner box.” Over time, the rats learned that stepping on the lever directly caused the release of food, demonstrating that behavior can be influenced by rewards or punishments. He differentiated between positive and negative reinforcement, and also explored the concept of extinction.

1.1.5 – Observational Learning

Observational learning occurs through observing the behaviors of others and imitating those behaviors—even if there is no reinforcement at the time. Albert Bandura noticed that children often learn through imitating adults, and he tested his theory using his famous Bobo-doll experiment. Through this experiment, Bandura learned that children would attack the Bobo doll after viewing adults hitting the doll.

2 – Classical Conditioning

2.1 – Basic Principles of Classical Conditioning: Pavlov

2.1.1 – Overview

Ivan Pavlov (1849–1936) was a Russian scientist whose work with dogs has been influential in understanding how learning occurs. Through his research, he established the theory of classical conditioning.

Ivan Pavlov’s research on classical conditioning profoundly informed the psychology of learning and the field of behaviorism.

2.1.2 – The Basic Principles

Ivan Pavlov: Pavlov is known for his studies in classical conditioning, which have been influential in understanding learning.

Classical conditioning is a form of learning whereby a conditioned stimulus (CS) becomes associated with an unrelated unconditioned stimulus (US) in order to produce a behavioral response known as a conditioned response (CR). The conditioned response is the learned response to the previously neutral stimulus. The unconditioned stimulus is usually a biologically significant stimulus such as food or pain that elicits an unconditioned response (UR) from the start. The conditioned stimulus is usually neutral and produces no particular response at first, but after conditioning it elicits the conditioned response.

Extinction is the decrease in the conditioned response when the unconditioned stimulus is no longer presented with the conditioned stimulus. When presented with the conditioned stimulus alone, the individual would show a weaker and weaker response, and finally no response. In classical-conditioning terms, there is a gradual weakening and disappearance of the conditioned response. Related to this, spontaneous recovery refers to the return of a previously extinguished conditioned response following a rest period. Research has found that with repeated extinction/recovery cycles, the conditioned response tends to be less intense with each period of recovery.

2.1.3 – Pavlov’s Famous Study

Classical conditioning: Before conditioning, an unconditioned stimulus (food) produces an unconditioned response (salivation), and a neutral stimulus (bell) does not have an effect. During conditioning, the unconditioned stimulus (food) is presented repeatedly just after the presentation of the neutral stimulus (bell). After conditioning, the neutral stimulus alone produces a conditioned response (salivation), thus becoming a conditioned stimulus.

The best-known of Pavlov’s experiments involves the study of the salivation of dogs. Pavlov was originally studying the saliva of dogs as it related to digestion, but as he conducted his research, he noticed that the dogs would begin to salivate every time he entered the room—even if he had no food. The dogs were associating his entrance into the room with being fed. This led Pavlov to design a series of experiments in which he used various sound objects, such as a buzzer, to condition the salivation response in dogs.

He started by sounding a buzzer each time food was given to the dogs and found that the dogs would start salivating immediately after hearing the buzzer—even before seeing the food. After a period of time, Pavlov began sounding the buzzer without giving any food at all and found that the dogs continued to salivate at the sound of the buzzer even in the absence of food. They had learned to associate the sound of the buzzer with being fed.

If we look at Pavlov’s experiment, we can identify the four factors of classical conditioning at work:

The unconditioned response was the dogs’ natural salivation in response to seeing or smelling their food.
The unconditioned stimulus was the sight or smell of the food itself.
The conditioned stimulus was the ringing of the bell, which previously had no association with food.
The conditioned response, therefore, was the salivation of the dogs in response to the ringing of the bell, even when no food was present.

Pavlov had successfully associated an unconditioned response (natural salivation in response to food) with a conditioned stimulus (a buzzer), eventually creating a conditioned response (salivation in response to a buzzer). With these results, Pavlov established his theory of classical conditioning.

2.1.4 – Neurological Response to Conditioning

Consider how the conditioned response occurs in the brain. When a dog sees food, the visual and olfactory stimuli send information to the brain through their respective neural pathways, ultimately activating the salivation glands to secrete saliva. This reaction is a natural biological process as saliva aids in the digestion of food. When a dog hears a buzzer and at the same time sees food, the auditory stimulus activates the associated neural pathways. However, because these pathways are being activated at the same time as the other neural pathways, there are weak synapse reactions that occur between the auditory stimulus and the behavioral response. Over time, these synapses are strengthened so that it only takes the sound of a buzzer (or a bell) to activate the pathway leading to salivation.

2.1.5 – Behaviorism and Other Research

Pavlov’s research contributed to other studies and theories in behaviorism, which is an approach to psychology interested in observable behaviors rather than the inner workings of the mind. The philosopher Bertrand Russell argued that Pavlov’s work was an important contribution to a philosophy of mind. Pavlov’s research also contributed to Hans Eysench’s personality theory of introversion and extroversion. Eysench built upon Pavlov’s research on dogs, hypothesizing that the differences in arousal that the dogs displayed was due to inborn genetic differences. Eysench then extended the research to human personality traits.

Pavlov’s research further led to the development of important behavior-therapy techniques, such as flooding and desensitizing, for individuals who struggle with fear and anxiety. Desensitizing is a kind of reverse conditioning in which an individual is repeatedly exposed to the thing that is causing the anxiety. Flooding is similar in that it exposes an individual to the thing causing the anxiety, but it does so in a more intense and prolonged way.

2.2 – Applications of Classical Conditioning to Human Behavior

Research has demonstrated the effectiveness of classical conditioning in altering human behavior.

Since Ivan Pavlov’s original experiments, many studies have examined the application of classical conditioning to human behavior.

2.2.1 – Watson’s “Little Albert” Experiment

The Little Albert experiment: Through stimulus generalization, Little Albert came to fear furry things, including Watson in a Santa Claus mask.

In the early 1900s, John B. Watson carried out a controversial classical conditioning experiment on an infant boy called “Little Albert.” Watson was interested in examining the effects of conditioning on the fear response in humans, and he introduced Little Albert to a number of items such as a white rat, a bunny, and a dog. Albert was originally not fearful of any of the items. Watson then allowed Albert to play with the rat, but as Albert played, Watson suddenly banged a hammer on a metal bar. The sound startled Albert and caused him to cry. Each time Albert touched the rat, Watson again banged the hammer on the bar. Watson was able to successfully condition Albert to fear the rat because of its association with the loud noise. Eventually, Albert was conditioned to fear other similar furry items such as a rabbit and even a Santa Claus mask. While Watson’s research provided new insight into conditioning, it would be considered unethical by the current ethical standards set forth by the American Psychological Association.

2.2.2 – Classical Conditioning in Humans

The influence of classical conditioning can be seen in responses such as phobias, disgust, nausea, anger, and sexual arousal. A familiar example is conditioned nausea, in which the sight or smell of a particular food causes nausea because it caused stomach upset in the past. Similarly, when the sight of a dog has been associated with a memory of being bitten, the result may be a conditioned fear of dogs.

As an adaptive mechanism, conditioning helps shield an individual from harm or prepare them for important biological events, such as sexual activity. Thus, a stimulus that has occurred before sexual interaction comes to cause sexual arousal, which prepares the individual for sexual contact. For example, sexual arousal has been conditioned in human subjects by pairing a stimulus like a picture of a jar of pennies with views of an erotic film clip. Similar experiments involving blue gourami fish and domesticated quail have shown that such conditioning can increase the number of offspring. These results suggest that conditioning techniques might help to increase fertility rates in infertile individuals and endangered species.

2.2.3 – Behavioral Therapies

Classical conditioning has been used as a successful form of treatment in changing or modifying behaviors, such as substance abuse and smoking. Some therapies associated with classical conditioning include aversion therapy, systematic desensitization, and flooding. Aversion therapy is a type of behavior therapy designed to encourage individuals to give up undesirable habits by causing them to associate the habit with an unpleasant effect. Systematic desensitization is a treatment for phobias in which the individual is trained to relax while being exposed to progressively more anxiety -provoking stimuli. Flooding is a form of desensitization that uses repeated exposure to highly distressing stimuli until the lack of reinforcement of the anxiety response causes its extinction.

2.2.4 – Classical Conditioning in Everyday Life

Classical conditioning is used not only in therapeutic interventions, but in everyday life as well. Advertising executives, for example, are adept at applying the principles of associative learning. Think about the car commercials you have seen on television: many of them feature an attractive model. By associating the model with the car being advertised, you come to see the car as being desirable (Cialdini, 2008). You may be asking yourself, does this advertising technique actually work? According to Cialdini (2008), men who viewed a car commercial that included an attractive model later rated the car as being faster, more appealing, and better designed than did men who viewed an advertisement for the same car without the model.

3 – Operant Conditioning

3.1 – Basic Principles of Operant Conditioning: Thorndike’s Law of Effect

3.1.1 – Overview

Thorndike’s law of effect states that behaviors are modified by their positive or negative consequences.

Operant conditioning is a theory of learning that focuses on changes in an individual’s observable behaviors. In operant conditioning, new or continued behaviors are impacted by new or continued consequences. Research regarding this principle of learning first began in the late 19th century with Edward L. Thorndike, who established the law of effect.

3.1.2 – Thorndike’s Experiments

Thorndike’s puzzle box: This image shows an example of Thorndike’s puzzle box alongside a graph demonstrating the learning of a cat within the box. As the number of trials increased, the cats were able to escape more quickly by learning.

Thorndike’s most famous work involved cats trying to navigate through various puzzle boxes. In this experiment, he placed hungry cats into homemade boxes and recorded the time it took for them to perform the necessary actions to escape and receive their food reward. Thorndike discovered that with successive trials, cats would learn from previous behavior, limit ineffective actions, and escape from the box more quickly. He observed that the cats seemed to learn, from an intricate trial and error process, which actions should be continued and which actions should be abandoned; a well-practiced cat could quickly remember and reuse actions that were successful in escaping to the food reward.

3.1.3 – The Law of Effect

Thorndike realized not only that stimuli and responses were associated, but also that behavior could be modified by consequences. He used these findings to publish his now famous “law of effect” theory. According to the law of effect, behaviors that are followed by consequences that are satisfying to the organism are more likely to be repeated, and behaviors that are followed by unpleasant consequences are less likely to be repeated. Essentially, if an organism does something that brings about a desired result, the organism is more likely to do it again. If an organism does something that does not bring about a desired result, the organism is less likely to do it again.

Law of effect: Initially, cats displayed a variety of behaviors inside the box. Over successive trials, actions that were helpful in escaping the box and receiving the food reward were replicated and repeated at a higher rate.

Thorndike’s law of effect now informs much of what we know about operant conditioning and behaviorism. According to this law, behaviors are modified by their consequences, and this basic stimulus-response relationship can be learned by the operant person or animal. Once the association between behavior and consequences is established, the response is reinforced, and the association holds the sole responsibility for the occurrence of that behavior. Thorndike posited that learning was merely a change in behavior as a result of a consequence, and that if an action brought a reward, it was stamped into the mind and available for recall later.

From a young age, we learn which actions are beneficial and which are detrimental through a trial and error process. For example, a young child is playing with her friend on the playground and playfully pushes her friend off the swingset. Her friend falls to the ground and begins to cry, and then refuses to play with her for the rest of the day. The child’s actions (pushing her friend) are informed by their consequences (her friend refusing to play with her), and she learns not to repeat that action if she wants to continue playing with her friend.

The law of effect has been expanded to various forms of behavior modification. Because the law of effect is a key component of behaviorism, it does not include any reference to unobservable or internal states; instead, it relies solely on what can be observed in human behavior. While this theory does not account for the entirety of human behavior, it has been applied to nearly every sector of human life, but particularly in education and psychology.

3.2 – Basic Principles of Operant Conditioning: Skinner

3.2.1 – Overview

B. F. Skinner was a behavioral psychologist who expanded the field by defining and elaborating on operant conditioning.

Operant conditioning is a theory of behaviorism that focuses on changes in an individual’s observable behaviors. In operant conditioning, new or continued behaviors are impacted by new or continued consequences. Research regarding this principle of learning was first conducted by Edward L. Thorndike in the late 1800s, then brought to popularity by B. F. Skinner in the mid-1900s. Much of this research informs current practices in human behavior and interaction.

3.2.2 – Skinner’s Theories of Operant Conditioning

Almost half a century after Thorndike’s first publication of the principles of operant conditioning and the law of effect, Skinner attempted to prove an extension to this theory—that all behaviors are in some way a result of operant conditioning. Skinner theorized that if a behavior is followed by reinforcement, that behavior is more likely to be repeated, but if it is followed by some sort of aversive stimuli or punishment, it is less likely to be repeated. He also believed that this learned association could end, or become extinct, if the reinforcement or punishment was removed.

3.2.3 – Skinner’s Experiments

B. F. Skinner: Skinner was responsible for defining the segment of behaviorism known as operant conditioning—a process by which an organism learns from its physical environment.

Skinner’s most famous research studies were simple reinforcement experiments conducted on lab rats and domestic pigeons, which demonstrated the most basic principles of operant conditioning. He conducted most of his research in a special cumulative recorder, now referred to as a “Skinner box,” which was used to analyze the behavioral responses of his test subjects. In these boxes he would present his subjects with positive reinforcement, negative reinforcement, or aversive stimuli in various timing intervals (or “schedules”) that were designed to produce or inhibit specific target behaviors.

In his first work with rats, Skinner would place the rats in a Skinner box with a lever attached to a feeding tube. Whenever a rat pressed the lever, food would be released. After the experience of multiple trials, the rats learned the association between the lever and food and began to spend more of their time in the box procuring food than performing any other action. It was through this early work that Skinner started to understand the effects of behavioral contingencies on actions. He discovered that the rate of response—as well as changes in response features—depended on what occurred after the behavior was performed, not before. Skinner named these actions operant behaviors because they operated on the environment to produce an outcome. The process by which one could arrange the contingencies of reinforcement responsible for producing a certain behavior then came to be called operant conditioning.

To prove his idea that behaviorism was responsible for all actions, he later created a “superstitious pigeon.” He fed the pigeon on continuous intervals (every 15 seconds) and observed the pigeon’s behavior. He found that the pigeon’s actions would change depending on what it had been doing in the moments before the food was dispensed, regardless of the fact that those actions had nothing to do with the dispensing of food. In this way, he discerned that the pigeon had fabricated a causal relationship between its actions and the presentation of reward. It was this development of “superstition” that led Skinner to believe all behavior could be explained as a learned reaction to specific consequences.

In his operant conditioning experiments, Skinner often used an approach called shaping. Instead of rewarding only the target, or desired, behavior, the process of shaping involves the reinforcement of successive approximations of the target behavior. Behavioral approximations are behaviors that, over time, grow increasingly closer to the actual desired response.

Skinner believed that all behavior is predetermined by past and present events in the objective world. He did not include room in his research for ideas such as free will or individual choice; instead, he posited that all behavior could be explained using learned, physical aspects of the world, including life history and evolution. His work remains extremely influential in the fields of psychology, behaviorism, and education.

3.3 – Shaping

3.3.1 – Introduction

Shaping is a method of operant conditioning by which successive approximations of a target behavior are reinforced.

Dog show: Dog training often uses the shaping method of operant conditioning.

In his operant-conditioning experiments, Skinner often used an approach called shaping. Instead of rewarding only the target, or desired, behavior, the process of shaping involves the reinforcement of successive approximations of the target behavior. The method requires that the subject perform behaviors that at first merely resemble the target behavior; through reinforcement, these behaviors are gradually changed, or shaped, to encourage the performance of the target behavior itself. Shaping is useful because it is often unlikely that an organism will display anything but the simplest of behaviors spontaneously. It is a very useful tool for training animals, such as dogs, to perform difficult tasks.

3.3.2 – How Shaping Works

In shaping, behaviors are broken down into many small, achievable steps. To test this method, B. F. Skinner performed shaping experiments on rats, which he placed in an apparatus (known as a Skinner box) that monitored their behaviors. The target behavior for the rat was to press a lever that would release food. Initially, rewards are given for even crude approximations of the target behavior—in other words, even taking a step in the right direction. Then, the trainer rewards a behavior that is one step closer, or one successive approximation nearer, to the target behavior. For example, Skinner would reward the rat for taking a step toward the lever, for standing on its hind legs, and for touching the lever—all of which were successive approximations toward the target behavior of pressing the lever.

As the subject moves through each behavior trial, rewards for old, less approximate behaviors are discontinued in order to encourage progress toward the desired behavior. For example, once the rat had touched the lever, Skinner might stop rewarding it for simply taking a step toward the lever. In Skinner’s experiment, each reward led the rat closer to the target behavior, finally culminating in the rat pressing the lever and receiving food. In this way, shaping uses operant-conditioning principles to train a subject by rewarding proper behavior and discouraging improper behavior.

In summary, the process of shaping includes the following steps:

Reinforce any response that resembles the target behavior.
Then reinforce the response that more closely resembles the target behavior. You will no longer reinforce the previously reinforced response.
Next, begin to reinforce the response that even more closely resembles the target behavior. Continue to reinforce closer and closer approximations of the target behavior.
Finally, only reinforce the target behavior.

3.3.3 – Applications of Shaping

This process has been replicated with other animals—including humans—and is now common practice in many training and teaching methods. It is commonly used to train dogs to follow verbal commands or become house-broken: while puppies can rarely perform the target behavior automatically, they can be shaped toward this behavior by successively rewarding behaviors that come close.

Shaping is also a useful technique in human learning. For example, if a father wants his daughter to learn to clean her room, he can use shaping to help her master steps toward the goal. First, she cleans up one toy and is rewarded. Second, she cleans up five toys; then chooses whether to pick up ten toys or put her books and clothes away; then cleans up everything except two toys. Through a series of rewards, she finally learns to clean her entire room.

3.4 – Reinforcement and Punishment

3.4.1 – Introduction

Reinforcement and punishment are principles of operant conditioning that increase or decrease the likelihood of a behavior.

Reinforcement and punishment are principles that are used in operant conditioning. Reinforcement means you are increasing a behavior: it is any consequence or outcome that increases the likelihood of a particular behavioral response (and that therefore reinforces the behavior). The strengthening effect on the behavior can manifest in multiple ways, including higher frequency, longer duration, greater magnitude, and short latency of response. Punishment means you are decreasing a behavior: it is any consequence or outcome that decreases the likelihood of a behavioral response.

Extinction , in operant conditioning, refers to when a reinforced behavior is extinguished entirely. This occurs at some point after reinforcement stops; the speed at which this happens depends on the reinforcement schedule, which is discussed in more detail in another section.

3.4.2 – Positive and Negative Reinforcement and Punishment

Both reinforcement and punishment can be positive or negative. In operant conditioning, positive and negative do not mean good and bad. Instead, positive means you are adding something and negative means you are taking something away. All of these methods can manipulate the behavior of a subject, but each works in a unique fashion.

Operant conditioning: In the context of operant conditioning, whether you are reinforcing or punishing a behavior, “positive” always means you are adding a stimulus (not necessarily a good one), and “negative” always means you are removing a stimulus (not necessarily a bad one. See the blue text and yellow text above, which represent positive and negative, respectively. Similarly, reinforcement always means you are increasing (or maintaining) the level of a behavior, and punishment always means you are decreasing the level of a behavior. See the green and red backgrounds above, which represent reinforcement and punishment, respectively.

Positive reinforcers add a wanted or pleasant stimulus to increase or maintain the frequency of a behavior. For example, a child cleans her room and is rewarded with a cookie.
Negative reinforcers remove an aversive or unpleasant stimulus to increase or maintain the frequency of a behavior. For example, a child cleans her room and is rewarded by not having to wash the dishes that night.
Positive punishments add an aversive stimulus to decrease a behavior or response. For example, a child refuses to clean her room and so her parents make her wash the dishes for a week.
Negative punishments remove a pleasant stimulus to decrease a behavior or response. For example, a child refuses to clean her room and so her parents refuse to let her play with her friend that afternoon.

3.4.3 – Primary and Secondary Reinforcers

The stimulus used to reinforce a certain behavior can be either primary or secondary. A primary reinforcer, also called an unconditioned reinforcer, is a stimulus that has innate reinforcing qualities. These kinds of reinforcers are not learned. Water, food, sleep, shelter, sex, touch, and pleasure are all examples of primary reinforcers: organisms do not lose their drive for these things. Some primary reinforcers, such as drugs and alcohol, merely mimic the effects of other reinforcers. For most people, jumping into a cool lake on a very hot day would be reinforcing and the cool lake would be innately reinforcing—the water would cool the person off (a physical need), as well as provide pleasure.

A secondary reinforcer, also called a conditioned reinforcer, has no inherent value and only has reinforcing qualities when linked or paired with a primary reinforcer. Before pairing, the secondary reinforcer has no meaningful effect on a subject. Money is one of the best examples of a secondary reinforcer: it is only worth something because you can use it to buy other things—either things that satisfy basic needs (food, water, shelter—all primary reinforcers) or other secondary reinforcers.

3.5 – Schedules of Reinforcement

3.5.1 – Introduction

Reinforcement schedules determine how and when a behavior will be followed by a reinforcer.

A schedule of reinforcement is a tactic used in operant conditioning that influences how an operant response is learned and maintained. Each type of schedule imposes a rule or program that attempts to determine how and when a desired behavior occurs. Behaviors are encouraged through the use of reinforcers, discouraged through the use of punishments, and rendered extinct by the complete removal of a stimulus. Schedules vary from simple ratio- and interval-based schedules to more complicated compound schedules that combine one or more simple strategies to manipulate behavior.

3.5.2 – Continuous vs. Intermittent Schedules

Continuous schedules reward a behavior after every performance of the desired behavior. This reinforcement schedule is the quickest way to teach someone a behavior, and it is especially effective in teaching a new behavior. Simple intermittent (sometimes referred to as partial) schedules, on the other hand, only reward the behavior after certain ratios or intervals of responses.

3.5.3 – Types of Intermittent Schedules

There are several different types of intermittent reinforcement schedules. These schedules are described as either fixed or variable and as either interval or ratio.

3.5.3.1 – Fixed vs. Variable, Ratio vs. Interval

Fixed refers to when the number of responses between reinforcements, or the amount of time between reinforcements, is set and unchanging. Variable refers to when the number of responses or amount of time between reinforcements varies or changes. Interval means the schedule is based on the time between reinforcements, and ratio means the schedule is based on the number of responses between reinforcements. Simple intermittent schedules are a combination of these terms, creating the following four types of schedules:

A fixed-interval schedule is when behavior is rewarded after a set amount of time. This type of schedule exists in payment systems when someone is paid hourly: no matter how much work that person does in one hour (behavior), they will be paid the same amount (reinforcement).
With a variable-interval schedule, the subject gets the reinforcement based on varying and unpredictable amounts of time. People who like to fish experience this type of reinforcement schedule: on average, in the same location, you are likely to catch about the same number of fish in a given time period. However, you do not know exactly when those catches will occur (reinforcement) within the time period spent fishing (behavior).
With a fixed-ratio schedule, there are a set number of responses that must occur before the behavior is rewarded. This can be seen in payment for work such as fruit picking: pickers are paid a certain amount (reinforcement) based on the amount they pick (behavior), which encourages them to pick faster in order to make more money. In another example, Carla earns a commission for every pair of glasses she sells at an eyeglass store. The quality of what Carla sells does not matter because her commission is not based on quality; it’s only based on the number of pairs sold. This distinction in the quality of performance can help determine which reinforcement method is most appropriate for a particular situation: fixed ratios are better suited to optimize the quantity of output, whereas a fixed interval can lead to a higher quality of output.
In a variable-ratio schedule, the number of responses needed for a reward varies. This is the most powerful type of intermittent reinforcement schedule. In humans, this type of schedule is used by casinos to attract gamblers: a slot machine pays out an average win ratio—say five to one—but does not guarantee that every fifth bet (behavior) will be rewarded (reinforcement) with a win.

All of these schedules have different advantages. In general, ratio schedules consistently elicit higher response rates than interval schedules because of their predictability. For example, if you are a factory worker who gets paid per item that you manufacture, you will be motivated to manufacture these items quickly and consistently. Variable schedules are categorically less-predictable so they tend to resist extinction and encourage continued behavior. Both gamblers and fishermen alike can understand the feeling that one more pull on the slot-machine lever, or one more hour on the lake, will change their luck and elicit their respective rewards. Thus, they continue to gamble and fish, regardless of previously unsuccessful feedback.

Extinction of a reinforced behavior occurs at some point after reinforcement stops, and the speed at which this happens depends on the reinforcement schedule. Among the reinforcement schedules, variable-ratio is the most resistant to extinction, while fixed-interval is the easiest to extinguish.

3.5.4 – Simple vs. Compound Schedules

Simple reinforcement-schedule responses: The four reinforcement schedules yield different response patterns. The variable-ratio schedule is unpredictable and yields high and steady response rates, with little if any pause after reinforcement (e.g., gambling). A fixed-ratio schedule is predictable and produces a high response rate, with a short pause after reinforcement (e.g., eyeglass sales). The variable-interval schedule is unpredictable and produces a moderate, steady response rate (e.g., fishing). The fixed-interval schedule yields a scallop-shaped response pattern, reflecting a significant pause after reinforcement (e.g., hourly employment).

All of the examples described above are referred to as simple schedules. Compound schedules combine at least two simple schedules and use the same reinforcer for the same behavior. Compound schedules are often seen in the workplace: for example, if you are paid at an hourly rate (fixed-interval) but also have an incentive to receive a small commission for certain sales (fixed-ratio), you are being reinforced by a compound schedule. Additionally, if there is an end-of-year bonus given to only three employees based on a lottery system, you’d be motivated by a variable schedule.

There are many possibilities for compound schedules: for example, superimposed schedules use at least two simple schedules simultaneously. Concurrent schedules, on the other hand, provide two possible simple schedules simultaneously, but allow the participant to respond on either schedule at will. All combinations and kinds of reinforcement schedules are intended to elicit a specific target behavior.

4 – Cognitive Approaches to Learning

4.1 – Latent – Learning

4.1.1 – Introduction

Latent learning occurs without any obvious conditioning or reinforcement of a behavior, illustrating a cognitive component to learning.

Latent learning is a form of learning that is not immediately expressed in an overt response. It occurs without any obvious reinforcement of the behavior or associations that are learned. Interest in this type of learning, spearheaded by Edward C. Tolman, arose largely because the phenomenon seemed to conflict with the widely held view that reinforcement was necessary for learning to occur. Latent learning is not readily apparent to the researcher because it is not shown behaviorally until there is sufficient motivation. This type of learning broke the constraints of behaviorism, which stated that processes must be directly observable and that learning was the direct consequence of conditioning to stimuli.

Latent learning implies that learning can take place without any behavioral changes being immediately present. This means that learning can be completely cognitive and not instilled through behavioral modification alone. This cognitive emphasis on learning was important in the development of cognitive psychology. Latent learning can be a form of observational learning (i.e., learning derived from the observation of other people or events), though it can also occur independently of any observation.

4.1.2 – Early Work with Latent Learning

Edward Tolman: Edward Tolman was a behavioral psychologist who first demonstrated latent learning in rats. While he was a behaviorist in method, his work with latent learning disproved the behaviorist idea that learning was solely a product of conditioning.

Edward Tolman (1886–1959) first documented this type of learning in a study on rats in 1930. Tolman designed a study with three groups of rats placed in a maze. The first group received no reward for finishing, the second received a reward, and the third received no reward for the first 10 days but then received a reward for the final eight.

The first group consistently made errors in running the maze and showed little improvement over the 18-day study. The second group showed constant improvement in the number of errors made. The third group showed little to no improvement over the first ten days, then dramatically improved once a food reward was presented. Interestingly, the third group’s improvement was more pronounced than the second “constant reward” group.

Tolman theorized that the rats in the third group had indeed been learning a “cognitive map” of the maze over the first ten days; however, they’d had no incentive to run the maze without any errors. Once a reward was presented, the learning that had remained latent became useful, and the rats ran the maze more efficiently.

4.1.3 – Latent Learning in Humans

While most early studies of latent learning were done with rats, later studies began to involve children. One such experiment required children to explore a series of objects to find a key. After finding the key, the children were asked to find “non-key” objects. The children found these objects faster if they had previously been exposed to them in the first part of the experiment. Their ability to learn this way increased as they became older (Stevenson, 1954).

Children may also learn by watching the actions of their parents but only demonstrate it at a later date, when the learned material is needed. For example, suppose that Ravi’s dad drives him to school every day. In this way, Ravi learns the route from his house to his school, but he’s never driven there himself, so he has not had a chance to demonstrate that he’s learned the way. One morning Ravi’s dad has to leave early for a meeting, so he can’t drive Ravi to school. Instead, Ravi follows the same route on his bike that his dad would have taken in the car. This demonstrates latent learning: Ravi had learned the route to school but had no need to demonstrate this knowledge earlier.

In another example, perhaps you’ve walked around a neighborhood regularly and noticed—but never used—a particular shortcut. One day you receive a text telling you there is free pizza at a restaurant in the neighborhood, but only for the next 15 minutes. You use the shortcut that you’d noticed because you want to get there quickly. While you had developed a cognitive map of the area through latent learning, you’d never demonstrated a behavior that indicated you had done so until you were required to.

4.2 – Bandura and Observational Learning

4.2.1 – Introduction

Observational learning occurs from watching, retaining, and replicating a behavior observed from a model.

Observational learning, also referred to as modeling or social learning, occurs by observing, retaining, and replicating behavior seen in others. The individuals performing the imitated behavior are called models. While this type of learning can take place at any stage in life, it is thought to be particularly important during childhood, when authority is important. Stemming from Albert Bandura’s social learning theory, observational learning allows for learning without any direct change to behavior; because of this, it has been used as an argument against strict behaviorism, which argues that behavior must occur for learning to have taken place.

Observational learning can teach completely new behaviors or can affect the frequency of previously learned behaviors. This type of learning can also encourage previously forbidden behaviors. In some cases, observational learning can have an impact on behaviors that are similar to, but not identical to, the ones being modeled. For example, seeing a model excel at playing the piano may motivate an observer to play the saxophone. The observational theory of learning implies that behavior is not simply shaped by immediate consequences, but rather by considering the implications of an action.

4.2.2 – Albert Bandura and the Bobo-Doll Experiment

Bobo-doll experiment (Bandura): The Bobo-doll experiment was conducted by Albert Bandura in 1961 and studied patterns of behavior associated with aggression. Bandura hoped that the experiment would prove that aggression can be explained, at least in part, by social learning theory. The theory of social learning states that behavior such as aggression is learned through observing and imitating others.

One of the first recorded instances of observational learning in research was the 1961 study performed by Albert Bandura. This experiment demonstrated that children can learn merely by observing the behavior of a social model, and that observing reinforcement of the model’s behavior could affect whether or not a behavior was emulated. Bandura believed that humans are cognitive beings who, unlike animals, are (1) likely to think about the links between their behavior and its consequences, and (2) more likely to be influenced by what they believe will happen than by actual experience.

In his experiment, Bandura studied the responses of nursery-school-aged children to the actions of adults. The children were presented with a short film in which an adult model directed aggression towards an inflatable Bobo doll. Three main conditions were included: a) the model-reward condition, in which the children saw a second adult give the aggressive model candy for a “championship performance”; b) the model-punished condition, in which the children saw a second adult scold the model for their aggression; and c) the no-consequence condition, in which the children simply saw the model behave aggressively.

Results indicated that after viewing the film, when children were left alone in a room with the Bobo doll and props used by the adult aggressor, they imitated the actions they had witnessed. Those in the model-reward and no-consequence conditions were more willing to imitate the aggressive acts than those in the model-punished condition. Further testing indicated that children in each condition had equal amounts of learning, and it was only the motivation factor that kept behaviors from being similar in each condition

4.2.3 – Four Conditions for Observational Learning

According to Bandura’s social learning theory, four conditions, or steps, must be met in order for observational or social learning to occur:

4.2.3.1 – Attention

Observers cannot learn unless they pay attention to what is happening around them. This process is influenced by characteristics of the model, as well as how much the observer likes or identifies with the model. It is also influenced by characteristics of the observer, such as the observer’s expectations or level of emotional arousal.

4.2.3.2 – Retention or Memory

Observers have to not only recognize the observed behavior, but also remember it. This process depends on the observer’s ability to code or structure the information so that it is easily remembered.

4.2.3.3 – Initiation or Reproduction

Observers must be physically and intellectually capable of producing the act. In many cases the observer possesses the necessary responses, but sometimes reproducing the observed actions may involve skills the observer has not yet acquired. You will not be able to become a champion juggler, for example, just by watching someone else do it.

4.2.3.4 – Motivation

An observer must be motivated to reproduce the actions they have seen. You need to want to copy the behavior, and whether or not you are motivated depends on what happened to the model. If you saw that the model was reinforced for her behavior, you will be more motivated to copy her; this is known as vicarious reinforcement. On the other hand, if you observed the model being punished, you would be less motivated to copy her; this is called vicarious punishment. In addition, the more an observer likes or respects the model, the more likely they are to replicate the model’s behavior. Motivation can also come from external reinforcement, such as rewards promised by an experimenter.

4.3 – Kohler and Insight Learning

4.3.1 – Introduction

Insight learning occurs when a new behavior is learned through cognitive processes rather than through interactions with the outside world.

Insight learning was first researched by Wolfgang Kohler (1887–1967). This theory of learning differs from the trial-and-error ideas that were proposed before it. The key aspect of insight learning is that it is achieved through cognitive processes, rather than interactions with the outside world. There is no gradual shaping or trial and error involved; instead, internal organizational processes cause new behavior.

4.3.2 – Sultan the Chimpanzee and Insight Learning

Chimpanzees solving problems: Watch this video to see an experiment much like those conducted by Wolfgang Köhler.

Kohler’s most famous study on insight learning involved Sultan the chimpanzee. Sultan was in a cage and was presented with a stick, which he could use to pull a piece of fruit close enough to the cage so that he could pick it up. After Sultan had learned to use the stick to reach the fruit, Kohler moved the fruit out of range of the short stick. He then placed a longer stick within reach of the short stick. Initially, Sultan tried to reach the fruit with the short stick and failed. Eventually, however, Sultan learned to use the short stick to reach the long stick, and then use the long stick to reach the fruit. Sultan was never conditioned to use one stick to reach another; instead, it seemed as if Sultan had an epiphany. The internal process that lead Sultan to use the sticks in this way is a basic example of insight.

4.3.3 – Insight Learning versus Other Learning Theories

A basic assumption of strict behaviorism is that only behavior that can be seen may be studied, and that human behavior is determined by conditioning. Insight learning suggests that we learn not only by conditioning, but also by cognitive processes that cannot be directly observed. Insight learning is a form of learning because, like other forms, it involves a change in behavior; however, it differs from other forms because the process is not observable. It can be hard to define because it is not behavioral, a characteristic that distinguishes it from most theories of learning throughout the history of psychology.

Initially, it was thought that learning was the result of reproductive thinking. This means that an organism reproduces a response to a given problem from past experience. Insight learning, however, does not directly involve using past experiences to solve a problem. While past experiences may help the process, an insight or novel idea is necessary to solve the problem. Prior knowledge is of limited help in these situations.

Crows learning through insight: In another experiment, a crow creatively learns to bend a wire to get food out of a jar.

In humans, insight learning occurs whenever we suddenly see a problem in a new way, connect the problem to another relevant problem/solution, release past experiences that are blocking the solution, or see the problem in a larger, more coherent context. When we solve a problem through insight, we often have a so-called aha or eureka moment. The solution suddenly appears, even if previously no progress was being made. Famous examples of this type of learning include Archimedes’s discovery of a method to determine the density of an object (“Eureka!”) and Isaac Newton’s realization that a falling apple and the orbiting moon are both pulled by the same force.

4.3.4 – Insight versus Heuristics

Insight should not be confused with heuristics. A heuristic is a mental shortcut that allows us to filter out overwhelming information and stimuli in order to make a judgement or decision. Heuristics help us to reduce the cognitive burden of the decision-making process by examining a smaller percentage of the information. While both insight and heuristics can be used for problem solving and information processing, a heuristic is a simplistic rule of thumb; it is habitual automatic thinking that frees us from complete and systematic processing of information.

Insight is not a mental shortcut, but instead is a way to arrive at a novel idea through cognitive means. Rather than being habitual or automatic, insight involves coming up with a new idea that does not result from past experience to solve a problem. While heuristics are gradually shaped by experience, insight is not. Instead, internal processes lead to new behavior.

5 – Biological Basis of Learning

5.1 – Habituation, Sensitization, and Potentiation

Learning occurs when stimuli in the environment produce changes in the nervous system. Three ways in which this occurs include long-term potentiation, habituation, and sensitization.

5.1.1 – Long-Term Potentiation

One way that the nervous system changes is through potentiation, or the strengthening of the nerve synapses (the gaps between neurons). Long-term potentiation (LTP) is the persistent strengthening of synapses based on recent patterns of activity: it occurs when a neuron shows an increased excitability over time due to a repeated pattern, behavior, or response. The opposite of LTP is long-term depression (LTD), which produces a long-lasting decrease in synaptic strength.

The structure of a neuron: Communication between neurons occurs when the neurotransmitter is released from the axon on one neuron, travels across the synapse, and is taken in by the dendrite on an adjacent neuron.

Because memories are thought to be encoded by modification of synaptic strength, LTP is widely considered one of the major cellular mechanisms that underlies learning and memory. The role of LTP in learning is still being researched, but studies on the hippocampus have found LTP to occur during associative learning (such as classical conditioning ). LTP is based on the Hebbian principle: “cells that fire together, wire together.” This principle attempts to explain associative learning, in which simultaneous activation of cells leads to pronounced increases in synaptic strength between those cells, and provides a biological basis for the pairing of stimulus and response in classical conditioning.

5.1.2 – Habituation

Habituation to stress: Habituation involves responding to stimuli and stress less over time—after our body’s initial natural resistance to the stimuli.

Recall that sensory adaptation involves the gradual decrease in neurological sensory response caused by the repeated application of a particular stimulus over time. Habituation is the “behavioral version” of sensory adaptation, with decreased behavioral responses over time to a repeated stimulus. In other words, habituation is when we learn not to respond to a stimulus that is presented repeatedly without change. As the stimulus occurs over and over (and as long as it is not associated with any reward or punishment), we learn not to focus our attention on it. It is a form of non-associative learning that does not require conscious motivation or awareness.

Habituation helps us to distinguish meaningful information from the background. For example, an animal may be startled when it hears a loud noise, but if it is repeatedly exposed to loud noises and experiences no associated consequence, such as pain, it will eventually stop being startled.

5.1.3 – Sensitization

Sensitization is the strengthening of a neurological response to a stimulus due to the response to a secondary stimulus. For example, if a loud sound is suddenly heard, an individual may startle at that sound. If a shock is given following the sound, then the next time the sound occurs, the individual will subsequently react even more strongly to the sound. It is essentially an exaggerated startle response, and is often seen in trauma survivors. For example, the sound of a car backfiring might sound like a gunshot to a war veteran, and the veteran may drop to the ground in response, even if there is no threat present.

5.1.4 – Neurological Differences

Neural communication: This image shows the way two neurons communicate by the release of the neurotransmitter from the axon, across the synapse, and into the dendrite of another neuron.

Habituation and sensitization work in different ways neurologically. In neural communication, a neurotransmitter is released from the axon of one neuron, crosses a synapse, and is then picked up by the dendrites of an adjacent neuron. During habituation, fewer neurotransmitters are released at the synapse. In sensitization, however, there are more pre-synaptic neurotransmitters, and the neuron itself is more excitable.

6 – Psychology in Education

6.1 – Application of Psychological Theories to the Life of a Student

6.1.1 – Introduction

How we learn and incorporate information is directly influenced by psychology and is a key subject of interest for educational psychologists.

Psychology in the life of a student: How we learn and incorporate information is directly influenced by psychology.

Psychology plays an important role in what we do on a day-to-day basis, and this is especially true for students. How we learn and incorporate information is directly influenced by psychology, whether we know it or not. Educational psychology is the study of how humans learn in educational settings, the effectiveness of educational interventions, the psychology of teaching, and the social psychology of schools as organizations. It is concerned with how students learn and develop, often focusing on subgroups such as gifted children and those subject to specific disabilities. Understanding the various theories of learning as well as your personal learning style can help you better understand information and develop positive study habits.

6.1.2 – Education and Theories of Learning

Within the realm of psychology, there are several theories that help explain the ways in which people learn. By understanding these concepts, students are better able to understand and capitalize on how they acquire knowledge in school. Behaviorism is based on both classical conditioning (in which a stimulus is conditioned to create a response) and operant conditioning (in which behavior is reinforced through a particular reward or punishment). For example, if you study for your psychology test and receive a grade of A, you are rewarded; in theory, this makes it more likely that you will study in the future for your next test.

Cognitivism is the idea that people develop knowledge and meaning through the sequential development of several cognitive processes, including recognition, reflection, application, and evaluation. For example, you read your psychology textbook (recognition), you ponder what the ideas mean (reflection), you use the ideas in your everyday life (application) and then you are tested on your knowledge (evaluation). All of these processes work together to help you develop prior knowledge and integrate new concepts.

Constructivism is the concept of constructing new ideas based on previous knowledge. For example, our prior experiences with a situation help us to understand new experiences and information. Piaget is most famous for his work in constructivism, and many Montessori schools are based on the constructivist school of thought.

6.1.3 – Types of Learners

People also learn in a variety of ways. Styles of learning are generally grouped into three primary categories: visual, auditory, and kinesthetic. Although most people are a combination of these three types, we tend to have a particular strength in one area. Knowing your strongest learning type can help you learn in the most effective way; depending on your learning style, you’ll want to tweak your study skills to get the most of your education.

Visual learners usually use objects such as flashcards or take and reread lecture notes. Visual learners will highlight important passages in books or draw pictures/diagrams of ideas to help better understand the concepts.
Auditory learners understand concepts best by listening; many will record a lecture and play it back to further understand the lesson. Many auditory learners will read aloud and tend to do well on oral, rather than written, exams.
Kinesthetic learners (related to kinesthesia) do best when they act out or repeat something several times. Role-plays, experiments, and hands-on activities are great ways for kinesthetic learners to understand and remember concepts.

6.2 – Learning Disabilities and Special Education

6.2.1 – Introduction

Special-education programs are designed to help children with disabilities obtain an education equivalent to their non-disabled peers.

There are a variety of learning disabilities that require special assistance in order to help children learn effectively. Special education is the practice of educating students with disabilities or special needs in an effective way that addresses their individual differences and needs. Ideally, this process involves the individually planned and systematically monitored arrangement of teaching procedures, adapted equipment and materials, and accessible settings. Some forms of support include specialized classrooms; teacher’s aides; and speech, occupational, or physical therapists.

Special-education interventions are designed to help learners with special needs achieve a higher level of personal self-sufficiency and success in school and their community than may be available if they were only given access to a typical classroom education. Certain laws and policies are designed to help children with learning disabilities obtain an education equivalent to their non-disabled peers.

6.2.2 – Types of Learning Disabilities

6.2.2.1 – Intellectual Disabilities

An intellectual disability, or general learning disability, is a generalized disorder appearing before adulthood, characterized by significantly impaired cognitive functioning and deficits in two or more adaptive behaviors (such as self-help, communication, or interpersonal skills). Intellectual disabilities were previously referred to as mental retardation (MR)—though this older term is being used less frequently—which was historically defined as an intelligence quotient (IQ) score under 70. There are different levels of intellectual disability, from mild to moderate to severe.

6.2.2.2 – ADHD

Attention -deficit hyperactivity disorder (ADHD) is considered a type of learning disability. This disability is characterized by difficulty with focusing, paying attention, and controlling impulses. Children with ADHD may have trouble sitting in their seat and focusing on the material presented, or their distractions may keep them from fully learning and understanding the lessons. To be diagnosed according to the Diagnostic and Statistical Manual of Mental Disorders, 5th edition (DSM-5), symptoms must be observed in multiple settings for six months or more and to a degree that is much greater than others of the same age. They must also cause problems in the person’s social, academic, or work life.

6.2.2.3 – Autism Spectrum Disorder

A child with autism stacking cans: Although many children with ASD display normal intelligence, they often require special support due to other symptoms of the disorder.

Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by limitations in language and social skills. While previously divided into different disorders, the DSM-5 now uses the term ASD to include autism, Asperger syndrome, childhood disintegrative disorder, and pervasive developmental disorder not otherwise specified (PDD-NOS). Language difficulties related to ASD will sometimes make it hard for the child to interact with teachers and peers or themselves in the classroom. Deficits in social skills can interfere with the development of appropriate peer relationships, and repetitive behaviors can be obsessive and interfere with a child’s daily activities. Although many children with ASD display normal intelligence, they may require special support due to other symptoms of the disorder.

6.2.2.4 – Dyslexia

Dyslexia is characterized by difficulty with learning to read or write fluently and with accurate comprehension, despite normal intelligence. This includes difficulty with phonological awareness, phonological decoding, processing speed, auditory short-term memory, and/or language skills or verbal comprehension. Dyslexia is the most recognized of reading disorders; however not all reading disorders are linked to dyslexia.

6.2.3 – Laws for Children with Disabilities

6.2.3.1 – Overview

Two laws exist to help ensure that children with learning disabilities receive the same level of education as children without disabilities: IDEA and Section 504.

6.2.3.2 – The Individuals with Disabilities Education Act (IDEA)

The Individuals with Disabilities Education Act (IDEA) provides federal funding to states to be put toward the educational needs of children with disabilities. IDEA, which covers 13 categories of disability, has two main components: Free and Appropriate Public Education (FAPE) and an Individual Education Program (IEP). In addition to the disabilities listed above, IDEA covers deaf-blindness, deafness, developmental delays, hearing impairments, emotional disturbance, orthopedic or other health impairment, speech or language impairment, traumatic brain injury, and visual impairment (including blindness).

The Free and Appropriate Public Education (FAPE) component of IDEA makes it mandatory for schools to provide free and appropriate education to all students, regardless of intellectual level and disability. FAPE is defined as an educational program that is individualized for a specific child, designed to meet that child’s unique needs, and from which the child receives educational benefit. An Individual Education Program (IEP) is developed for each child who receives special education; each plan consists of individualized goals for the child to work toward, and these plans are re-evaluated annually.

IDEA also advocates for the Least Restrictive Environment (LRE), which means that—to the greatest extent possible—a student who has a disability should have the opportunity to be educated with non-disabled peers, have access to the general-education curriculum, and be provided with supplementary aids and services necessary to achieve educational goals if placed in a setting with non-disabled peers.

6.2.3.3 – Section 504

Section 504 is a civil-rights law that protects people with disabilities from discrimination. All students with disabilities are protected by Section 504, even if they are not provided for by IDEA. Section 504 states that schools must ensure that a student with a disability is educated among peers without disabilities. A re-evaluation is required prior to any significant changes in a child’s placement, and a grievance procedure is in place for parents who may not agree with their child’s educational placement.

Originally published by Lumen Learning – Boundless Psychology under a Creative Commons Attribution-ShareAlike 3.0 Unported license.