Understanding the Four Quadrants of Operant Conditioning
Operant conditioning is using consequence manipulation to increase or decrease the frequency of a particular behavior. Frequently, when trainers speak to clients about their dog's behavior, they refer to one or more of the four quadrants of operant conditioning. These terms can be confusing, so today we will clarify what each of them means.
The consequences we'll be discussing are punishment and reinforcement. As they relate to training, these words have very specific definitions which differ from their colloquial use. Punishment is any consequence which reduces behavior. Reinforcement is any consequence which increases behavior.
There are two types of punishment and two types of reinforcement. We use the terms positive and negative to describe these varieties, and again, the words positive and negative have very particular meanings and are used in the mathematical sense. Positive means you are adding (+) a consequence and negative (-) means you are removing a consequence. Let's examine each of the four quadrants.
Positive punishment (P+) - we are adding an [aversive] stimulus which will reduce the frequency of behavior. Spanking, shouting, or cutting off air supply through a choke chain can be examples of positive punishment.
Negative punishment (P-)- we are removing a [desirable] stimulus to reduce the frequency of behavior. If a dog jumps on a person to greet them, and the person walks away when the dog jumps, negative punishment has been employed - that person is removing their attention to reduce the frequency of jumping in the future.
Positive reinforcement (R+)- we are adding a [desirable] stimulus to increase the frequency of behavior. A dog sits and gets a click and a treat. You go to work all week, and are reinforced with a paycheck.
Negative reinforcement (R-)- we are removing an [aversive] stimulus to increase the frequency of behavior. Your alarm clock goes off continually until you get up to turn it off - the behavior of getting up to turn off the alarm clock has been negatively reinforced. A dog runs away from the handler and an electric shock is administered until the dog begins to return to the handler (removing the shock to increase the frequency of dog checking in).
Often, when using negative reinforcement, there are in fact two quadrants at play - the aversive is applied during or immediately following the undesired behavior (positive punishment) and is continually applied until the desired behavior is offered, at which time the aversive is removed (negative reinforcement). The instant the owner begins shocking the dog, she is positively punishing the behavior of going away. The instant she stops shocking the dog, she is negatively reinforcing the behavior of checking in/return to handler.
In all operant conditioning applications, the learner gets to decide what is punishing or reinforcing. Punishment and reinforcement are defined by their effect on the relative frequency behavior. If a person yells at a dog to stop barking behavior and the barking does not decrease in frequency, yelling at the dog has not, by definition, functioned as a punishment. If the person keeps yelling at the dog and the behavior does not reduce in frequency, the person is nagging, rather than training. If you give your dog a piece of kibble each time he sits and he does not offer the "sit" behavior more frequently, the kibble has not functioned as a reinforcement.
Tomorrow we'll talk more about aversives, punishment, and the difference between the two. Until then, happy training!