Clicker training currently is a popular training method amongst dog owners, and increasingly among cat trainers and horse trainers.
Clicker training currently is a popular training method amongst dog owners, and increasingly among cat trainers and horse trainers. Many people consider clicker training a fad or gimmick. For others, because the method often relies heavily on food rewards, the method equates to bribing animals to do "tricks" and therefore does not constitute "real" training. This presentation is intended to outline the scientific basis for clicker training, dispel some myths, and give examples of the advantages of the method for many training situations.
There is nothing magical or mystical about a clicker. Initially, it means nothing to an animal. The sound only becomes meaningful to the animal after it is paired with something the animal finds significant (e.g. food). The clicker is a tool that is used during a training process that is based on scientific learning theory involving operant and classical conditioning. Operant conditioning (OC) and classical conditioning (CC) are two forms of associative learning – learning that occurs when an animal (or person) makes an association between two stimuli or between a behavior and its consequence.
Classical conditioning, also called Pavlovian conditioning or respondent learning, was discovered by Ivan Pavlov during experiments involving salivation in dogs. The classic example is dogs that "learned" to salivate upon hearing a bell because the bell predicted the presentation of food. Classical conditioning involves stimulus-response processes – involuntary responses — often the conditioning of reflexes and emotions. It also involves stimulus-stimulus associations that are independent of the animal's behavior.
For example, cats learn that the sound of can openers predict food no matter what the cats are doing at the time. Similarly dogs learn that doorbells predict visitors. Classical conditioning is the process by which dogs learn the significance of a clicker – the sound of the click predicts the presentation of a reinforcer usually food, but also toys, attention, or other primary reinforcements. With consistent pairing, the click takes on reinforcing properties on its own; it becomes a conditioned, or secondary, reinforcer.
Operant conditioning refers to a process by which animals learn associations between their behavior and its consequences. These are voluntary responses such as sit, shake, come, heel, speak, etc. Operant conditioning involves the use of reinforcements and punishments. Reinforcement refers to any stimulus that results in an increased probability of the target behavior being repeated in the future, while punishment refers to any stimulus that results in a decreased probability of the behavior being repeated in the future. There are four quadrants, or possible consequences, in OC:
1. Positive reinforcement: the application of a stimulus that increases the frequency of the target behavior, for example, giving a dog a treat if it sits.
2. Negative reinforcement: the application of an aversive stimulus that is then removed when the animal performs the desired behavior. Using an ear pinch to teach a forced retrieve is an example of this method.
3. Positive punishment: the application of an aversive stimulus with the intention of reducing the frequency of the target behavior. Yelling at a dog or kneeing a dog in the chest for jumping up is an example of this.
4. Negative punishment: the removal of a reward or the opportunity for a reward in order to reduce the frequency of the target behavior. Turning away from a jumping dog or using a time-out are examples.
All four of the quadrants of operant conditioning work successfully to modify behavior if applied correctly.
Clicker training encompasses two concepts: 1) training by using almost exclusively the positive reinforcement quadrant of OC, and 2) using a secondary reinforcer to bridge the time between the target behavior and the delivery of a primary reinforcement. (This latter effect is why the clicker is also called a "bridge".) What makes this superior to many other training methods? 1) Using a bridge allows for precise timing of reinforcement, particularly for short duration behaviors or for animals that show many behaviors at one time or in rapid succession to each other. A person's eye-hand reaction time is much faster than his/her eye-verbal-output time. This allows the trainer to more accurately "pick out" the desired behavior from other background behaviors. 2) Because the clicker is a distinct, and unique, sound it has high salience against background noise. Dogs are accustomed to humans talking, often quite too much, so verbal reinforcements can lose saliency to the dog over time. 3) The clicker allows for reinforcement even if the animal refuses the subsequent primary reinforcer. For example, a dog is nervous and distracted. The owner asks the dog to sit and the dog responds. The dog is offered a treat as a reward but is too nervous to eat it. If the dog is clicked for sitting, reinforcement occurs even if the dog refuses the treat because the brain still processes the sound. 4) Positive reinforcement training conditions desirable emotional associations with the training process, the trainer, the cues (commands) and the act of performing the behavior itself, as well as all other contextual stimuli associated with the training process. This latter effect is the most important concept especially when dealing with animals with behavioral problems, such as anxiety disorders and aggression issues.
Operant and respondent learning occur together during every training sessions. While a trainer can focus more on one paradigm over the other, it is impossible to train exclusively with OC or CC – they always occur together.
Positive reinforcement is the only intervention that does not purposely involve the use of an aversive stimulus in the training process. This has major implications when you take classically conditioned effects into mind and recall that CC involves conditioning involuntary responses – emotions and reflexes. By definition, punishments and negative reinforcement involve doing something unpleasant to the animal. Punishments need not, and should not, be abusive. Punishments do not need to be painful either, although many of them are.
Let's look at a simple training exercise from two perspectives. First, food is a primary reinforcement. Food is associated with comfort, "pleasure", and activation of the parasympathetic nervous system (where as the fight-flight response activates the sympathetic nervous system). For simplicity sake, we will say that food makes dogs "happy." (Pull out a dog biscuit and look at your dog's response.) This is an automatic response, not a learned one.
Scenario 1: A dog is trained to sit by using a piece of food as lure and reward – when the dog sits it is given the food. The trainer then begins to say "sit" just before the dog performs the behavior. If the dog does not sit in response to the cue, the dog is simply denied reinforcement. Over time the dog learns to sit in response to the command because it knows it will get a food reward when it does so. This is the operantly trained behavior. But what effect has CC had here? Classical conditioning has made the following associations
1. Because sitting predicts food, and food makes the dog "happy", sitting makes the dog "happy".
2. The command "sit" predicts the opportunity for reinforcement (food), therefore the word "sit" makes the dog "happy".
3. Since the trainer delivers the food, the presence of the trainer makes the dog "happy".
4. All other stimuli associated with the training environment are now associated with the food and also take on "happy" associations. This is called contextual conditioning.
5. Because sitting makes the dog "happy", sitting in and of itself is now rewarding to the dog.
6. The command "sit" now can act as a secondary reinforcement because the word "sit" makes the dog "happy" and the word "sit" is now rewarding to the dog.
Scenario 2: A dog is trained to sit by pulling up on its collar and pushing on its rump until it does so (negative reinforcement). The command is introduced just before the aversive stimulus (pulling on the collar and pushing on the rump) is applied. If the dog does not sit on command, the dog is corrected with the training collar. Remember, by definition, negative reinforcement involves applying some stimulus that is aversive or unpleasant to the dog and then removing this when the dog performs the target behavior- the dog is working to avoid an aversive stimulus. Soon the dog learns to sit because it knows that an aversive stimulus will be applied if it does not sit. This is the operantly trained behavior, but what has Pavlov done?
1. Pulling up on the dog's training collar is unpleasant and makes the dog "unhappy" (and possibly fearful). Because the trainer performs this action to the dog, the presence of the trainer makes the dog "unhappy" (and possibly fearful).
2. The command "sit" predicts pulling up on the dog's collar, therefore, the word "sit" makes the dog unhappy/fearful.
3. The act of sitting makes the dog unhappy/fearful.
4. Because the word "sit" now makes the dog unhappy/fearful, it becomes a conditioned punisher.
5. Because the command, the behavior, the trainer, and the training process make the dog unhappy/fearful, all of the other stimuli associated with the training process and context are also associated with unpleasantness.
Now take a dog with storm phobia that paces, digs, pants, and shows other destructive behavior. The owner wants to help the dog calm down. If the dog was trained with positive reinforcement, what happens to the dog's emotional state if the owner asks the dog to sit during a storm? If the dog was trained using negative reinforcement and punishment, what happens to the dog's emotional state if the owner asks the dog to sit during a storm? What if this was a dog that was aggressive toward people on walks because it was fearful of them?
Clicker training is a powerful way to build durable behavior in animals as well as alter the animal's emotional state in a desirable direction. Clicker training and positive reinforcement also empower the animal by giving the animal choices. The more choice an animal has, the better the animal's behavioral health.
Podcast CE: A Surgeon’s Perspective on Current Trends for the Management of Osteoarthritis, Part 1
May 17th 2024David L. Dycus, DVM, MS, CCRP, DACVS joins Adam Christman, DVM, MBA, to discuss a proactive approach to the diagnosis of osteoarthritis and the best tools for general practice.
Listen