Timing is probably one of the attributes that contributes most to establishing an efficient venue of communication between a human and a non-human animal. If we want to positively reinforce a behaviour, and increase its frequency of occurrence in the future, the reward should be offered at the exact moment when the behaviour we want to "capture" occurs (or within one second after the occurrence of this behaviour). The moment in which we offer the reward contains very important information about what we are trying to teach.
Operant conditioning tells us that the common sequence of events is antecedent, behaviour, consequence (ABC). Thus, the cue (antecedent) should come before the behaviour and then, when the behaviour occurs, an appropriate consequence must follow. If we want the dog to learn to relieve herself on cue, that cue should ideally be offered when the dog starts circling and sniffing the ground. We should not wait until she has already started the behaviour and/or has finished it. Instead, the cue should happen before. Similarly, a road sign that warns of a crossroad should appear before the crossing. If it appears on or after the crossing it will not be very useful.
We then have the issue of using good timing for the feedback we provide when the behaviour happens. If the timing is not appropriate we may be reinforcing a behaviour that is not the intended one. Imagine asking the dog to sit and when he does so we get distracted and look at something else for a few seconds. During that time the dog can stand up, look away, sniff the ground, etc. When you turn back and reinforce the behaviour, what did you reinforce exactly? Was it the sit? Was it the Stand? It is usually difficult to know, but you were probably reinforcing the behaviour that the dog was doing when you offered the reward.
Now let’s discuss an imaginary scenario. Imagine that you go to a restaurant for lunch and you choose fruit instead of chocolate mousse (your typical choice) for desert. When you get home two hours later you find 100$ in cash in your mail box. The money was left there without your knowledge by a wealthy friend of yours who seeks to reinforce the fact that you have chosen to eat fruit instead of chocolate mousse. The millionaire’s intention was to make it more likely that you would choose fruit in the future. It is highly unlikely that the millionaire will be successful in this initiative. Many behaviours have occurred between the time you ate the fruit and found the cash. In this scenario the probability of association between the two behaviours is greatly reduced. Now imagine that what happened instead was that when you ordered the fruit it came with a 100$ voucher on the side. In that scenario it is much more likely that you will order fruit the next time. It all has to do with timing.
We can now make an analogy to an issue that most dog guardians face. Imagine that you get home and your puppy urinated on the floor during your absence. If you decide to reprimand her for the “mistake” she did when you get home, how would she know why she is being reprimanded? It will be very difficult for her to associate your reaction with the behaviour of urinating on the floor that happened half an hour ago. I should make a quick reference here to the fact that I would not recommend reprimanding a puppy for a house training mistake, even if you catch her “in the act”. First, leaving a young puppy unattended in the house without supervision and/or management and “hoping for the best” is a recipe for disaster. Second, if you reprimand her “in the act” she may think that she cannot relieve herself in front of you and will start to sneak away when she needs to relieve herself.
Unfortunately (or maybe fortunately) we cannot tell an animal that we like what he did two hours ago. We have to act in the present moment and we can only influence what is happening then. When I have a conversation with a person, I can let her know that I like what she did yesterday and earlier today. I can also let her know about the things that I did not like. With non-human animals, a system that allows us to do this does not yet exist, as far as I am aware at least. We would then have the issue of whether or not they are cognitively capable of comprehending information referring to the past, but that is a topic for another time…
I mentioned earlier that in order to positively reinforce a behaviour, and increase its frequency of occurrence in the future, the reward should be offered when the behaviour happens (or in less than a second after the behaviour) in order to be effective. The reality is that it is very difficult to achieve this consistently. The solution to this problem involves the use of a conditioned or secondary reinforcer (e.g. a word, a whistle, the sound of a clicker). A primary reinforcer is something that the individual naturally needs such as food and water. A conditioned reinforcer, on the other hand, acquires meaning when repeatedly paired with something that the individual finds rewarding (typically a primary reinforcer, but it could also be a previously established secondary reinforcer). To accomplish this we make use of classical conditioning and we condition the individual to a “marker” or “bridge”. We do this by systematically presenting it before the delivery of something that already has reinforcement value for the animal (e.g. click – food, click – food, etc.). This is often called “charging the marker” or “charging the clicker”. With enough repetition the clicker will start to elicit the same response that the food would and later on we can use this to pinpoint moments in time in which he just earned a reward.
Now we are able to tell the dog to sit, click for the sit and give the reward after a few seconds. Another major advantage of this system is that we can reinforce behaviour that occurs away from us. A dolphin jumps to touch a ball and the trainer blows its whistle when it happens. The delivery of the reward can now occur when the animal is back with the trainer, even if that takes a few seconds to happen. This procedure eliminates confusion about which behaviour caused the delivery of the reward.
Regarding the bridge, there is a lot of ongoing discussion debating whether or not the bridge should tell the animal that the behaviour is over and that it can now come to the trainer to collect the reward. The alternative approach would be one in which the animal hears the bridge and should remain in position to receive the reward. An example of this is when you tell your dog to sit. When he sits, you click and offer a food reward. The question then is: does he have to remain sitting to receive it? I tend to prefer using the bridge as a classic terminal signal that tells the animal that he did well and that the behaviour is over. I recognize however that many successful trainers use the other approach and that animals seem to learn equally well. Furthermore, for some behaviours offering the reward out of position seems to speed up the learning process, while in other situations offering the reward in position seems to convey important information to the animal.
In conclusion, timing is a very important skill to master if effective communication is to be achieved. In operant conditioning terminology the sequence antecedent, behaviour, consequence (ABC) is highly accepted and it tells us the correct chain of events when something is under operant conditioning control. Regarding the consequence, when training non-human animals there is no point in providing feedback for something that has happened in the past. Acting in the moment when the behaviour happens makes the trainee’s learning experience easier and clearer. Considering the importance of good timing, the use of a “marker” or “bridge” is recommended to fill in the gap between the behaviour and the collection of a reward.
Picture: www.morguefile.com