It is the nature of behaviour to change
Kay Laurence March 2009. From the newly published book: Teaching with Reinforcement
And nature does not make is easy for us.
I have taught my puppy to sit for a piece of cooked chicken. A cube about half the size of my thumbnail. Once my puppy has repeated this behaviour 4-500 times I will reduce the value of the reinforcer. The sit has become much easier to achieve, is it more of a reflex than a considered response, and the puppy, now 7 months old, will get reaction, affection or attention as a reinforcer for a plain sit, but it will always get some reinforcer. If I continue to feed the cube of chicken, I will be de-valueing it since the behaviour that requires that value has become easier to do. The sit will still get good quality reinforcement, but not as the same high value.
Imagine a cube of cooked chicken is the value of 1 unit. To earn this the pup has to focus, respond with self control to the lure, or remember what they have to do to get that treat. This is quite an achievement for this 10 week old pup, who is high energy and easily aroused. I set the value of this combined effort as 1 unit of effort.
As time and repetition increases the sit competency the pup will need less skill, concentration and memory skills to earn the same value reinforcer - 1 unit. I may be using these treats for shaping exercises, teaching a new behaviour or recognition of self control, but by over paying for the competent sit, I risk de-valueing the 1 unit reinforcer, and may affect the amount of effort put into more complex behaviours that only earn the same reinforcement.
But in more difficult situations, perhaps a sit before the dog leaves the car, I will revert to 1 unit of reinforcement, since the amount of effort required to achieve that behaviour will be far higher than the sit in my kitchen.
It is highly unlikely that you will have the time to measure the reinforcer on every occasion. But remember to recognise when the effort is greatly in excess of the standard 1 unit reinforcer and use sufficient reinforcers to match the effort required to perform the behaviour. My Gordons will respond to recall for food when in the garden, but not when in the field. The food value has completely changed and I would require to compete with the highest reinforcer: bird hunting. I have a secret weapon for high value reinforcement with the Gordons and that is intense focus and loud adoration, drama, body rubs for at least 60 seconds. This is quite exhausting, but definitely high value for their temperament. My collies would be appalled at such a display of adoration. A simple "cool", is all they require. Job done.
One technique that worries me is the strategy of using too much of "yourself" as part of a standard unit of reinforcement. When clicker training a new behaviour, the dog learns to memorise what they are doing at the time of the mark and seek their reward. This is a highly cognitive process and needs plenty of relaxed concentration. If, when you deliver the food treat, you simultaneously add several murmurs, or even dramas of amazement and approval you can disrupt the concentration. The dog needs to be able to learn without trying to seek approval or an emotional response from you at the same time. In this situation let the clicker and food do the talking and reserve your valuable approval for those times when there is a genuine increase in effort or you are without external reinforcement. You may need to plan to associate sounds such as "good boy" with the delivery of food, but be aware of over using it unnecessarily. The amount you need is the amount that is sufficient to strengthen, or increase frequency of the behaviour.
You can vary the value of the reinforcer in an unpredictable way, with variable reinforcement. Note: NOT variable schedule of reinforcement, this is when the same reinforcer is delivered on a variable schedule, ie 1st, 3rd, 8th, 13th repetition etc. For this sit I will give you attention, for the next sit I will put the lead on to go for a walk, for the next sit I will give a food treat, but there is always reinforcement. This variability will only work if at a regular intervals the dog has received an exceedingly high value, important reinforcer, such as going for a walk. But if the dog only ever receives a pat on the head or a smile for a sit, then you can expect to get the same quality behaviour in return. The predictability of a high reinforcer elicits strong behaviour, the predictability of a poor reinforcer will elicit a weak behaviour.
Those important, must have, behaviours such as a recall, should always predict the best quality reinforcer to maintain the quality of the behaviour. It is too important for you to get slack on.
|