Topic #5 - All Positive, Balanced, or ? Poisoning the cue?

From Kim:

Okay, I will admit I am an idiot. But 1-my timing sucks on the closed hand leave it (I know PRACTICE!) and 2-I dont understand how the dogs generalize the exercise from food in your hand to the Tylenol that has spilled on the floor (leave it in real life is rarely used for thing in your hand, in my opinion). So you practice the close hand leave it. Dog does great. Now you are out and about on the walk or even doing the leave it course and the dog dives for an item. What do you do? Do you simply let them get it and work thru a "give"? That's fine if it is an item from the leave it box, but it is not fine if it is the 1/2 of an onion that just went flying off the cutting board. So, do you give a leash pop at that time? If so, then aren't we poisoning the cue thus negating the "positive training" work we did because the emotional impact of the "negative" action (leash pop) will override the "positive"??

And if teaching a leave-it using the leash pop (on a flat or premier collar) is the "wrong" way to train it, then isn't using a prong on a pulling dog, simply stopping a walk and letting a dog pull againsts a flat collar, long line work in TOR all wrong too?? Or is it that the leash pop technique for leave it is just a more obvious correction to the viewer.

Are we harming the relationship between human and dog if we pop the lead, again on a flat collar? Again, what is the goal of leave it? Avoid the object or look at me? Look at me doesnt work if youre in a different room, so logically it would seem to me that leave it means avoid. I dont know.........

And from Elin:

My take on leave it is this: If you are going to punish a dog for choosing wrong when you ask him to "leave it" then you should be sure the dog actually knows what you are asking him to do when you say "leave it." Otherwise you are punishing the dog for not KNOWING what "leave it" means, which doesn't seem fair. I think I read someplace that positive punishment (or is it negative? I am totally mixed up now!) should only be used if the dog absolutely knows what the command means and chooses to ignore it. So my preference is to teach my dogs & foster dogs "leave it" using positive methods and save the leash pop for when the dog knows the command but chooses not to do it. elin (from the peanut gallery)


OH I love this debate!

Is there such a thing as "Purely Positive" training?

Does the dog "know better"? Is the dog making a conscious choice - and why would he make the wrong choice if he knew better?

If the behavior is self reinforcing - would building in some fluent avoidance behavior be prudent to interrupt a behavior so you could redirect it? Is teaching an incompatible behavior enough - when and when might it not be?

Teaching the dog to monitor his own behavior, exercise discretion in absence of a cue or when not under 100% supervision - not get on the couch even when the owner isn't home, not take cookies off the counter even if the owner is in the shower or be accountable/responsible for keeping track of us even when we aren't actively in "training mode" (as in the long line exercise - the owner changes direction and the dog runs out of line if he isn't paying attention) requires that the dog resist temptation, follow our lead and make choices without our direction.

Here is the Karen Pryor article: Poisoning the Cue

(Gail Fisher addressed this in her presentation at the Canadian APDT conference a few years back, although hers was mostly about how luring messes up clicker training.) Karen's main point in the article is addressing cross-over trainers who have added the use of the clicker to their traditional choke chain method of training. 'Clicker users' as opposed to true clicker trainers. (A clicker in one hand and a jerk in the other - or a dog on one end of the leash and a jerk on the other?) The article explains how mixing methods can undermine the fact that in clicker training the cue actually becomes a reinforcer that can even be used as a reward when chaining behaviors together - great for those dance routines! In traditional training the cue carries the added negative connotation of threat of impending correction, as that is how the behavior was acquired. Adding the probability of correction for failure to comply "poisons the cue" - i.e. destroys the positive conditioning and reinforcing quality of the cue.

Are we Poisoning the Cue? Is a leash pop "wrong" when teaching "Leave-it"?
In the JQ Public real world, the words "leave it" are used as more of a "no reward marker" than a positive "cue" in the same sense as "sit" or "come" - so perhaps given that, it doesn't "poison the cue" as much as it emphasizes the fact that to attempt to take an off-limits item is futile and unwise. Understanding that, while attainable, learning to not touch "free stuff" is totally foreign concept to a canine and one that must be systematically taught.

The ideal goal is to set up a situation where you can repeatedly reward the dog for "giving up" and "avoiding" and hence, teach them how to self-monitor when the owner isn't physically or mentally present and give you an emergency "get back" safety net at those moments when the bottle of pills or glass hits the floor and breaks in the kitchen.

As for the administering of the leash-pop correction - we are more "letting them repeatedly run into the end of their leash" as a circumstance of lunging than we are inflicting a pop/jerk/physical socially administered growl/snap "correction." It gets the owner away from becoming scary and yelling "NO" and instead, teaches them to pro-actively interrupt and redirect the dog to an appropriate behavior: "ah-ah" "YES!".

Emphasize to the student that we are:
1.
preventing the dog from getting to a "planted" item while
2. watching carefully for opportunities to mark ("yes!") and reward the behavior of coming away from the item and then
3. naming the behavior "Leave it" meaning "giving up trying to grab or avoiding an off-limits item."
4. the item itself becomes the cue for the dog to leave-it.

If we do it correctly, we are layering one level of learning foundation on another and building a clear understanding of what we expect from the dog when it hears the words "leave it" or sees an unattended non-toy or food item.

Yes, running out of leash is punishing - our goal is to decrease the behavior of grabbing off-limits items, installing a level of restraint and impulse control - and even a certain level of healthy caution and avoidance - for their future safety.

In Karen's article she states: "Even though successful response to a given discriminative stimulus is still followed by reward, if failure is now followed by punishment, you have made that discriminative stimulus ambiguous in terms of predictable outcome. It is no longer 'safe.' "

SO, if we are to compare the always-must-be-safe cue "Come" to the command "Leave-it" - Leave-it is an "unsafe" cue in its application - a warning that "that thing could be dangerous." Although if we teach it as a neutral conditioned NRM that means "come away from there, it doesn't belong to you" rewarding so when the dog hears the cue it associates it with "the handler is the rewarding place AND the way to get what you want" it doesn't have to carry the unspoken threat of "or I'll rip your neck off with this choke chain if you don't!"

When we teach "Leave it" we are teaching the dog "that item is off-limits and there is no point in trying to grab it" AND that turning away from the item, moving toward the handler, and making eye contact with the handler will be rewarded - in fact, access to the item itself could be the reward.

Here is how I approach teaching "Leave-it":
Remember our mantra is "BUILD ON A FIRM FOUNDATION!"
The stages of Learning are ACQUIRE, PERFECT, GENERALIZE.

Doggy Zen in hand - (See page 49 in handbook) Teaches leadership, control of resource, patience, impulse control, eye contact, timing of verbal marker "yes", criterion shift, setting the dog up to succeed, and the concept of an unemotional no-reward marker as opposed to a correction. Bonus- improved bite inhibition, behavior around food. Mine, not yours. Work to earn.

Step one (acquire):
Leave-it - Doggy Zen (biscuit in closed hand) - this simple exercise is INVALUABLE for teaching students the timing of "yes" and the possibility of teaching the dog to STOP doing something by rewarding the appropriate behavior and ignoring the bad. It gives them practice in the timing of the verbal reward marker and shows them how ignoring inappropriate behavior and simply rewarding what you DO want works - and works incredibly well. It also explains how the phrase "leave it" is information - a neutral no reward marker, not a punitive correction. If you can get the handler to focus on the critical timing of the "yes!" for the action of simply removing his nose from the hand and not be consumed by growling "leave it" you've made HUGE strides in preparing them for Prep class.

Step two (perfect):
The LLW to the plate game. I start with the basic premise of the leave-it in the hand - the goal is for the dog to sit in front of a plate full of tasty food without lunging for it. Reward dog for loosening leash - mark "yes"! for settle back / eye contact -just like doggy zen. If the dog continues to strain against the leash to try and get to the food, I say nothing and hang on tight and wait. If the leash slackens, "yes" and feed. If the dog sits instead of lunging "yes!" and feed. When the dog is able to sit and resist the urge to "mug the plate" "yes!" and jackpot.

Each time the dog hears "yes!" the food comes from above, from the handler's face, to encourage the dog to look up to the handler in anticipation of the next reward. Before long they hear "yes" and look up - away from the plate. There is a point where the dog is resisting (has been rewarded multiple times for slackening the leash) but then sees an opportunity and tries to grab it again as he lunges and is about to run out of leash, I add "leave it". The dog learned from the Zen exercise that when you hear "leave it" the opportunity for reward was just removed - it is a no reward marker. Information, not correction. Continuing to try will not be successful. It is identical to the Doggy Zen in that what I am rewarding is "giving up." Eye contact earns a jackpot. End goal - dog considers its options, resists temptation and looks at handler instead. "YES!" Jackpot!

Bonus: dog is learning how to slacken the leash, owner is learning when to give verbal marker for a slack leash. Then we approach from 1 step away, then 2 and at this point it becomes a LLW (loose leash walking) exercise with the plate as the goal line.

Step three (generalize):
Random stuff on the ground. You've laid a foundation that mugging a hand full of food gets you nowhere and trying to grab food from a plate gets you nowhere. Giving up pays off, being patient and looking to the handler pays off. The handler has gained the skills to give well-timed rewards. Now you've raised the criteria to "real stuff." You leave the opportunity open, wandering near an item on a loose leash. Dog lunges for an item but runs out of leash (pop), can't get to it, can't get to it, can't get to it - there's an invisible force field around the object - dog gives up "Yes!" Looks at handler JACKPOT. Yes, you get avoidance behavior because hitting the end of the leash is punishing. The dog begins skirting the items and offering upward looks to the handler who will (if you really stressed it on the Zen exercise and the plate exercise) verbal mark "yes!" for making the right *choice* and being attentive to the handler in the face of some tough distractions. Key point: the sight of an object on the ground becomes the cue to "leave it" and look to the handler.

Why not just go straight to step three? Because if you've laid a good foundation at steps one and two, the dog has an understanding of the concept. He has the tools he needs to solve the problem with a minimum of P+ necessary. It is a criterion shift, not the whole training session.

So what pieces of learning theory are happening during this process?

There are four quadrants (plus extinction) in operant conditioning:

R+ ....- positive reinforcement - adding something to increase a behavior
R-
.....- negative reinforcement - removing something to increase a behavior
P+
....- positive punishment - adding something to decrease a behavior
P-
.....- negative punishment - removing something to decrease a behavior
Extinction
- nothing happens - behavior weakens from lack of reinforcement

The bottom line when determining which quadrant is in play is to ask was something added or removed (+ or -) and did behavior increase (reinforced) or decrease (punished).

Positive trainers try their best to use predominantly R+ (adding something the dog wants to strengthen a desired behavior) - but remember every time you withhold a reward to reduce an undesired behavior you are using negative punishment.

I've tried to come up with examples of how the quadrants work within the "Leave-it" exercise below:

R- negative reinforcement - removing something the dog doesn't want to increase a behavior.

Closed hand containing tasty food is presented and remains tightly closed as long as the dog continues to try to pry the food out. When the dog gives up, handler opens the hand so the dog can get to the food. Frustration at getting into closed hand is removed (-), non-mugging withdrawal behavior increases (is reinforced (R)) .

The tension on the leash is maintained until the desired response is achieved, i.e. dog strains against leash, owner is a tree, dog gives up pulling, uncomfortable tension on neck is removed (-), behavior of releasing tension on the leash increases (R).

Dog wanders near an off-limits item and handler begins a steady AH-AH-AH-AH-AH-AH which stops the instant the dog moves away to an allowable distance. The annoying noise is removed (-) and behavior of staying away from the off-limits item increases (R). Your seat belt reminder is a R- making you hurry to put your belt on.

The ear pinch to teach a retrieve is the most commonly thought of R- ... the pain of the pinch ends when the dog takes hold of the dumbbell.

P- negative punishment - withholding something the dog wants to decrease a behavior.

Hand snatches closed as dog reaches for food - food disappears (-). The behavior of mugging for the food decreases -is punished (P).

Dog lunges for food dish as owner is placing dish on floor, handler puts dish back on counter (-). Lunging for food dish decreases (P).

R+ - positive reinforcement - adding something the dog wants to increase a behavior.

The dog pauses as hand opens, handler marks with a "yes" and quickly delivers food reward (+). Dog learns to wait patiently in front of an open palm full of food, the owner reinforces the dog on a variable schedule. The behavior of not taking the food when presented increases (R).

Closed hand containing tasty food is presented and remains tightly closed - the dog continues to try to pry the food out and finally claws the fingers open. The handler drops the food (+) and the dog eats it. Mugging behavior increases -is reinforced (R).

P+ - positive punishment - adding something the dog doesn't want to decrease a behavior.

As dog reaches for food on open palm, handler startles dog with verbal reprimand, penny can, water bottle, stomping foot, leash pop causing it to withdraw from the hand. (A "soft" dog might need a mere whisper of "ah ah!") Addition (+) of something the dog finds disagreeable causes avoidance of food taking behavior. Behavior of grabbing for the food decreases -is punished (P).

Dog repeatedly slams against leash that handler has made too short/pops to prevent dog from reaching the off-limits item. Addition (+) of something the dog finds disagreeable (hitting end of leash) causes reduction of behavior. Behavior of lunging for off limits items decreases -is punished (P).

Dog grabs something from kitchen counter, penny can crashes to the ground (+), dog drops food and runs out of kitchen. Behavior of counter surfing decreases (P).

 

You are using one or more of the quadrants every time you teach an animal anything. You should be aware of what quadrant you are choosing to effectively diminish an unwanted behavior and install a new one as well understanding the benefits and fall-outs of those choices.

Revisit Building Behaviors Topic 4 - what quadrants are at work?