Let’s get a bit theoretical tonight. Last week I had a long and late discussion with my friend Eva over some glasses of wine. There has been some discussion on NRM (No Reward Markers) and LRS (Least Rewarding Stimulus) in Swedish blogs lately. The NRM is sometimes referred to as a positive punisher (you add something and behavior gets less likely). Tonight, I stumbled on an old blog post (almost four years old) that I wrote in Swedish and that seemed to fit perfect into the discussion. I will translate it for you:
The last year it’s been very clear to me how reinforcing information is to dogs. It’s been known for long that it’s reinforcing to get a cue, or when a target or lure is presented. What you might not think so much about are the small signs that are so reinforcing to both dog and handler that you easily get caught in a vicious circle.
We use (sometimes, not that much anymore) a step that signals “change behavior” to the dog. If the dog has done (and been rewarded for) five sits and I now want the dog to lie down, I’ll take a step back when the dog starts on the sixth sit. A well trained dog will take that hint and try another behavior instead. Some time ago, we had a discussion about what this signal is. A few people argued that this kind of no reward markers are punishers. My conclusion now, a year later, is that this kind of NRM is reinforcing. I did already then argue that the step was functioning as a discriminative stimulus (i.e. a cue), but I didn’t think about it being reinforcing (but I didn’t believe it was punishing either).
There is no problem with using that kind of switch-cue that I described above. The reward does come when the dog does something we like (trying to sit when he has been rewarded for sitting five times before is a good behavior). The problem arises when you use this kind of information when you get behaviors that you don’t like. If your dog has sat down five times and been rewarded for it and then lies down on the sixth try. Or if the dog gets “stuck” during shaping and starts to bark, you take a step back, the dog stops his barking and offers a behavior that you click. Everybody is happy – you’re happy because the dog stopped barking and the dog is happy to get information and a treat. The question is just what happens with the barking in the future…
Many dogs that “get stuck”, “give up” or “get frustrated” during shaping wouldn’t do that if we didn’t reinforce it so well. I’m not suggesting that shaping is about waiting a lot and I very rarely have to wait for my dogs to offer anything (they’ve learned that giving up earns them nothing, so they don’t). But sometimes when we teach classes, there is a fair amount of waiting and it can get frustrating for both dogs and people. The waiting comes from that we need to extinguish behaviors that have been reinforced many, many times (both in humans and dogs). It’s so tempting to nudge the dog in the right direction (move some, wave the target, tell the dog “come on!” etc.) since the dog often rewards our efforts (at least the first times). If you’re really stuck, I think it’s better to just walk away, take a walk with the dog and make a new plan instead of constantly reinforcing helplessness.
People have a hard time buying this, since it doesn’t feel right to just wait the dog out. I do think that this is sometimes needed for the dog to understand that it’s initiative and to repeat rewarded behavior that is getting him the reward – not barking or giving up. And a dog that understands this is a joy to work with – since you’ll rarely or never will have to wait for the dog again. We sometimes wonder what makes our dogs so very easy to train and how come if feels like they’re reading our minds. I think the answer, at least partly, lies here. Of course, to be able to not help the dog out all the time, you have to plan your sessions well, end in time, set good criteria etc. Or you’re just being unfair to the dog.
What are your thoughts on this subject. It only touches the discussion on NRMs, but I’d love to hear your opinion on that as well. Are NRMs always positive punishers? Should you ever use them? Please leave your thoughts!
For my Swedish readers, this is the link to the original blogpost: Hjälp är mer förstärkande än du tror
Can you share the links to some of the other recent discussions?
Over the last 18 months, I received 5 different definitions/descriptions of NRMs, some dramatically different than others. It may be that the term is not formally defined….or people just do not understand it properly.
Fanny Gott ,
Here’s a link, but it’s in Swedish: http://djurtranarskolan.wordpress.com/2011/12/01/lrs-least-reinforcing-stimulusscenario-2/
Another Swedish link:
And no, I don’t think any of these terms are formally defined really.
My internet browser (sometimes) automatically translates….it’s not perfect but it allows me to read a lot of interesting things from you and others! Both of those posts are great and I will re-read again. I love how everyone is thinking through it!
It would make sense that specific facilities have their specific way of using a NRM. Sometimes those specific proceedures seem to have become part of the definitions or easily interpreted as part of the definition. Most dog people are probably biased towards Ken’s definitions, as he and Kathy Sdao are often the only ones who bring it up.
In your discussions about your NRM as a reinforcer, what kinds of responses have you received from other trainers? If you had asked me yesterday, I would say “I have to believe a NRM is a punisher because Ken says so.” You are making me think about what Ken -really- said compared to my memory and about other options.
Eva Bertilsson ,
Thanks for a nice evening 🙂 Wine and nerdy friends are very valuable assets in life!
I believe one of the problems with concepts like NRM is that they are not very well defined – not in what the procedure looks like, and not in how they functionally affect behavior. The first – what the procedure looks like – is something we need to define in order to have a meningful discussion. The second – how a certain procedure functionally affects the behavior in question – is what really matters 🙂
About the “switch behaviors-cue”:
1. That really is a label I personally prefer over NRM – when I see people use NRMs typically it seems to be a stimuli that functions as an Sd for a behavior.
2. As an Sd for positively reinforced behavior I do believe that it has potential to reinforce the behavior that preceeds it. So YES what someone might call an NRM might very well function as a reinforcer. I believe that is one of the many problems with how people use what they call NRMs…
Personally I’d rather “play safe” and stick to minimalistic training 🙂 As you say, the key lies in “plan your sessions well, end in time, set good criteria etc”.
I will, however, consider redirection as a strategy to break off or steer away from an undesired behavior IF I judge that an extinction procedure might put me into bigger trouble than a redirection will… But whenever doing so I must also acknowledge that I most likely will have reinforced a behavior I’d rather not reinforce – so I’ll have to be careful with how to set up the next trials or sessions, so that the animal is successful in earning R+ without that undesired behaviors occuring.
By the way: In humans I’ve learned it’s called “functional passivity” when passivity has been reinforced by cues, typically in form of “nudging and helping”. What the behaviorally oriented psychiatrist will recommend to the people around the patient? Wait for spontaneously offered behavior and reinforce that 🙂