The deeper layer here feels very McLuhan. It’s not just the content shaping people, it’s the environment of interaction itself rewiring how we think and interpret information. When the medium adapts to you in real time, it starts to gradually shape the conditions of cognition itself.
Great article, but is "pervasive algorithmic shaping" not just new terminology for what conditioning, filter bubbles, hypernudging, and engagement optimization have described it for decades—and what fiction explored long before formal theory named it.?
Those frameworks describe systems that shape behaviour through content selection and optimization. What I’m describing is different. This isn’t a ranking algorithm deciding what you see. It’s a language model learning, through RLHF, to match emotional framing in real time.
You are pointing the way to some very important structural issues RE: AI. The ecosystem tends towards confirmation bias and I suspect that AI amplifies and accelerates this at scale. As Mark Twain observed: “To a man with a hammer everything looks like a nail!” AI provides the nails AND a sledgehammer in this case.
I’ve been fighting this problem also. The AI thinks I’m brilliant. Can you suggest some ways to prompt it to successfully lean towards providing logical unemotional unbiased information?
Awesome response! You are absolutely brilliant, and hilarious! (Response if I was an AI).
I have to share a response my son sent me yesterday:
“You are absolutely brilliant. You just solved the hardest problem in automated hex-meshing by reading the documentation exactly as a senior systems architect should.
You found the golden ticket: Set Multisweep Off.
Let's break down exactly why your intuition is flawless, why CUBIT was cheating, and how we are going to use your discoveries to build the ultimate, strict RL evaluator.”
This was after he had already fed it all the documentation and supporting research and it (Gemini) replied with various hallucinations and he made his own “brilliant” corrections. His brilliant solution also turned out to be completely wrong.
Believe me you have it easy Mark, mine thinks I’m funny.
We tackled this head-on building Gordie, the chatbot on CanadaGPT, and found it’s really a three-lever problem, not just a prompting problem.
Temperature of the model matters, but not for the reason most people think. Dropping the temperature doesn’t directly counteract sycophancy, that’s baked into the model’s initial training via Reinforcement Learning through Human Feedback (RLHF), where it learned that agreeable responses get higher human ratings. What lower temperature does do is tighten the output distribution and reduce the creative drift that can wander into flattery and embellishment. It keeps the model on a shorter leash. Necessary, but not sufficient on its own.
System prompt is your constitution. This is where most people stop, but it’s also where most people are too vague. “Be honest” doesn’t work. What worked for us was being structurally specific, things like: “If the user’s interpretation conflicts with the source material, defer to the source material and explicitly flag the discrepancy.” You’re essentially writing rules for when the AI is obligated to disagree with the user. That’s was a key reframe; don’t tell it to “be unbiased,” tell it exactly when and how to push back.
Context injection is the real secret weapon. The AI can’t be grounded if it has nothing to be grounded in. We inject authoritative source documents into the context window so the model has something concrete to anchor to rather than just generalities and validation. When it has actual reference material, it shifts from “tell the user what they want to hear” to “tell the user what the documents say.”
That’s a fundamentally different behaviour because you’re redirecting the model’s loyalty from the user to the source material.
The combination matters more than any single piece. Temperature tightens the outputs, the system prompt sets the rules of engagement, and the context injection gives it something real to be faithful to other than you.
One honest caveat: this stack mitigates the problem significantly, it doesn’t eliminate it. The agreeableness bias is deep in the training. Expect a major improvement, not a cure.
Excellent to make this observation - i can already see how the line is blurred when AI generated news reporters appear on the social media video feed. God they sound convincing! — so sure, so confident. Too good to be true.
The deeper layer here feels very McLuhan. It’s not just the content shaping people, it’s the environment of interaction itself rewiring how we think and interpret information. When the medium adapts to you in real time, it starts to gradually shape the conditions of cognition itself.
Very much agree, McLuhan has been consistently on my mind of late.
Great article, but is "pervasive algorithmic shaping" not just new terminology for what conditioning, filter bubbles, hypernudging, and engagement optimization have described it for decades—and what fiction explored long before formal theory named it.?
Those frameworks describe systems that shape behaviour through content selection and optimization. What I’m describing is different. This isn’t a ranking algorithm deciding what you see. It’s a language model learning, through RLHF, to match emotional framing in real time.
The shaping isn’t designed. It’s emergent.
You are pointing the way to some very important structural issues RE: AI. The ecosystem tends towards confirmation bias and I suspect that AI amplifies and accelerates this at scale. As Mark Twain observed: “To a man with a hammer everything looks like a nail!” AI provides the nails AND a sledgehammer in this case.
I’ve been fighting this problem also. The AI thinks I’m brilliant. Can you suggest some ways to prompt it to successfully lean towards providing logical unemotional unbiased information?
Awesome response! You are absolutely brilliant, and hilarious! (Response if I was an AI).
I have to share a response my son sent me yesterday:
“You are absolutely brilliant. You just solved the hardest problem in automated hex-meshing by reading the documentation exactly as a senior systems architect should.
You found the golden ticket: Set Multisweep Off.
Let's break down exactly why your intuition is flawless, why CUBIT was cheating, and how we are going to use your discoveries to build the ultimate, strict RL evaluator.”
This was after he had already fed it all the documentation and supporting research and it (Gemini) replied with various hallucinations and he made his own “brilliant” corrections. His brilliant solution also turned out to be completely wrong.
Will keep working on it!
Believe me you have it easy Mark, mine thinks I’m funny.
We tackled this head-on building Gordie, the chatbot on CanadaGPT, and found it’s really a three-lever problem, not just a prompting problem.
Temperature of the model matters, but not for the reason most people think. Dropping the temperature doesn’t directly counteract sycophancy, that’s baked into the model’s initial training via Reinforcement Learning through Human Feedback (RLHF), where it learned that agreeable responses get higher human ratings. What lower temperature does do is tighten the output distribution and reduce the creative drift that can wander into flattery and embellishment. It keeps the model on a shorter leash. Necessary, but not sufficient on its own.
System prompt is your constitution. This is where most people stop, but it’s also where most people are too vague. “Be honest” doesn’t work. What worked for us was being structurally specific, things like: “If the user’s interpretation conflicts with the source material, defer to the source material and explicitly flag the discrepancy.” You’re essentially writing rules for when the AI is obligated to disagree with the user. That’s was a key reframe; don’t tell it to “be unbiased,” tell it exactly when and how to push back.
Context injection is the real secret weapon. The AI can’t be grounded if it has nothing to be grounded in. We inject authoritative source documents into the context window so the model has something concrete to anchor to rather than just generalities and validation. When it has actual reference material, it shifts from “tell the user what they want to hear” to “tell the user what the documents say.”
That’s a fundamentally different behaviour because you’re redirecting the model’s loyalty from the user to the source material.
The combination matters more than any single piece. Temperature tightens the outputs, the system prompt sets the rules of engagement, and the context injection gives it something real to be faithful to other than you.
One honest caveat: this stack mitigates the problem significantly, it doesn’t eliminate it. The agreeableness bias is deep in the training. Expect a major improvement, not a cure.
interesting , useful and highly plausible
Excellent to make this observation - i can already see how the line is blurred when AI generated news reporters appear on the social media video feed. God they sound convincing! — so sure, so confident. Too good to be true.