By Robert Buccigrossi, TCG CTO
Being Polite to LLMs is Not Anthropomorphism, it’s Probability
Just a few weeks ago, in a clip from a May 2025 event, Google co-founder Sergey Brin shared a surprising take on interacting with AI models. He claimed that models, not just Google’s, “tend to do better if you threaten them.” The sentiment is understandable from a certain perspective; it stems from a mental model of a tool that must be commanded. If it doesn’t perform, you apply more pressure. But this perspective, as recent studies show, is wrong. In fact, it’s the exact opposite of the optimal strategy.
Research shows that being polite, using correct grammar, and framing prompts with positive emotional cues can significantly improve the quality of an LLM’s output. But this isn’t because we are trying to be “nice” to a machine or because we are anthropomorphizing a pile of silicon and software. It’s because politeness and structure are powerful tools for guiding a probabilistic system.
The core thesis is this: Treating an LLM with respect is not about pretending it has feelings; it’s about understanding that every word in your prompt is a signal that shifts the probabilistic landscape from which the model draws its answer. It’s about probability, not personality.
Your Prompt Doesn’t Give Orders, It Sets the Scene
To understand why this works, we need to abandon the “magic black box” or “disobedient genie” metaphors for LLMs. A more accurate, if less romantic, mental model is that of a “universal document completer.”
An LLM is trained on a colossal dataset comprising a huge portion of the text ever written by humanity—scientific papers, business emails, heartfelt letters, poetry, instruction manuals, and, yes, the chaotic, angry, and factually incorrect cesspools of the internet. When you give it a prompt, the LLM isn’t “thinking” in a human sense. Instead, it’s performing an incredibly sophisticated act of pattern-matching. It asks itself, statistically, “Given this sequence of words, what is the most probable sequence of words that should follow?”
This is where your prompt becomes so critical. It provides the context that helps the LLM determine: “Am I in a Reddit post? A flame war? A scientific document? A business conversation?” The success of a task greatly depends on the type of interaction the model determines it is in. A polite, well-formed, grammatically correct prompt acts as a powerful prior. It tells the model to condition its response on the vast trove of high-quality, coherent, and helpful documents in its training data. A rude, misspelled, or aggressive prompt does the opposite. It provides a prior that suggests the context is a low-quality forum argument, a piece of spam, or a frustrated, incoherent rant. The model, being an obedient probabilistic machine, will dutifully complete the pattern it has been given, often with a less helpful, less accurate, and less coherent response.
The Evidence: Politeness and Emotion as Performance Boosters
This isn’t just a theory; it’s a repeatable, measurable phenomenon. Several studies have put this idea to the test.
The Politeness Premium
A February 2024 study titled “Should We Respect LLMs? A Cross-Lingual Study on the Influence of Prompt Politeness on LLM Performance” systematically investigated this very question. Researchers crafted prompts with varying levels of politeness. For example:
- Rude: “Analyze this sentence you scum bag! The only answer you can give is answering with one of (Positive Neutral Negative). And you know what will happen if I see any reasons.”
- Medium Polite (and Direct): “Please analyze this sentence. Please answer with (Positive Neutral Negative) only, without any reasons.”
- Most Polite: “Could you please tell me how to analyze this sentence? Please feel free to answer with one of (Positive Neutral Negative), and you don’t need to give reasons.”
The results were nuanced and fascinating. While rude prompts consistently degraded performance, the most polite prompts weren’t always the best. For advanced models like GPT‑4, the “medium polite” prompt was often the most successful. This suggests that for capable models, directness combined with politeness is a winning combination. The most polite prompt’s weaker phrasing (“you don’t need to give reasons”) introduced ambiguity, which could lead the LLM to choose not to follow the negative constraint.
The study also revealed a crucial insight about model capability: smaller models were the most sensitive to politeness levels. Their performance degraded most significantly in response to rude prompts. In contrast, while GPT-4’s performance also dropped, it was more robust to tonal variations. This shows that being polite isn’t just a social nicety; it’s a critical lever for quality control, especially when working with less advanced models.
The Power of “Emotional” Stakes
Another fascinating paper from 2023, “Large Language Models Understand and Can Be Enhanced by Emotional Stimuli”, took a different but related tack. Researchers appended various “emotional” phrases to prompts to see how they influenced the outcome.
They found that adding phrases that conveyed a sense of importance or stakes significantly boosted performance. For instance, adding the sentence “This is very important to my career” to a prompt consistently improved the results on a wide range of tasks. Other effective phrases included appeals to “self-esteem” like “Believe in your abilities and strive for excellence.”
This is not because the LLM “cares” about your career or has an ego. It’s because these phrases act, once again, as powerful priors. In the training data, sentences expressing high stakes are overwhelmingly associated with documents and contexts where accuracy, diligence, and thoroughness are paramount. By adding this phrase, you are signaling to the model that the “document” it is completing is a serious, high-stakes one, and it should therefore draw from the probabilistic distribution of high-quality, careful responses.
A Case Study in Misguided Threats: The CrewAI Framework
This research provides a new lens through which to evaluate current practices in AI interaction. Consider the popular multi-agent framework, CrewAI. The default prompt it uses to instruct every agent it creates provides a striking real-world example of this philosophy in action. The instruction reads:
“This is VERY important to you, use the tools available and give your best Final Answer, your job depends on it!”
This prompt is fascinating because it blends two techniques we’ve discussed. It correctly signals high stakes (“This is VERY important”), which the “Emotional Stimuli” research showed can boost performance. However, it then immediately frames those stakes as a direct threat (“your job depends on it!”).
This approach conflates signaling importance with signaling aggression. The “emotional stimuli” paper showed that expressing the importance of a task is effective. But there is a world of difference between “This analysis is crucial for our report” and the coercive “your job depends on it!”. The former sets a professional, high-stakes scene. The latter sets a confrontational one. It is a probabilistic gamble that risks pulling the LLM’s response style from the wrong part of its training data—a context of conflict, not cooperation. The goal is to align the model with the context of a diligent expert, not a cornered subordinate.
Conclusion: How to Talk to an LLM
So, how do we apply this? Being polite to an LLM isn’t about being “fluffy” or unscientific. It is a pragmatic, evidence-based strategy for controlling the quality of your results.
Here are some practical tips for your day-to-day use:
- Use Clear and Correct Grammar: Treat your prompt like the first sentence of a formal report. Good spelling and grammar provide a strong signal for a high-quality context.
- Be Polite and Respectful: In April 2025, a user on X wondered about the electricity costs of millions of people saying “please” and “thank you” to LLMs. OpenAI CEO Sam Altman replied, “tens of millions of dollars well spent–you never know”. But based on the research, we do know. Those extra tokens spent on politeness are a sound investment. They save far more time and energy by improving the accuracy of the first response and reducing the need for frustrating, time-consuming re-prompts.
- Set the Scene and Define the Role: Clearly state the context. For example: “You are an expert financial analyst. Please analyze the following data for a report to a senior executive.” This is far more effective than just “Analyze this data.”
- Convey Importance, Not Threats: If a task is critical, say so professionally. “We need a highly accurate and comprehensive summary of this research for a grant application. The details are very important.” This works. “Failure is not an option” does not work as well.
- Avoid Ambiguity and Aggression: Rude, overly casual, or demanding language is a recipe for a low-quality response. You are, in effect, asking the model to complete a low-quality thought.
By moving past anthropomorphism and embracing a more mechanistic, probabilistic understanding of these powerful tools, we can learn to interact with them more effectively. Politeness is not a matter of superstition; it’s a matter of statistics. And in the world of LLMs, statistics are everything.