Yvo.Schedule

The three prompt patterns I actually use in production

2025.07.14·3 min read·
PromptingProductionPatterns
Revised 1 time — last on 2025.09.30· open
  • 2025.09.30Removed the section about chain-of-thought prompting — by late 2025, the frontier models reason without being asked and the old 'let's think step by step' incantation became noise.

After shipping AI in four companies where wrong answers cost real money, I've stopped reaching for clever prompt engineering and started reaching for three patterns. That's it. Three. Everything else is decoration.

These patterns are boring. They're also the ones that survive contact with production.

1. Structured output as the interface

Every time I'm tempted to "just parse the response," I remind myself that the model will produce JSON-adjacent-but-not-quite-JSON at least 0.5% of the time, and that's enough to break a pipeline in the middle of the night. So: A response that conforms to a strict schema (JSON-with-types) rather than free-form prose you have to parse heuristically. OpenAI calls it response_format; Anthropic gets it via tool use.. Always. Use the provider's native schema support — OpenAI's response_format, Anthropic's tool use, or whatever else.

The mental model shift is this: your prompt is not a conversation with the model. Your prompt is the body of an API call that happens to be written in English. Treat the response the same way you'd treat a REST response — as a typed payload, validated at the boundary, never parsed heuristically.

2. Role priming with concrete negatives

"You are a helpful assistant" is not a system prompt. It's a wish.

What actually works: naming what the assistant is not supposed to do, with concrete examples. "You help developers write PostgreSQL queries. You do not suggest ORMs. You do not recommend migrating to MongoDB. If the user asks about NoSQL, redirect them to PostgreSQL's JSONB features." The negatives do more work than the positives.

I learned this the hard way at Rabobank, where a polite "helpful assistant" prompt kept confidently inventing API endpoints that didn't exist. Adding three lines of "do not hallucinate API paths; if you're not sure, say so" cut the hallucination rate to near zero.

3. Few-shot examples as the spec

The cheapest way to specify a task is to write two examples of input and expected output, and paste them directly into the prompt. Not instructions. Two to five demonstrations beats a paragraph of rules every time. The model is a pattern-completion engine — feed it patterns, not specifications..

This is particularly powerful for classification and extraction tasks where the rubric is hard to articulate but easy to demonstrate. I can write "classify this customer email as 'billing', 'technical', or 'escalation'" in a prompt, but the model will drift. If I show it two examples of each category, the drift disappears.

What I don't do anymore

I don't use The model's randomness knob. 0 picks the single highest-probability token; higher values widen the distribution. In production I use 0.3 for a sane middle — enough variance to avoid robotic phrasing, not enough to hallucinate. 0. I use temperature 0.3. Determinism is a lie at scale; what you want is consistent shape (hence structured output) with reasonable variance in word choice.

I don't pre-write the entire response structure in the prompt. Overfit prompts break when the input shifts slightly. Write the minimum that gets you reliable output, then stop.

These three patterns cover maybe 80% of production work. The other 20% is boring engineering — retries, rate limits, caching, evals — which are not prompt problems.