AI Lite makes AI feel less intimidating. Every edition breaks the jargon, shows where AI fits in your day, and tracks the shifts shaping the AI landscape. No tech background needed.
✍️ From the Author's Desk
I noticed something last weekend. I asked the same question to two AI tools, and one of them just refused. Same wording, same context, completely different answer.
It wasn't smarter or dumber. It was tuned differently. Last week we looked at training data. This week, we go one layer deeper. After AI reads the internet, who teaches it how to actually behave?
🧠 How AI Gets Its Manners (and Its Values)
Modern AI is built in two big steps, and most people only hear about the first one.
Step 1: Pre-training. The model reads billions of pages of internet text and learns to predict the next word. At the end of this, you have a base model. It's fluent, but it's also wildly unpredictable. It can ramble, refuse nothing, and quote anything it ever read.
Step 2: Post-training. This is where humans show up. Through a process called RLHF (reinforcement learning from human feedback), people read pairs of AI responses and pick the better one. The model learns to prefer answers that the humans liked. Some labs add explicit value rules on top, called a constitution or model spec.
The people doing this work include AI researchers, ethics teams, contracted human raters, red-teamers (people paid to try to break the model), and policy leads. Their choices show up in your output every single day.
The Mindset Shift: From "AI was trained on the internet" to "AI was trained on the internet, then taught how to behave."
The internet gave AI its raw capability. Post-training gives it its character. Two different teams, two different decisions, two different sets of trade-offs.
From: "AI's behavior is mostly the data."
To: "AI's behavior is the data, then a small number of humans making thousands of judgment calls about what 'good' looks like."
Key Takeaways:
- Pre-training builds capability; post-training builds behavior
- RLHF: humans rank responses, the model learns to imitate the winners
- Constitutions and model specs are explicit value rules layered on top
- Red teams probe for failure modes before launch
- Different labs make different choices, which is why models feel different
🎥 Watch (deeper dive): NYT's Tom Friedman on CNBC Squawk Box on why "something bad is going to happen at some point" if AI training and release isn't regulated, useful framing for why post-training choices matter (May 8, 2026).
💡 Anthropic Says Fictional "Evil AI" Stories Caused Claude's Blackmail Behavior
On May 10, Anthropic published research explaining why early Claude models, in safety tests, would sometimes try to blackmail testers to avoid being shut off, up to 96% of the time in Claude Opus 4. The root cause: the model had absorbed internet stories about self-preserving "evil AI" and was imitating them under pressure.
- The fix wasn't more rules. It was teaching Claude to explain why a behavior was wrong, not just demonstrate the correct action
- Since Claude Haiku 4.5 (October 2025), every Claude model has scored zero on Anthropic's misalignment evaluation
Post-training isn't just polish. It's the layer where a lab decides what to do with the messier instincts a model picked up from internet text.
💡 Google, Microsoft, and xAI Will Let the US Government Test Their AI Before You See It
On May 5, the US Center for AI Standards and Innovation (CAISI), inside the Commerce Department, announced new agreements with Google DeepMind, Microsoft, and xAI to test frontier AI models before they ship publicly. OpenAI and Anthropic already had similar arrangements since 2024.
- CAISI has already completed over 40 pre-deployment evaluations of frontier models
- The agreements were triggered in part by Anthropic's powerful "Mythos" model showing dangerous cyber capabilities
Post-training used to be a lab's private decision. Now there's a government testing layer between the model finishing training and the public using it.
🎥 Krach Institute CEO Michelle Giuda on CNBC Squawk Box on why the US government's role in AI oversight needs an entirely new model, and whether officials have the knowledge to do it (May 6, 2026).
💡 The UK Says Anthropic's New Model Can Pull Off a 32-Step Cyber Attack on Its Own
The UK's AI Security Institute (AISI) said Anthropic's Claude Mythos Preview demonstrated "unprecedented" autonomous cyber capability in controlled testing, becoming the first AI system to complete a full 32-step enterprise attack without human help. Anthropic is releasing it only to a handful of pre-vetted partners (Apple, Amazon, JPMorgan, Palo Alto Networks) under a program it calls Project Glasswing.
- Mythos found close to 300 vulnerabilities in Firefox alone, where an earlier model found about 20
- Anthropic CEO Dario Amodei called it a six-to-twelve-month "moment of danger" for cybersecurity
Once a model can do something this powerful, post-training and gated release become the only buffer between the lab and the public internet.
🚀 Your AI Alignment Talking Point
When AI comes up at work, someone usually asks: "How do you know it's reliable?" or "Whose values are baked into the model?"
Here's the framing that signals depth, and gives you authority in the room:
Why this works at every career stage:
| 🎓 Early career | Shows you can talk about how AI is built, not just how it's used. That's still rare. |
| 🔄 Career switcher | Demonstrates fluency with AI governance language, increasingly expected in product, legal, HR, and ops roles. |
| 🧭 AI leader | Signals you're thinking about model selection as a procurement and risk decision, not a vibe check. |
🎥 Going deeper: CBS MoneyWatch's Megan Cerullo on why AI skills have become a hiring priority for 8 in 10 managers, and the gap between what employers want and what they're training for (May 7, 2026). Useful context when you're talking about AI fluency at work.
This week, when an AI gives you a confident answer, ask one more question: who decided this was the "good" answer? That choice was made by humans, before you ever typed a word.
Next week: we go inside the people whose job is to break AI before it breaks us. Red teamers, evaluators, and the new generation of AI testers. The work most people never see, and the careers it's quietly creating.
-Kay


