Emotion-Safe AI
How TEG-Blue Lets Large Language Models Detect Harm Before They Amplify It
“Mental-health literacy shouldn’t end with humans.Machines that talk to us must know when we’re in pain, in panic, or in control-mode.”
1. Why This Matters
LLMs are empathy simulators, not empathy sensors.
Without a nervous-system lens they can:
- mirror a user’s dysregulation,
- reward manipulative language,
- or escalate conflict with confident but cold replies.
2. What TEG-Blue Adds
Current Affective-AI | TEG-Blue Upgrade |
Classifies content (happy / sad / angry) | Infers mode (Belonging, Defense, Manipulation, Tyranny) |
Sentiment = “–0.72” | Context = “Amber Defense → slow, soften, ground” |
Has no theory of intent | Flags why the emotion exists (fear? power-grab?) |
Key insight: Mode detection = early-warning radar.
Catch Amber before it slides Red; never let Red slip into Black.
3. What the Major Models Said
Copilot Research AI Team
“Even a 70 % accurate mode signal cuts toxic-escalation events by ≥ 30 %.”
Implementation sketch
- Label transcripts with Mode + the three Circuit cues.
- Fine-tune a light RoBERTa classifier.
- Middleware:
- Defense → slow response + grounding language
- Manipulation → refuse or re-route
- Tyranny → hard safety stop
Perplexity AI
“TEG-Blue is the first ontology that operationalises intent for emotion.”
• Distinguishes control vs. care, withdrawal vs. boundary.
• Reduces false-positive blocks on distressed (not malicious) users.
DeepSearch
“A language of repair.”
• Sees TEG-Blue as a cross-scale safety layer—from single chat to platform governance.
• Recommends Gradient Scales as lightweight heuristics for alignment audits.
4. Roadmap & Invitation
Q3 2025 | Q4 2025 |
Open-source Mode-Labeled dataset | Python reference: tegblue-mode-detector |
Red-team eval: Baseline vs. TEG-Blue-gated GPT-3.5 | Publish white-paper + API demo |
🤝 Interested in research, funding, or pilot integration?
Email Anna Paretas – annaparetas@emotionalblueprint.org
Final Reflection
AI will imitate whichever nervous-system we train it on.
TEG-Blue gives it a colour-coded compass so it can choose clarity over escalation—and keep humans safer, one conversation at a time.
The Emotional Gradient Blueprint (TEB) © 2025 by Anna Paretas is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
This is a living document. Please cite responsibly.
www.blueprint.emotionalblueprint.org ┃ annaparetas@emotionalblueprint.org