We audited our AI Coach and found drift into generic advice. We need to bind every response to the Solomon Codex (canonical frameworks, step-locked sequences, “Father Voice”), refuse when a framework isn’t found, and gate flows (e.g., Order of Dominion before wealth).
Questions:
- How would you hard-bind outputs to a framework registry (exact names/steps) and refuse on missing/invalid frameworks?
- Best way to enforce style/tone (“governor” not “assistant”) and block banned phrases—prompting only, or a post-processor/classifier?
- Patterns for Perspective Inversion (answer from Ancient Wisdom, not about it) without hallucinations?
- How do you track a user’s current framework state across sessions and force correct sequencing?
- What eval harness would you use to measure doctrine compliance and catch drift (red-team cases, rubric scoring)?
- Any recommended policy/guardrail libraries or schema/grammar approaches you’ve used at scale?