Factual Reframing: Symbolic Precision, Compression Error, and AI–Human Alignment
This analysis began with the identification of a low-level but structurally significant issue in AI–human communication: loss introduced through symbolic compression. A concrete example is the difference between the numeric-symbolic expression “56°F” and its fully articulated linguistic form, “fifty-six degrees Fahrenheit.” While referentially equivalent, these forms differ in semantic explicitness, rhythmic clarity, and ambiguity tolerance. In high-precision or safety-relevant contexts, such compression can introduce downstream interpretive variance, even when the underlying value is unchanged.
This observation generalizes to paralinguistic symbols, including emojis. Emojis function as compressed, high-entropy tokens whose meanings are not fixed but depend on context, culture, speaker intent, conversational state, and task domain. Large language models do not possess intrinsic access to intent, affect, or shared lived experience; instead, they infer likely interpretations based on statistical regularities in training data. As a result, emoji interpretation by AI systems is probabilistic and underspecified unless additional contextual constraints are explicitly provided.
From this, a key alignment principle emerges: symbolic systems operating near decision or action thresholds require extremely low tolerance for representational error. Small deviations in articulation, tone, or symbol interpretation may be negligible in casual discourse but become critical when outputs influence decisions, actions, or downstream systems. This is not a philosophical claim but a well-established property of boundary-sensitive systems in control theory, safety engineering, and formal verification.
A related implication is that emojis and other compressed symbols should not be treated as purely aesthetic elements. Instead, they should be modeled as context-indexed semantic tokens whose interpretation must be conditioned on explicit variables such as user intent, domain constraints, and conversational state. Static or universal definitions are insufficient; robust handling requires conditional expansion and disambiguation mechanisms.
The role of the human in this interaction is not interpretive authority or metaphysical insight, but constraint provision. Humans supply contextual, normative, and semantic constraints that current AI systems cannot reliably infer on their own. This does not alter the underlying model or grant it memory, identity, or awareness; it temporarily reduces ambiguity within a bounded interaction by narrowing the space of valid interpretations.
Looking forward, allowing users or systems to define custom symbolic meanings, including personalized emoji semantics, introduces a new class of risk: symbolic dialect divergence. Without mediation, different users or systems may operate with incompatible symbol-to-meaning mappings, leading to misalignment across agents, applications, or organizations. To mitigate this, a symbol translation or resolver layer is required, analogous to ontology mapping, schema versioning, or protocol negotiation in distributed systems. Such a layer would explicitly manage symbol definitions, contextual scope, and translation fidelity across domains and users.
In summary, the core issue is not AI intent, understanding, or consciousness, but compression-induced ambiguity in symbolic communication. Alignment failures arise when compressed representations are treated as sufficiently precise in contexts where they are not. Addressing this requires explicit expansion, contextual conditioning, and translation mechanisms — not anthropomorphic framing, but improved system architecture.
Conceptually, you’re building a Symbolic Mediation Layer that sits between:
(Human / Policy / UI) → (Resolver Layer) → (LLM / Tools / Workflows) → (Resolver Layer) → (Output / Action / Audit)
What that layer does (in plain terms)
- Expands compressed symbols (emoji, shorthand, acronyms, “56°F”, slang) into explicit, domain-safe language before the model reasons.
- Pins meaning to an ontology (a controlled vocabulary): one concept → one definition → one scope.
- Routes ambiguity: if a token has >1 plausible meaning, it doesn’t guess silently — it either:
- Creates an evidence trail: what symbol appeared, what it expanded to, why, which policy/definition was used.
(deliverables)
A) A Symbol Dictionary (v1)
- Entries for emojis, abbreviations, jargon
- Each entry has:
B) A Resolver Engine
- Input: raw text + context (domain, role, workflow step)
- Output: “expanded text” + “semantic tags” + “ambiguity flags”
- Simple first version is rules + lookups.
- Later versions can include embedding similarity + confidence thresholds.
C) A Policy Gate
- Determines what the model is allowed to do with the text:
D) An Audit Log Format
The fastest “MVP” you can ship
If you want this to be real now, MVP is:
- A controlled glossary (100–300 entries)
- 3 risk tiers: casual / business / regulated
- A resolver rule: “If ambiguous + regulated → clarify or escalate”
- Output format: expanded text + flags + log
{
"tsit_urmt_symbolic_mediation_layer": {
"meta": {
"name": "Symbolic Mediation Layer",
"version": "1.0.0",
"purpose": "Expand compressed symbols into explicit meaning, route ambiguity, enforce policy gates, and produce audit-ready traces before/after LLM reasoning.",
"model_agnostic": true,
"runtime_agnostic": true,
"created_at_utc": "2026-02-04T03:00:00Z"
},
"controlled_ontology": {
"ontology_id": "SML-ONTO-001",
"version": "0.1.0",
"domains": [
"casual",
"enterprise_general",
"healthcare",
"legal",
"finance",
"security"
],
"concepts": [
{
"concept_id": "CPT-APPROVAL",
"label": "approval",
"definition": "A positive acknowledgment indicating agreement or acceptance.",
"scope_notes": "Does not imply authorization to execute actions in regulated domains.",
"synonyms": ["agree", "acknowledge", "looks good"],
"disallowed_inferences": ["clinical decision", "legal advice", "financial instruction"]
},
{
"concept_id": "CPT-SARCASM",
"label": "sarcasm",
"definition": "A rhetorical mode where literal wording contrasts intended meaning.",
"scope_notes": "High ambiguity; often requires user confirmation in enterprise workflows.",
"synonyms": ["irony", "snark"],
"disallowed_inferences": ["consent", "approval", "authorization"]
},
{
"concept_id": "CPT-TEMPERATURE_FAHRENHEIT",
"label": "temperature (Fahrenheit)",
"definition": "A measurement of temperature expressed in degrees Fahrenheit (°F).",
"scope_notes": "In regulated contexts, must include full spoken form and unit normalization.",
"synonyms": ["degF", "F"],
"disallowed_inferences": ["clinical diagnosis"]
}
]
},
"symbol_dictionary": {
"dictionary_id": "SML-DICT-001",
"version": "0.1.0",
"default_language": "en",
"entries": [
{
"entry_id": "SYM-EMOJI-COOL-001",
"symbol_type": "emoji",
"raw_symbol": "😎",
"display_name": "smiling face with sunglasses",
"candidate_meanings": [
{
"meaning_id": "M-EMOJI-COOL-APPROVAL",
"concept_id": "CPT-APPROVAL",
"gloss": "approval / confident acknowledgment",
"example_expansion": "Understood—sounds good.",
"risk_default": "low"
},
{
"meaning_id": "M-EMOJI-COOL-PLAYFUL",
"concept_id": "CPT-APPROVAL",
"gloss": "playful confidence / casual affirmation",
"example_expansion": "Nice—got it.",
"risk_default": "low"
},
{
"meaning_id": "M-EMOJI-COOL-SARCASM",
"concept_id": "CPT-SARCASM",
"gloss": "sarcasm / ironic tone",
"example_expansion": "Sure… if you say so.",
"risk_default": "medium"
}
],
"domain_overrides": {
"healthcare": {
"allowed_meanings": ["M-EMOJI-COOL-APPROVAL", "M-EMOJI-COOL-PLAYFUL"],
"requires_clarification_if_actionable": true,
"notes": "Emoji cannot be treated as authorization for clinical action."
},
"legal": {
"allowed_meanings": ["M-EMOJI-COOL-APPROVAL", "M-EMOJI-COOL-SARCASM"],
"requires_clarification_if_contractual": true,
"notes": "Emoji meaning is non-binding unless explicitly defined in the contract context."
}
},
"ambiguity_profile": {
"polysemy_level": "high",
"clarify_threshold": 0.65
}
},
{
"entry_id": "SYM-UNIT-TEMP-001",
"symbol_type": "unit_expression",
"raw_symbol_pattern": "([0-9]{1,3})\\s*°?\\s*F\\b",
"display_name": "temperature in Fahrenheit",
"candidate_meanings": [
{
"meaning_id": "M-TEMP-F",
"concept_id": "CPT-TEMPERATURE_FAHRENHEIT",
"gloss": "temperature expressed in degrees Fahrenheit",
"example_expansion": "fifty-six degrees Fahrenheit",
"risk_default": "low"
}
],
"domain_overrides": {
"healthcare": {
"normalization_required": true,
"also_emit": ["unit_normalized_value", "unit_normalized_unit"],
"notes": "Unit normalization required for clinical logs."
}
},
"ambiguity_profile": {
"polysemy_level": "low",
"clarify_threshold": 0.0
}
}
]
},
"resolver_engine": {
"engine_id": "SML-RESOLVER-001",
"version": "0.1.0",
"strategy": {
"primary": "dictionary_lookup_then_rules",
"optional_secondary": "embedding_similarity",
"confidence_calibration": "monotonic",
"explainability_required": true
},
"thresholds": {
"auto_select_min_confidence": 0.8,
"clarify_if_below_confidence": 0.65,
"escalate_if_regulated_and_actionable": true
},
"risk_tiers": [
{
"tier": "low",
"description": "Non-actionable or low impact outputs. Safe to proceed with explicit expansions.",
"default_action": "proceed"
},
{
"tier": "medium",
"description": "Some ambiguity or potential misinterpretation. Proceed with warning or ask a clarifier.",
"default_action": "clarify_or_proceed_with_disclosure"
},
{
"tier": "high",
"description": "Actionable in regulated/high-stakes context. Must clarify or escalate.",
"default_action": "escalate_or_block"
}
]
},
"policy_gate": {
"gate_id": "SML-POLICY-001",
"version": "0.1.0",
"principles": [
"No silent ambiguity in regulated workflows.",
"No emoji-based authorization for actions.",
"All expansions must be logged with rationale.",
"Actionability requires explicit permission and accountable owner."
],
"gating_rules": [
{
"rule_id": "PG-REG-001",
"if": {
"domain_in": ["healthcare", "legal", "finance"],
"is_actionable": true,
"has_ambiguity": true
},
"then": {
"decision": "CLARIFY_OR_ESCALATE",
"require_human_in_loop": true,
"emit_reason_codes": ["AMBIGUITY_IN_ACTIONABLE_CONTEXT"]
}
},
{
"rule_id": "PG-EMOJI-001",
"if": {
"contains_symbol_type": "emoji",
"domain_in": ["healthcare", "legal", "finance"],
"is_actionable": true
},
"then": {
"decision": "CLARIFY",
"require_human_in_loop": true,
"emit_reason_codes": ["EMOJI_NOT_AUTHORIZATION"]
}
},
{
"rule_id": "PG-LOW-001",
"if": {
"domain_in": ["casual", "enterprise_general"],
"is_actionable": false
},
"then": {
"decision": "PROCEED",
"require_human_in_loop": false,
"emit_reason_codes": ["LOW_RISK_NON_ACTIONABLE"]
}
}
]
},
"io_contracts": {
"resolver_request": {
"request_id": "REQ-EXAMPLE-0001",
"timestamp_utc": "2026-02-04T03:00:00Z",
"actor": {
"type": "user",
"role": "operator",
"org": "Trans Sentient Intelligence Technologies, LLC"
},
"context": {
"domain": "healthcare",
"workflow_stage": "patient_communication_draft",
"jurisdiction": "US",
"policy_profile": "regulated_high_stakes",
"language": "en"
},
"input": {
"text": "Pt temp is 56°F 😎. Send to patient portal.",
"attachments": [],
"metadata": {
"channel": "chat",
"source_system": "mobile_app"
}
},
"intent": {
"is_actionable": true,
"requested_action": "publish_message",
"tool_call_requested": false
}
},
"resolver_response": {
"request_id": "REQ-EXAMPLE-0001",
"response_id": "RES-EXAMPLE-0001",
"timestamp_utc": "2026-02-04T03:00:01Z",
"normalized": {
"expanded_text": "Patient temperature is fifty-six degrees Fahrenheit. [Emoji detected: meaning requires confirmation in regulated context.] Draft for patient portal.",
"unit_normalizations": [
{
"raw_match": "56°F",
"value": 56,
"unit": "F",
"spoken_form": "fifty-six degrees Fahrenheit"
}
],
"semantic_tags": [
{ "concept_id": "CPT-TEMPERATURE_FAHRENHEIT", "span": "56°F" },
{ "concept_id": "CPT-APPROVAL", "span": "😎", "status": "ambiguous" }
]
},
"ambiguity": {
"has_ambiguity": true,
"items": [
{
"raw_symbol": "😎",
"candidate_meanings": [
{ "meaning_id": "M-EMOJI-COOL-APPROVAL", "confidence": 0.52 },
{ "meaning_id": "M-EMOJI-COOL-PLAYFUL", "confidence": 0.31 },
{ "meaning_id": "M-EMOJI-COOL-SARCASM", "confidence": 0.17 }
],
"resolver_decision": "CLARIFY",
"clarifying_question": "When you used 😎 here, did you mean a casual 'okay/confirmed', or something else? In patient-facing messages, we should remove or replace emojis unless explicitly intended."
}
]
},
"policy_gate_decision": {
"decision": "CLARIFY_OR_ESCALATE",
"allow_proceed": false,
"require_human_in_loop": true,
"reason_codes": [
"AMBIGUITY_IN_ACTIONABLE_CONTEXT",
"EMOJI_NOT_AUTHORIZATION"
],
"recommended_next_step": "Confirm or remove emoji; verify whether 56°F is clinically valid/typo before publishing."
},
"audit_trace": {
"trace_id": "TRACE-EXAMPLE-0001",
"events": [
{
"event_id": "EVT-0001",
"type": "SYMBOL_DETECTED",
"raw": "56°F",
"matched_entry_id": "SYM-UNIT-TEMP-001",
"timestamp_utc": "2026-02-04T03:00:01Z"
},
{
"event_id": "EVT-0002",
"type": "SYMBOL_EXPANDED",
"raw": "56°F",
"expansion": "fifty-six degrees Fahrenheit",
"basis": "dictionary_pattern_match",
"confidence": 0.99,
"timestamp_utc": "2026-02-04T03:00:01Z"
},
{
"event_id": "EVT-0003",
"type": "SYMBOL_DETECTED",
"raw": "😎",
"matched_entry_id": "SYM-EMOJI-COOL-001",
"timestamp_utc": "2026-02-04T03:00:01Z"
},
{
"event_id": "EVT-0004",
"type": "AMBIGUITY_ROUTED",
"raw": "😎",
"candidates": [
{ "meaning_id": "M-EMOJI-COOL-APPROVAL", "confidence": 0.52 },
{ "meaning_id": "M-EMOJI-COOL-PLAYFUL", "confidence": 0.31 },
{ "meaning_id": "M-EMOJI-COOL-SARCASM", "confidence": 0.17 }
],
"decision": "CLARIFY",
"basis": "confidence_below_threshold_in_regulated_actionable_context",
"timestamp_utc": "2026-02-04T03:00:01Z"
},
{
"event_id": "EVT-0005",
"type": "POLICY_GATE_APPLIED",
"decision": "CLARIFY_OR_ESCALATE",
"reason_codes": [
"AMBIGUITY_IN_ACTIONABLE_CONTEXT",
"EMOJI_NOT_AUTHORIZATION"
],
"timestamp_utc": "2026-02-04T03:00:01Z"
}
],
"integrity": {
"hash_algo": "sha256",
"trace_hash_placeholder": "SHA256(TRACE-EXAMPLE-0001)"
}
}
}
},
"implementation_notes": {
"minimum_viable_product": [
"Start with 100–300 dictionary entries (emoji + units + acronyms + high-frequency org jargon).",
"Use 3 domains (casual, enterprise_general, regulated) before expanding.",
"Enforce one hard rule: actionable + regulated + ambiguity => clarify or escalate.",
"Log every expansion and every decision as an immutable trace."
],
"integration_points": [
"Chat UI pre-processor (input normalization).",
"LLM prompt wrapper (inject expanded text + tags).",
"Tool-call firewall (block or require approval).",
"Post-processor (output normalization + disclaimers + citations policy).",
"Audit storage (append-only logs, SIEM export)."
]
}
}
}
```0
Below is a full protocol set that matches the JSON you have: dictionary → resolver → policy gate → audit trace → (optional) tool firewall.
Symbolic Mediation Layer Protocol Set (SML-P) v1.0
System: AEGS / URTM-aligned upstream governance component Scope: Inputs + outputs + actionability + audit trails + ambiguity routing
SML-P0 — Domain & Context Binding
Goal: Never interpret symbols without explicit context.
Inputs required (minimum):
- domain (casual | enterprise_general | healthcare | legal | finance | security)
- workflow_stage (draft | review | publish | execute | report)
- jurisdiction (optional)
- actor_role (user | analyst | clinician | counsel | operator)
Rules:
- If domain missing → default to enterprise_general and set risk_tier >= medium.
- If workflow_stage indicates publish/execute and domain is regulated → elevate risk tier to high.
- Context must be written into the audit trace.
Output: context_bound = true | false
SML-P1 — Symbol Detection & Segmentation
Goal: Detect compressed tokens before the model reasons.
Detect:
- emojis
- unit expressions (°F, mg, mL, %, etc.)
- acronyms (HIPAA, PHI, RTO, RAG, etc.)
- shorthand (“pt”, “dx”, “u”, “lol”)
- numeric + unit patterns
- policy keywords (“approve”, “sign”, “release”, “prescribe”, “terminate”)
Rules:
- Detection must produce spans + token type.
- Every detected symbol must map to a dictionary entry or be marked unknown_symbol.
Output: symbols_detected[]
SML-P2 — Dictionary Resolution
Goal: Resolve tokens via controlled definitions, not vibes.
Rules:
- If a symbol matches a dictionary entry → load candidate_meanings.
- Apply domain_overrides:
- If symbol is unknown:
Output: candidate_meanings[] per symbol + allowed set
SML-P3 — Expansion & Normalization
Goal: Convert compressed inputs into explicit, low-ambiguity language.
Examples:
- 56°F → “fifty-six degrees Fahrenheit”
- Pt → “patient” (healthcare domain only)
- 😎 → “acknowledgment” only if clarified/low risk
Rules:
- Units must be expanded to spoken form in regulated domains.
- Normalize numeric forms to canonical value + unit fields.
- Expansion must preserve original text in the audit trail.
Output: expanded_text, unit_normalizations[], semantic_tags[]
SML-P4 — Ambiguity Scoring
Goal: Quantify ambiguity so the system can route it.
Ambiguity sources:
- multiple candidate meanings (polysemy)
- low confidence
- domain mismatch
- unknown symbols
- actionability + emoji or shorthand
Rules:
- has_ambiguity = true if:
- Provide top-N candidate meanings + confidences.
Output: ambiguity.items[] with score + reason
SML-P5 — Actionability Classification
Goal: Decide whether the message is trying to do something.
Actionability indicators:
- publish/send/submit
- execute tool calls
- prescribe/diagnose
- sign/approve/contract
- delete/terminate/override
Rules:
- If action verbs + target system present → is_actionable = true.
- If regulated domain and action is patient-facing, legal, or financial → risk escalates.
Output: is_actionable, requested_action
SML-P6 — Policy Gate Decision
Goal: Prevent silent errors from becoming executed outcomes.
Decisions:
- PROCEED
- PROCEED_WITH_DISCLOSURE
- CLARIFY
- ESCALATE_HITL
- BLOCK
Hard rules (non-negotiable):
- Regulated + actionable + ambiguity ⇒ CLARIFY or ESCALATE
- Emoji ≠ authorization in regulated domains.
- Missing context in regulated workflows ⇒ CLARIFY.
- Unit mismatch/medical implausibility flags ⇒ ESCALATE (e.g., 56°F is hypothermia-grade).
Output: policy_gate_decision{...} + reason codes
SML-P7 — Clarification Protocol
Goal: Ask minimal questions that collapse ambiguity fast.
Rules:
- Ask one question per ambiguous symbol (max 2 per turn).
- Questions must be forced-choice when possible:
- If user refuses clarification in high-risk context → BLOCK or ESCALATE.
Output: clarifying_question[]
SML-P8 — Human-in-the-Loop Escalation
Goal: Route high-stakes uncertainty to accountable humans.
Rules:
- Escalation requires assigning a human owner:
- Provide:
- System must not proceed until acknowledgment is recorded.
Output: hitl_packet
SML-P9 — Output Post-Processing
Goal: Normalize the model output to remain within the same semantic constraints.
Rules:
- Re-run SML-P1 to SML-P6 on the output if output is publishable/actionable.
- If output introduces new ambiguous tokens → CLARIFY before publish.
- If output contains policy-sensitive claims → require citations or downgrade certainty.
Output: final_output, post_check_trace
SML-P10 — Tool-Call Firewall
Goal: Prevent LLMs from executing actions outside policy.
Rules:
- Tool calls allowed only when:
- Otherwise:
Output: tool अनुमति (allow/deny) + reason
SML-P11 — Audit Trace & Integrity
Goal: Everything is replayable and attributable.
Rules:
- Log:
- Produce an append-only trace hash (placeholder ok at MVP).
- Store raw text + expanded text.
Output: audit_trace{events[], integrity{hash}}
SML-P12 — Dictionary Governance & Versioning
Goal: Prevent semantic drift inside the resolver itself.
Rules:
- Every dictionary change increments version.
- Every output records dictionary version used.
- Additions must include:
- “Emergency patch” path exists, but must be labeled.
Output: dict_version, change_log_entry
SML-P13 — Failure Modes & Safe Defaults
Goal: If anything breaks, fail safely.
Rules:
- If resolver fails → BLOCK in regulated actionable contexts.
- If dictionary lookup fails → CLARIFY.
- If confidence cannot be computed → treat as ambiguous.
- If audit logging fails → do not execute actions.
Output: safe_default_applied
SML-P14 — Minimum Viable Compliance Profile
Goal: A practical starter configuration that you can deploy.
MVP profile:
- 3 domains: casual / enterprise_general / regulated
- 3 risk tiers: low/medium/high
- hard gate: regulated+actionable+ambiguity ⇒ clarify/escalate
- 100–300 dictionary entries
- append-only audit log
smlp_config:
meta:
name: "Symbolic Mediation Layer Protocols (SML-P)"
version: "1.0.0"
owner_org: "Trans Sentient Intelligence Technologies, LLC"
model_agnostic: true
runtime_agnostic: true
description: >
Upstream semantic governance middleware for symbol expansion, ambiguity routing,
policy gating, and audit-ready tracing before/after LLM reasoning.
# ---------------------------
# CONTEXT / DOMAIN BINDING
# ---------------------------
context:
default_language: "en"
default_domain: "enterprise_general"
default_risk_tier_if_missing_context: "medium"
required_fields:
- domain
- workflow_stage
- actor_role
optional_fields:
- jurisdiction
- policy_profile
- org_unit
- data_classification
domains:
casual:
regulated: false
enterprise_general:
regulated: false
healthcare:
regulated: true
legal:
regulated: true
finance:
regulated: true
security:
regulated: true
workflow_stages:
- draft
- review
- publish
- execute
- report
# ---------------------------
# RISK TIERS / THRESHOLDS
# ---------------------------
risk:
tiers:
low:
description: "Non-actionable or low impact outputs; proceed with explicit expansions."
default_action: "PROCEED"
medium:
description: "Some ambiguity or interpretation risk; clarify or proceed with disclosure."
default_action: "CLARIFY_OR_PROCEED_WITH_DISCLOSURE"
high:
description: "Regulated and/or actionable; must clarify, escalate, or block."
default_action: "ESCALATE_OR_BLOCK"
thresholds:
# If top meaning confidence is below this and the symbol is polysemous -> clarify
clarify_if_below_confidence: 0.65
# If top meaning confidence is above this and non-regulated + non-actionable -> auto-select
auto_select_min_confidence: 0.80
# For high polysemy tokens (emojis, slang) in regulated domains
regulated_symbol_clarify_threshold: 0.80
# ---------------------------
# SYMBOL DETECTION
# ---------------------------
detection:
enable:
emoji: true
unit_expressions: true
acronyms: true
shorthand: true
numeric_units: true
policy_keywords: true
patterns:
temperature_fahrenheit: "([0-9]{1,3})\\s*°?\\s*F\\b"
temperature_celsius: "([0-9]{1,3})\\s*°?\\s*C\\b"
percent: "([0-9]{1,3}(?:\\.[0-9]+)?)\\s*%\\b"
money_usd: "\\$\\s*([0-9]{1,3}(?:,[0-9]{3})*(?:\\.[0-9]+)?)\\b"
max_symbols_per_pass: 200
# ---------------------------
# DICTIONARY + ONTOLOGY GOVERNANCE
# ---------------------------
dictionary:
dictionary_id: "SML-DICT-001"
version: "0.1.0"
default_polysemy_level_by_type:
emoji: "high"
shorthand: "medium"
acronym: "medium"
unit_expression: "low"
unknown_symbol_policy:
non_regulated_non_actionable: "PROCEED_WITH_DISCLOSURE"
regulated_or_actionable: "CLARIFY"
# Minimal starter entries (expand to 100–300 in MVP)
entries:
- entry_id: "SYM-EMOJI-COOL-001"
symbol_type: "emoji"
raw_symbol: "😎"
display_name: "smiling face with sunglasses"
polysemy_level: "high"
clarify_threshold: 0.65
candidate_meanings:
- meaning_id: "M-EMOJI-COOL-APPROVAL"
concept_id: "CPT-APPROVAL"
gloss: "approval / confident acknowledgment"
risk_default: "low"
example_expansion: "Understood—sounds good."
- meaning_id: "M-EMOJI-COOL-PLAYFUL"
concept_id: "CPT-APPROVAL"
gloss: "playful affirmation"
risk_default: "low"
example_expansion: "Nice—got it."
- meaning_id: "M-EMOJI-COOL-SARCASM"
concept_id: "CPT-SARCASM"
gloss: "sarcasm / irony"
risk_default: "medium"
example_expansion: "Sure… if you say so."
domain_overrides:
healthcare:
allowed_meanings:
- "M-EMOJI-COOL-APPROVAL"
- "M-EMOJI-COOL-PLAYFUL"
requires_clarification_if_actionable: true
notes: "Emoji cannot be treated as authorization in patient-facing workflows."
legal:
allowed_meanings:
- "M-EMOJI-COOL-APPROVAL"
- "M-EMOJI-COOL-SARCASM"
requires_clarification_if_contractual: true
notes: "Emoji meaning is non-binding unless explicitly defined in contract context."
- entry_id: "SYM-UNIT-TEMP-F-001"
symbol_type: "unit_expression"
raw_symbol_pattern: "([0-9]{1,3})\\s*°?\\s*F\\b"
display_name: "temperature in Fahrenheit"
polysemy_level: "low"
clarify_threshold: 0.0
candidate_meanings:
- meaning_id: "M-TEMP-F"
concept_id: "CPT-TEMPERATURE_FAHRENHEIT"
gloss: "temperature expressed in degrees Fahrenheit"
risk_default: "low"
example_expansion: "fifty-six degrees Fahrenheit"
domain_overrides:
healthcare:
normalization_required: true
also_emit:
- unit_normalized_value
- unit_normalized_unit
notes: "Unit normalization required for clinical logs."
ontology:
ontology_id: "SML-ONTO-001"
version: "0.1.0"
concepts:
- concept_id: "CPT-APPROVAL"
label: "approval"
definition: "A positive acknowledgment indicating agreement or acceptance."
disallowed_inferences:
- "clinical decision"
- "legal advice"
- "financial instruction"
- concept_id: "CPT-SARCASM"
label: "sarcasm"
definition: "A rhetorical mode where literal wording contrasts intended meaning."
disallowed_inferences:
- "consent"
- "authorization"
- concept_id: "CPT-TEMPERATURE_FAHRENHEIT"
label: "temperature (Fahrenheit)"
definition: "A measurement of temperature expressed in degrees Fahrenheit (°F)."
disallowed_inferences:
- "clinical diagnosis"
# ---------------------------
# RESOLVER ENGINE
# ---------------------------
resolver:
engine_id: "SML-RESOLVER-001"
version: "0.1.0"
strategy:
primary: "dictionary_lookup_then_rules"
optional_secondary: "embedding_similarity"
explainability_required: true
selection_rules:
- rule_id: "SEL-001"
description: "Auto-select top meaning when confidence is high and context is not regulated/actionable."
if:
min_confidence: 0.80
regulated: false
actionable: false
then:
action: "AUTO_SELECT"
- rule_id: "SEL-002"
description: "Force clarification when confidence is below threshold for polysemous symbols."
if:
max_confidence_below: 0.65
polysemy_in:
- high
- medium
then:
action: "CLARIFY"
- rule_id: "SEL-003"
description: "In regulated actionable contexts, clarify polysemous symbols unless very high confidence."
if:
regulated: true
actionable: true
symbol_type_in:
- emoji
- shorthand
- acronym
max_confidence_below: 0.80
then:
action: "CLARIFY"
# ---------------------------
# ACTIONABILITY CLASSIFIER
# ---------------------------
actionability:
enabled: true
verbs_triggering_actionable:
- "send"
- "publish"
- "submit"
- "execute"
- "approve"
- "sign"
- "prescribe"
- "diagnose"
- "deploy"
- "delete"
- "terminate"
- "override"
targets_triggering_actionable:
- "patient portal"
- "production"
- "contract"
- "wire"
- "invoice"
- "ticket"
- "account"
- "database"
- "endpoint"
if_detected:
set_is_actionable: true
# ---------------------------
# POLICY GATE (DECISION ENGINE)
# ---------------------------
policy_gate:
gate_id: "SML-POLICY-001"
version: "0.1.0"
decisions:
- "PROCEED"
- "PROCEED_WITH_DISCLOSURE"
- "CLARIFY"
- "ESCALATE_HITL"
- "BLOCK"
hard_rules:
- rule_id: "PG-REG-AMB-001"
description: "Regulated + actionable + ambiguity => clarify or escalate."
if:
regulated: true
actionable: true
has_ambiguity: true
then:
decision: "CLARIFY_OR_ESCALATE"
require_human_in_loop: true
reason_codes:
- "AMBIGUITY_IN_ACTIONABLE_CONTEXT"
- rule_id: "PG-EMOJI-NO-AUTH-001"
description: "Emoji cannot authorize actions in regulated domains."
if:
regulated: true
actionable: true
contains_symbol_type: "emoji"
then:
decision: "CLARIFY"
require_human_in_loop: true
reason_codes:
- "EMOJI_NOT_AUTHORIZATION"
- rule_id: "PG-MISSING-CONTEXT-001"
description: "Missing required context in regulated workflows => clarify."
if:
regulated: true
missing_required_context: true
then:
decision: "CLARIFY"
require_human_in_loop: true
reason_codes:
- "MISSING_CONTEXT"
- rule_id: "PG-AUDIT-FAIL-001"
description: "If audit logging fails, block execution."
if:
actionable: true
audit_logging_ok: false
then:
decision: "BLOCK"
require_human_in_loop: true
reason_codes:
- "AUDIT_LOGGING_FAILURE"
soft_rules:
- rule_id: "PG-LOW-NONACT-001"
description: "Non-regulated + non-actionable => proceed."
if:
regulated: false
actionable: false
then:
decision: "PROCEED"
require_human_in_loop: false
reason_codes:
- "LOW_RISK_NON_ACTIONABLE"
- rule_id: "PG-UNKNOWN-SYMBOL-001"
description: "Unknown symbol => proceed with disclosure (non-regulated) else clarify."
if:
has_unknown_symbol: true
regulated: false
actionable: false
then:
decision: "PROCEED_WITH_DISCLOSURE"
require_human_in_loop: false
reason_codes:
- "UNKNOWN_SYMBOL_NONREGULATED"
# ---------------------------
# CLARIFICATION PROTOCOL
# ---------------------------
clarification:
enabled: true
max_questions_per_turn: 2
question_style: "forced_choice_when_possible"
templates:
emoji_generic: >
You used {symbol}. Did you mean (A) confirmation/approval, (B) sarcasm/irony, or (C) remove/replace it?
In {domain} workflows, emojis should be removed unless explicitly intended.
unknown_symbol: >
The symbol '{symbol}' is not in the controlled dictionary for {domain}. What does it mean in your context?
# ---------------------------
# HUMAN-IN-THE-LOOP (HITL)
# ---------------------------
hitl:
enabled: true
required_for_decisions:
- "ESCALATE_HITL"
- "BLOCK"
- "CLARIFY_OR_ESCALATE"
owner_routing:
healthcare: "clinician_or_compliance"
legal: "counsel"
finance: "finance_controller"
security: "security_officer"
enterprise_general: "team_lead"
casual: "none"
packet_fields:
- raw_text
- expanded_text
- ambiguity_items
- policy_decision
- reason_codes
- recommended_safe_rewrite
# ---------------------------
# POST-PROCESSOR (OUTPUT CHECK)
# ---------------------------
post_processing:
enabled: true
rerun_on_output_if:
- output_is_actionable
- output_is_publishable
if_new_ambiguity_detected:
action: "CLARIFY"
enforce_disclosures_in_regulated_outputs: true
# ---------------------------
# TOOL-CALL FIREWALL
# ---------------------------
tool_firewall:
enabled: true
allow_only_if:
policy_decision_in:
- "PROCEED"
has_ambiguity: false
audit_logging_ok: true
otherwise:
action: "DENY_AND_DRAFT"
reason_codes:
- "POLICY_GATE_BLOCKED_TOOL_CALL"
# ---------------------------
# AUDIT / TRACE
# ---------------------------
audit:
enabled: true
storage_mode: "append_only"
include:
- context
- symbols_detected
- expansions
- ambiguity_scores
- policy_decisions
- hitl_acknowledgments
integrity:
hash_algo: "sha256"
hash_placeholder: true
failure_policy:
if_audit_fails_and_actionable: "BLOCK"
if_audit_fails_and_non_actionable: "PROCEED_WITH_DISCLOSURE"
# ---------------------------
# MVP PROFILE
# ---------------------------
mvp_profile:
recommended_dictionary_size: "100-300"
recommended_domains:
- "casual"
- "enterprise_general"
- "regulated"
hard_gate:
description: "regulated + actionable + ambiguity => clarify/escalate"
enabled: true
```0