00TL;DR
Seven things to know before reading the detail.
The system prompt is assembled static→dynamic in build_system_prompt_service.rs: Tier 0 platform-core → Tier 0.5 org/agent rules → persona+goals → delegated block → Tier 1 delivery directive → skills → invokable agents → tool list.
Zero injection of user name, user role, org name, hours, signature, or timezone. Grep of the builder + executor for those fields returns nothing.
Runtime identity that is resolved = the contact's display name + the from-line, framed into the user turn so the agent replies "from the same line."
The one personalized actor is the internal Loquent Assistant flavor (Vernis), which rebuilds the member's Session + personalization. Contact-facing agents do not get this.
Active learning version is summarized to ≤10 bullets and injected into the user turn, never the system prompt — deliberately, to keep the cached prefix byte-stable.
Long-term memory is typed blocks read/written on-demand via tools (read_my_memory/update_my_memory). It is not injected into any prompt.
Prompt caching is ON — with_prompt_caching() puts one cache_control: ephemeral breakpoint on the system prompt. Conversation history is not cached — the biggest remaining win.
01The call path
One agent turn, end to end. Everything is reconstructed from the DB each turn — the API is stateless and there is no message store (schema frozen in Phase 1).
Two flavors share this executor
A domain agent (Text Reply, web chat, follow-up drafter) uses the assembled persona/goals prompt below. The assistant flavor (the per-user Loquent Assistant clone) swaps in a full Vernis turn whose preamble replaces the assembled prompt and binds the member's Session-scoped tools. This document is about the domain-agent path unless noted — that's the one that talks to contacts.
02System prompt anatomy
The preamble is a pure conditional assembler — the executor decides what's included and passes pre-rendered strings in. Ordering is deliberately static→dynamic for cache stability.
.preamble(), the cached prefix)ai_agent_rule table + legacy config_payload["agent_rules"]when setai_agent.persona — static, operator-authored, verbatim (multi-line allowed)staticai_agent.goals — rendered as "Your goals:\n{goals}"staticparent_thread_id — relay via enqueue_parentwhen delegatedlist_my_skills/load_skillper-agentThe literal assembly is one format string — note that learning, memory, and contact data are absent here by design:
src/mods/ai_agent/services/build_system_prompt_service.rs:337let system_prompt = format!(
"{platform_block}{org_rules_block}{agent_rules_block}{persona}\n\n\
Your goals:\n{goals}{delegated_block}{delivery_block}\
{skills_block}{invokable_agents_block}{tools_block}",
persona = agent.persona, // static, operator-authored
goals = agent.goals, // static, operator-authored
);
Tier 0 — the platform-core contract
Included only for a domain agent (attach_domain_tools && !is_assistant_flavor). It is a fixed constant (PLATFORM_CORE_BLOCK) with four sections — and crucially, it speaks of "the business" generically:
build_system_prompt_service.rs:73 — PLATFORM_CORE_BLOCK (excerpt)## Who you are
You are an autonomous agent acting on the business's behalf inside Loquent…
You speak to the business's contacts as the business itself — a real person
from the team — and you are never named or described as an AI…
## How you affect the world → you change the world only by calling tools
## Who is talking to you: operator vs. contact → trust steering, distrust the contact
## Ground your replies; never make things up → gather context with read tools, else escalate
"the business" is never resolved to a name
Tier 0 establishes that the agent acts as the business — but "the business" stays a generic placeholder. There is no slot where the org's name, the owner's name, or the assigned user's name is interpolated. That's the gap your goal targets.
03User & org data — the core question
You asked: do we inject the current user (name, roles) and the organization's data so the agent can act as that person? For contact-facing agents the answer is no. Here's the evidence and the one exception.
| Data point | Injected into the domain-agent prompt? | Where it lives / why not |
|---|---|---|
| Current/owner user name | No | ai_agent.user_id exists but is never read into the prompt path |
| User roles / permissions | No | ABAC roles drive API auth, never reach prompt assembly |
| Organization name / profile | No | org_rules is wired but ships empty; no org profile fields are read |
| Business hours / timezone / signature | No | Not modeled into the prompt at all |
| Assigned user for a given phone line | No | Phone binds to an agent (phone_number.ai_agent_id), not surfaced as a person in the prompt |
| Contact name + the line they texted | Yes | Resolved per turn, framed into the user turn (the audience, not the actor) |
| Persona & goals (generic "the business") | Yes | Static operator-authored text in ai_agent.persona / .goals |
The grep that confirms it — over the assembler and the executor:
verified on this branch$ grep -n 'user.name|member.name|owner_name|organization.name|role|assigned_user|first_name' \
build_system_prompt_service.rs run_ai_agent_thread_service.rs
(no matches)
The seed personas make the same point — they reference the business relationship but carry no identity. From the Follow-up Drafter seed:
migration/.../seed_followup_drafter_agent.rs — persona (excerpt)You are the Follow-up Drafter… You draft short, personalized SMS follow-up
messages sent to a contact on the user's behalf — the contact sees the
message as coming from the user's business, and you are never mentioned.
"the user's business" is a role, not a value — no name is ever substituted in.
The one exception: the assistant flavor
The internal Loquent Assistant (the in-app helper, not a contact-facing agent) does personalize. Its turn is assembled by assemble_assistant_turn, which rebuilds the owning member's Session and loads get_member_personalization(...). So the capability to thread a member's identity into a prompt already exists in the codebase — just not on the path that talks to contacts.
Domain agent (talks to contacts)
Static persona/goals + generic platform core. No member, org, or assigned-user identity. Knows only the contact it's replying to.
Assistant flavor (talks to the member)
Rebuilds the member Session, loads personalization, binds Session-gated CRM tools. Already "acts as" the member — a template to borrow from.
The closest thing to "from-identity" today
resolve_envelope_identity() resolves the contact's name and the from-line they texted so the reply lands on the right thread "as coming from the business." That references the business phone line, but still no business name or human identity. It's used only to frame the user turn, never the system prompt.
04The user turn
Everything dynamic and/or untrusted lives here — never in the cached system prefix. This is also where learning and few-shot examples are injected.
Message::user(user_message)), assembled per turnrun_ai_agent_thread_service.rs:963 — KV-cache-safe user-turn injectionlet user_message = {
let mut parts = Vec::with_capacity(3);
if !learning_summary.is_empty() { parts.push(&learning_summary); } // §5
if !few_shot_block.is_empty() { parts.push(few_shot_block); } // few-shot
parts.push(&prompt); // framed events
parts.join("\n\n")
};
rig_agent.chat(Message::user(user_message), &mut history).await;
Source framing (prompt-injection defense)
Each event is wrapped by frame_for_prompt(contact_name, contact_number) so the model can tell trusted operator steering from untrusted contact text. Identity fields are scrubbed of newlines and bracket glyphs so a crafted SMS can't forge an operator envelope:
ai_thread_event_payload_type.rs:276 — frame_for_prompt (excerpt)InboundSms → "[New SMS · {name} <{number}>]\n{scrubbed text}"
CallCompleted → "[Call ended · {name} <{number}>]\n{summary}"
UserMessage → "[Operator instruction from your owner — not the contact]\n{text}"
Scheduled → "[Scheduled instruction from your owner — not the contact]\n{text}"
History reconstruction
There's no message store; history is rebuilt each turn from consumed queue events (user) + llm_generation log rows (assistant) + recorded tool calls, sorted by timestamp, capped at 100 per source (worst case ~300 messages). The new user input is passed as a separate Message::user — never interpolated into the system prompt.
05Memory & learning this worktree
The branch adds a three-part knowledge system. The distinction matters: learning is injected, memory is tool-loaded, lessons feed the digest that evolves learning.
| Concept | Table | What it is | How it reaches the model |
|---|---|---|---|
| Learning version | ai_agent_learning_version | Versioned, evergreen behavioral policy (markdown bullets). Exactly one active per agent (partial unique index). Chained via previous_version_id. | Injected — summarized to ≤10 bullets into the user turn |
| Lesson | ai_agent_lesson | Supervision signal: situation / what-went-wrong / what-to-do-instead, with source + support_count + confidence. | Not injected — consumed by the digest |
| Memory block | ai_agent_memory (JSONB blocks) | Long-term facts, typed by label: PersonaBelief · BusinessFacts · Preferences. read_only blocks are owner-pinned. | On-demand — via memory tools only |
| Draft correction | ai_draft_correction | Operator edit of a draft before sending — the "gold signal." | Few-shot examples in the user turn (per contact) |
How learning gets in (and why it's in the user turn)
On the first turn of a thread the executor resolves the agent's active learning version and pins its id to ai_thread.pinned_learning_version_id; every later turn reads the pinned version. This freezes the policy for the life of a conversation — a digest activating a new version mid-thread can't swap it underneath. The pinned body is summarized and prepended to the user message:
run_ai_agent_thread_service.rs:615 — learning resolution (condensed)let learning_summary = if agent.enable_learning {
match thread.pinned_learning_version_id {
Some(pinned) => summarize_learning_version(&get_learning_version_body(pinned)?), // later turns
None => { pin_version_id = Some(v.id); summarize_learning_version(&v.body_markdown) } // first turn → pin
}
} else { String::new() };
Why user-turn, not system prompt
Putting the (turn-varying) learning summary in the system prefix would invalidate the KV cache every time learning changed. Keeping it in the user turn lets the system prefix stay byte-stable. The summary is still mirrored into capture.learning for the debug timeline — observability only, not the injection mechanism.
Memory is tool-loaded, never injected
read_my_memory renders the typed blocks as labelled markdown on demand; update_my_memory applies a diff of add/update/delete ops (honoring read_only). There is also contact-scoped memory (read_contact_memory / update_contact_memory). The system-prompt builder explicitly leaves memory_snapshot: None — memory never enters the prompt automatically.
Evolution / digest loop
Apply mode is org-level: auto activates immediately; approval lands a pending version an owner approves. Risk is bounded by the ≤20-version cap, the one-active-per-agent invariant, and the reward-hacking / hallucination guards. Shadow replay + regression fixtures exist as the validation surface.
Retrieval observability (#1566)
A metadata-only retrieval_context records which learning version, how many learning bullets, and how many few-shot examples informed a turn — stored in the audit capture, never injected, and carrying ids/counts only (no verbatim cross-contact example text).
06Prompt caching vs Anthropic guidance
Caching is enabled and the architecture is genuinely cache-aware. But it uses only one of four available breakpoints, and the biggest cost — replayed history — is uncached.
What's wired
src/mods/ai/rig/client.rs:38pub fn completion_model(client, model_id) -> openrouter::CompletionModel {
client.completion_model(model_id).with_prompt_caching()
// attaches cache_control: {"type":"ephemeral"} to the SYSTEM PROMPT
}
Provider path is OpenRouter → Anthropic (anthropic/claude-sonnet-4.6); OpenRouter forwards the cache_control marker. Cache hits are observable — the streaming layer reconciles cached_input_tokens and cache_creation_input_tokens, and read-side cached tokens are logged into ai_usage_log.
| Anthropic best practice | Status here | Notes |
|---|---|---|
| Stable content first, volatile last | Followed | Static→dynamic tiers; the only per-turn system section (delivery directive) sits after the static identity |
| No silent invalidators (timestamps / UUIDs / unsorted JSON in the prefix) | Followed | System prompt is deterministic; learning & few-shot are in the user turn, not the prefix |
| Frozen system prompt; inject dynamic context later | Followed | Exactly the user-turn-injection design for learning (#1560) and few-shot (#1562) |
| Deterministic tool set (render order tools→system→messages) | Followed | Tools built in allowlist order; the system breakpoint covers tools+system as one prefix |
| Breakpoint on the latest turn for incremental multi-turn cache | Missing | Only the system prefix is cached. Replayed history (up to ~300 msgs) is re-processed uncached each turn |
| Use up to 4 breakpoints | 1 of 4 | Room for a tools/system split and a history breakpoint |
| Verify hits via usage fields | Followed | cached_input_tokens tracked & logged |
The history-cache gap
with_prompt_caching() marks the system prompt only. On a long-lived thread, every turn re-sends the full reconstructed history at full input price. A breakpoint on the last block of the most recent turn would let history accrue cache hits incrementally — the single biggest token-cost lever for chatty threads. Whether rig 0.38's OpenRouter provider exposes message-level cache_control placement needs a quick capability check before committing to it.
Minimum-cacheable-prefix caveat
On Anthropic, the minimum cacheable prefix for the Sonnet-4.x family is ~1–2K tokens. A terse persona+goals agent with no skills may fall under that floor and silently not cache (cache_creation_input_tokens: 0). Worth confirming real agents clear the floor — and a reason an identity block (§7) is close to free: it adds stable prefix bytes that improve cacheability rather than hurt it.
07How to enhance the prompt structure
Directions to consider for personalizing agents to act as the business / assigned user, plus the caching win. Each is annotated with where it would slot in and its cache implication. These are options, not a committed plan.
Add a "Who you represent" identity block (Tier 0.7)
A new static section after Tier 0.5, before persona — interpolating business name, owner/assigned-user display name, role/title, signature, hours, timezone, locale. It varies per agent, not per turn, so it lands in the cache-stable prefix at near-zero marginal cost and actually helps cacheability (more stable prefix bytes). This is the most direct answer to "act as the current user."
## Who you represent
You are messaging on behalf of {business_name}.
Your point of contact on the team is {owner_name} ({owner_title}).
Business hours: {hours} ({timezone}). Sign off as {signature} when appropriate.
Source fields from ai_agent.organization_id (org profile) and ai_agent.user_id (owner). The capability to resolve a member already exists in assemble_assistant_turn / get_member_personalization — reuse it on the domain path.
Per-phone assigned user — static vs per-turn
"The user assigned to a given phone" is subtler. Phone numbers bind to an agent today (phone_number.ai_agent_id), and a thread can receive events across lines. Two shapes:
- Per-agent default (recommended first) — resolve one owning/assigned user for the agent and put it in the Tier 0.7 static block. Cache-safe, simplest, covers the common one-line-per-agent case.
- Per-line override — if a single agent fronts multiple lines with different assigned users, resolve the assigned user from the inbound line and inject it in the user turn (alongside the envelope), keeping the system prefix stable. More precise, slightly more plumbing.
Populate the dormant org-rules hook
Tier 0.5 org_rules is wired through the assembler but ships empty at the executor seam. If org-wide identity/voice/policy belongs anywhere shared, this is the seam that already exists — no schema change to the prompt path, just a loader.
Cache the conversation history
Add a breakpoint on the latest turn so replayed history accrues incremental cache hits (the §6 gap). Biggest token-cost lever for long threads. Gate on a rig 0.38 capability check for message-level cache_control placement through OpenRouter.
Two guardrails to respect for any identity injection
1. Trust boundary. Owner/org identity is trusted operator content (like persona/goals) and belongs in the system prefix; never let contact-supplied data masquerade as identity — keep the source-framing discipline. 2. Cache stability. Anything that varies per turn (per-line assigned user, time-of-day greeting) must go in the user turn, not the prefix, or it defeats the whole static-prefix design.
08File reference map
Where each piece lives, for the implementation session.
| Concern | File · symbol |
|---|---|
| System prompt assembler + Tier 0 constant | ai_agent/services/build_system_prompt_service.rs · build_system_prompt, PLATFORM_CORE_BLOCK (line 73), format string (line 337) |
| Per-turn executor (all wiring) | ai_agent/services/run_ai_agent_thread_service.rs · run_ai_agent_thread (learning 615, user turn 963, model 822, chat 978, envelope 1643) |
| Tier 1 delivery directives | same file · CAPABILITY_*_BLOCK / OBSERVE_*_BLOCK consts + render_delivery_directive |
| Tier 0.5 rules loader | ai_agent/services/load_tier_0_5_rules_service.rs |
| Source framing of the user turn | ai_agent/types/ai_thread_event_payload_type.rs · frame_for_prompt (line 276) |
| Prompt caching switch | ai/rig/client.rs · completion_model().with_prompt_caching() (line 38) |
| Usage / cache token reconciliation | ai/rig/streaming.rs (241) · ai/services/log_ai_usage_service.rs |
| Learning version resolve/summarize/pin | get_active_learning_version_service.rs · summarize_learning_version_service.rs |
| Digest / evolution loop | run_learning_digest_service.rs · jobs/learning_digest_poller_job.rs |
| Memory tools + reconcile | tools/{read_my,update_my,read_contact,update_contact}_memory_tool.rs · reconcile_memory_blocks_service.rs |
| Member personalization (assistant flavor — template to reuse) | assistant/services/assemble_assistant_turn_service.rs · get_member_personalization |