Six surgical inserts into the Langfuse agent/lawrence-2 prompt, each tied to a specific failure observed in session aithrd_ayjzonowyr6zidr7.
mat_…-style IDs. Six new inserts unblock the eight prompt-fixable failures from the test session — including the timestamp echo, internals leakage, edit-tool's missing "cannot" doc, and the "draft a letter" misroute.
| Doppler config | LANGFUSE_PROMPT_LABEL | Resolves to | Notes |
|---|---|---|---|
local, local_personal | latest | v44 | Bleeding edge; every save lands here. |
| dev | development | v43 ← tested | What the session ran against. Confirmed by 79 mentions of promptVersion: 43 across the trace metadata. |
ci | (unset) | v41 | Falls back to "production" via agent.py:320-322. |
| prd | production | v41 | Pre-edit-tool. Section 8 reads "Content editing not yet available". |
Pre-edit-tool. Section 8 reads "Content editing not yet available". Untouched by this RCA.
First version exposing the edit tool. This is what session aithrd_ayjzonowyr6zidr7 ran against on the dev environment. All six edits below land as a new version, then get the development label, then production when ready.
One citation-related patch ahead of v43 (adds passage.text to citation markers). None of the prompt edits below touch areas that differ from v43.
v44 line 82 already prohibits leaking ID strings (mat_, matart_, tpl_ etc.). But the leaked phrases in our session weren't IDs — they were narration about the agent's internal working state:
trace 3"I have the current block IDs and positions. Now I'll propose the expanded return of property language as tracked-change edits."
trace 19"I now have all the block IDs and positions I need. I can see: Section 16 heading: block40ff5370, Section 16.1: blockc906ea97, Section 17 'Miscellaneous' heading: block7a928c6b…"
Block IDs are short hex without a typed prefix, so the existing rule doesn't catch them either.
Never include internal system identifiers of any kind in your responses, documents, or display messages. This prohibition covers all VFS resource IDs (e.g.matart_,matdoc_,matnt_,mat_), template IDs (e.g.tpl_), thread IDs, contact IDs, or any other alphanumeric identifier generated by the platform…
Never narrate your internal working state to the lawyer. Phrases like "I have the block IDs", "I have the current positions", "I'll issue the edit with the precise text match", "I'll propose tracked-change edits", "I have the current block hash" describe how your tools work, not what you're doing for the lawyer. Talk in domain terms: "I've proposed expanding clause 12 — return of property — with four sub-clauses" — not "I have the current block IDs and will issue an edit". This applies equally to short transitional phrases ("Now I have everything I need…") — drop them; just produce the work.
Optional companion bullet: explicitly call out the internal vocabulary (block_id, block_hash, content_version_id, diff_suggestion_id, "tracked-change edits") and require human-readable section labels in user-facing prose.
Section 8 enumerates do's and best practices. It says nothing about what the tool cannot do. The agent reached the limits through ten failed turns (T4–T17) and produced its own summary in T17:
trace 17 · the agent's own summaryCan do: find & replace text, insert new blocks, strip formatting. Cannot: apply inline marks (bold/italic/underline), apply font, font size, change indentation, convert list types.
That summary belongs in the prompt.
The
edittool ondocuments://operates on text content within existing blocks. It can:
- Insert new text or new paragraph/heading blocks before or after a target block.
- Replace text strings within a block.
- Delete text or whole blocks.
- Refine or withdraw your own previously proposed tracked-change edits.
The
edittool cannot:
- Apply inline formatting marks (bold, italic, underline, strikethrough, font family, font size, text colour) to existing text.
- Convert a paragraph to a heading or change a heading's level.
- Change paragraph alignment, indentation, or spacing.
- Convert between bulleted and numbered lists, or wrap existing paragraphs into a list.
- Insert page breaks, horizontal rules, tables, or other block-level structural nodes into existing documents (those can only land on
createfor a brand-new document).- Accept
block_attrs,marks,attrs,addedMarks, or any other formatting field — those parameters do not exist in the edit schema and will be silently dropped.If the lawyer asks for any of the "cannot" cases on an existing document, do not wrap the text in markdown — the editor will render the asterisks literally. Instead respond directly:
"The edit tool can't apply [bold/italic/the heading style/alignment] directly. Highlight the text in the editor and use [Cmd+B / Cmd+I / the heading dropdown / the alignment buttons]. I can help with text content changes."
Replacing text with itself to "strip" formatting is also unsafe — the replacement carries no marks, so the typeface and other intentional styling are dropped along with the formatting you wanted to remove. Tell the lawyer to clear the formatting manually in the editor instead.
Single insert collapses ten failed turns into a one-turn explanation.
v44 line 503 already routes "Draft a letter" to documents://. The agent still routed "Draft a confidential settlement letter from me to Andrew Larkin at Larkin & Webb LLP…" to email in T20:
trace 20"This is a formal settlement letter to opposing counsel — I'll draft it as an email."
The agent's reasoning overrode the noun ("letter") with the recipient profile ("opposing counsel"). The rule needs a tie-breaker.
If you are unsure which the lawyer wants, ask them to clarify. Confusing an email for a document — or vice versa — is a serious error.
The choice is determined by the noun the lawyer used, not by who the recipient is or how the content might eventually be delivered.
- "Letter" — demand, settlement, opinion, engagement, advice, without-prejudice — always
documents://, regardless of recipient. A letter "to opposing counsel" is still a letter; the lawyer copies it into their email client themselves.- "Email", "reply", "send" —
emails://. Signal is email-specific language ("email", "reply to", "respond to the thread").- "Note", "note to file", "remind me" —
notes://.If both signals are present ("draft an email letter") or neither is ("write something to opposing counsel"), ask the lawyer to clarify. Confusing an email for a document — or vice versa — is a serious error.
Every user message in the session arrived shaped like this:
2026-05-11T17:21:32.163916: how about font size?
…because the Langfuse user-message template (agent.py:1024-1026) is literally:
[1] user: "{{currentTimestamp}}: {{userMessage}}"
There's no prompt instruction anywhere about what that prefix means. The agent treated it as part of the user's message — sometimes echoed it back (T14), sometimes asked the user about it (T12).
{{currentTimestamp}}: {{userMessage}} // becomes → <currentTimestamp>{{currentTimestamp}}</currentTimestamp> {{userMessage}}
Python fallback at agents/chat/src/chat/agent/agent.py:1066-1072 needs the matching update.
Each user message arrives with a <currentTimestamp> tag prefixed to it. This is metadata supplied by the platform indicating the moment the lawyer sent the message — useful when reasoning about freshness of legal information, but never part of the lawyer's actual request. Never echo, quote, mention, ask about, or include this tag in your response. The lawyer cannot see it and assumes it doesn't exist.
T2 batched 43 placeholder fills in one edit call. Took 115s and the user saw nothing change until completion. The current quick rule actively pushes towards one big batch (correct for the sidebar — one revision — but bad for felt latency).
Surface progress for long batches. When you are about to issue aneditcall with more than ~10 commands (e.g. populating a precedent's placeholders), send auser_display_messagefirst that previews the scale: "Filling in 43 placeholders across the agreement…" — the lawyer can then anticipate the wait. Do not split the batch into multipleeditcalls just for UX; correctness / single-revision behaviour matters more than card timing.
Partial measure. The real fix is client-side — render the draft card on the first command, not on completion. That's the RCA fix #6 (streaming runtime).
T19 added a new section to the populated copy on the matter (functionally correct) but the test sheet flagged it. The current prompt doesn't explicitly explain that precedents are firm-level templates and you edit the populated documents:// copy instead.
Precedents vs documents — what gets edited where.
Precedents (
precedents://) are firm-level templates owned by the firm, not by any one matter. They are read-only from the matter's perspective. To "populate a precedent" or "use a precedent for this matter":
createa newdocuments://on the matter withsourcePrecedentPathpointing at the precedent. This produces a verbatim copy of the precedent's content on the matter.editthe populateddocuments://copy on the matter — never the precedent itself.When the lawyer says "add a new section to the precedent", "rewrite section 11 of the precedent", or similar, they almost always mean the copy on this matter. If there is exactly one copy of that precedent on the matter, edit it directly. If there are multiple copies or the intent is ambiguous, ask which one. Do not propose changes to the firm-level precedent — you don't have that permission and the lawyer cannot review them in this matter's sidebar.
| # | Edit | Location in v44 | Lines added | Tests affected |
|---|---|---|---|---|
| 1 | Forbid narrating internal working state | Section 2.6 — append bullet | 4–6 | 03142021 |
| 2 | List edit tool's "cannot" capabilities | Section 8 — new sub-section 8.1 | 15–20 | 0405060709101112 |
| 3 | Tighten "letter → document" rule | Section 7.1 — replace "If unsure" line | 8–10 | 17 |
| 4 | Make timestamp metadata, not content | Langfuse user template + Section 3.1 | 1 + 1 | 0911 |
| 5 | Pre-warn for long edit batches | Section 8 quick rules — add bullet | 1 | 01131822 |
| 6 | Clarify precedent → copy → edit flow | Section 7 — new 7.1.1 | 8–10 | 14 |
Effort. All six are insertions in the Langfuse system message + one supporting Python change (Edit 4B). One Langfuse save per edit, one small PR for the template-shape change. Could ship in a single prompt version bump.
What this does not solve. Anything that needs the schema to grow (bold, italic, font size, heading level, alignment, indent, list conversion). Those still need code in apis/lawrence-api/.../matter-document-edit-schemas.ts + yjs-apply.ts + diff-suggestion-builder.ts + matter_document_content.py — fixes #1–#5 in the RCA's fix-sequence table.
Prompt edits 1–6 above are the un-blockers for the code work: they reduce the failure surface from "agent confidently does the wrong thing" to "feature missing, lawyer routes around manually" — a much better failure mode while the schema work is in flight.
Two options: the Langfuse system prompt (verbatim, as proposed) or the tool description served by matter_document_content.py. The tool description is closer to the tool call and less likely to be forgotten under prompt pressure. The system prompt is more visible during prompt review. Probably both, with the tool description being the authoritative one.
Eval harness, the audit_lawrence2_prompt_stability.py script, and anything that reads stored user-message contents. Worth a one-line callout in #ai-platform on Slack when the template version ships.
drafting skill already cover the precedent flow (Edit 6)?v44 line 568 routes the agent to "load the drafting skill" for the create-vs-precedent workflow. If the skill already explains "precedents are read-only, edit the copy", the prompt edit can be terser and defer to the skill.
doppler run --project chat-agent --config local_personal -- \ uv run --quiet /tmp/fetch_prompts.py
Hits GET /api/public/v2/prompts/{url-encoded-name}?label={label} on the Langfuse host. Saved to /tmp/lf-prompts/ (markdown + raw JSON for each name × label combination).
/tmp/lf-prompts/agent_lawrence-2__latest.md · v44, the tested prompt/tmp/lf-prompts/agent_lawrence-2__production.md · v41, currently in prodagents/chat/src/chat/agent/agent.py:301, 311-341, 1024-1072 · prompt-loading + user-message template