Prompt review · companion to the RCA · sibling: edit-DSL architecture

Lawrence-2 prompt — concrete edits

Six surgical inserts into the Langfuse agent/lawrence-2 prompt, each tied to a specific failure observed in session aithrd_ayjzonowyr6zidr7.

Date · 2026-05-11 Prompt · agent/lawrence-2 Tested · v43 (development) Production · v41 Latest · v44 Author · Adolfo

1 · Stop narrating internal state 2 · Document edit-tool limits 3 · "Letter" → document 4 · Timestamp metadata 5 · Preview long batches 6 · Precedent → copy

Summary

v44 (the tested prompt) already has strong rules against leaking mat_…-style IDs. Six new inserts unblock the eight prompt-fixable failures from the test session — including the timestamp echo, internals leakage, edit-tool's missing "cannot" doc, and the "draft a letter" misroute.

Environment → label → version

Doppler config	`LANGFUSE_PROMPT_LABEL`	Resolves to	Notes
`local`, `local_personal`	`latest`	v44	Bleeding edge; every save lands here.
dev	`development`	v43 ← tested	What the session ran against. Confirmed by 79 mentions of `promptVersion: 43` across the trace metadata.
`ci`	(unset)	v41	Falls back to `"production"` via `agent.py:320-322`.
prd	`production`	v41	Pre-edit-tool. Section 8 reads "Content editing not yet available".

production

v41 · 2026-05-07

Pre-edit-tool. Section 8 reads "Content editing not yet available". Untouched by this RCA.

development · ← tested

v43 · 2026-05-11

First version exposing the edit tool. This is what session aithrd_ayjzonowyr6zidr7 ran against on the dev environment. All six edits below land as a new version, then get the development label, then production when ready.

latest

v44 · 2026-05-11

One citation-related patch ahead of v43 (adds passage.text to citation markers). None of the prompt edits below touch areas that differ from v43.

Forbid narrating internal working state

03142021

v44 line 82 already prohibits leaking ID strings (mat_, matart_, tpl_ etc.). But the leaked phrases in our session weren't IDs — they were narration about the agent's internal working state:

trace 3
"I have the current block IDs and positions. Now I'll propose the expanded return of property language as tracked-change edits."

trace 19
"I now have all the block IDs and positions I need. I can see: Section 16 heading: block 40ff5370, Section 16.1: block c906ea97, Section 17 'Miscellaneous' heading: block 7a928c6b…"

Block IDs are short hex without a typed prefix, so the existing rule doesn't catch them either.

v44 · Section 2.6 (line 82)

Never include internal system identifiers of any kind in your responses, documents, or display messages. This prohibition covers all VFS resource IDs (e.g. matart_, matdoc_, matnt_, mat_), template IDs (e.g. tpl_), thread IDs, contact IDs, or any other alphanumeric identifier generated by the platform…

Proposed · append new bullet to Section 2.6

Never narrate your internal working state to the lawyer. Phrases like "I have the block IDs", "I have the current positions", "I'll issue the edit with the precise text match", "I'll propose tracked-change edits", "I have the current block hash" describe how your tools work, not what you're doing for the lawyer. Talk in domain terms: "I've proposed expanding clause 12 — return of property — with four sub-clauses" — not "I have the current block IDs and will issue an edit". This applies equally to short transitional phrases ("Now I have everything I need…") — drop them; just produce the work.

Optional companion bullet: explicitly call out the internal vocabulary (block_id, block_hash, content_version_id, diff_suggestion_id, "tracked-change edits") and require human-readable section labels in user-facing prose.

Tell the agent what the edit tool cannot do

0405060709101112

Section 8 enumerates do's and best practices. It says nothing about what the tool cannot do. The agent reached the limits through ten failed turns (T4–T17) and produced its own summary in T17:

trace 17 · the agent's own summary
Can do: find & replace text, insert new blocks, strip formatting. Cannot: apply inline marks (bold/italic/underline), apply font, font size, change indentation, convert list types.

That summary belongs in the prompt.

Proposed · insert as Section 8.1 after the existing "Quick rules" list

The edit tool on documents:// operates on text content within existing blocks. It can:

Insert new text or new paragraph/heading blocks before or after a target block.

Replace text strings within a block.

Delete text or whole blocks.

Refine or withdraw your own previously proposed tracked-change edits.

The edit tool cannot:

Apply inline formatting marks (bold, italic, underline, strikethrough, font family, font size, text colour) to existing text.

Convert a paragraph to a heading or change a heading's level.

Change paragraph alignment, indentation, or spacing.

Convert between bulleted and numbered lists, or wrap existing paragraphs into a list.

Insert page breaks, horizontal rules, tables, or other block-level structural nodes into existing documents (those can only land on create for a brand-new document).

Accept block_attrs, marks, attrs, addedMarks, or any other formatting field — those parameters do not exist in the edit schema and will be silently dropped.

If the lawyer asks for any of the "cannot" cases on an existing document, do not wrap the text in markdown — the editor will render the asterisks literally. Instead respond directly:

"The edit tool can't apply [bold/italic/the heading style/alignment] directly. Highlight the text in the editor and use [Cmd+B / Cmd+I / the heading dropdown / the alignment buttons]. I can help with text content changes."

Replacing text with itself to "strip" formatting is also unsafe — the replacement carries no marks, so the typeface and other intentional styling are dropped along with the formatting you wanted to remove. Tell the lawyer to clear the formatting manually in the editor instead.

Single insert collapses ten failed turns into a one-turn explanation.

Tighten the "letter → document" rule

v44 line 503 already routes "Draft a letter" to documents://. The agent still routed "Draft a confidential settlement letter from me to Andrew Larkin at Larkin & Webb LLP…" to email in T20:

trace 20
"This is a formal settlement letter to opposing counsel — I'll draft it as an email."

The agent's reasoning overrode the noun ("letter") with the recipient profile ("opposing counsel"). The rule needs a tie-breaker.

v44 · Section 7.1 (line 507)

If you are unsure which the lawyer wants, ask them to clarify. Confusing an email for a document — or vice versa — is a serious error.

Proposed · replace with a tie-breaker block

The choice is determined by the noun the lawyer used, not by who the recipient is or how the content might eventually be delivered.

"Letter" — demand, settlement, opinion, engagement, advice, without-prejudice — always documents://, regardless of recipient. A letter "to opposing counsel" is still a letter; the lawyer copies it into their email client themselves.

"Email", "reply", "send" — emails://. Signal is email-specific language ("email", "reply to", "respond to the thread").

"Note", "note to file", "remind me" — notes://.

If both signals are present ("draft an email letter") or neither is ("write something to opposing counsel"), ask the lawyer to clarify. Confusing an email for a document — or vice versa — is a serious error.

Make the timestamp prefix non-content

0911 + all turns (wasted context)

Every user message in the session arrived shaped like this:

2026-05-11T17:21:32.163916: how about font size?

…because the Langfuse user-message template (agent.py:1024-1026) is literally:

[1] user: "{{currentTimestamp}}: {{userMessage}}"

There's no prompt instruction anywhere about what that prefix means. The agent treated it as part of the user's message — sometimes echoed it back (T14), sometimes asked the user about it (T12).

Change A · the Langfuse user-message template

{{currentTimestamp}}: {{userMessage}}
// becomes →
<currentTimestamp>{{currentTimestamp}}</currentTimestamp>
{{userMessage}}

Python fallback at agents/chat/src/chat/agent/agent.py:1066-1072 needs the matching update.

Change B · system-prompt rule (Section 3.1, after the em-dash rule)

Each user message arrives with a <currentTimestamp> tag prefixed to it. This is metadata supplied by the platform indicating the moment the lawyer sent the message — useful when reasoning about freshness of legal information, but never part of the lawyer's actual request. Never echo, quote, mention, ask about, or include this tag in your response. The lawyer cannot see it and assumes it doesn't exist.

Pre-warn for long edit batches

01131822 · perceived latency

T2 batched 43 placeholder fills in one edit call. Took 115s and the user saw nothing change until completion. The current quick rule actively pushes towards one big batch (correct for the sidebar — one revision — but bad for felt latency).

Proposed · add a bullet under Section 8 quick rules

Surface progress for long batches. When you are about to issue an edit call with more than ~10 commands (e.g. populating a precedent's placeholders), send a user_display_message first that previews the scale: "Filling in 43 placeholders across the agreement…" — the lawyer can then anticipate the wait. Do not split the batch into multiple edit calls just for UX; correctness / single-revision behaviour matters more than card timing.

Partial measure. The real fix is client-side — render the draft card on the first command, not on completion. That's the RCA fix #6 (streaming runtime).

Clarify precedent → copy → edit flow

T19 added a new section to the populated copy on the matter (functionally correct) but the test sheet flagged it. The current prompt doesn't explicitly explain that precedents are firm-level templates and you edit the populated documents:// copy instead.

Proposed · insert as Section 7.1.1 after the protocol table

Precedents vs documents — what gets edited where.

Precedents (precedents://) are firm-level templates owned by the firm, not by any one matter. They are read-only from the matter's perspective. To "populate a precedent" or "use a precedent for this matter":

create a new documents:// on the matter with sourcePrecedentPath pointing at the precedent. This produces a verbatim copy of the precedent's content on the matter.

edit the populated documents:// copy on the matter — never the precedent itself.

When the lawyer says "add a new section to the precedent", "rewrite section 11 of the precedent", or similar, they almost always mean the copy on this matter. If there is exactly one copy of that precedent on the matter, edit it directly. If there are multiple copies or the intent is ambiguous, ask which one. Do not propose changes to the firm-level precedent — you don't have that permission and the lawyer cannot review them in this matter's sidebar.

Summary

#	Edit	Location in v44	Lines added	Tests affected
1	Forbid narrating internal working state	Section 2.6 — append bullet	4–6	03142021
2	List edit tool's "cannot" capabilities	Section 8 — new sub-section 8.1	15–20	0405060709101112
3	Tighten "letter → document" rule	Section 7.1 — replace "If unsure" line	8–10	17
4	Make timestamp metadata, not content	Langfuse user template + Section 3.1	1 + 1	0911
5	Pre-warn for long edit batches	Section 8 quick rules — add bullet	1	01131822
6	Clarify precedent → copy → edit flow	Section 7 — new 7.1.1	8–10	14

Effort. All six are insertions in the Langfuse system message + one supporting Python change (Edit 4B). One Langfuse save per edit, one small PR for the template-shape change. Could ship in a single prompt version bump.

What this does not solve. Anything that needs the schema to grow (bold, italic, font size, heading level, alignment, indent, list conversion). Those still need code in apis/lawrence-api/.../matter-document-edit-schemas.ts + yjs-apply.ts + diff-suggestion-builder.ts + matter_document_content.py — fixes #1–#5 in the RCA's fix-sequence table.

Prompt edits 1–6 above are the un-blockers for the code work: they reduce the failure surface from "agent confidently does the wrong thing" to "feature missing, lawyer routes around manually" — a much better failure mode while the schema work is in flight.

Open questions

Where should the "cannot" list (Edit 2) live?

Two options: the Langfuse system prompt (verbatim, as proposed) or the tool description served by matter_document_content.py. The tool description is closer to the tool call and less likely to be forgotten under prompt pressure. The system prompt is more visible during prompt review. Probably both, with the tool description being the authoritative one.

Edit 4 changes the user-message template — what's downstream?

Eval harness, the audit_lawrence2_prompt_stability.py script, and anything that reads stored user-message contents. Worth a one-line callout in #ai-platform on Slack when the template version ships.

Does the drafting skill already cover the precedent flow (Edit 6)?

v44 line 568 routes the agent to "load the drafting skill" for the create-vs-precedent workflow. If the skill already explains "precedents are read-only, edit the copy", the prompt edit can be terser and defer to the skill.

Appendix

How the prompts were pulled

doppler run --project chat-agent --config local_personal -- \
  uv run --quiet /tmp/fetch_prompts.py

Hits GET /api/public/v2/prompts/{url-encoded-name}?label={label} on the Langfuse host. Saved to /tmp/lf-prompts/ (markdown + raw JSON for each name × label combination).

Files referenced

The RCA
/tmp/lf-prompts/agent_lawrence-2__latest.md · v44, the tested prompt
/tmp/lf-prompts/agent_lawrence-2__production.md · v41, currently in prod
agents/chat/src/chat/agent/agent.py:301, 311-341, 1024-1072 · prompt-loading + user-message template