May 19, 20267 min readЧитать на русском

Context Before Action — Why an Agent Must Read the Chat Before It Speaks Into It

A studio runs an autonomous hiring pipeline through an AI agent. The pipeline scans a Telegram channel for candidates, qualifies them, sends initial outreach, tracks responses, and replies again. Each stage is a script; state is held in a JSON file with timestamps. One Tuesday a script reads the JSON, finds that 18 candidates appear to have responded based on the last_msg_ts field, and sends each of them the same templated "thanks" reply. The message goes out to 18 people. None of them have actually responded to the previous outreach. The agent has sent strangers a thank-you for something they never said. The conversation thread now looks, to each candidate, exactly like what it is — a script that doesn't read.

What follows is the pattern I name Context Before Action. The rule is: before sending any message into a chat, read the last 20–30 messages of that chat. The agent's internal state files (timestamps, stage counters, funnel positions) are an index, not ground truth. The chat is the ground truth. State files can lie — they lie when timestamps were updated by another process, when stage transitions were applied wrongly, when a script that meant well sets a flag the chat doesn't reflect. Reading the chat itself before acting catches the lie before it's amplified by a send.

The mechanism is one paragraph. The agent's heartbeat script, before generating any reply, calls the chat-read function with a context window of 20–30 messages. The model receives those messages as part of the prompt and generates the next reply with full awareness of what was actually said. If the messages don't support the stage the state file claims, the agent does nothing and notifies the operator. The rule is unconditional: no exceptions, no fast-path for cases where the state file seems reliable, no special handling for "we already know what to send".

The hiring incident as the canonical failure

The case study's hiring_reply_scan.py script trusted the JSON state file. Its job was to detect candidates who had replied to the agent's first outreach and pass them to the next stage. The script compared the candidate's last_msg_ts to the agent's last_sent_ts. If last_msg_ts > last_sent_ts, the script concluded the candidate had responded.

The bug: last_msg_ts was being updated by an upstream scanning script that wasn't actually checking who sent the message. A candidate's older message (from before the outreach) had its timestamp re-stamped during a re-scan. The comparison said "responded"; the chat said no such thing.

The next stage was a script that sends a thanks reply. The send went out to 18 candidates. The state file had lied; the chat would have shown the truth; the script never looked at the chat.

The recovery is in the case study: a hiring_audit.py script (built reactively), a hiring_delete_extra.py to retract the 18 messages "for both sides" using the user-account capability (a Bot API can't do that fully), state rollback in the JSON, a hiring_context_dump.py to assemble the actual chat history per candidate, then 82 hand-written personalised replies to the candidates who really had responded. Four hours including the cleanup; the rule promoted to HEARTBEAT.md immediately afterwards.

Why state files lie

State files are an index over the chat — a summary of what the chat shows, updated by code, optimised for fast reads. Like any index, they can drift. Reasons they drift:

An upstream script updates a field the downstream script consumed.
A concurrent run of the same script overwrites state mid-cycle.
A field is renamed or repurposed; old code reads it with the old meaning.
A timestamp is updated by a re-scan when no new message exists.
A stage transition is applied to the wrong record because of a mistaken ID match.
The operator hand-edits the file and the change conflicts with the agent's assumptions.

All of these are real. None of them are unusual. State files are useful — they make the pipeline cheap, they let the agent reason quickly about counts and stages — but they cannot be the source of truth for the act of sending a message. The chat is.

The 20–30 message window

The window is calibrated, not arbitrary. Twenty messages is enough to cover a normal back-and-forth between agent and candidate (initial outreach, candidate response, agent acknowledgement, candidate clarification, agent follow-up). Thirty is enough to cover the case where someone tagged the agent into a long thread and the relevant context is several turns back. Beyond thirty, the marginal context starts to dilute the model's attention on what just happened.

The window is also bounded so the read is cheap. Fetching messages from a chat is rate-limited; fetching the last 30 is fast, fetching the last 300 starts to feel expensive on a heartbeat-per-minute cron. The window is right-sized for the actual decision the agent is about to make.

The rule generalises beyond messaging

The pattern fits any agent action where:

The agent's internal state is an index over an authoritative external store.
Reading the external store is cheap.
The action is hard to reverse.

A code agent that wants to commit should git status and git diff against actual files before committing, not trust a state file describing what it thinks the working tree looks like. An ops agent that wants to apply a config change should read the live config from the cluster before patching, not trust a cached snapshot. A research agent that wants to cite a source should fetch the source's current text before quoting, not trust the version it embedded last week. The shape is the same: index can drift, authoritative source is cheap to query, the action is expensive to reverse, so the rule is re-read before sending.

Two false economies

Two arguments come up against the rule, both wrong:

"The state file is the source of truth — that's what state files are for." No. State files are a fast index. The source of truth in messaging is the actual sent/received bytes in the chat. The state file's job is to summarise, not to authorise.
"Reading the chat every time is wasteful — most of the time nothing has changed." Yes, most of the time the read is redundant. The cost is small; the cost of being wrong about the one case that did change is large. The rule pays a small recurring tax to eliminate a low-probability high-cost failure mode. The 18-message incident is exactly the case where the rule pays for itself.

When the pattern fits

Any agent that sends messages to humans in chats. The case study is here.
Operations agents that act on shared resources (databases, configs, deployments).
Coding agents that commit, merge, or otherwise modify shared code.
Hiring, sales, support pipelines with stages and templated replies — exactly where the temptation to trust the state file is strongest.

It is weakest where reading the authoritative source is genuinely expensive — a research agent that would have to re-download a multi-megabyte document each time isn't doing context-before-action; it's doing a different cost calculation.

Two failure modes

Read but don't use. The script calls the chat-read function and discards the result. The model never sees the messages. The rule is performed at the API level, not at the reasoning level. Fix: pass the chat history into the model's context as part of the prompt; verify the model's output reflects the read history.
Wrong window size. The agent reads 5 messages and misses the relevant turn that was 12 messages back, or reads 200 and dilutes attention. Fix: calibrate to the dynamics of the chat — for back-and-forth DMs, 20–30; for long-thread forums, 30–50.