Deliberate Omission
Deliberate Omission
When Silence in a Prompt Is the Feature
Author: Alex Nix Status: Working draft — for public release
Abstract
The intuitive design of a prompt is to say more. A longer, more detailed, more exhaustive prompt feels like it gives the model more to work with. For single-channel LLM tasks this intuition often holds. For multi-channel generation tasks — where vision, structured references, or composite anchors carry information alongside text — the intuition inverts: saying more in text actively degrades output when the text and the other channels describe the same dimensions.
This paper names the pattern Deliberate Omission: the system prompt explicitly forbids specific dimensions in the text because those dimensions arrive reliably through a different channel. The text describes only what the other channels cannot provide; silence on everything else is the feature.
The pattern is unusual in the public prompt-engineering literature, which tends toward maximalism. This paper describes when it applies, why it works, the two commitments it requires, and the failure modes it prevents.
1. Motivation
A generator receiving both text and vision references on the same dimension makes a hidden trade-off. A text prompt saying "moody directional lighting" plus a moodboard reference showing bright ambient daylight produces output that is neither — the generator averages, weights semi-arbitrarily, and produces drift that none of the inputs individually caused.
This is not a model-quality problem. Stronger generators do not fix it because the trade-off is structural: two instructions on the same dimension must be reconciled, and the reconciliation is a loss of specificity from each.
The standard answer in the literature is to write better text — more precise, more aligned with the reference — so that text and reference agree. This is good advice and insufficient. In production, references are user-supplied and heterogeneous; pre-aligning text with them at prompt-construction time is brittle.
A stronger answer: stop writing text about the dimensions the reference channel already carries. Silence the text on those dimensions. Let the reference channel own them.
2. The pattern
Text channel → describes ONLY (subject, location, camera)
Vision channel → carries (style, colour, lighting, atmosphere)
Each channel covers one half of the output's decisions. Neither is allowed to overlap with the other. The system prompt for the text-generating LLM explicitly forbids naming the dimensions assigned to vision:
"Describe only subject, location, camera. Never mention style, colour, lighting, or rendering quality. Those dimensions are carried by the attached references and must not be named in this prompt."
The system prompt often includes anti-examples — sentences that read naturally but violate the omission rule — to pull the LLM away from its default writerly instincts:
"Do not write sentences like: warm cinematic tones, moody dramatic lighting, golden-hour glow, rich saturated colour. These dimensions come from the attached composite moodboard image. Describe only what the moodboard cannot provide: subject action, location type, camera framing."
3. Two commitments the pattern requires
3.1 A reliable alternative channel
The pattern only works when a non-text channel carries the omitted dimensions reliably. Typically this is:
- A composite reference image that condenses many inputs into one coherent anchor (see the sequential-consistency companion paper on reference condensation).
- A prior output in a sequence (see reference-inheritance, in the same companion).
- A structured typed reference where the refType declares exactly what the reference contributes.
Without a reliable alternative channel, deliberate omission leaves the model to guess — and the result is not clean output but washed-out, inconsistent output.
3.2 Explicit, named omission
"Be concise" is not omission. "Don't over-describe lighting" is not omission. These are hints, and LLMs route around hints in favour of their default writing style.
Omission must be named: "Never mention style, colour, lighting, or rendering quality." The specific dimensions must be listed. The instruction must be negative-form (not "focus on X" but "do not mention Y"). LLMs comply with specific prohibitions more reliably than with vague emphases.
Anti-examples in the system prompt reinforce the rule: naming the specific sentences the LLM is prone to emitting and explicitly rejecting them.
4. Why it works
When text and vision describe the same dimension, three things happen:
- The model's attention is split between reconciling them and producing output.
- The reconciliation is non-deterministic — which weighting "wins" varies shot to shot.
- The specificity of each channel is lost in the averaging.
Removing text's voice on the contested dimensions does three things:
- Attention flows to the channel that still has a voice — the vision reference.
- The vision reference's specificity is preserved because no text is competing with it.
- Text attention flows to the dimensions that remain — subject, location, camera — where text is the only carrier and its full specificity lands.
The pattern is about attention allocation, not instruction volume. Omission adds specificity to the remaining dimensions by not distributing attention across unnecessary ones.
5. Where it shows up in production (image / video pipelines)
| Stage | Text describes | Omitted (carried by another channel) | |---|---|---| | Multi-reference image prompts | Subject, location, camera | Style, colour, lighting (via composite moodboard image as vision) | | Motion prompts for rendered stills | Physical motion, camera movement | Scene content, aesthetics (carried by the still being animated) | | Refinement prompts (single-shot edits) | The requested change only | Everything else (preserved from baseline) |
The common structure: the channel that has the most dense reliable signal on a dimension owns that dimension; other channels go silent.
6. Failure modes
- Omission without a reliable alternative. Text forbids describing lighting; no reference carries lighting information. Generator produces washed-out, default-lit outputs. Counter: omit only dimensions that are reliably carried elsewhere; verify the alternative channel before committing to the omission.
- Soft omission. "Try not to over-describe lighting" — LLMs ignore soft guidance in favour of writerly defaults. Counter: explicit prohibition, named dimensions, anti-examples.
- Leakage of forbidden vocabulary. LLM slips "warm cinematic tones" into a subject-only prompt. Counter: include specific banned phrases in the system prompt, not just dimension names.
- Over-omission. Forbidding so many dimensions that the remaining text carries insufficient information. Counter: omission is about avoiding overlap, not about minimising text. If text is the sole carrier of a dimension, it must describe it.
- Channel drift. The pipeline changes — a moodboard composite is no longer attached — but the omission rule persists. Result: degraded output because neither channel now carries the dimension. Counter: treat omission rules as coupled to the channels that support them; invalidate the rule when the channel is gone.
7. Contrast with maximalist prompt engineering
The public-facing prompt-engineering discourse trends toward maximalism — longer prompts, more detailed style keywords, richer mood descriptions. This is good default advice for single-channel tasks (text-only generation, first-pass image generation with no references).
Deliberate omission is not a rejection of maximalism; it is a claim that maximalism is channel-specific. For single-channel tasks, describe everything. For multi-channel tasks, describe only what your channel uniquely carries; let other channels own their dimensions.
The inversion is unusual because multi-channel tasks are a relatively recent production concern. Image generators only recently acquired reliable reference-conditioning. The pattern emerges from experience with multi-channel production, not from first-principles prompt design.
8. Generalisations beyond image generation
The pattern is about channel attention allocation, and it generalises:
- Code generation with existing file context. When a function signature, existing imports, and surrounding code are attached, the generation prompt should describe only the body logic, not restate the signature or imports. Restating them causes the LLM to propose minor rewrites of signatures it was shown unchanged.
- Document editing with a template attached. When the document template is attached, the prompt should describe only the content, not the structure. Describing structure causes the LLM to propose restructuring.
- Agentic tool use with tool schemas attached. When tool signatures are in the system prompt as structured schema, the user-turn prompt should describe only the user intent, not restate the tools. Restating tools causes tool-selection drift.
- RAG-augmented generation. When retrieved passages are attached as context, the user-turn prompt should describe only what the retrieval missed, not restate what it provided. Restating retrieved content causes the LLM to weight it twice.
- Multi-agent system messages. When one agent's output is attached as context for another, the second agent's prompt should describe only what it adds, not restate what the first agent established.
Common shape: when non-text context reliably carries dimension X, text that describes X is noise at best, conflict at worst.
9. Design heuristics
For a multi-channel prompt design, three questions:
- What do the non-text channels reliably carry? List them. Style from moodboard, structure from template, tools from schema, retrieved facts from RAG.
- What must the text uniquely carry? The dimensions not covered by the non-text channels. This is what the text prompt should describe exhaustively.
- What do the text and non-text channels both try to describe? This is the overlap zone. The text must go silent here. Name the banned dimensions explicitly in the system prompt.
A well-designed multi-channel prompt answers all three. A badly-designed one lets text and references compete in the overlap zone.
10. Open research questions
- Empirical effect size. How much does deliberate omission actually improve output consistency vs. non-omitted prompts of similar length? Controlled study across image-gen and code-gen tasks would be useful.
- Omission taxonomy. A catalog of which dimensions benefit most from omission when paired with which kinds of reference channels. Likely domain-specific; useful as a practitioner cheat-sheet.
- Cross-model variation. Does omission work equally well across open and closed models? Early observations suggest yes, but no systematic measurement.
- LLM self-moderation of omissions. Can a separate "omission checker" LLM reliably detect and rewrite violations of an omission rule? Could automate enforcement.
- Negative prompting vs. omission. In image generation, negative prompts are a related but different mechanism (telling the generator what to exclude). Omission operates at the text-construction stage, not at the conditioning stage. The two likely compose; unstudied.
11. Conclusion
Deliberate omission inverts a default of prompt engineering — more detail is better — by noting that detail costs nothing only when the detail is unique to the channel that carries it. When multiple channels describe the same dimension, detail in one channel costs specificity in another.
The pattern is simple to state and specific to implement: identify the reliable non-text channels, identify the dimensions they carry, forbid those dimensions explicitly in the text prompt with named anti-examples.
It is underutilised in the public literature, heavily used in production multi-channel pipelines, and generalises across generation tasks wherever multiple channels feed one output. The invitation is to name the omissions in your own pipelines explicitly — they're often present implicitly already, and making them explicit makes them durable.
Citation
Nix, A. (2026). Deliberate Omission — When Silence in a Prompt Is the Feature. Working paper.