Two-Stage Architect Pattern
You're writing one mega-prompt: "Be a creative director. Generate five cinematic concepts for this product, output as a JSON array with these exact fields." The concepts come back safe and homogeneous — the constraint of JSON-mode pulls the model toward compliance, and the same temperature that would produce interesting prose breaks the schema. Bump the temperature up, you get more interesting concepts and malformed JSON. Bump it down, the JSON is clean and the concepts are dull. The same prompt is asking the LLM to be exploratory and compliant at the same time.
Split the work. Let one LLM call be the creative director — high temperature, prose output, no schema. Let a second LLM call be the technical architect — lower temperature, structured output, no creative load.
This is the two-stage architect pattern: creative intent and technical generation split across two separate LLM calls with different roles, temperatures, and output formats. A "Creative Director" LLM emits prose concepts. A "Technical Architect" LLM converts those concepts into structured generator-ready prompts.
The shape
Input (descriptors, references, user intent)
↓
Stage A — Creative Director (LLM, high temp, prose)
│ Role: editorial / creative director
│ Output: N prose concepts in ONE shared environment
↓
Stage B — Technical Architect (LLM, lower temp, structured)
Role: technical prompt engineer
Output: N generator-ready prompts with explicit structure
The two stages communicate through prose, not through a shared data structure. The Director's output is narrative; the Architect's input is that narrative plus structured metadata (image positions, descriptor rules, constraints).
Why the split matters
- Different temperatures want different tasks. Creative generation wants 0.6–0.8; technical format wants 0.3–0.5. A single LLM at one temperature compromises both.
- Different roles want different system prompts. "Be a creative director" and "Be a technical prompt architect" pull for opposite behaviors: one wants to explore, the other wants to comply.
- Different failure modes are isolated. If concepts come out repetitive, it's a Director problem. If prompts come out malformed, it's an Architect problem. Debugging is one step faster than with a monolithic prompt.
- Shared environment is enforced by the shape. The Director emits one environment description; the Architect receives it verbatim. Continuity across the fanout is structural, not a suggestion buried in a long system prompt.
A concrete example
A campaign pipeline that fans out five shots from one product brief:
- Stage A (Creative Director, temp 0.6): "You are an elite creative director for product photography. Generate 5 editorial concepts for this product, all set in one continuous environment. Concepts differ in subject action and camera emphasis; environment stays the same."
- Stage B (Architect, temp 0.5): "You are the prompt Architect. Convert these 5 concepts into 5 structured prompts. Use the IMAGE ORDER positions exactly. Blocks separated by
^on its own line. No preamble."
Stage A's output is the load-bearing shared environment. Stage B's job is format, reference numbering, and technical discipline.
Composition with other architectures
- The Director stage often produces the shared environment that enables continuity rules in downstream prompts.
- The Architect stage often emits prompts governed by typed-reference composition — each generator image-input declared inline.
- The final prompts typically get wrapped in a structural-fidelity Lock before reaching the generator.
When not to use it
- Single-shot generation without fanout. The pattern's benefit is spreading one creative intent across many technical outputs. One shot doesn't need two LLM calls.
- Tightly-coupled creative and technical dimensions — e.g., compact image prompts where there's nothing to split.
The pattern earns its keep where the same creative direction has to fan out into many technically-disciplined outputs. Otherwise it's overhead.