The Lock Layer
The Lock Layer
Identity & Brand Preservation as a Separable Prompt Architecture
Author: Alex Nix Status: Working draft — for public release
Abstract
Generative models trained on broad data will happily reshape, idealise, or blend the identities of the subjects they generate — a face becomes subtly more symmetric, a product's proportions drift, a brand's logo stops being legible. Prompts that try to prevent this by writing "preserve identity" in free prose fail, because the preservation instruction sits inside the same prompt as the creative instruction and the LLM trades them off.
The Lock Layer Pattern separates preservation from generation as two distinct layers of the prompt. The generative prompt describes this shot. The Lock Layer wraps the generative prompt with an immutable structured constraint — replayed verbatim across every generation in a batch — that names what must survive unchanged. The Lock is computed deterministically from a pre-extracted descriptor; the generative prompt varies; the subject's identity does not.
This paper describes the pattern, its failure modes, and the open research questions around it.
1. Motivation
Three real production failure modes:
- Face drift. Across a five-shot campaign, the same person's facial structure shifts subtly — eye spacing, jawline, chin sharpness. Individually each shot looks fine; as a set, the viewer registers "these are different people."
- Product form drift. A bottle in shot 1 has a slightly different cap than the bottle in shot 3. A logo is mirrored. Label text becomes unreadable in shot 4 and drifts into pseudo-text in shot 5.
- Brand-voice drift. A persona character's "voice" shifts across chapters of generated narrative — tone, sentence-length tendency, vocabulary register.
All three are versions of the same architectural failure: the preservation concern and the creative concern share one prompt, and the LLM trades them off shot by shot.
Writing "preserve the person's exact facial structure" in every prompt helps marginally — until the creative instruction for a particular shot (dramatic angle, soft focus, artistic interpretation) pulls the output toward drift. The preservation clause is just words competing with other words.
The Lock Layer reframes the problem. Preservation is not a clause inside the generative prompt; it is a layer wrapping the generative prompt.
2. The pattern
Final generator input (one per generation in the batch)
┌─────────────────────────────────────────────────┐
│ REFERENCE MAP │ ← where each reference lives
├─────────────────────────────────────────────────┤
│ LOCK LAYER │ ← immutable structured constraints
│ (IDENTITY LOCK, PRODUCT LOCK, BRAND LOCK, …) │
├─────────────────────────────────────────────────┤
│ │
│ Generative prompt for this output │ ← varies per output
│ │
└─────────────────────────────────────────────────┘
The Lock Layer and the Reference Map are replayed verbatim across every generation in the batch. The generative prompt varies per generation.
3. What a Lock contains
A Lock is a structured replay of a pre-extracted descriptor. The descriptor is produced upstream — typically by an LLM analysis stage — and encodes the dimensions that must survive across generations.
Identity Lock (persona) — drawn from a photo analysis + structured params:
- Authority statement: "ONLY Image 1 defines the person's identity."
- Override instruction: "If other reference images contain people, COMPLETELY IGNORE those people."
- Fidelity requirements:
- Exact facial structure, proportions, skin tone, distinguishing features — 100%
- Exact body type, build, proportions
- Exact hair colour, texture, length, style
- Explicit rejections: "No smoothing, idealising, reshaping, or blending with ANY other face from ANY other image."
Product Lock — drawn from a product analysis:
- Rigid-unit rule: which parts move together vs. independently
- Orientation anchor: which side faces the camera
- Text and markings: on-pack text must remain legible
- Material state: glass stays glass, metal stays metal
- Scale context: no size inflation or deflation
Brand Lock (generalisation beyond images) — drawn from a brand style guide:
- Tone-of-voice descriptor
- Vocabulary constraints (preferred terms, banned terms)
- Sentence-rhythm notes
- Explicit rejections (phrases, clichés, competitor-aligned language)
The specific dimensions vary by domain. The shape is constant: a structured replay of what must survive.
4. Why separation is load-bearing
Merging preservation and generation in one prompt creates four specific problems:
- Axis pollution. A prompt describing lighting and composition becomes unreadable when it also carries a 150-word identity clause. Neither the creative nor the preservation intent reads clearly.
- Per-shot drift in preservation language. LLMs rewrite preservation clauses slightly differently per shot ("keep her face recognisable" / "preserve the subject's features" / "maintain identity"). Variance in preservation language produces variance in preservation outcome.
- Debug opacity. When a face drifts in shot 4, was the drift caused by the creative instruction, the preservation instruction, their interaction, or the underlying model? Merged prompts make this unanswerable.
- Composability loss. Preservation is a concern that applies to many generation tasks; if it's embedded in each generation's prompt, the pattern can't be lifted and reused.
Separating the Lock Layer fixes all four:
- Axes stay clean — the generative prompt describes the shot, the Lock describes identity.
- Preservation language is identical across the batch by construction.
- Debugging starts by asking "did the Lock reach the model unchanged?" If yes, the drift is in the generative prompt; if no, the infrastructure.
- The Lock is a standalone artifact — the same Lock can wrap A.O.C. prompts, text-to-image prompts, or other formats.
5. Where the Lock comes from
A Lock is a deterministic function of a pre-extracted descriptor, not an LLM call. The flow is:
- Upstream descriptor extraction (LLM, typically one-time or per-subject). An LLM at low temperature analyses a reference image and emits a structured descriptor. This is the expensive, variable stage.
- Lock computation (pure function, no LLM). The descriptor is replayed into a Lock template. No LLM is involved; the output is deterministic given the descriptor.
- Lock replay (wrapping step, per generation). Every generation in the batch prepends the same Lock verbatim.
The pattern's reliability comes from the no-LLM replay: variance across generations is impossible by construction because the Lock is byte-identical.
6. Failure modes
| Failure | Symptom | Fix | |---|---|---| | Descriptor too vague | "preserve facial features" without specifying which | Descriptor extraction must itemise concrete dimensions | | Lock fights with generative prompt | Generative prompt says "soft-focus dreamy portrait"; Lock says "sharp facial fidelity" | Generative prompt must defer to Lock on preservation dimensions; style words that imply fidelity violation are rejected | | Lock drift across batch | Someone regenerates the Lock mid-batch | Lock is computed once per project and replayed; regeneration is versioned | | Lock embedded instead of wrapped | Lock placed inside the generative prompt, not as a layer | Downstream parser must enforce the layer boundary | | Descriptor dimensions miss the actual failure mode | Identity Lock carefully describes face; wardrobe drifts instead | Descriptor must cover the full "what must survive" set; extend iteratively |
7. Generalisation beyond images
The pattern — wrap a generative task with an immutable structured constraint drawn from a pre-computed descriptor — applies wherever a concern must survive a generation unchanged while the generative instruction varies.
- Brand-voice preservation in long-form content. A Brand Lock wraps chapter-level content generation. Tone, vocabulary, banned phrases survive unchanged across 50 chapters of a content-marketing series.
- Character consistency in narrative generation. A Character Lock (personality, voice, relationship stances, physical appearance if described) wraps per-scene narrative generation.
- Spec fidelity in demo-video scripts. A Spec Lock (product capabilities, messaging positioning, technical accuracy) wraps per-scene script generation.
- Safety and policy boundaries in tool-using agents. A Policy Lock (allowed actions, forbidden actions, escalation triggers) wraps per-task action-plan generation. This is particularly important in agentic systems where the generative instruction varies wildly but the policy must not.
- Schema preservation in structured code generation. A Schema Lock (data contracts, API signatures, invariants) wraps per-function code generation.
The common architectural question the pattern answers: what must survive this call unchanged? If the answer is a stable, describable set, the pattern fits.
8. Relationship to adjacent patterns
- A.O.C. framework. A.O.C. describes the shot; the Lock guarantees the subject survives the shot. They compose at different layers: A.O.C. is the generative prompt, the Lock wraps it.
- Typed-reference composition. The Reference Map above the Lock in the layered prompt is typed-reference discipline. Both patterns routinely appear together.
- System-prompt policies. A policy expressed in the system prompt is a weak analogue of a Lock — it's described once but the LLM re-interprets it per call. Locks are stronger because they're replayed verbatim; they're not re-interpreted.
- Constitutional AI / principle-based constraints. Conceptually aligned — fixed principles that constrain per-task generation. Locks are more concrete and per-artifact; constitutional principles are more abstract and cross-task.
- Two-Stage Architect Pattern. The Lock fits cleanly into the Architect stage's output: the Architect emits the generative prompt; the pipeline wraps it with the Lock before sending to the generator.
9. Open research questions
- Empirical validation of fidelity gain. How much does the Lock Layer actually reduce identity drift vs. a merged prompt of identical length and content? Controlled study needed.
- Descriptor completeness. What dimensions should an Identity Lock cover? A Brand Lock? Early work suggests face structure + hair + build is sufficient for persona work, but edge cases (distinctive tattoos, scars, eyewear) surface over time. A taxonomy of "what can drift" per domain would be useful.
- Lock versioning. When a project's descriptor changes mid-production, how should older generations relate to newer ones? Are they "different versions of the same subject" or "different subjects"? Practical question for production pipelines.
- Cross-modal Locks. Does a Brand Lock written for text generation transfer to image generation for the same brand? Probably partially; unstudied.
- Lock-level adversarial robustness. If a generative prompt actively tries to work around a Lock (creative instructions that imply drift — "a stylised reinterpretation of the subject"), how does the Lock hold? Needs adversarial testing.
10. Conclusion
The Lock Layer is a minor architectural move with major reliability consequences. Separating preservation from generation — the Lock wrapping the generative prompt rather than embedded in it — eliminates four categorical failure modes (axis pollution, drift in preservation language, debug opacity, composability loss) that merged-prompt designs persistently produce.
The pattern is cheap — a Lock is a pure function of a pre-extracted descriptor, replayed verbatim. It is composable — the same Lock can wrap many different generative formats. And it generalises — beyond image generation, to brand voice, character consistency, spec fidelity, safety policies, and schema preservation.
The invitation is to use it explicitly. Preservation concerns tend to accumulate implicitly in system prompts and drift; naming them as a layer makes them durable.
Citation
Nix, A. (2026). The Lock Layer — Identity & Brand Preservation as a Separable Prompt Architecture. Working paper.