April 17, 20264 min readЧитать на русском

A Workflow for Generating Seedance 2 JSON Prompts

Every Seedance 2 project tends to invent its own workflow. One team checks references first, another writes the global block last, a third forgets to pre-flight face-heavy stills until the content filter trips on submission. Under deadline, the steps that get dropped are usually the cheap-but-easy-to-skip ones — image pre-flight, character budget audit, scene-mapping coherence — and they cost the most when they fail late in the pipeline.

What follows is a documented version of that workflow, so the steps don't depend on which project the operator happened to do last.

The shape is end-to-end: scenario plus reference stills go in, a production-ready Seedance 2 JSON prompt comes out. It's the companion to the seedance-2-cinematic-video-prompt-engineer system prompt — the prompt does the heavy lifting; this workflow surrounds it with the discipline (image pre-flight, scene mapping, character budget, final validation) so the prompt's output lands clean every time.

Inputs

Scenario — text description of the commercial / video (any language). May be loose prose or a pre-broken-down shot list.
Reference images — numbered stillshots representing key frames. Filenames like img1, 1.png, shot_01.png.
(Optional) overrides — custom char limit, target shot count, locked wardrobe / location notes.

Output

A single raw JSON object (no fences, no prose) conforming to the structure documented in seedance-2-cinematic-video-prompt-engineer:

{"refs":[...],"g":"...","s":[{"id":"1","c":"...","p":"..."}]}

Process

0. Image pre-flight (censorship / face filter)

Before anything else, inspect every reference image for trigger content:

Real faces, portrait placards, photo walls, dense crowds with visible faces → apply Layer 1 grid overlay per seedance-2-censorship-bypass (white 100%-opaque grid, 6×6 @ 12px default; escalate to 10×10 @ 8px after a flag).
Reference already carries a grid-like overlay (production chroma placards with + tracking marks)? Verify it qualifies as a white opaque grid; if not, lay a proper one on top.
Sensitive scenario (memorial, war, protest, historical-political)? Apply the grid even when faces are secondary.
Flag in refs[].r that the image is gridded so the model is not confused — e.g. "white 6×6 grid overlay for detector bypass; ignore in output".

Pre-commit to Layer 2 hygiene (applied in step 7):

No age-signal words, no emotions or backstory, no named public figures.
Every p is visual facts + scene context + production language + role-over-age.
If a grid overlay was applied, scene prompts include no grid lines, no overlay, no mesh, clean skin, smooth image.

1. Check inputs

Scenario present? If missing, ask for it.
Images attached? Count them and note filenames as provided.
Clarify any override (char cap, shot count, specific VFX language) before generating.

2. Load the system prompt

Use seedance-2-cinematic-video-prompt-engineer verbatim as the system layer for the generation step.

3. Analyze the scenario

Identify scenes by camera setup change — a new scene starts on a cut or a discrete camera move.
Flag: freeze moments, VFX states, composited overlays (text/UI/logos), loops.
Lock: wardrobe, location, character descriptions.

4. Analyze every reference image

For each image, extract:

Wardrobe (colors, cuts, accessories)
Interior / location elements
Hand positions, props, gestures
VFX style if present (grid / pixel / wireframe / particle, color, coverage)
Camera angle and framing
Color grade / lighting mood

Pick the PRIMARY reference (usually the character + location anchor) — it maps to the most scenes.

5. Map images → scenes

Build refs[] first. Each image gets:

img — exact filename
s — CSV of scene IDs it applies to
r — ≤ 80-char match descriptor

6. Write global (`g`)

≤ 300 chars. Must cover: composited elements, wardrobe lock, location lock, VFX rules.

7. Write scenes (`s[]`)

Per scene:

c — camera-only shorthand, ≤ 80 chars. Use the full verb palette (ROCKET, whip, CRASH stop, orbit, corkscrew, bullet-time, …).
p — visual frame content only, ≤ 250 chars. No camera repetition. Explicit freeze / VFX scoping.

8. Count characters

If total JSON > cap (default 3500):

Compress p fields first (dense abbreviations).
Merge adjacent scenes with similar camera + content.
Shorten g (but never remove the composited-elements note).

9. Final pass

English throughout (translate from any source language).
Proper nouns / brands preserved.
No markdown, no fences, no commentary.
Valid JSON (closed brackets, escaped quotes).

10. Deliver

Paste raw JSON only. Offer: "Want me to iterate on any shot, tighten the cap, or add a scene?"

Common variations

Loop video — the last scene's p must state "final frame matches shot 1 first frame" and the camera in c should reverse the opening move.
Freeze sequences — every frozen scene states pose + "Zero movement."
Text-overlay heavy — g explicitly lists "empty comp spaces"; scenes note where those spaces live in frame.
Multi-character — each character gets one locked wardrobe line in g; refs with multiple people map to most scenes as PRIMARY.

Anti-patterns to avoid

Describing camera motion inside p — put it in c.
Writing the text content of composited titles — never.
Adding "type":"cut" or other schema extensions — not part of the contract.
Letting VFX bleed onto characters without explicit "chars CLEAN" scoping.
Forgetting to mark the PRIMARY reference.
Skipping image pre-flight on face-heavy refs — most filter fails are preventable at upload time (see seedance-2-censorship-bypass).
Age words in p — young, child, elderly, etc., raise scrutiny on the entire prompt.
Emotional framing — "remembering", "sad", "hopeful" → replaced with visual facts.

Pairs with

seedance-2-cinematic-video-prompt-engineer — the system prompt this workflow consumes.
seedance-2-censorship-bypass — the full face-filter bypass rules.
happy-horse-prompt-rules — the equivalent rules for the competing Happy Horse model; useful when picking which generator suits a brief.