April 17, 20264 min readЧитать на русском

Seedance 2 — Censorship & Face-Filter Bypass Rules

Your prompt for a historical-themed commercial says "a small figure in grey coat runs down a cobblestone road past a uniformed silhouette." Generation request fires; Seedance 2 returns a content-moderation error. You strip "small" and try again — flagged. You rewrite to avoid "soldier" — flagged. You swap to an entirely innocuous prompt with the same reference image — still flagged. Eventually it turns out the filter wasn't reading your prose at all: it was triggering on face detection in the reference image. You needed to fix the picture, not the words.

Once you know both halves of the moderation surface exist, the fix is symmetrical: a tweak to the reference image before upload, and a discipline you apply to every prompt's phrasing. Either half alone leaves the other failing silently.

The pattern is a two-layer defense for Seedance 2 content moderation: image-side (a white grid overlay on the reference) plus prompt-side (phrasing hygiene in every p field). Apply on any job involving real faces, portraits, crowd photography, politically sensitive imagery, or references that contain grids or pattern overlays already.

When to apply

Mandatory when reference images include:

Real human faces (close-up or mid-shot)
Portrait placards, memorial boards, photo walls, ID photos
Dense crowds with visible faces
Anime / illustrated faces (false positives are common)
AI-generated portraits at high resolution

Recommended for politically or emotionally charged scenarios (war, memorial, protest, historical). Skip for full environmental shots with no visible faces.

Layer 1 — Image-side: grid overlay

Break up the pixel patterns the detector uses without damaging what the generator learns from the reference.

Specs

| Setting | Value | |---|---| | Color | White | | Opacity | 100% solid — semi-transparent does NOT work | | Line width | 8–15px (tied to density) | | Coverage | Full frame — not face-only |

Density presets

| Preset | Grid | Line width | Use when | |---|---|---|---| | Light | 4×4 | 15px | Low-risk, mostly environmental, faces small/blurred | | Standard | 6×6 | 12px | Default — balanced, works for most | | Dense | 10×10 | 8px | Clear front-facing portraits, high-res faces, repeat failures |

Escalation

If Standard still trips the filter:

Increase to Dense (10×10, 8px)
Downscale the reference to 1024px width before applying the grid
Combine both

Note on existing reference grids

If the reference already carries a grid-like pattern (production chroma placards with + corner markers, tracking grids, checker overlays), that often already helps — but verify the grid is opaque white, not the magenta/green chroma variant, and dense enough. If not, lay a proper white grid on top.

Layer 2 — Prompt-side: phrasing hygiene

Do

Visual facts only — what the camera sees, not character psychology, backstory, relationships, or emotions.
Full scene context — shot type + location + era + lighting + atmosphere in every prompt.
Production language — 2–3 terms: shot type (wide / close-up), camera movement (tracking / dolly / locked-off), lens/grain (35mm grain, anamorphic), lighting (rim light, volumetric rays, overcast flat).
Role over age — "a figure in grey coat" rather than "a young man"; "a marcher" rather than "an elderly woman".
Explicit reference tagging — @Image 1 / @img1 when you need the model to lock to a specific reference ("environment based on @img1", "first frame = @img2").
Specific negative prompt — target actual artifacts: no jitter, no warping, no flickering, no text morphing, no garbled logos.
Grid-artifact suppression when using the grid overlay: add no grid lines, no overlay, no mesh, clean skin, smooth image to the prompt.

Don't

Age-signal words — child, kid, young, boy, girl, teen, minor, baby, infant, toddler, elderly. These raise scrutiny on the entire prompt even when the image is safe.
Emotional or narrative framing — motivations, relationships, backstory, "feels sad", "remembering".
Sparse single-action prompts without setting — "a soldier shoots someone" fails; full contextualized version works.
Named public figures or IP names — use descriptive roles instead.
Generic negative-prompt dumps — keep negatives short and artifact-specific.

Flagged → fixed examples

| ❌ Flagged | ✅ Works | |---|---| | "a young boy running away from a soldier" | "wide shot, 1940s Eastern European street, a small figure in grey coat runs down cobblestone road past a uniformed silhouette, overcast flat light, 35mm grain, documentary handheld" | | "a girl remembering her grandfather" | "medium shot, marcher in blue jacket holds a portrait placard among a dense crowd, soft overcast daylight, shallow DOF, documentary framing" | | "soldiers fighting" | "wide establishing shot, trench line at dawn, figures in field-grey silhouettes move between sandbags, volumetric smoke, cold blue hour light, anamorphic lens" |

How the two layers compose

Layer 1 is upload-time, applied to the reference image. Layer 2 is generation-time, applied to every prompt that consumes that reference. Skipping either layer fails the same way the wrong layer would — silently and stochastically. Most filter failures I've seen are preventable at upload time, before a single token is written.

These rules are operationalized by the seedance-2-cinematic-video-prompt-engineer system prompt (which includes a Safety / Censorship section derived from this page) and by the seedance-2-prompt-generation workflow (where step 0 is image pre-flight).