Article

How to Write Better Seedance 2.0 Video Prompts

Seedance 2.0 is one of the most capable multimodal video models available today: you can combine text, images, reference videos, and audio in a single generation. The catch? Results depend heavily on how you write the prompt — especially how you tell the model what each uploaded file is for.

This guide walks through the syntax and patterns that consistently produce strong output on HiArt, whether you are making a 6-second product loop or a 15-second narrative clip.

What you can upload

Before writing prompts, know the limits — planning your assets upfront saves rework.

Input limits

Input	Limit	Format	Max size
Images	≤ 9	jpeg, png, webp, bmp, tiff, gif	30 MB each
Videos	≤ 3	mp4, mov	50 MB each, 2–15s total
Audio	≤ 3	mp3, wav	15 MB each, ≤ 15s total
Text	Natural language prompt	—	—
Total	≤ 12 files	—	—

Output specs

Duration: 4–15 seconds (you choose)
Auto-generated sound effects and background music
Resolution: 480p–720p

Good to know

Realistic human faces in uploads are blocked for compliance — use stylized or non-realistic references instead.
Reference videos cost slightly more credits than image-only generations.
Upload the files that matter most for look and rhythm first.

The @ reference system

Every uploaded asset gets a slot — @Image1, @Video2, @Audio1, and so on. The model does not infer what each file is for. You must say it explicitly in the prompt.

@Image1    @Image2    @Image3
@Video1    @Video2
@Audio1

Common reference roles

Purpose	Example
First frame	`@Image1 as the first frame`
Last frame	`@Image2 as the last frame`
Character look	`@Image1's character as the subject`
Scene / background	`scene references @Image3`
Camera movement	`reference @Video1's camera movement`
Action / choreography	`reference @Video1's action choreography`
VFX / transitions	`completely reference @Video1's effects and transitions`
Rhythm / tempo	`video rhythm references @Video1`
Background music	`BGM references @Audio1`
Sound effects	`sound effects reference @Video3's audio`

You can stack multiple roles in one prompt:

@Image1's character as the subject, reference @Video1's camera movement
and action choreography, BGM references @Audio1, scene references @Image2

How to structure a prompt

Strong prompts usually follow this order:

[Subject] + [Scene] + [Action/Motion] + [Camera] +
[Timing] + [Effects/Transitions] + [Audio] + [Style/Mood]

For clips longer than 10 seconds, break the action into timed beats — the model follows these much more reliably:

0–3s: opening scene, camera, action
3–6s: mid-section development
6–10s: climax or key action
10–15s: ending shot, final text or branding

Camera language cheat sheet

Seedance responds well to standard film terminology. Mix movement + shot size for finer control.

Movements

Term	What it does
Push in / Slow push	Camera moves toward subject
Pull back	Camera moves away from subject
Pan left/right	Horizontal rotation
Tilt up/down	Vertical rotation
Track / Follow shot	Camera follows subject
Orbit / Revolve	Camera circles subject
One-take / Oner	Single continuous shot, no cuts

Shot sizes

Term	Frame
Extreme close-up	Eyes, mouth, or small detail
Close-up	Face fills frame
Medium shot	Waist up
Full shot	Entire body
Wide / Establishing	Full environment

Examples that work

Keep a character consistent

Anchor the subject to a reference image, then describe the scene beat by beat:

The man in @Image1 walks tiredly down the hallway, slowing his steps,
finally stopping at his front door. Close-up on his face — he takes a
deep breath, adjusts his emotions, replaces the weariness with a relaxed
expression. After entering, his daughter and a pet dog run to greet him.
The interior is warm and cozy. Natural dialogue throughout.

Copy camera work from a reference video

Reference @Image1's male character. He is in @Image2's elevator.
Completely reference @Video1's camera movements and facial expressions.
Hitchcock zoom during the fear moment, then orbit shots of the elevator
interior. Elevator doors open, follow shot walking out. Exterior scene
references @Image3.

Extend an existing clip

Match the generation duration to the extension length (extend 5s → select 5s output):

Extend @Video1 by 15 seconds.
1–5s: Light and shadow slide across a wooden table through venetian blinds.
6–10s: A coffee bean drifts down from the top of frame. Camera pushes in
until the screen goes black.
11–15s: Text appears — "Lucky Coffee", "Breakfast", "AM 7:00-10:00".

Product showcase

Deconstruct the reference image. Static camera. Hamburger suspended and
rotating mid-air. Ingredients gently separate while maintaining shape.
Bun top, lettuce, tomato, beef patties, melting cheese, bun base — all
descend and reassemble into a complete burger. Cheese melts and drips,
lettuce and tomato dewdrops glisten.

Beat-synced editing

@Image1 @Image2 @Image3 @Image4 @Image5 @Image6 @Image7 — match the
keyframe positions and overall rhythm of @Video1 for beat-synced cuts.
More dynamic character movement. Dreamlike visual style with strong tension.
Adjust shot sizes and lighting to follow the music.

Style and audio modifiers

Append short phrases at the end of your prompt to steer look and sound:

Visual: Cinematic quality, film grain, shallow depth of field · 2.35:1 widescreen, 24fps · Anime style / Photorealistic
Mood: Tense and suspenseful · Warm and healing · Epic and grand
Audio: Background music: grand and majestic · Beat-synced transitions matching music rhythm · Voice tone reference @Video1

A quick workflow

Decide the video type — ad, drama, MV, product, educational.
Gather images, reference videos, and audio; note which slot each will use.
Assign a role to every @ reference — first frame, camera, BGM, etc.
Write the scene in order: subject → environment → action → camera → timing.
Add style and audio modifiers; check file count and duration limits.
Generate, review, and tighten ambiguous lines in the next iteration.

Mistakes to avoid

Vague references — "reference @Video1" without saying camera, action, or rhythm.
Conflicting camera notes — static camera and orbit shot in the same segment.
Overpacked short clips — three scene changes in 4 seconds rarely land well.
Unassigned uploads — five images uploaded but only two mentioned in the prompt.
No audio direction — sound design is one of the fastest quality wins.
Duration mismatch — a 15-beat script with 5-second output selected.

Copy-paste templates

15s product ad

Reference @Video1's editing style and camera transitions. Replace @Video1's
product with @Image1 as the hero product.
0–3s: Product enters with dynamic rotation, close-up on texture and logo.
4–8s: Angle transitions — front, side, back — with highlight scanning light.
9–12s: Product in lifestyle usage context.
13–15s: Hero shot, brand tagline appears, music builds to resolution.
Sound: Reference @Video1's background music. Add product interaction SFX.

15s short drama

Scene (0–5s): Close-up on reddened eyes, finger pointing accusingly, tears
streaming. Dialogue (A, choking): "What exactly are you trying to take from me?"
Scene (6–10s): Other character trembles, holds up evidence, steps forward.
Dialogue (B): "I'm not deceiving you! This is what he entrusted to me!"
Scene (11–15s): Evidence revealed. A freezes — anger shifts to shock.
Sound: Urgent piano, static interference, sobbing, button click.

13s dance clip

Have the character in @Image1 replicate the dance moves and beat-synced
music from @Video1. Generate a 13-second video. Movements smooth,
no stuttering or freezing.

Ready to try it? Open a video node on HiArt canvas, select Seedance 2.0, upload your references, and paste a prompt from this guide. Start simple — one image, one clear action, one camera move — then layer complexity as you see what the model handles well.