記事

Seedance 2.0 の動画プロンプトを上手に書く方法

Seedance 2.0 は、現時点でも屈指のマルチモーダル動画モデルのひとつです。テキスト、画像、参照動画、音声をひとつの生成に組み合わせられます。ただし、結果の良し悪しはプロンプトの書き方に大きく左右されます。とくに、アップロードした各ファイルを「何のために使うか」をモデルに明示できるかが鍵です。

本記事では、HiArt で安定して良い出力が得られる構文とパターンを解説します。6 秒の商品ループでも、15 秒のナラティブクリップでも使える内容です。

アップロードできるもの

プロンプトを書く前に制限を把握しておきましょう。素材を先に設計しておくと、作り直しが減ります。

入力の上限

入力	上限	形式	最大サイズ
画像	≤ 9	jpeg, png, webp, bmp, tiff, gif	各 30 MB
動画	≤ 3	mp4, mov	各 50 MB、合計 2–15 秒
音声	≤ 3	mp3, wav	各 15 MB、合計 ≤ 15 秒
テキスト	自然言語プロンプト	—	—
合計	≤ 12 ファイル	—	—

出力仕様

長さ：4–15 秒（任意で指定）
効果音・BGM を自動生成
解像度：480p–720p

覚えておくとよいこと

アップロードした実写の人間の顔はコンプライアンス上ブロックされます。スタイライズした参照や非実写の参照を使いましょう。
参照動画を使う生成は、画像のみよりやや多くクレジットを消費します。
見た目とテンポに効くファイルから先にアップロードしましょう。

@ 参照システム

アップロードした素材にはそれぞれスロットが割り当てられます — @Image1、@Video2、@Audio1 のように。モデルは各ファイルの用途を推測しません。プロンプト内で明示する必要があります。

@Image1    @Image2    @Image3
@Video1    @Video2
@Audio1

よく使う参照の役割

用途	例
最初のフレーム	`@Image1 as the first frame`
最後のフレーム	`@Image2 as the last frame`
キャラクターの見た目	`@Image1's character as the subject`
シーン / 背景	`scene references @Image3`
カメラワーク	`reference @Video1's camera movement`
動き / 振付	`reference @Video1's action choreography`
VFX / トランジション	`completely reference @Video1's effects and transitions`
リズム / テンポ	`video rhythm references @Video1`
BGM	`BGM references @Audio1`
効果音	`sound effects reference @Video3's audio`

ひとつのプロンプトに複数の役割を重ねて書けます：

@Image1's character as the subject, reference @Video1's camera movement
and action choreography, BGM references @Audio1, scene references @Image2

プロンプトの構成

効果的なプロンプトは、だいたい次の順序で書くと安定します：

[被写体] + [シーン] + [動き] + [カメラ] +
[タイミング] + [エフェクト/トランジション] + [音声] + [スタイル/ムード]

10 秒を超えるクリップでは、動きを時間軸で区切って書くとモデルがかなり忠実に追従します：

0–3s: オープニング、カメラ、動き
3–6s: 中盤の展開
6–10s: クライマックスまたはキーアクション
10–15s: エンディング、最後のテキストやブランド表示

カメラ用語チートシート

Seedance は映画で使われる標準的な用語に反応しやすいです。カメラの動きと画角を組み合わせると、より細かくコントロールできます。

カメラの動き

用語	意味
Push in / Slow push	被写体に向かってカメラが寄る
Pull back	被写体からカメラが離れる
Pan left/right	水平方向の回転
Tilt up/down	垂直方向の回転
Track / Follow shot	被写体を追うカメラ
Orbit / Revolve	被写体の周りを回る
One-take / Oner	カットなしの一連撮り

画角

用語	フレーム
Extreme close-up	目、口、細部のアップ
Close-up	顔が画面いっぱい
Medium shot	腰から上
Full shot	全身
Wide / Establishing	環境全体

うまくいく例

キャラクターの一貫性を保つ

参照画像で被写体を固定し、シーンをビートごとに描写します：

The man in @Image1 walks tiredly down the hallway, slowing his steps,
finally stopping at his front door. Close-up on his face — he takes a
deep breath, adjusts his emotions, replaces the weariness with a relaxed
expression. After entering, his daughter and a pet dog run to greet him.
The interior is warm and cozy. Natural dialogue throughout.

参照動画のカメラワークを再現する

Reference @Image1's male character. He is in @Image2's elevator.
Completely reference @Video1's camera movements and facial expressions.
Hitchcock zoom during the fear moment, then orbit shots of the elevator
interior. Elevator doors open, follow shot walking out. Exterior scene
references @Image3.

既存クリップを延長する

生成の長さを延長分に合わせます（5 秒延長 → 出力も 5 秒を選択）：

Extend @Video1 by 15 seconds.
1–5s: Light and shadow slide across a wooden table through venetian blinds.
6–10s: A coffee bean drifts down from the top of frame. Camera pushes in
until the screen goes black.
11–15s: Text appears — "Lucky Coffee", "Breakfast", "AM 7:00-10:00".

商品ショーケース

Deconstruct the reference image. Static camera. Hamburger suspended and
rotating mid-air. Ingredients gently separate while maintaining shape.
Bun top, lettuce, tomato, beef patties, melting cheese, bun base — all
descend and reassemble into a complete burger. Cheese melts and drips,
lettuce and tomato dewdrops glisten.

ビートに合わせた編集

@Image1 @Image2 @Image3 @Image4 @Image5 @Image6 @Image7 — match the
keyframe positions and overall rhythm of @Video1 for beat-synced cuts.
More dynamic character movement. Dreamlike visual style with strong tension.
Adjust shot sizes and lighting to follow the music.

スタイルと音声の修飾

プロンプトの末尾に短いフレーズを足すと、見た目と音を誘導できます：

映像： Cinematic quality, film grain, shallow depth of field · 2.35:1 widescreen, 24fps · Anime style / Photorealistic
ムード： Tense and suspenseful · Warm and healing · Epic and grand
音声： Background music: grand and majestic · Beat-synced transitions matching music rhythm · Voice tone reference @Video1

手早いワークフロー

動画の種類を決める — 広告、ドラマ、MV、商品、教育など。
画像・参照動画・音声を集め、どのスロットに入れるかメモする。
すべての @ 参照に役割を割り当てる — 最初のフレーム、カメラ、BGM など。
シーンを順に書く：被写体 → 環境 → 動き → カメラ → タイミング。
スタイルと音声の修飾を足し、ファイル数と長さの上限を確認する。
生成して確認し、曖昧な行を次のイテレーションで絞り込む。

避けたいミス

曖昧な参照 — 「reference @Video1」だけで、カメラ・動き・リズムのどれかを書かない。
矛盾するカメラ指示 — 同じ区間で固定カメラとオービットを同時に指定する。
短尺に詰め込みすぎ — 4 秒で 3 回シーン転換はほぼうまくいかない。
未割り当てのアップロード — 画像を 5 枚上げたのにプロンプトでは 2 枚しか触れない。
音声の指示なし — サウンド設計は品質向上の近道のひとつ。
長さの不一致 — 15 ビート分の脚本なのに出力は 5 秒を選んでいる。

コピペ用テンプレート

15 秒商品広告

Reference @Video1's editing style and camera transitions. Replace @Video1's
product with @Image1 as the hero product.
0–3s: Product enters with dynamic rotation, close-up on texture and logo.
4–8s: Angle transitions — front, side, back — with highlight scanning light.
9–12s: Product in lifestyle usage context.
13–15s: Hero shot, brand tagline appears, music builds to resolution.
Sound: Reference @Video1's background music. Add product interaction SFX.

15 秒ショートドラマ

Scene (0–5s): Close-up on reddened eyes, finger pointing accusingly, tears
streaming. Dialogue (A, choking): "What exactly are you trying to take from me?"
Scene (6–10s): Other character trembles, holds up evidence, steps forward.
Dialogue (B): "I'm not deceiving you! This is what he entrusted to me!"
Scene (11–15s): Evidence revealed. A freezes — anger shifts to shock.
Sound: Urgent piano, static interference, sobbing, button click.

13 秒ダンスクリップ

Have the character in @Image1 replicate the dance moves and beat-synced
music from @Video1. Generate a 13-second video. Movements smooth,
no stuttering or freezing.

試してみますか？HiArt キャンバスで動画ノードを開き、Seedance 2.0 を選び、参照をアップロードして、このガイドからプロンプトを貼り付けてください。まずはシンプルに — 画像 1 枚、はっきりした動き 1 つ、カメラの動き 1 つ — から始め、モデルが得意な範囲が分かってから複雑さを足していきましょう。