MiniMax (Hailuo) Seedance 2.0: Standard vs Fast vs Mini — real faces with video + audio omni references

JUN 26, 2026

Seedance 2.0 omni on MiniMax: real faces, driven by a reference video and audio

June 26, 2026

Seedance 2.0 keeps a real human face photoreal where many video models distort it, and MiniMax’s omni references now combine an image, a video, and an audio clip in one request. POST videos/create accepts up to 9 reference images (@Image1…@Image9), 3 reference videos (@Video1…@Video3), and 3 reference audios (@Audio1…@Audio3), each addressed by an @-tag in the prompt.

Here we drop one person into another’s footage: take a studio portrait (@image1), replace the woman in a rope-skipping clip (@video1) with her, and make her speak from an audio clip (@audio1). The same request runs on all three Seedance 2.0 tiers — Seedance-2.0 (Standard), Seedance-2.0-Fast, and Seedance-2.0-Mini — so you can compare the result directly.

The three references

A photo, a video, and an audio clip go in together:

@image1 — the face and identity to drop in. A real face in an image reference is fine.

@video1 — the scene, rope, and motion to keep. Its face is blurred on purpose: Hailuo’s Seedance declines a video reference that shows a clear real face (the job comes back moderated, “Face Detected”), and the identity you keep comes from @image1 anyway. A quick blur over the face in the reference clip is all it takes.

@audio1 — the speech to drive (about 2 seconds).

The prompt

Replace the woman in @video1 with @image1, keeping the same outdoor scene, rope, and jumping movements. She speaks out loud, clearly saying Count me jumping in the voice from @audio1, audible throughout the clip.

First upload each file with POST /files to get its fileID, then pass those IDs as fileID, videoFileID1, and audioFileID1. Only the model and resolution change between the three runs below.

One tip for omni audio: spell out the spoken line and state that it is heard. MiniMax renders the @audio1 voice when the prompt says something like clearly saying … audible throughout — a bare make her say @audio1 tends to return a silent track even though the request is accepted.

Seedance-2.0 (Standard) — 1080p

Standard is the only tier with 1080p and 4K. Same omni request, highest resolution.

curl — Seedance-2.0 (Standard), 1080p, omni image + video + audio

curl --location 'https://api.useapi.net/v1/minimax/videos/create' \
--header 'Authorization: Bearer user:12345-…' \
--form 'model="Seedance-2.0"' \
--form 'prompt="Replace the woman in @video1 with @image1, keeping the same outdoor scene, rope, and jumping movements. She speaks out loud, clearly saying Count me jumping in the voice from @audio1, audible throughout the clip."' \
--form 'fileID="user:12345-minimax:1234567890-file:your-image"' \
--form 'videoFileID1="user:12345-minimax:1234567890-file:your-video"' \
--form 'audioFileID1="user:12345-minimax:1234567890-file:your-audio"' \
--form 'resolution="1080"' \
--form 'duration="5"' \
--form 'aspectRatio="9:16"'

Seedance-2.0-Fast — 720p

curl — Seedance-2.0-Fast, 720p, omni image + video + audio

curl --location 'https://api.useapi.net/v1/minimax/videos/create' \
--header 'Authorization: Bearer user:12345-…' \
--form 'model="Seedance-2.0-Fast"' \
--form 'prompt="Replace the woman in @video1 with @image1, keeping the same outdoor scene, rope, and jumping movements. She speaks out loud, clearly saying Count me jumping in the voice from @audio1, audible throughout the clip."' \
--form 'fileID="user:12345-minimax:1234567890-file:your-image"' \
--form 'videoFileID1="user:12345-minimax:1234567890-file:your-video"' \
--form 'audioFileID1="user:12345-minimax:1234567890-file:your-audio"' \
--form 'resolution="720"' \
--form 'duration="5"' \
--form 'aspectRatio="9:16"'

Seedance-2.0-Mini — 720p

The cheapest tier, same omni capability — a real face from the photo, plus video and audio references.

curl — Seedance-2.0-Mini, 720p, omni image + video + audio

curl --location 'https://api.useapi.net/v1/minimax/videos/create' \
--header 'Authorization: Bearer user:12345-…' \
--form 'model="Seedance-2.0-Mini"' \
--form 'prompt="Replace the woman in @video1 with @image1, keeping the same outdoor scene, rope, and jumping movements. She speaks out loud, clearly saying Count me jumping in the voice from @audio1, audible throughout the clip."' \
--form 'fileID="user:12345-minimax:1234567890-file:your-image"' \
--form 'videoFileID1="user:12345-minimax:1234567890-file:your-video"' \
--form 'audioFileID1="user:12345-minimax:1234567890-file:your-audio"' \
--form 'resolution="720"' \
--form 'duration="5"' \
--form 'aspectRatio="9:16"'

What it costs

Seedance models are billed per second — multiply the rate by your duration. For this 5-second omni clip:

Model	Resolution	Credits/sec	Credits (5s)
`Seedance-2.0` (Standard)	1080p	50	250
`Seedance-2.0-Fast`	720p	17	85
`Seedance-2.0-Mini`	720p	11	55

Mini is the natural default for iterating — the same omni capability at the lowest cost. Move up to Standard when you need 1080p or 4K. See the POST videos/create credits estimator for the full per-resolution breakdown.

See the POST videos/create reference for every @Image / @Video / @Audio tag, the reference-media limits, and a live Try-It console.