JUN 9, 2026
Table of contents
- Demo 1: noir clip with three inline
@-mentions - Demo 2: V2V edit — round-trip the noir clip and add two more characters
- Why inline
@-mentions matter - Get started
Inline @-mentions on Google Flow API
Google Flow API v1 now supports inline @-mention markers in the prompt on POST /videos and POST /images. Place your character / image / voice references inside the prompt — @character_1, @referenceImage_2, @referenceAudio_1 — and the API wires them to the corresponding body slot at exactly that position in the sentence.
| Endpoint | Supported markers |
|---|---|
| POST /videos | @character_1..7, @referenceImage_1..7, @referenceAudio_1..5 |
| POST /images | @character_1..7, @reference_1..10 |
Case-insensitive, opt-in (prompts without markers are unchanged), each marker must have its matching body slot supplied. Plain text prompts continue to work exactly like before.
This post walks two demos end-to-end:
omni-flashR2V with three inline@-refs — a moonlit San Francisco noir clip.omni-flashV2V edit that round-trips the noir clip back to the API as a source video, then composites two new characters into it via inline@character_1/@character_2.
All assets, references, voices, characters, and the two generated videos below were produced through the public Google Flow API v1.
Demo 1: noir clip with three inline @-mentions
The shot we want: a self-assured biker character sits on a vintage BMW motorcycle in an empty parking lot at night, Golden Gate Bridge in the far distance, San Francisco neo-noir atmosphere, slowly revs the engine, then delivers a single line in a deep raspy voice.
Source images
Three images uploaded via POST /assets/email — each used in a different role below.
(face reference passed to
POST /characters)
referenceImage_1(
@referenceImage_1 in the prompt)
referenceImage_2(
@referenceImage_2 in the prompt)Create the voice
A custom voice for the biker — Laomedeia base preset with a smoke-roughened performance direction.
curl — POST /voices
curl --location 'https://api.useapi.net/v1/google-flow/voices' \
--header 'Authorization: Bearer user:1234-…' \
--header 'Content-Type: application/json' \
--data '{
"email": "[email protected]",
"voice": "Laomedeia",
"displayName": "Smoker Lady",
"dialog": "Honey, you have no idea what I'\''ve seen in this town.",
"voicePerformance": "Deep raspy smoke-roughened, low gravelly pitch, slow drawl with audible exhales, world-weary"
}'
Create the character
Bundle the character image and the custom voice into a single named character.
curl — POST /characters
curl --location 'https://api.useapi.net/v1/google-flow/characters' \
--header 'Authorization: Bearer user:1234-…' \
--header 'Content-Type: application/json' \
--data '{
"displayName": "Smoker Biker",
"imageReference_1": "user:1234-email:…-image:129960d4-…",
"voice": "user:1234-email:…-voice:e25b42db-…-mid:fec00c57-…"
}'
Generate the noir clip
Three inline markers do the work. Notice the prompt does not describe what @referenceImage_1 or @referenceImage_2 are — no “motorcycle”, no “bridge”, no “Golden Gate” mentioned in the text. Identity comes purely from the references; the model uses the inline position to anchor each one in the noun phrase.
curl — POST /videos
curl --location 'https://api.useapi.net/v1/google-flow/videos' \
--header 'Authorization: Bearer user:1234-…' \
--header 'Content-Type: application/json' \
--data '{
"model": "omni-flash",
"aspectRatio": "landscape",
"duration": 8,
"character_1": "user:1234-email:…-character:7bf2bc7f-…-imgs:1-voice:e25b42db-…",
"referenceImage_1": "user:1234-email:…-image:6139e134-…",
"referenceImage_2": "user:1234-email:…-image:bb5032e0-…",
"prompt": "OPEN ON ambient neo-noir synthwave score already playing — low pulsing bass and distant saxophone. @character_1 sitting on @referenceImage_1 in an empty parking lot at night, hands on the controls. Clear sky, sharp moonlight, deep crisp shadows under tall amber street lamps, dry asphalt with faint road markings. @referenceImage_2 visible in the far distance. Iconic neo-noir atmosphere. Static locked-off camera, no zoom, no pan, no movement. @character_1 twists throttle once and revs @referenceImage_1 once — smoke briefly rises from @referenceImage_1 — then eases off, looks at the camera with a knowing half-smile and says slowly \"What a night for a ride.\""
}'
The 8-second result:
Every @-mention grounded to its exact reference: the same specific BMW vintage motorcycle, the same Golden Gate Bridge in the distance, the same character identity from the source image — plus the smoke-roughened custom voice on the spoken line. Music kicks in from frame one.
Demo 2: V2V edit — round-trip the noir clip and add two more characters
For the next shot, we want to take the noir clip above and add two more characters flanking the biker — one on the left dancing, one on the right jumping with arms raised. This exercises omni-flash’s V2V edit path via referenceVideo_1 combined with inline @character_* markers.
Two more character sources
Two more images uploaded via POST /assets/email, then turned into characters via POST /characters (no voice attached this time).
(
@character_1 in the prompt — dances)
(
@character_2 in the prompt — jumps with arms raised) Get a referenceVideo_1 for the V2V edit
Two ways to get one:
- Reuse the generated video’s
mediaGenerationIddirectly from the Demo 1 response (media[0].mediaGenerationId) — no download or re-upload needed. - Upload an MP4 from disk via POST /assets/
emailwithContent-Type: video/mp4— for video sources that aren’t already in your Flow history.
We’ll use the first.
V2V edit constraints — heads up
V2V edit has tighter slot caps than R2V (5 image refs vs 7, 3 audio vs 5), accepts only omni-flash, and uses endFrameIndex_1 instead of duration to size the output. Full constraints in POST /videos → Model Capabilities.
One thing worth knowing from this demo: Google’s V2V moderation is non-deterministic. The exact same body can land on a content-safety reject one attempt and a clean success the next. A simple resubmit usually clears it — no input changes needed.
V2V edit with inline @character_1 + @character_2
referenceVideo_1 switches the request into V2V edit mode. The inline @character_* markers place each new character at the specified position in the scene without ever describing what they look like in the text — identity, again, comes entirely from the references.
curl — POST /videos (V2V edit)
curl --location 'https://api.useapi.net/v1/google-flow/videos' \
--header 'Authorization: Bearer user:1234-…' \
--header 'Content-Type: application/json' \
--data '{
"model": "omni-flash",
"aspectRatio": "landscape",
"character_1": "user:1234-email:…-character:5116516f-…-imgs:1",
"character_2": "user:1234-email:…-character:1555335b-…-imgs:1",
"referenceVideo_1": "user:1234-email:…-video:19ea81d0-…",
"prompt": "@character_1 on the LEFT side of the frame busts some cool noir-style dance moves — slow confident shoulder rolls, smooth hip sway, runs a hand through her hair, fitting the moody parking-lot vibe. @character_2 on the RIGHT side of the frame jumps up and down enthusiastically with both arms raised, beaming smile. The center character stays where she is. Same noir parking lot at night, same Golden Gate behind, same moonlit atmosphere from the source video. Static locked-off camera, no zoom, no pan, no movement."
}'
The result:
Three distinct characters in one frame, each anchored to a specific reference: the original biker preserved in place from the source video, plus the two new flanking characters in their assigned positions and actions. Moon, bridge, parking lot, and amber street lamps all carried through unchanged from the source.
Why inline @-mentions matter
Without inline markers, a video gen with three references says “use these for context” but doesn’t tell the model where each one belongs. The model has to guess from word order. With markers:
- Positional grounding. The model knows
@character_1goes in the noun phrase the marker sits in — not somewhere else. - No descriptive cheating needed. You don’t have to describe what the reference is in the prompt (
"the chrome motorcycle","the Golden Gate Bridge") — identity comes from the reference. The prompt stays neutral and the model has to use the ref to render correctly. - Multi-mention reuse. Mention the same
@character_1or@referenceImage_1multiple times across one prompt. The top-level reference array de-duplicates by id; only the positional information stacks. - Same wire shape across endpoints. Both
/videos(R2V, V2V edit) and/imagesaccept the same markers — pick the right slot name for the endpoint (@reference_Non /images,@referenceImage_Non /videos).
Get started
Google Flow API v1 docs · Setup Google Flow · POST /videos · POST /images · POST /characters · POST /voices