Skip to main content
Video Image to Video fal.ai

Veo 3.1 Reference-to-Video

Veo 3.1 Reference-to-Video costs $2.40/clip on FairStack — a image to video model for Character-consistent videos, Product demos, Brand content. No subscription required. Pay per generation with full REST API access. FairStack applies a transparent 20% margin on infrastructure cost so you always see the real price.

FairStack price
$2.40/clip

What is Veo 3.1 Reference-to-Video?

Veo 3.1 Reference-to-Video is Google DeepMind's reference-guided video generation model that creates video while maintaining visual consistency with provided reference images. The model preserves subject identity, visual style, and compositional characteristics from the reference throughout the generated video. With per-second billing at $0.20 per second base, the model supports both 1080p and 4K output with optional audio generation. 4K and audio each double the base cost. Up to 8 seconds of video can be generated per request. The reference guidance ensures consistent character appearance and visual style. Compared to standard text-to-video models that rely solely on prompts for visual direction, reference-guided generation provides significantly tighter control over the output's visual identity. Against Kling's reference-to-video options, Veo 3.1 brings Google's premium video quality and flexible resolution options. Best suited for character-consistent video from reference images, brand content, and product demonstrations where maintaining visual consistency with reference material is essential. Available on FairStack at infrastructure cost plus a 20% platform fee.

Key Features

Reference image guidance
1080p and 4K output
Optional audio generation
Visual consistency

What are Veo 3.1 Reference-to-Video's strengths?

Maintains subject consistency
Premium quality
Flexible resolution

What are Veo 3.1 Reference-to-Video's limitations?

Expensive at $0.20/s base
Max 8 seconds
Audio doubles cost

What is Veo 3.1 Reference-to-Video best for?

Character-consistent videos Product demos Brand content

How much does Veo 3.1 Reference-to-Video cost?

Metric
FairStack
Details
Price per generation
$2.40
Includes 20% margin
Per-second rate
$0.4000/sec
Billed per second of output
Subscription
None
Pay per generation only

How does Veo 3.1 Reference-to-Video perform across capabilities?

photoRealism
88%
textRendering
30%
promptAdherence
85%
styleRange
75%
detailFidelity
88%
compositionControl
85%
colorAccuracy
87%
edgeArtifacts
88%
speedEfficiency
50%
costEfficiency
35%

How do I use the Veo 3.1 Reference-to-Video API?

curl
curl -X POST https://api.fairstack.ai/v1/generations/video \
  -H "Authorization: Bearer $FAIRSTACK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veo-3-1-ref2v",
    "prompt": "Your prompt here"
  }'
Python
import requests

response = requests.post(
    "https://api.fairstack.ai/v1/generations/video",
    headers={
        "Authorization": f"Bearer {FAIRSTACK_API_KEY}",
        "Content-Type": "application/json",
    },
    json={
        "model": "veo-3-1-ref2v",
        "prompt": "Your prompt here",
    },
)

result = response.json()
print(result["url"])
Node.js
const response = await fetch(
  "https://api.fairstack.ai/v1/generations/video",
  {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.FAIRSTACK_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "veo-3-1-ref2v",
      prompt: "Your prompt here",
    }),
  }
);

const result = await response.json();
console.log(result.url);

What parameters does Veo 3.1 Reference-to-Video support?

Parameter
Type
Default
Details
aspect_ratio
enum
16:9
Options: 16:9, 9:16
duration
enum
8s
Options: 4s, 6s, 8s
resolution
enum
720p
Options: 720p, 1080p
generate_audio
boolean
true
negative_prompt
string (optional)
seed
integer (optional)

Frequently Asked Questions

How much does Veo 3.1 Reference-to-Video cost?

Veo 3.1 Reference-to-Video costs $2.40/clip on FairStack as of 2026-05-13. This price includes FairStack's transparent 20% margin on infrastructure cost. No subscription or monthly fee — you pay per generation only. Minimum deposit is $1.

What is Veo 3.1 Reference-to-Video and what is it best for?

Veo 3.1 Reference-to-Video is Google DeepMind's reference-guided video generation model that creates video while maintaining visual consistency with provided reference images. The model preserves subject identity, visual style, and compositional characteristics from the reference throughout the generated video. With per-second billing at $0.20 per second base, the model supports both 1080p and 4K output with optional audio generation. 4K and audio each double the base cost. Up to 8 seconds of video can be generated per request. The reference guidance ensures consistent character appearance and visual style. Compared to standard text-to-video models that rely solely on prompts for visual direction, reference-guided generation provides significantly tighter control over the output's visual identity. Against Kling's reference-to-video options, Veo 3.1 brings Google's premium video quality and flexible resolution options. Best suited for character-consistent video from reference images, brand content, and product demonstrations where maintaining visual consistency with reference material is essential. Available on FairStack at infrastructure cost plus a 20% platform fee. Veo 3.1 Reference-to-Video is best for Character-consistent videos, Product demos, Brand content. Available via FairStack's REST API with curl, Python, and Node.js SDKs.

Does Veo 3.1 Reference-to-Video have an API?

Yes. Veo 3.1 Reference-to-Video is available via FairStack's REST API at api.fairstack.ai. Send a POST request to /v1/generations/video with your API key and prompt. Works with curl, Python requests, Node.js fetch, and any HTTP client. No SDK installation required.

How does Veo 3.1 Reference-to-Video compare to other video models?

Veo 3.1 Reference-to-Video excels at Character-consistent videos, Product demos, Brand content. It is a image to video model priced at $2.40/clip on FairStack. Key strengths: Maintains subject consistency, Premium quality. Compare all video models at fairstack.ai/models.

What makes Veo 3.1 Reference-to-Video stand out from other video models?

Veo 3.1 Reference-to-Video is distinguished by maintains subject consistency and premium quality. Generation typically takes 15-60 seconds due to its higher-quality processing.

What are the known limitations of Veo 3.1 Reference-to-Video?

Key limitations include: expensive at $0.20/s base; max 8 seconds; audio doubles cost. FairStack documents these transparently so you can choose the right model for your workflow.

How fast is Veo 3.1 Reference-to-Video?

Veo 3.1 Reference-to-Video typically takes 15-60 seconds due to its higher-quality processing. The longer processing time reflects its advanced architecture, which produces higher-quality results than faster alternatives.

What video capabilities does Veo 3.1 Reference-to-Video offer?

Veo 3.1 Reference-to-Video offers: reference image guidance; 1080p and 4k output; optional audio generation; visual consistency. All capabilities are accessible through both the FairStack web interface and REST API.

Start using Veo 3.1 Reference-to-Video today

$2.40/clip. Full API access. No subscription.

Start Creating