Skip to main content
Talking Head audio-driven fal.ai

OmniHuman v1.5 (ByteDance)

OmniHuman v1.5 (ByteDance) costs $0.960/clip on FairStack — a audio-driven model for Premium talking head content, Emotional storytelling, Virtual presenters. That's -92% less than Replicate. No subscription required. Pay per generation with full REST API access. FairStack applies a transparent 20% margin on infrastructure cost so you always see the real price.

FairStack price
$0.960/clip
Replicate
$0.50/clip
You save
-92%

What is OmniHuman v1.5 (ByteDance)?

OmniHuman v1.5 is ByteDance's premier audio-driven avatar model, generating lifelike talking head videos from a single portrait photo and audio input. It achieves the best emotional synchronization of any talking head model, with a facial expression score of 0.88 and lip sync accuracy of 0.90. Movements, expressions, and gestures correlate with audio emotion rather than merely matching mouth shapes. The model generates full upper-body motion including head movements, shoulder gestures, hand positions, and torso adjustments. Micro-expressions such as blinks, eyebrow raises, and subtle facial movements add a level of realism that distinguishes it from simpler lip-sync models. A single clear frontal portrait is sufficient as input. Compared to Creatify Aurora which targets commercial ad production with studio-grade consistency, OmniHuman v1.5 delivers superior emotional depth and expressiveness at a similar price point of $0.16 per second. Against Pixverse Lipsync at $0.04 per second, OmniHuman commands a premium but produces dramatically more realistic and emotionally nuanced output. Generation takes approximately 3 minutes for 10 seconds of video. Best suited for premium spokesperson content, emotional storytelling, virtual presenters, and any project where facial expression quality and emotional authenticity are critical. Available on FairStack at infrastructure cost plus a 20% platform fee.

Key Features

Best emotional synchronization (0.88 facial expression score)
Single portrait input — works with any clear frontal photo
Full upper-body motion — head, shoulders, hands, and torso
Micro-expressions — blinks, eyebrow raises, subtle movements
Best lip sync accuracy (0.90)

What are OmniHuman v1.5 (ByteDance)'s strengths?

Best emotional expression synchronization
Best lip sync accuracy (0.90)
Full upper body animation
Micro-expression detail

What are OmniHuman v1.5 (ByteDance)'s limitations?

Most expensive talking head ($0.16/second)
Slower generation (~3 minutes for 10s)
Best with frontal portrait photos

What is OmniHuman v1.5 (ByteDance) best for?

Premium talking head content Emotional storytelling Virtual presenters When expression matters most

How much does OmniHuman v1.5 (ByteDance) cost?

Metric
FairStack
Details
Price per generation
$0.960
Includes 20% margin
Per-second rate
$0.1600/sec
Billed per second of output
Replicate price
$0.50
Save -92%
Subscription
None
Pay per generation only

How does OmniHuman v1.5 (ByteDance) perform across capabilities?

lip sync
90%
facial expression
88%
head movement
82%
visual quality
85%

How do I use the OmniHuman v1.5 (ByteDance) API?

curl
curl -X POST https://api.fairstack.ai/v1/generations/talkingHead \
  -H "Authorization: Bearer $FAIRSTACK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "omnihuman-v1.5",
    "prompt": "Your prompt here"
  }'
Python
import requests

response = requests.post(
    "https://api.fairstack.ai/v1/generations/talkingHead",
    headers={
        "Authorization": f"Bearer {FAIRSTACK_API_KEY}",
        "Content-Type": "application/json",
    },
    json={
        "model": "omnihuman-v1.5",
        "prompt": "Your prompt here",
    },
)

result = response.json()
print(result["url"])
Node.js
const response = await fetch(
  "https://api.fairstack.ai/v1/generations/talkingHead",
  {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.FAIRSTACK_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "omnihuman-v1.5",
      prompt: "Your prompt here",
    }),
  }
);

const result = await response.json();
console.log(result.url);

What parameters does OmniHuman v1.5 (ByteDance) support?

Parameter
Type
Default
Details
image_url
string
audio_url
string

Frequently Asked Questions

How much does OmniHuman v1.5 (ByteDance) cost?

OmniHuman v1.5 (ByteDance) costs $0.960/clip on FairStack as of 2026-05-13. This price includes FairStack's transparent 20% margin on infrastructure cost. No subscription or monthly fee — you pay per generation only. Minimum deposit is $1.

What is OmniHuman v1.5 (ByteDance) and what is it best for?

OmniHuman v1.5 is ByteDance's premier audio-driven avatar model, generating lifelike talking head videos from a single portrait photo and audio input. It achieves the best emotional synchronization of any talking head model, with a facial expression score of 0.88 and lip sync accuracy of 0.90. Movements, expressions, and gestures correlate with audio emotion rather than merely matching mouth shapes. The model generates full upper-body motion including head movements, shoulder gestures, hand positions, and torso adjustments. Micro-expressions such as blinks, eyebrow raises, and subtle facial movements add a level of realism that distinguishes it from simpler lip-sync models. A single clear frontal portrait is sufficient as input. Compared to Creatify Aurora which targets commercial ad production with studio-grade consistency, OmniHuman v1.5 delivers superior emotional depth and expressiveness at a similar price point of $0.16 per second. Against Pixverse Lipsync at $0.04 per second, OmniHuman commands a premium but produces dramatically more realistic and emotionally nuanced output. Generation takes approximately 3 minutes for 10 seconds of video. Best suited for premium spokesperson content, emotional storytelling, virtual presenters, and any project where facial expression quality and emotional authenticity are critical. Available on FairStack at infrastructure cost plus a 20% platform fee. OmniHuman v1.5 (ByteDance) is best for Premium talking head content, Emotional storytelling, Virtual presenters. Available via FairStack's REST API with curl, Python, and Node.js SDKs.

Does OmniHuman v1.5 (ByteDance) have an API?

Yes. OmniHuman v1.5 (ByteDance) is available via FairStack's REST API at api.fairstack.ai. Send a POST request to /v1/generations/talkingHead with your API key and prompt. Works with curl, Python requests, Node.js fetch, and any HTTP client. No SDK installation required.

Is OmniHuman v1.5 (ByteDance) cheaper on FairStack than Replicate?

Yes. Replicate charges $0.50 per generation for OmniHuman v1.5 (ByteDance). FairStack charges $0.960/clip — that's -92% cheaper. FairStack uses a transparent 20% margin model with no subscription fees.

How does OmniHuman v1.5 (ByteDance) compare to other talking head models?

OmniHuman v1.5 (ByteDance) excels at Premium talking head content, Emotional storytelling, Virtual presenters. It is a audio-driven model priced at $0.960/clip on FairStack. Key strengths: Best emotional expression synchronization, Best lip sync accuracy (0.90). Compare all talking head models at fairstack.ai/models.

What makes OmniHuman v1.5 (ByteDance) different from other talking head models?

OmniHuman v1.5 (ByteDance) is distinguished by best emotional expression synchronization and best lip sync accuracy (0.90). Generation typically takes 15-60 seconds due to its higher-quality processing.

What are the limitations of OmniHuman v1.5 (ByteDance) for talking head videos?

Key limitations include: most expensive talking head ($0.16/second); slower generation (~3 minutes for 10s); best with frontal portrait photos. FairStack documents these transparently so you can choose the right model for your workflow.

How fast is OmniHuman v1.5 (ByteDance)?

OmniHuman v1.5 (ByteDance) typically takes 15-60 seconds due to its higher-quality processing. The longer processing time reflects its advanced architecture, which produces higher-quality results than faster alternatives.

How does using OmniHuman v1.5 (ByteDance) on FairStack compare to Replicate?

Using OmniHuman v1.5 (ByteDance) through FairStack gives you the same underlying model with transparent per-use pricing and full REST API access. OmniHuman v1.5 (ByteDance) offers best emotional expression synchronization. No subscription or commitment required — pay only for what you generate.

See how OmniHuman v1.5 (ByteDance) compares

Side-by-side pricing comparisons with competitors

Start using OmniHuman v1.5 (ByteDance) today

$0.960/clip. Full API access. No subscription.

Start Creating