Skip to main content

Overview

AI Talking Photo brings static photos to life by animating faces to speak with realistic lip-sync and natural facial movements. The API analyzes facial features and synchronizes mouth movements, head poses, and expressions with provided audio or generated speech.

API Spec

See API details

Product Page

Learn more about AI Talking Photo

How It Works

  1. Provide a photo - Upload an image with a clear face
  2. Add audio - Upload audio or provide text for speech generation
  3. API animates - AI creates realistic lip-sync and facial movements
  4. Download video - Retrieve your animated talking photo

Use Cases

  • Marketing videos - Create spokesperson videos from headshots
  • Educational content - Animate historical figures or characters
  • Personalized messages - Send video messages from static photos
  • Social media - Create engaging content from profile pictures
  • Presentations - Add dynamic talking heads to slides

Best Practices

Photo Selection

Use clear, front-facing photos - Best results come from high-quality headshots with visible facial features.
  • Good lighting - Well-lit faces produce better animations
  • Front-facing angles - Avoid extreme profile shots
  • Clear features - Eyes, nose, and mouth should be unobstructed
  • High resolution - At least 512x512 pixels recommended

Audio Guidelines

Audio TypeBest Practice
Voice recordingClear speech without background noise
Generated speechUse natural-sounding text prompts
Music/songsWorks best with clear vocals
LengthKeep under 30 seconds for best results

Code Examples

Basic Talking Photo with Text

from magic_hour import Client
from os import getenv

client = Client(token=getenv("API_TOKEN"))

result = client.v1.ai_talking_photo.generate(
    assets={
        "image_file_path": "https://raw.githubusercontent.com/runshouse/Sample_Assets/main/tomcruise.png",
        "audio_file_path": "https://raw.githubusercontent.com/runshouse/Sample_Assets/main/you-are-just-a-line-of-code.mp3"
    },
    name="Talking Photo",
    start_seconds=0,
    end_seconds=2,
    wait_for_completion=True,
    download_outputs=True,
    download_directory="."
)

if result.status == "complete":
    print(f"✅ Talking photo complete!")
    print(f"Downloaded to: {result.downloaded_paths}")
    print(f"Credits charged: {result.credits_charged}")
else:
    print(f"❌ Job failed with status: {result.status}")
    if hasattr(result, 'error_message'):
        print(f"Error: {result.error_message}")

With Audio File

result = client.v1.ai_talking_photo.generate(
    assets={
        "image_file_path": "https://raw.githubusercontent.com/runshouse/Sample_Assets/main/tomcruise.png",
        "audio_file_path": "https://raw.githubusercontent.com/runshouse/Sample_Assets/main/you-are-just-a-line-of-code.mp3"
    },
    name="Talking Photo",
    wait_for_completion=True,
    download_outputs=True,
    download_directory="."
)

if result.status == "complete":
    print(f"✅ Talking photo complete!")
    print(f"Downloaded to: {result.downloaded_paths}")
    print(f"Credits charged: {result.credits_charged}")
else:
    print(f"❌ Job failed with status: {result.status}")
    if hasattr(result, 'error_message'):
        print(f"Error: {result.error_message}")

Pricing

Talking Photo pricing varies by video length and resolution:
ConfigurationCredits per Second
720p or lower~10-15 credits/sec
Higher resolution~20-30 credits/sec

Resolution Limits

AI Talking Photo has a maximum resolution of 720p across all subscription tiers due to computational requirements.
Try this in our Google Colab Cookbook: Run this API with sample code. Just add your API key.

API Reference

AI Talking Photo API Reference

View full API specification

Lip Sync

Sync audio with existing video lip movements

AI Voice Generator

Generate speech audio for your talking photos