PixVerse Platform API

A REST API for generating AI videos from text prompts or images — with effects, transitions, lip-sync, and sound — from the PixVerse platform.

All endpoints are under https://app-api.pixverse.ai/openapi/v2/. Every request uses two headers: API-KEY (your key) and Ai-trace-id (a fresh UUID per request — reusing the same value returns the previous result instead of triggering a new generation). All responses use the shape { ErrCode, ErrMsg, Resp } — ErrCode=0 means success and your payload is in Resp. Video generation is asynchronous: submit a job, receive a video_id, then poll the get-status endpoint every 3–5s until status=1 (success). Shared parameters (model, duration, quality) — including valid enum values — are documented inline on each endpoint page. API credits are billed separately from the PixVerse web app and purchased on the Billing page.

What you can do

Introduction to PixVerse Platform: What the platform generates — text-to-video, image-to-video, transitions, effects, lip-sync, sound — and how the API exposes it.

Model & Pricing: Credit cost per operation by model and feature, so you can estimate cost before calling anything.

Rate limit: Per-account request and concurrency limits — check this before sizing your integration.

Quick Start: Shortest path from zero to first successful video — quick overview before the full walkthrough below.

How to call it — image-to-video walkthrough

How does the API work?: The full async pattern in one page — API-KEY + Ai-trace-id headers, submit a generation job, receive a video_id, poll the status endpoint until status=1.

How to get API key?: Step 1 — create an account and generate your API key from the dashboard (shown once, save it immediately).

How to subscribe API plans?: Step 2 — buy credits on the Billing page; note API credits are separate from the PixVerse web app membership.

Upload Image: Step 3 — POST multipart form-data with an image field (JPG/PNG/WebP, max 20MB and max 10000px; URL-based uploads are not supported). Returns Resp.img_id (integer) to reference in the next call.

Image-to-Video Generation: Step 4 — POST JSON body with fields img_id, prompt, model, duration, quality, aspect_ratio, motion_mode, negative_prompt. Verified sample values: model="v6", duration=5, quality="540p", aspect_ratio="16:9", motion_mode="normal". Returns Resp.video_id.

Get Video Generation Status: Step 5 — GET with the video_id every 3–5s until Resp.status=1. Other statuses: 5 = generating, 6 = deleted, 7 = moderation failed, 8 = generation failed. On success Resp.url contains the final video URL.

Other features

Text-to-Video Generation: Generate from a prompt only.

Image-to-Video Generation: transforms static images into dynamic videos with motion and effects

Transition (First-last frame) Generation: Generate a video that interpolates between a specified first frame image and last frame image.

Multi-transition Video Generation: Generate a longer video that transitions through multiple keyframe images in sequence (2–7 keyframes, 1–30s output).

Extend Generation: Extend an existing generated video with a continuation, producing a longer clip from a previous video_id.

Fusion (Reference to Video) Generation: POST /openapi/v2/video/fusion/generate with an image_references array (each: type: "subject"|"background", img_id, ref_name) plus a prompt that uses @ref_name to place subjects/backgrounds in the scene. Requires model v4.5+.

Swap Video Generation: POST /openapi/v2/video/swap/generate — replace a selected object in an existing video with a target image. Needs video_media_id, keyframe_id, mask_id, img_id; pair with the mask/selection API to specify the replacement region.

Restyle Video Generation: Re-render an existing video in a different visual style (3D, anime, cinematic, enhanced realism, etc.).

Restyle effect list: Catalog of valid style presets to pass to the Restyle endpoint.

Video Effects Generation: PixVerse's signature template-based effects (character transforms, cinematic motion, etc.) — driven by passing template_id + prompt to text-to-video or image-to-video. Template catalog lives in the web Effect Center; multi-image effects take img_ids instead of img_id.

Mimic Generation: Transfer motion from a reference video onto a character image — inputs are a character image + a reference video (mp4/mov, max 1920p, 100MB, 30s). For choreography and animation reuse.

Modify Generation:edit any part of an existing video — including adding, replacing, removing, modifying elements, or transforming the overall style.

Image Template Generation: Image-centric template workflow — upload photos, pick a template from the Template Center, get AI-composed image outputs.

Speech (Lipsync) Generation: Make a character in an existing video lip-sync to supplied speech audio.

Sound Effect Generation: Generate and attach sound effects to an existing video.

FAQ: Common questions about credits, accounts, API limits, and model behavior.

Common errors and Solutions: Error codes and status codes with typical fixes.

PixVerse-api-llm.txt

PixVerse Platform API#

What you can do#

How to call it — image-to-video walkthrough#

Other features#

PixVerse Platform API

What you can do

How to call it — image-to-video walkthrough

Other features