| Capability | C1 | V6 | Legacy models |
|---|---|---|---|
| Text-to-video | Supported | Supported | Varies |
| Image-to-video | Supported | Supported | Varies |
| Transition(first and last frame) | Supported | Supported | Varies |
| Reference-to-video / Fusion | Supported | Not supported | Varies |
| Video extension | Not supported | Supported | Varies |
| Inline Multi-clip generation | Supported with prompt | Supported with parameters | v5.5 above |
| Inline audio generation | Supported | Supported | v5.5 above |
| Max duration | 15s | 15s | Varies |
| Max resolution | 1080p | 1080p | 1080p |
| Pricing | Per second | Per second | Varies |
| Capability | Endpoint type | What it does |
|---|---|---|
| Restyle | Video editing | Change the visual style of an existing video |
| Swap | Video editing | Replace a subject or region in a video |
| Mimic | Motion control | Replicate motion from a reference source |
| Modify | Video editing | Edit video content using a text prompt |
| Sound Effects | Audio | Generate synchronized audio for a video |
| Lip Sync | Audio | Align speech to mouth movement in a video |
| Image Template | Image generation | Generate images from predefined templates |