Home/Models/Google/Veo 3
G

Veo 3

每次請求:$0.4
Google DeepMind 的 Veo 3 代表了文字轉影片生成技術的尖端水準,並標誌著大型生成式 AI 模型首次能夠將高保真影片與搭配的音訊—包括對話、音效與環境音景—無縫同步。
新
商業用途
Playground
概覽
功能
定價
API
版本

Core Features and Capabilities

  • 8‑Second Video Clips: Generates up to eight‑second sequences with seamless shot transitions and stitching.
  • Integrated Audio Generation: Produces dialogue, ambient noise, sound effects, and background music in a single pass.
  • High‑Definition Output: Supports resolutions up to 4K (3840 × 2160) with consistent lighting, realistic physics, and detailed scene textures.
  • Multi‑Modal Inputs: Accepts both text‑to‑video and image‑to‑video prompts, enabling versatile creative workflows.

These capabilities empower creators to craft near‑cinematic narratives without separate audio post‑production or complex editing pipelines .

Technical Details

Veo 3’s architecture leverages a multimodal transformer trained on millions of YouTube videos. Its encoder–decoder framework processes text prompts through a video tokenization layer, generating spatiotemporal features that drive the visual synthesis module. Simultaneously, an audio synthesis branch produces aligned sound outputs. A cross-modal attention mechanism ensures that visual and audio modalities remain tightly coupled, reducing desynchronization artifacts. Training involved billions of parameter updates, optimized via mixed-precision GPU clusters on Google Cloud’s Vertex AI platform .

Benchmark Performance

In internal benchmarks, Veo 3 demonstrates:

  • PSNR (Peak Signal‑to‑Noise Ratio) of 38 dB on standard video datasets, outperforming Veo 2 by 4 dB.
  • SSIM (Structural Similarity Index) scores of 0.92, indicating high visual fidelity.
  • Audio–Video Sync Error below 15 ms, ensuring imperceptible lag between sound and motion.
  • Inference Speed: ~12 frames per second on an NVIDIA A100 GPU, enabling near real-time generation for short clips.
    These metrics position Veo 3 at the forefront of generative video AI, eclipsing contemporaries like Sora and Meta’s recent video models in both quality and synchronization.
  • How to access Veo 3 API

Step 1: Sign Up for API Key

Log in to cometapi.com. If you are not our user yet, please register first. Sign into your CometAPI console. Get the access credential API key of the interface. Click “Add Token” at the API token in the personal center, get the token key: sk-xxxxx and submit.

img

Step 2: Send Requests to Veo 3 API

Select the “\Veo 3 \” endpoint to send the API request and set the request body. The request method and request body are obtained from our website API doc. Our website also provides Apifox test for your convenience. Replace <YOUR_API_KEY> with your actual CometAPI key from your account. base url is Veo3 Async Generation(https://api.cometapi.com/v1/videos).

Insert your question or request into the content field—this is what the model will respond to . Process the API response to get the generated answer.

Step 3: Retrieve and Verify Results

Process the API response to get the generated answer. After processing, the API responds with the task status and output data.

Veo 3 的功能

探索 Veo 3 的核心功能,專為提升效能和可用性而設計。了解這些功能如何為您的專案帶來效益並改善使用者體驗。

Veo 3 的定價

探索 Veo 3 的競爭性定價,專為滿足各種預算和使用需求而設計。我們靈活的方案確保您只需為實際使用量付費,讓您能夠隨著需求增長輕鬆擴展。了解 Veo 3 如何在保持成本可控的同時提升您的專案效果。
彗星價格 (USD / M Tokens)官方價格 (USD / M Tokens)折扣
每次請求:$0.4
每次請求:$0.5
-20%

Veo 3 的範例程式碼和 API

存取完整的範例程式碼和 API 資源,以簡化您的 Veo 3 整合流程。我們詳盡的文件提供逐步指引,協助您在專案中充分發揮 Veo 3 的潛力。

Veo 3的版本

Veo 3擁有多個快照的原因可能包括:更新後輸出結果存在差異需保留舊版快照以確保一致性、為開發者提供適應與遷移的過渡期,以及不同快照對應全球或區域端點以優化使用者體驗等潛在因素。各版本間的具體差異請參閱官方文件說明。
veo3
veo3-framesThe veo3-frames model is specifically optimized for frame sequence generation.The veo3-frames model is specifically optimized for frame sequence generation, and includes a diagram supporting the first and last frames.

更多模型