Vertex AI Model Pricing
Prices used for cost estimation in this dashboard. All prices are per 1 million tokens unless noted otherwise.
Text Models (Rerouter)
| Model | Input (text/image/video) | Output (text) | Input (audio) | Context Window |
| gemini-2.5-flash |
$0.30 / 1M tok |
$2.50 / 1M tok |
$1.00 / 1M tok |
1M tokens |
| gemini-2.5-flash-lite |
$0.10 / 1M tok |
$0.40 / 1M tok |
$0.30 / 1M tok |
1M tokens |
Voice Transcription
| Model | How it works | Input Cost | Output Cost |
| gemini-2.5-flash |
Audio sent as multimodal input to Gemini, transcription returned as text output |
$0.30 / 1M tok (text) · $1.00 / 1M tok (audio) |
$2.50 / 1M tok |
Image Generation & Description
| Model | Operation | Price |
| imagen-3.0-generate-001 |
Image generation from text prompt |
$0.04 / image |
| gemini-2.5-flash |
Image description (multimodal input) |
$0.30 / 1M input tok · $2.50 / 1M output tok |
Cost Examples
| Scenario | Tokens | Est. Cost |
| 1 flash call (2.5K in, 15K out) | 2,500 + 15,000 | $0.0383 |
| 1 flash-lite call (2.5K in, 15K out) | 2,500 + 15,000 | $0.0063 |
| 100 flash calls | ~250K in, ~1.5M out | $3.83 |
| 1000 flash-lite calls | ~2.5M in, ~15M out | $6.25 |
| 1 image generation | - | $0.04 |
| 1 voice transcription (5 min audio) | ~7.5K in, ~1.2K out | $0.005 |