Skip to content
Dashboard

How AI Gateway runs on Fluid compute

Link to headingHow we build Vercel with Vercel

Link to headingUnder the hood: Global delivery network

AI Gateway routes requests across Vercel’s global delivery network for faster responses and low-latency in-cloud routing.AI Gateway routes requests across Vercel’s global delivery network for faster responses and low-latency in-cloud routing.
AI Gateway routes requests across Vercel’s global delivery network for faster responses and low-latency in-cloud routing.
import { generateText } from 'ai';
const { text } = await generateText({
model: 'anthropic/claude-sonnet-4',
prompt: 'Explain how request routing works',
});

Example using AI Gateway with the AI SDK

Link to headingUnder the hood: Powered by Fluid compute

AI Gateway requests run on Fluid compute, combining the scalability of serverless with the concurrency of a server to reduce network overhead across invocations.AI Gateway requests run on Fluid compute, combining the scalability of serverless with the concurrency of a server to reduce network overhead across invocations.
AI Gateway requests run on Fluid compute, combining the scalability of serverless with the concurrency of a server to reduce network overhead across invocations.

Link to headingState, caching, and global coordination

Link to headingMonitoring from the inside out

Continuous checks are performed by both in-memory services and a global system, relaying feedback of provider and model performance to the entire network.Continuous checks are performed by both in-memory services and a global system, relaying feedback of provider and model performance to the entire network.
Continuous checks are performed by both in-memory services and a global system, relaying feedback of provider and model performance to the entire network.
import { streamText } from 'ai';
const result = streamText({
model: 'anthropic/claude-sonnet-4',
prompt: 'Write a technical explanation',
providerOptions: {
gateway: {
order: ['vertex', 'bedrock', 'anthropic'],
},
},
});

Vercel Observability provides native visibility into every model call, including overall request volume, spend, and performance.Vercel Observability provides native visibility into every model call, including overall request volume, spend, and performance.
Vercel Observability provides native visibility into every model call, including overall request volume, spend, and performance.

Link to headingWhy Fluid compute is the right fit

Link to headingBuilding with Vercel