Powered by Cloudflare AI

Workers AI

Run AI inference at the edge with zero cold starts. Access 100+ models including LLaMA, Stable Diffusion, and Whisper.

100+
AI Models
0ms
Cold Start
10K
Free Neurons/day
Global
Edge Inference

Available Models

Text Generation

  • LLaMA 3.1 70B
  • Mistral 7B
  • Gemma 2 27B
  • Qwen 2.5

Image Generation

  • Stable Diffusion XL
  • FLUX.1 Schnell
  • DreamShaper

Speech

  • Whisper (transcription)
  • Text-to-Speech
  • Voice cloning

Embeddings

  • BGE Base
  • BGE Large
  • Multilingual

Simple API

Text Generation
const response = await ai.run('@cf/meta/llama-3.1-70b', {
  prompt: 'Explain quantum computing in simple terms',
  stream: true,
  max_tokens: 500
});

// Stream the response
for await (const chunk of response) {
  console.log(chunk);
}
Image Generation
const response = await ai.run('@cf/stabilityai/stable-diffusion-xl', {
  prompt: 'A futuristic city at sunset, cyberpunk style',
  width: 1024,
  height: 1024
});

// Returns PNG image buffer
return new Response(response, {
  headers: { 'Content-Type': 'image/png' }
});

Pricing

Workers AI

Serverless AI inference at the edge

Free
  • 10000 neurons/day
  • Edge inference
  • 100+ models
  • REST API
  • Streaming
Start Free

Workers AI Pro

Production AI at scale

$50.0/mo
  • Unlimited neurons
  • Unlimited inference
  • Fine-tuned models
  • Vectorize integration
  • Priority support
Get Started

AI Without the Infrastructure

10,000 free neurons daily. No GPU management required.

Start Building with AI