GLM 4.7 Flash

GLM 4.7 Flash is the speed-optimized variant in Z.ai's GLM-4.7 generation, released N/A. It delivers faster inference for high-throughput workloads while retaining the coding, tool usage, and conversational improvements introduced in GLM-4.7.

ReasoningTool Use

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'zai/glm-4.7-flash',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Z.ai

200K

1.3s

160tps

$0.07/M

$0.40/M

Read:$0.01/M

Write:—

—

Amazon Bedrock

200K

0.4s

102tps

$0.07/M

$0.40/M

—

Agent Stack

Core Platform

Tools

Learn

Build

Explore

GLM 4.7 Flash

Providers