GLM-5V-Turbo

zai-org/glm-5v-turbo

GLM-5V-Turbo is Z.AI’s first multimodal coding foundation model, built for vision-based coding tasks. It can natively process multimodal inputs such as images, video, and text, while also excelling at long-horizon planning, complex coding, and action execution. Deeply optimized for agent workflows, it works seamlessly with agents such as Claude Code and OpenClaw to complete the full loop of “understand the environment → plan actions → execute tasks”.

Price

Input	$1.2 per million tokens
Cached reads	$0.24/百万 tokens
Output	$4/百万 tokens

Use the following code example to integrate our API:

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="<Your API Key>",
5    base_url="https://api.highwayapi.ai/openai"
6)
7
8response = client.chat.completions.create(
9    model="zai-org/glm-5v-turbo",
10    messages=[
11        {"role": "system", "content": "You are a helpful assistant."},
12        {"role": "user", "content": "Hello, how are you?"}
13    ],
14    max_tokens=131072,
15    temperature=0.7
16)
17
18print(response.choices[0].message.content)

Information

Provider

ZhipuAI

Quantification

fp8

Supported Features

Context length

204800

Maximum output

131072

Function call

Support

Structured output

Support

Reasoning

Support

serverless

Support

Input Capabilities

text, image, video

Output Capabilities

text