Z AI (Zhipu AI)
Learn how to configure and use Z AI's GLM-4.5 models with Sypha. Experience advanced hybrid reasoning, agentic capabilities, and open-source excellence with regional optimization.
Z AI (previously known as Zhipu AI) delivers the innovative GLM-4.5 series, showcasing hybrid reasoning abilities and agentic AI architecture. Launched in July 2025, these models demonstrate excellence in integrated reasoning, coding, and intelligent agent implementations while preserving open-source availability under MIT license.
Website: https://z.ai/model-api (International) | https://open.bigmodel.cn/ (China)
Getting an API Key
International Users
- Sign Up/Sign In: Visit https://z.ai/model-api. Establish an account or authenticate.
- Navigate to API Keys: Enter your account dashboard and locate the API keys section.
- Create a Key: Produce a new API key for your application.
- Copy the Key: Capture the API key right away and preserve it securely.
China Mainland Users
- Sign Up/Sign In: Visit https://open.bigmodel.cn/. Establish an account or authenticate.
- Navigate to API Keys: Enter your account dashboard and locate the API keys section.
- Create a Key: Produce a new API key for your application.
- Copy the Key: Capture the API key right away and preserve it securely.
Supported Models
Z AI offers distinct model catalogs depending on your chosen region:
GLM-4.5 Series
- GLM-4.5 - Primary model featuring 355B total parameters, 32B active parameters
- GLM-4.5-Air - Streamlined model featuring 106B total parameters, 12B active parameters
GLM-4.5 Hybrid Reasoning Models
- GLM-4.5 (Thinking Mode) - Sophisticated reasoning with sequential analysis
- GLM-4.5-Air (Thinking Mode) - Optimized reasoning for standard hardware
All models include:
- 128,000 token context window enabling comprehensive document processing
- Mixture of Experts (MoE) architecture delivering optimal performance
- Agent-native design combining reasoning, coding, and tool utilization
- Open-source availability through MIT license
Configuration in Sypha
- Open Sypha Settings: Select the settings icon (⚙️) within the Sypha panel.
- Select Provider: Pick "Z AI" from the "API Provider" dropdown menu.
- Select Region: Pick your region:
- "International" for worldwide access
- "China" for mainland China access
- Enter API Key: Insert your Z AI API key into the "Z AI API Key" field.
- Select Model: Pick your preferred model from the "Model" dropdown menu.
GLM Coding Plans
Z AI provides subscription tiers tailored specifically for coding use cases. These tiers deliver economical access to GLM-4.5 models via a prompt-based framework instead of conventional API usage charges.
Plan Options
GLM Coding Lite - $3/month
- 120 prompts within each 5-hour cycle
- Availability of GLM-4.5 model
- Functions exclusively via coding tools like Sypha
GLM Coding Pro - $15/month
- 600 prompts within each 5-hour cycle
- Availability of GLM-4.5 model
- Functions exclusively via coding tools like Sypha
Both tiers feature introductory pricing for the initial month: Lite reduces from $6 to $3, Pro reduces from $30 to $15.

Setting up GLM Coding Plans
To utilize the GLM Coding Plans with Sypha:
-
Subscribe: Navigate to https://z.ai/subscribe and select your preferred plan.
-
Create API Key: Following subscription, access your zAI dashboard and generate an API key for your coding plan.
-
Configure in Sypha: Launch Sypha settings, designate "Z AI" as your provider, and insert your API key into the "Z AI API Key" field.

The configuration links your subscription directly to Sypha, granting you access to GLM-4.5's tool-calling features refined for coding workflows.
Z AI's Hybrid Intelligence
Z AI's GLM-4.5 series presents groundbreaking capabilities that distinguish it from traditional language models:
Hybrid Reasoning Architecture
GLM-4.5 functions in two separate modes:
- Thinking Mode: Built for intricate reasoning tasks and tool employment, participating in extensive analytical procedures
- Non-Thinking Mode: Delivers instant responses for simple queries, maximizing efficiency
This dual-mode framework embodies an "agent-native" design approach that adjusts processing intensity according to query complexity.
Exceptional Performance
GLM-4.5 attains a complete score of 63.2 throughout 12 benchmarks covering agentic tasks, reasoning, and coding challenges, claiming 3rd place among all proprietary and open-source models. GLM-4.5-Air sustains competitive performance with a score of 59.8 while providing enhanced efficiency.
Mixture of Experts Excellence
The advanced MoE architecture maximizes performance while preserving computational efficiency:
- GLM-4.5: 355B total parameters featuring 32B active parameters
- GLM-4.5-Air: 106B total parameters featuring 12B active parameters
Extended Context Capabilities
The 128,000-token context window facilitates thorough understanding of extensive documents and codebases, with practical testing validating effective processing of approximately 2,000-line codebases while sustaining exceptional performance.
Open-Source Leadership
Published under MIT license, GLM-4.5 grants researchers and developers access to cutting-edge capabilities without proprietary limitations, including base models, hybrid reasoning editions, and optimized FP8 variants.
Regional Optimization
API Endpoints
- International: Employs
https://api.z.ai/api/paas/v4 - China: Employs
https://open.bigmodel.cn/api/paas/v4
Model Availability
The region configuration dictates both API endpoint and accessible models, with automatic filtering guaranteeing compatibility with your chosen region.
Special Features
Agentic Capabilities
GLM-4.5's integrated architecture renders it especially appropriate for sophisticated intelligent agent applications demanding unified reasoning, coding, and tool utilization abilities.
Comprehensive Benchmarking
Performance assessment includes:
- 3 agentic task benchmarks
- 7 reasoning benchmarks
- 2 coding benchmarks
This thorough evaluation exhibits versatility throughout various AI applications.
Developer Integration
Models enable integration via multiple frameworks:
- transformers
- vLLM
- SGLang
Packaged with specialized model code, tool parser, and reasoning parser implementations.
Performance Comparisons
vs Claude 4 Sonnet
GLM-4.5 demonstrates competitive performance in agentic coding and reasoning tasks, although Claude Sonnet 4 retains advantages in coding success rates and autonomous multi-feature application development.
vs GPT-4.5
GLM-4.5 positions competitively in reasoning and agent benchmarks, with GPT-4.5 typically leading in direct task accuracy on professional benchmarks like MMLU and AIME.
Tips and Notes
- Region Selection: Select the suitable region for peak performance and adherence to local regulations.
- Model Selection: GLM-4.5 for highest performance, GLM-4.5-Air for efficiency and standard hardware compatibility.
- Context Advantage: Expansive 128K context window permits processing of considerable codebases and documents.
- Open Source Benefits: MIT license permits both commercial usage and secondary development.
- Agentic Applications: Especially robust for applications demanding reasoning, coding, and tool usage integration.
- Hybrid Reasoning: Apply Thinking Mode for intricate problems, Non-Thinking Mode for straightforward queries.
- API Compatibility: OpenAI-compatible API delivers streaming responses and usage reporting.
- Framework Support: Multiple integration alternatives accessible for various deployment scenarios.