Sypha AI Docs
Provider config

Baseten

Learn how to configure and use Baseten's Model APIs with Sypha. Access frontier open-source models with enterprise-grade performance, reliability, and competitive pricing.

Baseten offers on-demand frontier model APIs built for production environments rather than mere experimentation. Powered by the Baseten Inference Stack, these APIs provide enterprise-level performance and dependability with optimized inference capabilities for premier open-source models from OpenAI, DeepSeek, Meta, Moonshot AI, and Alibaba Cloud.

Website: https://www.baseten.co/products/model-apis/

Getting an API Key

  1. Sign Up/Sign In: Visit Baseten and establish an account or authenticate if you already have one.
  2. Navigate to API Keys: Open your dashboard and locate the API Keys section.
  3. Create a Key: Produce a new API key. Assign it a meaningful name (e.g., "Sypha").
  4. Copy the Key: Capture the API key right away and preserve it in a secure location.

Supported Models

Sypha is compatible with all available models within Baseten Model APIs, including: For current pricing information, please refer to: https://www.baseten.co/products/model-apis/ Note: Kimi K2 0711, Llama 4 Maverick, and Llama 4 Scout Model APIs were deprecated at 5pm PT on October 8th. https://www.baseten.co/resources/changelog/model-api-deprecation-notice-kimi-k2-0711-scout-maverick/

  • zai-org/GLM-4.6 (Z AI) - Advanced open frontier model featuring sophisticated agentic, reasoning and coding abilities by Z AI (200k context) $0.60/$2.20 per 1M tokens
  • moonshotai/Kimi-K2-Instruct-0905 (Moonshot AI) - September release with improved features (262K context) - $0.60/$2.50 per 1M tokens
  • openai/gpt-oss-120b (OpenAI) - 120B MoE featuring robust reasoning capabilities (128K context) - $0.10/$0.50 per 1M tokens
  • Qwen/Qwen3-Coder-480B-A35B-Instruct- Sophisticated coding and reasoning (262K context) - $0.38/$1.53 per 1M tokens
  • Qwen/Qwen3-235B-A22B-Instruct-2507 - Mathematics and reasoning specialist (262K context) - $0.22/$0.80 per 1M tokens
  • deepseek-ai/DeepSeek-R1 - DeepSeek's initial-generation reasoning model (163K context) - $2.55/$5.95 per 1M tokens
  • deepseek-ai/DeepSeek-R1-0528 - Most recent iteration of DeepSeek's reasoning model (163K context) - $2.55/$5.95 per 1M tokens
  • deepseek-ai/DeepSeek-V3.1 - Combined reasoning with sophisticated tool calling (163K context) - $0.50/$1.50 per 1M tokens
  • deepseek-ai/DeepSeek-V3-0324 - Rapid general-purpose with improved reasoning (163K context) - $0.77/$0.77 per 1M tokens

Configuration in Sypha

  1. Open Sypha Settings: Select the settings icon (⚙️) within the Sypha panel.
  2. Select Provider: Pick "Baseten" from the "API Provider" dropdown menu.
  3. Enter API Key: Insert your Baseten API key into the "Baseten API Key" field.
  4. Select Model: Pick your preferred model from the "Model" dropdown menu.

Production-First Architecture

Baseten's Model APIs are engineered for production settings with multiple critical advantages:

Enterprise-Grade Reliability

  • 99.99% uptime achieved via active-active redundancy
  • Cloud-agnostic, multi-cluster autoscaling ensuring consistent availability
  • SOC 2 Type II certification and HIPAA compliance meeting security standards

Optimized Performance

  • Pre-optimized models delivered through the Baseten Inference Stack
  • Latest-generation GPUs supported by multi-cloud infrastructure
  • Ultra-fast inference refined from foundation to peak for production requirements

Cost Efficiency

  • 5-10x more affordable compared to closed alternatives
  • Optimized multi-cloud infrastructure enabling efficient resource usage
  • Transparent pricing eliminating hidden fees or unexpected rate limit costs

Developer Experience

  • OpenAI compatible API - transition by changing a single URL
  • Direct replacement for closed models featuring comprehensive observability
  • Effortless scaling transitioning from Model APIs to dedicated deployments

Special Features

Function Calling & Tool Use

Every Baseten model enables structured outputs, function calling, and tool usage through the Baseten Inference Stack, making them excellent for agentic applications.

Reasoning Capabilities

DeepSeek models provide enhanced reasoning featuring step-by-step analytical processes, while sustaining production-ready performance.

Long Context Support

  • Up to 1 million tokens available for Llama 4 models (Maverick and Scout)
  • 262K tokens available for Qwen3 models
  • 163K tokens available for DeepSeek models
  • Ideal for code repositories and intricate multi-turn conversations

Quantization Optimizations

Models are deployed utilizing advanced quantization methods (fp4, fp8, fp16) for peak performance while preserving quality.

Migration from Other Providers

Baseten's OpenAI compatibility simplifies migration:

From OpenAI:

  • Replace api.openai.com with inference.baseten.co/v1
  • Preserve existing request/response structures
  • Gain significant cost reductions

From Other Providers:

  • Employ standard OpenAI SDK format
  • Keep current prompting approaches
  • Obtain access to newer open-source models

Tips and Notes

  • Model Selection: Select models according to your particular use case - reasoning models for intricate tasks, coding models for development activities, and flagship models for general purposes.
  • Cost Optimization: Baseten provides some of the most attractive pricing available, particularly for open-source models.
  • Context Windows: Leverage expansive context windows (up to 1M tokens) for incorporating extensive codebases and documentation.
  • Enterprise Ready: Baseten is architected for production deployment with enterprise-level security, compliance, and reliability.
  • Dynamic Model Updates: Sypha automatically retrieves the current model list from Baseten, guaranteeing access to new models upon release.
  • Multi-Cloud Capacity Management (MCM): Baseten's multi-cloud infrastructure guarantees high availability and minimal latency worldwide.
  • Support: Baseten delivers dedicated support for production deployments and can collaborate with you on dedicated resources during scaling.

Pricing Information

Current pricing is exceptionally competitive and clear. For the latest pricing details, visit the Baseten Model APIs page. Prices generally range from $0.10-$6.00 per million tokens, making Baseten substantially more economical than numerous closed-model alternatives while granting access to cutting-edge open-source models.

On this page