Sypha AI Docs
Running models locally

LM Studio

A comprehensive walkthrough for configuring LM Studio to run AI models locally alongside Sypha.

Configuring LM Studio for Sypha

Execute AI models on your local machine by integrating LM Studio with Sypha.

Requirements

  • A Windows, macOS, or Linux system with AVX2 support
  • VS Code with Sypha extension installed

Configuration Process

1. Get LM Studio Installed

  • Navigate to lmstudio.ai
  • Obtain and install the version compatible with your system
LM Studio download page

2. Open LM Studio

  • Launch the application you just installed
  • The left sidebar contains four tabs: Chat, Developer (this is where the server launches from), My Models (your local model storage), Discover (for browsing available models)
LM Studio interface overview

3. Obtain a Model

  • Explore the "Discover" section
  • Choose and initiate download for your desired model
  • Allow the download process to finish
Downloading a model in LM Studio

4. Activate the Server

  • Switch to the "Developer" tab
  • Flip the server toggle to the "Running" position
  • Important: The server operates at http://localhost:1234
Starting the LM Studio server

5. Set Up Sypha

  1. Launch VS Code
  2. Access Sypha's settings icon
  3. Choose "LM Studio" as your API provider
  4. Pick your model from the dropdown list
Configuring Sypha with LM Studio

Optimal Model Selection and Configuration

To achieve the best results with Sypha, we recommend Qwen3 Coder 30B A3B Instruct. This model provides excellent coding capabilities and dependable tool integration.

Essential Configuration Options

Once you've loaded your model within the Developer tab, adjust these parameters:

  1. Context Length: Configure to 262,144 (this is the model's upper limit)
  2. KV Cache Quantization: Keep this disabled (essential for maintaining stable performance)
  3. Flash Attention: Turn this on if your hardware supports it (enhances speed)

Selecting Quantization Level

Select quantization according to your available RAM:

  • 32GB RAM: Opt for 4-bit quantization (~17GB download)
  • 64GB RAM: Choose 8-bit quantization (~32GB download) for enhanced quality
  • 128GB+ RAM: Explore full precision or more substantial models

Choosing Model Format

  • Mac (Apple Silicon): Select MLX format for enhanced performance
  • Windows/Linux: Choose GGUF format

Activating Compact Prompts

To maximize performance when using local models, activate compact prompts through Sypha's settings. This feature decreases prompt size by 90% while preserving essential functionality.

Go to Sypha Settings → Features → Use Compact Prompt and enable it.

Key Points to Remember

  • Launch LM Studio prior to connecting it with Sypha
  • Maintain LM Studio as a background process
  • Initial model downloads can require several minutes based on file size
  • Downloaded models persist on your local system

Resolving Common Issues

  1. If Sypha cannot establish connection to LM Studio:
  2. Confirm LM Studio server is active (verify in Developer tab)
  3. Make certain a model has been loaded
  4. Validate that your system satisfies hardware requirements

On this page