Automatic Context Summarization

When your conversation nears the model's context window limit, Sypha automatically condenses it to free up space and continue working.

Auto-compact feature condensing conversation context

How It Works

Sypha tracks token usage throughout your conversation. As you approach the limit, he:

Generates a comprehensive summary of everything that has occurred
Retains all technical details, code changes, and decisions
Substitutes the conversation history with the summary
Continues precisely where he left off

You'll observe a summarization tool call when this occurs, displaying the total cost like any other api call in the chat view.

Why This Matters

Previously, Sypha would truncate older messages upon hitting context limits. This resulted in losing important context from earlier in the conversation.

With summarization now:

All technical decisions and code patterns get preserved
File changes and project context stay intact
Sypha retains memory of everything he's done
You can work on substantially larger projects without interruption

Context Summarization pairs beautifully with Focus Chain. When Focus Chain is enabled, todo lists persist across summarizations. This allows Sypha to work on long-horizon tasks spanning multiple context windows while remaining on track with the todo list guiding him through each reset.

Technical Details

The summarization occurs through your configured API provider utilizing the same model you're already using. It employs prompt caching to minimize costs.

Sypha utilizes a summarization prompt to request a conversation summary.
After the summary is generated, Sypha substitutes the conversation history with a continuation prompt that instructs Sypha to continue working and supplies the summary as context.

Different models have varying context window thresholds for when auto-summarization activates. You can examine how thresholds are determined in context-window-utils.ts.

Claude 4 series
Gemini 2.5 series
GPT-5
Grok 4

When using other models, Sypha automatically defaults to the standard rule-based context truncation method, even if Auto Compact is enabled in settings.

Automatic Context Summarization

How It Works

Why This Matters

Technical Details

Cost Considerations

Restoring Context with Checkpoints

Next Generation Model Support

On this page