Dictation

Communicate with Sypha using your voice for faster, more natural AI collaboration

Dictation revolutionizes your AI workflow. Rather than typing lengthy explanations, you can speak in a natural way and convey your full intention. The advantage goes beyond mere speed - although speaking is indeed faster - it's about facilitating seamless collaboration that keyboard input simply cannot provide.

Why Voice Changes Everything

Typing encourages self-censorship. You abbreviate intricate concepts, omit important details, and sacrifice subtlety. Speaking allows you to express all your thoughts - the entire challenge, the limitations you're facing, the potential issues you're concerned about.

Leverage Dictation frequently when working in Plan mode for quick conversational exchanges. Rather than composing meticulous, well-organized prompts, simply articulate your challenge. When Sypha poses follow-up questions, answer without delay, and refine your approach until you've established a robust plan.

Keyboard friction has been limiting genuine collaboration. Voice eliminates that barrier.

Getting Started

Enable Dictation:

Navigate to Settings → Features → Dictation
Activate the "Enable Dictation" toggle
Authenticate your Sypha account as requested
Set up FFmpeg if not already present (Sypha provides guidance)

After activation, a microphone button becomes visible in the chat input section.

Using Dictation:

Press the microphone button to begin audio capture
Speak in your natural voice
Press the button once more to end recording
Allow time for the transcription to display in the chat

Dictation is compatible with every AI model in your configuration. While transcription processes through Sypha's service, your dialogue proceeds with your selected model.

System Requirements

Dictation isn't currently accessible on Windows. Windows compatibility is scheduled for an upcoming release.

Dictation relies on FFmpeg for voice capture on all supported platforms:

macOS: FFmpeg (via Homebrew: brew install ffmpeg)
Linux: FFmpeg (via apt: sudo apt-get install ffmpeg)

Should FFmpeg be absent from your system, Sypha automatically recognizes this and offers a one-click installation prompt.

Where Dictation Shines

Plan Mode Conversations

Dictation excels during Plan mode exchanges. Rather than meticulously constructing prompts, you have the ability to:

Verbalize your complete problem scenario all at once
Answer Sypha's inquiries in real-time
Refine concepts without keyboard constraints
Vocalize your reasoning as Sypha processes it

Initiate a planning dialogue by talking for 2-3 uninterrupted minutes, describing the comprehensive context of your development goals, the limitations affecting your work, and the particular obstacles you're encountering.

Complex Problem Explanation

Certain problems resist typed descriptions. When you're confronted with:

Multi-stage processes with boundary conditions
Integration obstacles spanning several systems
Performance challenges requiring precise reproduction procedures
UI/UX concerns demanding extensive background

Speaking enables you to describe the complete scenario in a natural flow, capturing all those "oh, and also..." specifics that prove essential.

Code Review and Debugging

During code reviews or bug explanations, voice enables you to articulate your reasoning process:

"This function appears correct, but my concern is what occurs when..."
"The problem could be located in this area, or perhaps in this alternative section..."
"I attempted X and Y, but both failed due to..."

You have the freedom to communicate your entire debugging experience rather than merely posing the concluding question.

Technical Requirements

System Requirements:

FFmpeg present on your system
Live internet connectivity
Sypha account containing transcription credits

Audio Quality:

Captures in WebM format using Opus codec
Single-channel audio at 16kHz sampling rate
Configured for speech recognition

Privacy:

Audio captured locally on your device
Exclusively audio files transmitted for transcription
Zero audio retention following transcription
Temporary files removed automatically

Cost and Credits

Voice transcription charges $0.006 per minute via your Sypha account. For the majority of users, this amounts to cents per session.

A standard 5-minute planning discussion costs approximately 3 cents. Even frequent voice users seldom exceed a few dollars monthly.

Pricing remains experimental and subject to adjustment as we enhance the service.

Best Practices

Speak Naturally Avoid attempting to speak as you would type. Employ your regular conversational manner and disregard grammatical perfection.

Give Context First Begin with the overall perspective, then delve into particulars. "I'm developing a React app requiring real-time data handling, and I'm experiencing performance complications with the WebSocket connection..."

Use Voice for Exploration Dictation proves ideal for investigative dialogues where your requirements aren't entirely clear. Begin articulating the challenge and allow the discussion to develop organically.

Combine with Text Voice needn't be used universally. Employ voice for intricate explanations and background, then transition to text for rapid responses or code fragments.

Troubleshooting

Microphone Not Working

Verify your IDE has microphone access permissions
Confirm FFmpeg installation is correct
Attempt refreshing VSCode/your editor

Poor Transcription Quality

Articulate clearly at regular volume
Minimize background noise where feasible
Review your microphone configuration

Connection Issues

Confirm internet connectivity
Verify whether firewall blocks Sypha's servers
Attempt signing out and re-authenticating your Sypha account

Authentication Issues

Sign out and re-authenticate your Sypha account upon encountering authentication errors
Verify your account contains adequate transcription credits
Confirm your internet connection remains stable

Audio Recording Issues

Confirm FFmpeg is correctly installed and accessible
Verify your browser/IDE possesses microphone permissions
Attempt restarting your editor should audio capture malfunction

The Future of AI Collaboration

When your speech matches the pace of your thinking, self-censorship disappears. You convey the complete context, the boundary conditions, the "what if" possibilities that hold significance. This produces superior solutions and minimizes clarification rounds.

On this page