Ollama
Sypha provides compatibility with locally hosted models through Ollama. This methodology delivers privacy benefits, offline functionality, and potential cost savings. Initial configuration is required along with adequately powerful hardware. Due to current consumer hardware limitations, using Ollama with Sypha is not recommended as performance will typically be suboptimal for standard hardware configurations.
Website: https://ollama.com/
Configuring Ollama
-
Install Ollama: Download the Ollama installation package appropriate for your operating system from the Ollama website and complete their installation process. Verify that Ollama is operational. You can generally launch it using:
ollama serve -
Acquire a Model: Ollama provides compatibility with numerous models. The complete catalog of supported models is accessible on the Ollama model library. Several models well-suited for programming tasks include:
codellama:7b-code(an excellent, compact starting option)codellama:13b-code(delivers improved quality, increased size)codellama:34b-code(supplies superior quality, considerably large)qwen2.5-coder:32bmistralai/Mistral-7B-Instruct-v0.1(a reliable all-purpose model)deepseek-coder:6.7b-base(proficient for programming)llama3:8b-instruct-q5_1(appropriate for general applications)
To acquire a model, launch your terminal and run:
ollama pull <model_name>As an example:
ollama pull qwen2.5-coder:32b -
Adjust the Model's Context Window: Ollama models typically default to a context window of 2048 tokens, which may prove inadequate for numerous Sypha operations. A minimum threshold of 12,000 tokens is recommended for satisfactory outcomes, with 32,000 tokens being optimal. To modify this, you'll adjust the model's parameters and preserve it as a new configuration.
Initially, load the model (utilizing
qwen2.5-coder:32bas an illustration):ollama run qwen2.5-coder:32bAfter the model loads within the Ollama interactive environment, configure the context size parameter:
/set parameter num_ctx 32768Subsequently, preserve this customized model with a unique identifier:
/save your_custom_model_name(Substitute
your_custom_model_namewith an identifier of your preference.) -
Configure Sypha:
- Access the Sypha sidebar (typically indicated by the Sypha icon).
- Select the settings gear icon (⚙️).
- Choose "ollama" as the API Provider.
- Input the Model name you preserved in the preceding step (e.g.,
your_custom_model_name). - (Optional) Modify the base URL if Ollama is executing on an alternate machine or port. The default setting is
http://localhost:11434. - (Optional) Adjust the Model context size within Sypha's Advanced settings. This assists Sypha in managing its context window efficiently with your customized Ollama model.
Additional Information and Recommendations
- Hardware Requirements: Executing large language models locally can impose significant demands on system resources. Confirm your computer satisfies the specifications for your selected model.
- Model Selection: Test various models to identify which performs optimally for your particular tasks and requirements.
- Offline Functionality: Following model download, you can utilize Sypha with that model without requiring an internet connection.
- Token Monitoring: Sypha monitors token consumption for models accessed through Ollama, enabling you to track usage.
- Ollama's Official Documentation: For additional comprehensive information, refer to the official Ollama documentation.