Ollama
Ollama is a powerful tool for running large language models locally on your machine. It provides a simple way to download, run, and manage open-source models like Llama, Mistral, and many others directly on your local hardware.
Provider Slug:
ollama
Get Started
Step 1: Install Ollama
- Download and install Ollama (opens in a new tab) for your operating system
- Follow the installation instructions for your platform (macOS, Linux, or Windows) from Ollama Docs (opens in a new tab)
Step 2: Pull a Model
- Open your terminal or command prompt
- Pull a model using the Ollama CLI:
ollama pull <model-name>
You can also pull other models like:
ollama pull mistral
ollama pull codellama
ollama pull llama2:7b
Step 3: Start Ollama Service
- Start the Ollama service (it runs on port 11434 by default):
ollama serve
Step 4: Expose Ollama API
If you're not self-hosting the Lamatic Gateway, you'll need to expose your local Ollama API:
- Install a tunneling service like ngrok (opens in a new tab)
- Expose your Ollama API:
ngrok http 11434 --host-header="localhost:11434"
For using Ollama with ngrok, here’s a useful guide (opens in a new tab)
- Copy the generated ngrok URL (e.g.,
https://abc123.ngrok.io
)
Setup Instructions
Configure in Lamatic
- Open your Lamatic.ai Studio (opens in a new tab)
- Navigate to Models section
- Select Ollama from the provider list
- Enter your Ollama API URL (Need HTTPS):
- If self-hosting:
http://localhost:11434
- If using ngrok: Use the ngrok URL from Step 4 above
- If self-hosting:
- Save your changes
Key Features
- Local Model Execution: Run models directly on your hardware without cloud dependencies
- Wide Model Support: Access hundreds of open-source models from Hugging Face
- Easy Model Management: Simple CLI commands to pull, run, and manage models
- Cost Effective: No API costs - only your local computing resources
- Privacy Focused: All processing happens locally on your machine
Available Models
Ollama supports a wide variety of models including check out the Ollama Library (opens in a new tab)
Configuration Options
- API URL: The endpoint where Ollama is running (default:
http://localhost:11434
) - Model Selection: Choose from your locally installed models
- Custom Parameters: Configure temperature, top_p, and other generation parameters
Best Practices
- Ensure you have sufficient RAM and storage for your chosen models
- Use GPU acceleration when available for better performance
- Keep your Ollama installation updated
- Monitor system resources when running large models
- Consider using smaller models for faster inference if speed is critical
Troubleshooting
Ollama service not starting:
- Check if port 11434 is available
- Ensure you have sufficient system resources
- Verify installation was successful
Model not found:
- Pull the model first using
ollama pull <model-name>
- Check available models with
ollama list
Connection issues:
- Verify Ollama is running with
ollama serve
- Check firewall settings if using ngrok
- Ensure the API URL is correct in Lamatic