Ollama

Ollama is a powerful tool for running large language models locally on your machine. It provides a simple way to download, run, and manage open-source models like Llama, Mistral, and many others directly on your local hardware.

Provider Slug: ollama

Get Started

Step 1: Install Ollama

Download and install Ollama (opens in a new tab) for your operating system
Follow the installation instructions for your platform (macOS, Linux, or Windows) from Ollama Docs (opens in a new tab)

Step 2: Pull a Model

Open your terminal or command prompt
Pull a model using the Ollama CLI:

ollama pull <model-name>

You can also pull other models like:

ollama pull mistral
ollama pull codellama
ollama pull llama2:7b

Step 3: Start Ollama Service

Start the Ollama service (it runs on port 11434 by default):

ollama serve

Step 4: Expose Ollama API

If you're not self-hosting the Lamatic Gateway, you'll need to expose your local Ollama API:

Install a tunneling service like ngrok (opens in a new tab)
Expose your Ollama API:

ngrok http 11434 --host-header="localhost:11434"

For using Ollama with ngrok, here’s a useful guide (opens in a new tab)

Copy the generated ngrok URL (e.g., https://abc123.ngrok.io)

Setup Instructions

Configure in Lamatic

Open your Lamatic.ai Studio (opens in a new tab)
Navigate to Models section
Select Ollama from the provider list
Enter your Ollama API URL (Need HTTPS):
- If self-hosting: http://localhost:11434
- If using ngrok: Use the ngrok URL from Step 4 above
Save your changes

Key Features

Local Model Execution: Run models directly on your hardware without cloud dependencies
Wide Model Support: Access hundreds of open-source models from Hugging Face
Easy Model Management: Simple CLI commands to pull, run, and manage models
Cost Effective: No API costs - only your local computing resources
Privacy Focused: All processing happens locally on your machine

Available Models

Ollama supports a wide variety of models including check out the Ollama Library (opens in a new tab)

Configuration Options

API URL: The endpoint where Ollama is running (default: http://localhost:11434)
Model Selection: Choose from your locally installed models
Custom Parameters: Configure temperature, top_p, and other generation parameters

Best Practices

Ensure you have sufficient RAM and storage for your chosen models
Use GPU acceleration when available for better performance
Keep your Ollama installation updated
Monitor system resources when running large models
Consider using smaller models for faster inference if speed is critical

Troubleshooting

Ollama service not starting:

Check if port 11434 is available
Ensure you have sufficient system resources
Verify installation was successful

Model not found:

Pull the model first using ollama pull <model-name>
Check available models with ollama list

Connection issues:

Verify Ollama is running with ollama serve
Check firewall settings if using ngrok
Ensure the API URL is correct in Lamatic

Additional Resources

DeepSeek Voyage AI

Was this page useful?

Questions? We're here to help

Feedback Email Talk to sales