Thursday, April 17, 2025

Unleashing Local LLMs with Python and Ollama

 The landscape of Large Language Models (LLMs) is rapidly evolving, and having the ability to run them locally opens up exciting possibilities for development, experimentation, and privacy-conscious applications. Ollama simplifies this process, allowing you to effortlessly download, run, and manage LLMs on your local machine. This post will guide you through the steps of calling a locally running Ollama instance using Python.

Prerequisites

Before diving in, ensure you have the following set up:

  1. Ollama Installed: Download and install Ollama from the official website (https://ollama.ai/download). Follow the installation instructions specific to your operating system.
  2. Ollama Running: Once installed, make sure the Ollama server is running in the background.
  3. Desired LLM Installed: Use the ollama run command in your terminal to download and run the LLM you intend to interact with. For example, to download and run the qwen2.5-coder:1.5b model, execute: Bashollama run qwen2.5-coder:1.5b Ollama will automatically download the model if it’s not already present. You can verify the installed models using the ollama list command: BashC:\Users\Administrator>ollama list NAME ID SIZE MODIFIED qwen2.5-coder:1.5b 6d3abb8d2d53 986 MB 6 days ago

Python Interaction with Ollama

Now, let’s write a simple Python script to interact with our locally running LLM. We’ll leverage the ollama Python library, which provides a convenient interface for communicating with the Ollama API.

Python

1
2
3
4
5
6
7
8
9
from ollama import Client
 
client = Client(host='http://localhost:11434')  # Replace with your Ollama host and port if different
 
response = client.chat(
    model='qwen2.5-coder:1.5b',
    messages=[{'role': 'user', 'content': 'What is the meaning of life?'}],
)
print(response['message']['content'])

Explanation:

  1. from ollama import Client: This line imports the Client class from the ollama library.
  2. client = Client(host='http://localhost:11434'): This creates an instance of the Client, specifying the host and port where your Ollama server is running. The default address is http://localhost:11434. Adjust this if you have configured Ollama to run on a different address or port.
  3. response = client.chat(...): This is the core of the interaction. The client.chat() method sends a chat request to the specified model.
    • model='qwen2.5-coder:1.5b': This parameter specifies the name of the LLM you want to use. Ensure this matches the name listed in your ollama list output.
    • messages=[{'role': 'user', 'content': 'What is the meaning of life?'}]: This is a list of message dictionaries. Each dictionary represents a turn in the conversation. Here, we are sending a single user message asking “What is the meaning of life?”. The 'role' can be either 'user' or 'assistant'.
  4. print(response['message']['content']): The client.chat() method returns a response object. The actual content generated by the LLM is typically found within the 'message' dictionary under the 'content' key. This line prints the LLM’s response to the console.

Running the Python Script

Save the Python code to a file (e.g., ollama_interaction.py) and execute it from your terminal. Make sure your Ollama server is running in a separate terminal window.

Bash

1
(env) C:\vscode-python-workspace>python ollama_interaction.py

You should see output similar to the following:

1
As an AI language model, I cannot provide personal opinions or beliefs. However, from a philosophical perspective, there are different interpretations of what the meaning of life may be for individuals. Some people view it as a journey that has no end, while others see it as a purposeful pursuit of knowledge and understanding. Ultimately, the meaning of life is an individual choice that can vary greatly from person to person.

Conclusion

This simple example demonstrates how easily you can integrate local LLMs powered by Ollama into your Python projects. The ollama library provides a straightforward way to send requests and receive responses, opening the door to a wide range of applications, from local chatbots and code assistants to text generation and analysis tools, all running directly on your machine. Explore the ollama library documentation for more advanced features and functionalities.

No comments: