Recently, AIM reviewed the best tools to run large language models (LLMs) locally on a computer, and Ollama stood out as the most efficient solution, offering unmatched flexibility. Ollama, an open-source tool developed by Jeffrey Morgan, is revolutionising how enthusiasts run LLMs on their local terminals.
With its user-friendly interface and compatibility with popular models like LLaMA 2 and Mistral, Ollama can easily be considered by those who wish to experiment with LLMs securely, cost-effectively, and efficiently. It enables users to harness the power of advanced AI models without relying on cloud services or expensive hardware.
AIM tested Ollama by running multiple LLMs across multiple operating systems, including Linux (Pop!_OS), macOS, and Windows, to give readers a comprehensive overview of this utility.
Ollama: A Blazing-fast Tool to Run LLMs Locally
While it stood out as the fastest solution to run LLMs locally on a terminal, if you are not comfortable with that, there’s also a provision of GUI, but it requires some additional steps (that most basic users would want to avoid).
Ollama has access to a wide range of LLMs directly available from their library, which can be downloaded using a single command. Once downloaded, you can start using it through a single command execution. This can be quite helpful for users whose workload revolves around a terminal window. If they are stuck somewhere, they can get an answer without switching to another browser window.
While using Ollama on Linux, we found a rather surprising approach taken by its developers. When you execute the installation script, it takes care of all the GPU drivers by itself, with no extra steps required.
On the other hand, we found macOS’s installer to be the most refined among the three platforms, as it was pretty easy to navigate and use. Sure, you eventually have to switch to the terminal, but the way it takes you to the terminal was smooth.
However, if you wish to use Ollama with a GUI, you can use Open Web UI. But it can only be installed through a Docker container, which can be troublesome for users not familiar with the concept of containerisation.
AIM noted that Ollama’s only downside was that it did not provide official documentation on how to use the already downloaded LLMs.
Model Matters the Most
Most certainly, using a minimal utility like Ollama to run LLMs locally can give you a little edge over other tools, but in the end, what model you are using matters the most.
For example, we used Llama 2 and Llama 3 side by side on Ollama, and we were surprised with the results.
As you can see, Llama 2 started answering our question within a few seconds, whereas LLaMA 3 took long but gave compelling and detailed answers to the given criteria. We tested LLaMA 2 on multiple utilities, but Ollama + terminal gave the fastest results, which is why we concluded that Ollama is the fastest when used in the terminal.
Also, if you try loading a very large model beyond what your specs can handle, like any other tool, Ollama will not be able to load the given model. And the sad part is it does not even inform you that Ollama has stopped loading the model.
The only way to tackle this problem is to keep an eye on system resources. When you see a sudden drop in resource consumption, it indicates Ollama has failed to load the model, and you can stop the entire process (or else it will show the loading animation endlessly).
All in all, we found Ollama flexible and fast. It lets you use GUI if you want to and gives you blazing-fast responses when used in a terminal, making it a win-win situation. However, if you want to use something simple, we’ll soon introduce you to Jan, a tool that lets you run LLMs locally with the least effort possible.