Running Open Source LLMs Locally with GPT-OSS + Ollama

I was looking for a way to run LLMs locally because GPT API costs were getting expensive, and I found Ollama. Here's my experience setting it up and using it. Simply put, it's a super-large AI that can understand and respond to natural language as if we're having a conversation.

I'd like to introduce GPT-OSS, an open source LLM released by OpenAI, and Ollama, an app that makes it very easy to run.

default alt text

1. What is Ollama?

Ollama is an application that lets you run open source LLMs with just a few button clicks on Mac, Windows, and Linux. Previously, running an LLM directly required tedious installation processes and complex command inputs, but Ollama makes it as easy as installing an app.

👉 Download Ollama

2. What is GPT-OSS?

GPT-OSS is an open source large language model released by OpenAI. There are two main versions:

gpt-oss-120b → A very large model with 117 billion parameters
gpt-oss-20b → A relatively lightweight model with 21 billion parameters

What does "Parameter" mean? 👉 Simply put, it's like the number of brain cells the AI has learned with. The more there are, the smarter it is, but it also requires more resources (memory/GPU) to run.

These models use MoE (Mixture of Experts) architecture. This saves resources by only calling the needed experts, and uses a technology called MXFP4 for speed and efficiency optimization.

The large model can run with just 1 H100 GPU
The small model only needs 16GB of memory, so it's possible on regular PCs or personal servers

It's also distributed under the Apache 2.0 license, so you can use it freely.

3. Running GPT-OSS with Ollama

The method is simpler than you'd think.

Install Ollama
After running, select model → gpt-oss:20b
Start chatting by typing messages

The first time takes about 20 minutes to download the model, but after that you can use it right away.

My test environment was a Macbook Air M2 (24GB memory), and it took about 40 seconds to get a response. If you want faster speeds, you'll need a high-performance GPU, or you might want to use commercial LLM services in parallel.

4. Pros and Cons of Open Source LLMs

✅ Pros

No usage-based billing (pay-per-use) → Cost savings
Can do additional training and tuning as you want
Runs in local environment → Lower risk of personal data leakage
Active community → Continuous improvement and optimization

⚠️ Cons

Requires some level of technical knowledge for installation and setup
Need high-performance GPU for fast speeds
Performance may be lacking compared to commercial LLMs in some situations

In other words, it's important to choose between commercial LLMs and open source LLMs appropriately based on your purpose and environment.

5. Open Source LLMs in Emergency Situations

Here's the most important point!

In situations where the internet is cut off—for example, natural disasters, war zones, remote exploration—even online searches are impossible. In such cases, locally-runnable open source LLMs can be a substitute for Google search.

Medical emergencies
Emergency evacuation guidelines during disasters
Technical problem-solving in communication-impossible locations

All become AI assistants that can help immediately without the internet.

Conclusion

The combination of GPT-OSS and Ollama is not just a technical toy, but an open source innovation that can be directly connected to real life and survival. We are now entering an era of freely running AI in our own hands, beyond the age of using AI only as a 'subscription service'.