I Tried Running a Local LLM on My 3-Year-Old MacBook Pro — Here’s What Happened

As a tech journalist who’s covered artificial intelligence for over a decade, I thought I had a solid grasp on the computational demands of large language models. But nothing prepared me for the reality check I got when I tried running an LLM locally on my trusty three-year-old MacBook Pro.

The Experiment: Can an Old MacBook Handle Modern AI?

My setup was far from cutting-edge: a 2021 MacBook Pro with an M1 chip, 16GB of RAM, and a nearly full terabyte drive running macOS Sonoma. When I bought this machine in early 2023 during a Best Buy closeout sale, it was already becoming yesterday’s model. Yet it had served me admirably for everyday information worker tasks—email, web browsing, video editing, podcast recording—without a single complaint.

But could this venerable machine handle the computational demands of running a large language model locally?

Why Bother with Local LLMs Anyway?

Before diving into my experiment, it’s worth considering why anyone would want to run LLMs locally rather than just using ChatGPT or Perplexity online:

Professional development: In today’s job market, being able to download and run models locally makes you more valuable than someone who just types into online prompts like every free ChatGPT user.

Data privacy: With local instances, your sensitive data never leaves your machine—crucial for any information worker handling confidential information.

Cost savings: As AI costs continue to rise in 2026, running models locally means avoiding the constant meter running with OpenAI, Google, and Anthropic services.

Greater control: Local models allow for fine-tuning, integration with tools like LangChain and Codex, and more focused results tailored to your specific needs.

The Setup: Ollama to the Rescue

Inspired by my colleague Jack Wallen’s coverage of Ollama, I downloaded the macOS binary as my gateway to local AI. Ollama has done impressive work integrating with LangChain, Codex, and other AI tools, making it an increasingly central hub for bringing together various aspects of AI.

The start-up screen looks familiar—like ChatGPT with its friendly prompt and model selection dropdown. But there’s a crucial difference: some models in the list aren’t local; they’re cloud-based through Ollama’s infrastructure service.

First Attempt: GLM-4.7-Flash

I started with glm-4.7-flash, a 30-billion-parameter model from Chinese AI startup Z.ai. At 19GB of disk usage, it’s considered “small” by today’s standards, though still substantial.

The download was reasonably quick on my gigabit cable modem, but the real test came when I asked a simple question: “What kind of large language model are you?”

What followed was excruciating. After 45 minutes, the model was still “thinking” about how to structure its explanation. After an hour and 16 minutes—5,197.3 seconds of processing time—I finally got a response. The answer? Not particularly insightful or worth the wait.

Everything on my Mac had become noticeably sluggish during this process. The model seemed trapped in what I can only describe as “prompt creep”—the longer I waited, the more it seemed to be contemplating its own contemplation.

Second Attempt: GPT-OSS 20B

Not ready to give up, I tried OpenAI’s gpt-oss:20b, a 20-billion-parameter model that my colleague had recommended as faster than others he’d tried.

This time, after about six minutes, I got a response: “I am ChatGPT, powered by OpenAI’s GPT-4 family.” The model even provided a nice table of details, though oddly claimed to have “roughly 175 billion parameters” when it should have known it was the 20B variant.

While this was acceptable for a simple prompt, I could already tell that anything more ambitious would be problematic. The waiting time, while not snail-like, was slow enough that I didn’t dare upload my entire trove of articles for analysis.

The Harsh Reality: We Need More Power

When I checked with ChatGPT itself about minimum requirements for running gpt-oss:20b, the answer was clear: 32GB of RAM is really the minimum configuration needed.

My M1 Pro silicon has an integrated GPU, and Ollama has provided gpt-oss:20b with support for the Mac GPU through a library called “llama.cpp backend.” Technically, everything should work, but practically, I need more DRAM than just 16GB.

This realization hit hard. After three decades of writing about computers, I’m facing the reality that for modern AI workloads, 32GB is becoming the minimum reasonable configuration for an information worker. I’m essentially competing against cloud vendors who are consuming massive amounts of DRAM for their data centers.

The Verdict: Time for an Upgrade

My fledgling local Ollama effort didn’t yield success, but it gave me a newfound appreciation for just how memory-intensive AI truly is. I always knew this from years of reporting on AI, but now I feel it in my bones—that sinking feeling when the response to your prompt takes forever to scroll across the screen.

The math is clear: I’ll probably be dipping into the credit card to trade up to a new computer. Apple will give me about $599 for my M1 MacBook as a trade-in, but that barely puts a dent in the cost of a new M4 or M5 MacBook Pro with 32GB of RAM.

This experiment taught me that while local LLMs offer compelling advantages—privacy, cost savings, control—they also demand serious hardware investment. For now, my trusty three-year-old MacBook Pro will have to stick to its traditional information worker tasks, while I save up for the AI-powered future.

Tags: local LLM, MacBook Pro, Ollama, AI hardware requirements, machine learning, macOS, RAM requirements, AI costs, data privacy, professional development, tech upgrade, M1 chip, M4 chip, M5 chip, DRAM memory, large language models, open source AI, tech journalism

Viral phrases: “I tried running a local LLM on my 3-year-old MacBook Pro—here’s what happened,” “Why you’ll pay more for AI in 2026,” “5 reasons I use local AI instead of ChatGPT,” “This is the fastest local AI I’ve tried,” “I tried vibe coding an app as a beginner,” “Why AI costs are increasing in 2026,” “My go-to LLM tool just dropped a super simple Mac app,” “This app makes using Ollama local AI on macOS so easy,” “I tried the only agentic browser that runs local AI,” “How to install an LLM on macOS (and why you should)”

I tested local AI on my M1 Mac, expecting magic – and got a reality check instead

I Tried Running a Local LLM on My 3-Year-Old MacBook Pro — Here’s What Happened

The Experiment: Can an Old MacBook Handle Modern AI?

Why Bother with Local LLMs Anyway?

The Setup: Ollama to the Rescue

First Attempt: GLM-4.7-Flash

Second Attempt: GPT-OSS 20B

The Harsh Reality: We Need More Power

The Verdict: Time for an Upgrade

Leave a Reply

Leave a Reply Cancel reply

Interesting links

Pages

Categories

Archive