Join our FREE personalized newsletter for news, trends, and insights that matter to everyone in America

Newsletter
New

Scaling Llms Won't Get Us To Agi. Here's Why.

Card image cap

Been thinking about whether more training/compute will get us to AGI, or if we need a fundamentally different architecture. I'm convinced it's the latter.

Current transformer architecture is a glorified pattern matcher. It was literally created to translate languages. We've scaled it up, added RLHF, made it chat — but at its core, it's still doing statistical pattern matching over sequences.

When Ramanujan came up with his formulas, when Gödel proved incompleteness, when Cantor invented set theory — these weren't in any training distribution. There was no historical precedent to pattern-match against. These required *seeing structure that didn't exist yet*.

LLMs can interpolate brilliantly within their training data. They cannot extrapolate to genuinely novel structures. That's the difference between pattern matching and understanding.

If I ask an LLM for business ideas, it'll suggest things that match my statistical profile — I'm a tech professional, so it'll say SaaS, consulting, AI tools. Plumbing? Probably not on the list.

But I'm a general-purpose agent. I can decide tomorrow to learn plumbing and start a plumbing business. The LLM sees the shadow of who I've been. I have access to the space of who I could become.

LLMs reason over P(outcome | observable profile). Humans reason over possibility space, not probability space. Completely different.

We need architectures that can:

- Build causal models of the world (not just statistical associations)

- Learn from minimal examples (a kid learns "dog" from 3 examples, not millions)

- Reason about novel structures that don't exist in training data

- Model agency — the ability of entities to change themselves

Scaling transformers won't get us there. It's like building a really good horse and hoping it becomes a car.

Curious what others think. Am I missing something, or is the current hype around scaling fundamentally misguided?

submitted by /u/objective_think3r
[link] [comments]