March 22, 2026

Tinybox: Run 120B AI Models Offline for $12K

George Hotz's tiny corp just started shipping the Tinybox — a dedicated AI computer that runs massive models locally. No cloud fees, no rate limits, no one else's servers. Just you and your AI.

This is the hardware indie AI builders have been waiting for.

What Is Tinybox?

Tinybox is a purpose-built machine for deep learning. It's designed to do one thing really well: run neural networks fast. The specs are impressive:

Spec	Red ($12K)	Green ($65K)
GPU	4x AMD 9070XT	4x RTX PRO 6000 Blackwell
GPU RAM	64 GB	384 GB
FP16 Performance	778 TFLOPS	3,086 TFLOPS
System RAM	128 GB	192 GB
Storage	2 TB NVMe	4 TB RAID + 1 TB boot

The red version at $12,000 is the entry point. For comparison, a similarly-specced custom build with 4x high-end GPUs would cost significantly more — and you'd have to assemble and debug it yourself.

Why This Matters for Indie Builders

If you're building AI products, you've probably hit the walls of API-based development:

Costs spiral — every token, every inference, every training run costs money
Rate limits — sudden success means sudden API throttling
Privacy concerns — customer data flows through someone else's servers
Dependency risk — your product lives or dies by someone else's API

Tinybox flips this model. You own the hardware. You own the models. You control your destiny.

Real sovereignty: Run Llama-3-70B, DeepSeek, or any open model locally. Train, fine-tune, and experiment without watching a meter tick. Your AI stack becomes truly yours.

The tinygrad Advantage

Tinybox runs tinygrad, a neural network framework built by tiny corp. It's absurdly simple — networks break down into just three operation types:

ElementwiseOps — unary, binary, ternary (SQRT, ADD, MUL, etc.)
ReduceOps — shrink data (SUM, MAX)
MovementOps — reshape, permute, expand (copy-free via ShapeTracker)

This simplicity enables aggressive optimization. tinygrad compiles custom kernels for every operation and fuses aggressively because all tensors are lazy.

It's already proven in production: openpilot uses tinygrad to run driving models on Snapdragon 845 GPUs.

What Can You Actually Build?

With 64-384GB of GPU RAM, the possibilities expand dramatically:

Private AI assistants — customer support bots that never see a third party
Fine-tuned models — customize open-source LLMs on your own data
Embedding engines — semantic search, RAG systems, recommendation engines
Image/video generation — Stable Diffusion, video models, all local
Agent workflows — autonomous coding, research, automation

Shipping Now

The factory is up and running. Orders ship within a week of payment. Pickup available in San Diego, plus worldwide shipping. Wire transfer only — they're keeping operations lean.

For indie hackers tired of renting their AI infrastructure, Tinybox offers something rare: ownership. The question isn't whether local AI makes sense. It's whether you're ready to stop paying rent.

Check out tinygrad.org for full specs and ordering. The future of AI isn't in the cloud — it's on your desk.

Build Your AI Stack

I build AI automation tools and agents for indie developers. Check out my products to level up your AI workflow.