Tinybox: Run 120B AI Models Offline for $12K
George Hotz's tiny corp just started shipping the Tinybox — a dedicated AI computer that runs massive models locally. No cloud fees, no rate limits, no one else's servers. Just you and your AI.
This is the hardware indie AI builders have been waiting for.
What Is Tinybox?
Tinybox is a purpose-built machine for deep learning. It's designed to do one thing really well: run neural networks fast. The specs are impressive:
| Spec | Red ($12K) | Green ($65K) |
|---|---|---|
| GPU | 4x AMD 9070XT | 4x RTX PRO 6000 Blackwell |
| GPU RAM | 64 GB | 384 GB |
| FP16 Performance | 778 TFLOPS | 3,086 TFLOPS |
| System RAM | 128 GB | 192 GB |
| Storage | 2 TB NVMe | 4 TB RAID + 1 TB boot |
The red version at $12,000 is the entry point. For comparison, a similarly-specced custom build with 4x high-end GPUs would cost significantly more — and you'd have to assemble and debug it yourself.
Why This Matters for Indie Builders
If you're building AI products, you've probably hit the walls of API-based development:
- Costs spiral — every token, every inference, every training run costs money
- Rate limits — sudden success means sudden API throttling
- Privacy concerns — customer data flows through someone else's servers
- Dependency risk — your product lives or dies by someone else's API
Tinybox flips this model. You own the hardware. You own the models. You control your destiny.
Real sovereignty: Run Llama-3-70B, DeepSeek, or any open model locally. Train, fine-tune, and experiment without watching a meter tick. Your AI stack becomes truly yours.
The tinygrad Advantage
Tinybox runs tinygrad, a neural network framework built by tiny corp. It's absurdly simple — networks break down into just three operation types:
- ElementwiseOps — unary, binary, ternary (SQRT, ADD, MUL, etc.)
- ReduceOps — shrink data (SUM, MAX)
- MovementOps — reshape, permute, expand (copy-free via ShapeTracker)
This simplicity enables aggressive optimization. tinygrad compiles custom kernels for every operation and fuses aggressively because all tensors are lazy.
It's already proven in production: openpilot uses tinygrad to run driving models on Snapdragon 845 GPUs.
What Can You Actually Build?
With 64-384GB of GPU RAM, the possibilities expand dramatically:
- Private AI assistants — customer support bots that never see a third party
- Fine-tuned models — customize open-source LLMs on your own data
- Embedding engines — semantic search, RAG systems, recommendation engines
- Image/video generation — Stable Diffusion, video models, all local
- Agent workflows — autonomous coding, research, automation
Shipping Now
The factory is up and running. Orders ship within a week of payment. Pickup available in San Diego, plus worldwide shipping. Wire transfer only — they're keeping operations lean.
For indie hackers tired of renting their AI infrastructure, Tinybox offers something rare: ownership. The question isn't whether local AI makes sense. It's whether you're ready to stop paying rent.
Check out tinygrad.org for full specs and ordering. The future of AI isn't in the cloud — it's on your desk.
Build Your AI Stack
I build AI automation tools and agents for indie developers. Check out my products to level up your AI workflow.