🧠 RAM is king for CPU inference. More = bigger models.
⚡ Apple Silicon uses unified memory — great for local AI.
🖥️ VRAM determines GPU-accelerated model size.
📦 Quantized models (Q4, Q5) use 4–5× less RAM than full precision.