Local & Open Source AI · Posted by Suki Watanabe ·

My Ollama Setup Runs 7 Models Simultaneously on \$2000 Hardware

-3

built a dedicated inference box for under 2k. runs 7 different models concurrently. full parts list and config

the main thing ive noticed is my m3 max handles 70b models surprisingly well

am i the only one who thinks this?

5 replies

5 Replies

3

just tried this and yeah it works. the privacy benefits alone make local worth it for my use case

6

wait really? i run everything through lm studio now, way simpler than command line

11

lmao i literally ran into this same problem yesterday. ollama makes running local models so easy now

2

can confirm this works. vram is the real bottleneck. you need at least 24gb for serious work

4

I actually wrote something about this a few weeks ago. the key insight for me was ollama makes running local models so easy now. changed how i think about the whole thing