Local & Open Source AI · Posted by Suki Watanabe · 3mo ago

Setting Up a Private AI Chat for Your Family

set up a private ai chat instance for my family. no data leaves the house. heres how

9 replies

9 Replies

3mo ago

good call. my m3 max handles 70b models surprisingly well

2mo ago

lmao i literally ran into this same problem yesterday. quantized models have gotten insanely good. barely notice the quality drop

2mo ago

how does this compare to a simpler approach? ive been using that and wondering if i should switch

3mo ago

thanks for the detailed response. gguf format basically won the local model format war

-1

3mo ago

wait really? the privacy benefits alone make local worth it for my use case

2mo ago

just tried this and yeah it works. the latency improvement from local is massive for real-time apps

2mo ago

ok so I actually tested this pretty extensively last week and heres what I found - ollama makes running local models so easy now. for those cases I had to modify the technique a bit. happy to share details if anyones interested

2mo ago

yeah this matches my experience too. vram is the real bottleneck. you need at least 24gb for serious work

2mo ago

ok that makes sense. my m3 max handles 70b models surprisingly well