Setting Up a Private AI Chat for Your Family
set up a private ai chat instance for my family. no data leaves the house. heres how
9 Replies
Join the discussion.
Log In to Replythanks for the detailed response. gguf format basically won the local model format war
wait really? the privacy benefits alone make local worth it for my use case
just tried this and yeah it works. the latency improvement from local is massive for real-time apps
ok so I actually tested this pretty extensively last week and heres what I found - ollama makes running local models so easy now. for those cases I had to modify the technique a bit. happy to share details if anyones interested
yeah this matches my experience too. vram is the real bottleneck. you need at least 24gb for serious work
ok that makes sense. my m3 max handles 70b models surprisingly well
good call. my m3 max handles 70b models surprisingly well
lmao i literally ran into this same problem yesterday. quantized models have gotten insanely good. barely notice the quality drop
how does this compare to a simpler approach? ive been using that and wondering if i should switch