Fixing Ollama hangs: NVIDIA persistence mode
· One min read
Ollama was randomly hanging during inference. GPU would go idle mid-conversation, requests would timeout.
Ollama was randomly hanging during inference. GPU would go idle mid-conversation, requests would timeout.
This section contains lab notebook entries: experiments, observations, and work-in-progress findings.