LLM installation

Jun 18, 2024

—

I installed Ollama and Stable Diffusion on the gaming desktop, and the docker WebUI to connect to Ollama

docker run -d -p 3000:8080 –gpus=all –add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data –name open-webui –restart always ghcr.io/open-webui/open-webui:main

To run Ollama, run the exe or check the bottom right corner making sure it’s already run, then open http://ai.wincometech.com:3000.
To check if Nvidia GPU is in use, cmd ollama ps
To check Nvidia CUDA version, cmd nvidia-smi

In Ollama I installed three models, llama3, alibaba qwen2 and llava.34b. To run llava.34b, cmd then

ollama run llava.34b

To run stable diffusion, open C:\Users\Jiang\stable-diffusion-webui\webui-user.bat on the desktop cmd, then open http://ai.wincometech.com:7860

On ollama docker (created on vnas server as docker stack), if a large model is ran, it might run into 500 error. This is because not enough memory being recognized by ollama docker as most of memory is allocated for zfs cache by truenas. To fix this issue, on truenas server cli, run command

echo 32212254720 | sudo tee /sys/module/zfs/parameters/zfs_arc_sys_free

This forces the cache to be 30G.

Just registered on hf.io, another model repo. Try out the new deepseek text-to-image model: deepseek.ai/Janus-Pro-7B. This cannot be run on ollama, instead, I installed Invoke app on windows. Invoke can use image models to generate pictures.

LLM installation

Comments

Leave a Reply Cancel reply