Generic selectors

Exact matches only

Search in title

Search in content

Post Type Selectors
Filter by Categories

ACI

AI

Business

Edu

health

homelab

kids

Python

spon

Tech Blogs

Uncategorized

voip

VXLAN

Work Notes

LLM installation

I installed Ollama and Stable Diffusion on the gaming desktop, and the docker WebUI to connect to Ollama

docker run -d -p 3000:8080 –gpus=all –add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data –name open-webui –restart always ghcr.io/open-webui/open-webui:main
 
 

To run Ollama, run the exe or check the bottom right corner making sure it’s already run, then open http://ai.wincometech.com:3000.
To check if Nvidia GPU is in use, cmd ollama ps
To check Nvidia CUDA version, cmd nvidia-smi

In Ollama I installed three models, llama3, alibaba qwen2 and llava.34b. To run llava.34b, cmd then

ollama run llava.34b

To run stable diffusion, open C:\Users\Jiang\stable-diffusion-webui\webui-user.bat on the desktop cmd, then open http://ai.wincometech.com:7860

 

On ollama docker (created on vnas server as docker stack), if a large model is ran, it might run into 500 error. This is because not enough memory being recognized by ollama docker as most of memory is allocated for zfs cache by truenas. To fix this issue, on truenas server cli, run command 

echo 32212254720 | sudo tee /sys/module/zfs/parameters/zfs_arc_sys_free

This forces the cache to be 30G.

Just registered on hf.io, another model repo. Try out the new deepseek text-to-image model: deepseek.ai/Janus-Pro-7B. This cannot be run on ollama, instead, I installed Invoke app on windows. Invoke can use image models to generate pictures.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *