Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Update docs/ai.md to include Ollama Quadlet instructions #93

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

jroddev
Copy link

@jroddev jroddev commented Jan 29, 2025

WIP for using podman-compose instead of docker compose with the Ollama API docs.

Needs someone else to verify before merging these updates. I have only tested on my Aurora install with Nvidia GPU.

Nvidia GPU passthrough seems to need sudo until this bug is fixed containers/podman#19338

Make some changes required to use podman instead of docker
docs/ai.md Outdated
ports:
- 11434:11434
volumes:
- ./ollama_v:/root/.ollama:z
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Permission denied in podman without :z

docs/ai.md Outdated
volumes:
- ./ollama_v:/root/.ollama:z
devices:
- nvidia.com/gpu=all
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might work with docker-compose as well. If so we can probably unify the 2 blocks (:z worked with docker-compose)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not work for docker

Error response from daemon: could not select device driver "cdi" with capabilities: []

@castrojo
Copy link
Member

We could also just put a quadlet in there and tell people to paste it into the right file? I'm thinking, if we're going podman, we should go full podman/quadlet which is what they prefer.

We should also leave the docker example there too, if someone needs to get something done and they need ollama they shouldn't have to learn podman that same time, so offering both feels great. What do you think?

@jroddev
Copy link
Author

jroddev commented Jan 30, 2025

@castrojo I think that makes sense. I suspect that it will still need to run as root, but I'll give it a try and report back.
Also LMStudio AppImage worked out of the box

split podman from the docker section, also add a quadlet version
docs/ai.md Outdated
ContainerName=ollama
AutoUpdate=yes
PublishPort=11434:11434
Volume=./ollama_v:/root/.ollama:z
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the volume path should be in the quadlet

@jroddev
Copy link
Author

jroddev commented Jan 30, 2025

OK so the quadlet kind of works.

  • Starts fine under --user with ~/.local/share/systemd/ollama.container
  • nvidia gpu is working (without sudo, --system)
  • I still needed to brew install ollama on the host to get the cli frontend (or you could exec into the container)
  • ollama is picking up existing models from my system (I think from ~/.ollama). I'm not sure how since it's not mounted into the container. Maybe the frontend is doing it?

@sykoram
Copy link

sykoram commented Feb 24, 2025

Yesterday, I've created a guide for running Ollama with podman. It was suggested to do a PR on the docs, and this is how I've found this PR trying to achieve basically the same thing.

Feel free to use any information from the guide.

Looking at this PR, maybe only the third step (creating the Quadlet unit) is sufficient, in which case the setup would be really easy, but my Quadlet unit and the one in this PR differ (eg. in adding Nvidia devices).

I'd be happy to help with this PR!

@sykoram
Copy link

sykoram commented Feb 24, 2025

Also here is how the Quadlet unit was being created when running (now removed) ujust ollama.

Maybe we can add instuctions for AMD GPU too.

@jroddev
Copy link
Author

jroddev commented Feb 26, 2025

@sykoram thanks for sharing.
My quadlet was blocked on this podman issue containers/podman#19338
which has now been merged but I haven't had a chance to test it

Though that might be irrelevant as you pointed out the ujust quadlet was using AddDevice=nvidia.com/gpu=all and I believe that worked fine.

@jroddev
Copy link
Author

jroddev commented Feb 26, 2025

The fix either didn't work or I haven't gotten the version with the patch yet.
Either way I think this page has enough options that we don't need the podman-compose version - especially while it requires sudo for nvidia gpu support.

This PR now only adds the Quadlet section because I think that could still be a nice way of doing it.

@sykoram I took your suggestion and switched to using AddDevice=nvidia.com/gpu=all . I think it reads nicer than Device=/dev/nvidia*:ro. Feel free to make other suggestions.

@castrojo What are your thoughts on this PR? The docs have been updated since I created it and now has information about Podman Desktop and Ramalama which look to be the preferred options. I've also switched to mostly using LMStudio myself.
Happy for this to be merged or closed.

@jroddev jroddev changed the title Draft: Ollama API with Podman Compose Update docs/ai.md to include Ollama Quadlet instructions Feb 26, 2025
@jroddev jroddev changed the title Update docs/ai.md to include Ollama Quadlet instructions feat: Update docs/ai.md to include Ollama Quadlet instructions Feb 26, 2025
@castrojo
Copy link
Member

I'll look tomorrow but I think we should keep the ollama sections, lotsd of people still use it.

Over the past week I've been playing a lot with ramalama and it solves lots of problems for us, the cli tool still consumes ollama models from them, but I love that the built in systemd integration is just a flag, the quadlet part gets totally automated, so I think it's a great default.

There's so much open source AI tooling though, so I think having a few options in the docs for folks makes sense, I'll take a look and merge appropriately!

@sykoram
Copy link

sykoram commented Feb 26, 2025

Thanks! Looks like we are converging to a nice solution.

Just a few remarks:

File location

According to the Quadlet guide in Bazzite docs, the recommended location for the Quadlet unit file would be ~/.config/containers/systemd/ollama.container. This is also the location that the ujust script used.

Volume

They also note there to

Use absolute path for volume, e.g /home/username/minecraft/data.

The ujust script uses %h placeholder for the user's home directory, but the rest of this line also differs from our version:

Volume=%h/.ollama:/.ollama

vs ours

Volume=ollama:/root/.ollama:z

Unfortunately, I have no idea what is the right way.

Use the ujust version?

There are also many more settings in the ujust script. Maybe some of them are necessary, maybe some of them are just nice to have. (I had to use SecurityLabelDisable=true for example):

[Service]
...
# Ensure there's a userland podman.sock
ExecStartPre=/bin/systemctl --user enable podman.socket
# Ensure that the dir exists
ExecStartPre=-mkdir -p %h/.ollama

[Container]
...
RemapUsers=keep-id
RunInit=yes
NoNewPrivileges=no
PodmanArgs=--userns=keep-id
PodmanArgs=--group-add=keep-groups
PodmanArgs=--ulimit=host
PodmanArgs=--security-opt=label=disable
PodmanArgs=--cgroupns=host

Maybe it would be easiest to just use the version from the ujust script?

AMD GPU

Looks like the setup for AMD GPU is also easy. It just uses

AddDevice=/dev/dri
AddDevice=/dev/kfd

instead of

AddDevice=nvidia.com/gpu=all

I think this would be nice to note in the wiki.

@jroddev
Copy link
Author

jroddev commented Mar 3, 2025

Great suggestions. I have applied the changes and tested Aurora + Nvidia GPU.
I don't have an AMD GPU to test with though.

Having the AMD devices listed in my .container file didn't cause any issues though.

AddDevice=/dev/dri
AddDevice=/dev/kfd


To do so, first configure docker to use the nvidia drivers (that come preinstalled with Bluefin) with:
### Quadlet (recommended)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expected: 1; Actual: 0; Below

I don't know what Codacy wants here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a link to the Codacy website for this issue, but there isn't much information and the suggested fix looks exactly the same

@sykoram
Copy link

sykoram commented Mar 3, 2025

Since the Quadlet unit file is quite long now, it would be great if someone looks at it and decides which lines are necessary. Unfortunately, I don't have the knowledge for that.

Apart from that, it looks good in my opinion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

docker-compose: Passing gpu with driver: cdi is not supported
3 participants