-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Update docs/ai.md to include Ollama Quadlet instructions #93
base: main
Are you sure you want to change the base?
Conversation
Make some changes required to use podman instead of docker
docs/ai.md
Outdated
ports: | ||
- 11434:11434 | ||
volumes: | ||
- ./ollama_v:/root/.ollama:z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Permission denied in podman without :z
docs/ai.md
Outdated
volumes: | ||
- ./ollama_v:/root/.ollama:z | ||
devices: | ||
- nvidia.com/gpu=all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might work with docker-compose as well. If so we can probably unify the 2 blocks (:z
worked with docker-compose)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not work for docker
Error response from daemon: could not select device driver "cdi" with capabilities: []
We could also just put a quadlet in there and tell people to paste it into the right file? I'm thinking, if we're going podman, we should go full podman/quadlet which is what they prefer. We should also leave the docker example there too, if someone needs to get something done and they need ollama they shouldn't have to learn podman that same time, so offering both feels great. What do you think? |
@castrojo I think that makes sense. I suspect that it will still need to run as root, but I'll give it a try and report back. |
split podman from the docker section, also add a quadlet version
docs/ai.md
Outdated
ContainerName=ollama | ||
AutoUpdate=yes | ||
PublishPort=11434:11434 | ||
Volume=./ollama_v:/root/.ollama:z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what the volume path should be in the quadlet
OK so the quadlet kind of works.
|
Yesterday, I've created a guide for running Ollama with podman. It was suggested to do a PR on the docs, and this is how I've found this PR trying to achieve basically the same thing. Feel free to use any information from the guide. Looking at this PR, maybe only the third step (creating the Quadlet unit) is sufficient, in which case the setup would be really easy, but my Quadlet unit and the one in this PR differ (eg. in adding Nvidia devices). I'd be happy to help with this PR! |
Also here is how the Quadlet unit was being created when running (now removed) Maybe we can add instuctions for AMD GPU too. |
@sykoram thanks for sharing. Though that might be irrelevant as you pointed out the |
remove podman-compose section
The fix either didn't work or I haven't gotten the version with the patch yet. This PR now only adds the Quadlet section because I think that could still be a nice way of doing it. @sykoram I took your suggestion and switched to using @castrojo What are your thoughts on this PR? The docs have been updated since I created it and now has information about Podman Desktop and Ramalama which look to be the preferred options. I've also switched to mostly using LMStudio myself. |
I'll look tomorrow but I think we should keep the ollama sections, lotsd of people still use it. Over the past week I've been playing a lot with ramalama and it solves lots of problems for us, the cli tool still consumes ollama models from them, but I love that the built in systemd integration is just a flag, the quadlet part gets totally automated, so I think it's a great default. There's so much open source AI tooling though, so I think having a few options in the docs for folks makes sense, I'll take a look and merge appropriately! |
Thanks! Looks like we are converging to a nice solution. Just a few remarks: File locationAccording to the Quadlet guide in Bazzite docs, the recommended location for the Quadlet unit file would be VolumeThey also note there to
The
vs ours
Unfortunately, I have no idea what is the right way. Use the
|
Great suggestions. I have applied the changes and tested Aurora + Nvidia GPU. Having the AMD devices listed in my
|
|
||
To do so, first configure docker to use the nvidia drivers (that come preinstalled with Bluefin) with: | ||
### Quadlet (recommended) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expected: 1; Actual: 0; Below
I don't know what Codacy wants here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a link to the Codacy website for this issue, but there isn't much information and the suggested fix looks exactly the same
Since the Quadlet unit file is quite long now, it would be great if someone looks at it and decides which lines are necessary. Unfortunately, I don't have the knowledge for that. Apart from that, it looks good in my opinion |
WIP for using podman-compose instead of docker compose with the Ollama API docs.
Needs someone else to verify before merging these updates. I have only tested on my Aurora install with Nvidia GPU.
Nvidia GPU passthrough seems to need
sudo
until this bug is fixed containers/podman#19338