-
Notifications
You must be signed in to change notification settings - Fork 373
Failing to attach devicemapper block #1758
Comments
Nice report @awprice ;-) /cc @ganeshmaharaj @amshinde |
working on recreating now... |
@awprice I could use a bit more detail on the scenario. Checkout this .md and let me know if this matches the scenario? I wasn't seeing the dind based container utilizing the same snapshot (it had to pull), so either the dind volumes are wrong, or I'm missing something else. |
@egernst and I have spoken over Slack, and came to the conclusion that if this is reproducible without Kata, then it's most likely not a problem with Kata and with containerd/the snapshotter we are using. I'm going to try reproducing this without Kata and if I can successfully reproduce this, will close out this issue and raise it with the https://github.com/containerd/containerd project. Suggested method for reproducing from @egernst is to add a delay/sleep to runc when the container is created to replicate the delay seen from when the snapshot is request to when it is actually used. |
@awprice Were you able to reproduce this without Kata? |
@amshinde Unfortunately we have yet to replicate it without Kata, but we haven't give up yet, with a couple more things to try. We are starting to focus on virtio-fs as a potential replacement for devicemapper as the performance has been good enough. |
An update: I've managed to replicate this. TL;DR: This occurs when using privileged containers inside the pod spec. When using a privileged container, all host block devices in I've developed the following very rough go script to replicate the behaviour. The script aims to create a large amount of "churn" on the Kata node, i.e. lots of pods being created and deleted quickly. https://gist.github.com/awprice/319b8f95db10de7757597305a8e37faa You can either run this inside the cluster (with appropriate RBAC), or from your local machine - On the Kata node, follow the containerd logs with the following:
After a few minutes you should start to see logs like the following:
If you switch the So why are these Then when the container is created, Kata iterates over each device and hot plugs it in: runtime/virtcontainers/container.go Lines 1363 to 1379 in 3255640
Due to the nature of Kata/VMs, there is a delay between when the A potential fix for this is to potentially have a check to ensure that the block device exists before hot plugging these "optional" block devices. |
Another possible fix is to not mount the host devices into the VM when using privileged, see #1568 |
@awprice The definition of a OTOH, if your |
@bergwolf Our use case for privileged is to run Docker in docker. Whilst we can explicitly add the capabilities that docker in docker needs to run, we cannot explicitly enable write access to the sysfs in the pod spec, which is one of the many things docker in docker needs. Until we can explicitly add what we need and run docker in docker without privileged, we need privileged to run docker in docker. |
Nothing protects it after checking for the existence of the device before someone else removes it. You only get smaller race window and fewer random failures. The problem is still there.
Then why not fix containerd to do it? The problem you list above is that we get a container config that asks for a device that is missing on the host. For kata, the right thing to do is to fail and report the missing device. If you don't want the device, make containerd not add it to container's spec in the first place. |
Going to close this out, as we have solved our issue by making a change to containerd/cri - containerd/cri#1225 to disable the host devices being mounted into the VM for the Kata runtime when using privileged. My recommendation for the next person that runs into this issue with devicemapper and kata -
|
Description of problem
We've noticed when using Containerd + devicemapper that occasionally a Kata pod is not able to be started due to a devicemapper block not being present on the host.
We have noticed this occurs when there is a large "churn" of pods on the Kata node - a large amount of pods being destroyed and created.
We believe the following timeline occurs:
There is most likely a race in the time from receiving the devicemapper snapshot from Containerd to when it is plugged into the VM that it may get deleted in the time between?
We haven't been able to replicate this without Kata, which leads me to believe that it's due to the extra steps and time that Kata takes to hotplug the block devices in that leads to it being more likely that the block could be deleted.
Expected result
The Kata pod to start without issues.
Actual result
When a large amount of pods are scheduled to a Kata node, there is the occasional pod that gets placed into the
StartError
state and does not start.The error we see in the kubelet logs:
Additional Details
We are using an AWS i3.metal instance
We are using Kubernetes 1.13.5
We are using containerd including devicemapper snapshotter, built from this commit containerd/containerd@b99a66c
Show kata-collect-data.sh details
Meta details
Running
kata-collect-data.sh
version1.7.0 (commit d4f4644312d2acbfed8a150e49831787f8ebdd90)
at2019-06-04.05:57:02.142789285+0000
.Runtime is
/opt/kata/bin/kata-runtime
.kata-env
Output of "
/opt/kata/bin/kata-runtime kata-env
":Runtime config files
Runtime default config files
Runtime config file contents
Output of "
cat "/etc/kata-containers/configuration.toml"
":Output of "
cat "/opt/kata/share/defaults/kata-containers/configuration.toml"
":Config file
/usr/share/defaults/kata-containers/configuration.toml
not foundKSM throttler
version
Output of "
--version
":systemd service
Image details
Initrd details
No initrd
Logfiles
Runtime logs
No recent runtime problems found in system journal.
Proxy logs
No recent proxy problems found in system journal.
Shim logs
No recent shim problems found in system journal.
Throttler logs
No recent throttler problems found in system journal.
Container manager details
Have
docker
Docker
Output of "
docker version
":Output of "
docker info
":Output of "
systemctl show docker
":No
kubectl
No
crio
Have
containerd
containerd
Output of "
containerd --version
":Output of "
systemctl show containerd
":Output of "
cat /etc/containerd/config.toml
":Packages
No
dpkg
No
rpm
The text was updated successfully, but these errors were encountered: