virtcontainers: Properly remove the container when shim gets killed

Here is an interesting case I have been debugging. I was trying to understand why a "kubeadm reset" was not working for kata-runtime compared to runc. In this case, the only pod started with Kata is the kube-dns pod. For some reasons, when this pod is stopped and removed, its containers receive some signals, 2 of them being SIGTERM signals, which seems the way to properly stop them, but the third container receives a SIGCONT. Obviously, nothing happens in this case, but apparently CRI-O considers this should be the end of the container and after a few seconds, it kills the container process (being the shim in Kata case). Because it is using a SIGKILL, the signal does not get forwarded to the agent because the shim itself is killed right away. After this happened, CRI-O calls into "kata-runtime state", we detect the shim is not running anymore and we try to stop the container. The code will eventually call into agent.RemoveContainer(), but this will fail and return an error because inside the agent, the container is still running. The approach to solve this issue here is to send a SIGKILL signal to the container after the shim has been waited for. This call does not check for the error returned because most of the cases, regular use cases, will end up returning an error because the shim itself not being there actually represents the container inside the VM has already terminated. And in case the shim has been killed without the possibility to forward the signal (like described in first paragraph), the SIGKILL will work and will allow the following call to agent.stopContainer() to proceed to the removal of the container inside the agent. Fixes #274 Signed-off-by: Sebastien Boeuf <[email protected]>
kata-containers · Apr 28, 2018 · 789dbca · 789dbca
1 parent d4225ed
commit 789dbca
Showing 1 changed file with 10 additions and 0 deletions.
diff --git a/virtcontainers/container.go b/virtcontainers/container.go
@@ -665,6 +665,16 @@ func (c *Container) stop() error {
 		}
 	}
 
+	// Force the container to be killed. For most of the cases, this
+	// should not matter and it should return an error that will be
+	// ignored.
+	// But for the specific case where the shim has been SIGKILL'ed,
+	// the container is still running inside the VM. And this is why
+	// this signal will ensure the container will get killed to match
+	// the state of the shim. This will allow the following call to
+	// stopContainer() to succeed in such particular case.
+	c.sandbox.agent.killContainer(*(c.sandbox), *c, syscall.SIGKILL, true)
+
 	if err := c.sandbox.agent.stopContainer(*(c.sandbox), *c); err != nil {
 		return err
 	}