Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

1.8.0-alpha0 containerd-shim-kata-v2 fails to start pod #1781

Closed
egernst opened this issue Jun 7, 2019 · 4 comments · Fixed by #1782
Closed

1.8.0-alpha0 containerd-shim-kata-v2 fails to start pod #1781

egernst opened this issue Jun 7, 2019 · 4 comments · Fixed by #1782
Assignees
Labels
bug Incorrect behaviour needs-review Needs to be assessed by the team.

Comments

@egernst
Copy link
Member

egernst commented Jun 7, 2019

Description of problem

in testing in Kubernetes, I see that 1.8.0-alpha0 release fails to start pod.

Expected result

It to work

Actual result

remote_runtime.go:105] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to start sandbox container: failed to create containerd task: io.containerd.kata.v2: failed to connect: dial unix /run/containerd/containerd.sock.ttrpc: connect: no such file or directory
kuberuntime_sandbox.go:68] CreatePodSandbox for pod "nginx-deployment-qemu-7fc7f55446-9vf4v_default(6d2f8906-896d-11e9-9995-001e67569165)" failed: rpc error: code = Unknown desc = failed to start sandbox container: failed to create containerd task: io.containerd.kata.v2: failed to connect: dial unix /run/containerd/containerd.sock.ttrpc: connect: no such file or directory

This is a regression between 1.7.0 and 1.8.0-alpha0. I narrowed this down to the following commits:

eabfd99 shimv2: Improve shim shutdown logic
590ed09 vendor: update gogo/protobuf, containerd and agent vendors

I suggest we revert while getting help from @lifupan to root-cause.

@egernst egernst added bug Incorrect behaviour needs-review Needs to be assessed by the team. labels Jun 7, 2019
egernst pushed a commit to egernst/runtime that referenced this issue Jun 7, 2019
This reverts:
 - 590ed09 vendor: update gogo/protobuf, containerd and agent vendors
 - eabfd99 shimv2: Improve shim shutdown logic

These introduce a regression for starting pods with k8s 1.14 + contaienr
1.2.6

Fixes: kata-containers#1781

Signed-off-by: Eric Ernst <[email protected]>
egernst pushed a commit to egernst/runtime that referenced this issue Jun 7, 2019
This reverts:
 - 590ed09 vendor: update gogo/protobuf, containerd and agent vendors
 - eabfd99 shimv2: Improve shim shutdown logic

These introduce a regression for starting pods with k8s 1.14 + contaienr
1.2.6

Fixes: kata-containers#1781

Signed-off-by: Eric Ernst <[email protected]>
@ganeshmaharaj
Copy link
Contributor

ganeshmaharaj commented Jun 8, 2019

I was able to reproduce the issue comparing 1.7.1 and 1.8.0-alpha0 and this was the same issue that we noticed during backports for 1.7.1 and the PR in question was the one that had to be removed from the list to make that release. #1756

ganeshma@kata-shimv2:~$ sudo tar -C / -xf kata-static-1.8.0-alpha0-x86_64.tar
ganeshma@kata-shimv2:~$ sudo ctr run -t --rm --runtime io.containerd.kata.v2 docker.io/library/busybox:latest busybox1
ctr: io.containerd.kata.v2: failed to connect: dial unix /run/containerd/containerd.sock.ttrpc: connect: no such file or directory
: exit status 1: unknown
ganeshma@kata-shimv2:~$ sudo tar -C / -xf kata-static-1.7.1-x86_64.tar
ganeshma@kata-shimv2:~$ sudo ctr run -t --rm --runtime io.containerd.kata.v2 docker.io/library/busybox:latest busybox1
/ # uname -a
Linux kata-container 4.19.28 #1 SMP Wed Jun 5 14:32:34 PDT 2019 x86_64 GNU/Linux
/ # exit
ganeshma@kata-shimv2:~$ uname -a
Linux kata-shimv2 4.15.0-50-generic #54-Ubuntu SMP Mon May 6 18:46:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
ganeshma@kata-shimv2:~$

@bergwolf
Copy link
Member

bergwolf commented Jun 9, 2019

@egernst @ganeshmaharaj what is your containerd version in use? IIUC, #1732 requires latest containerd to work.

@lifupan
Copy link
Member

lifupan commented Jun 10, 2019

Hi @egernst @bergwolf , the root cause was that the latest shimv2 used ttrpc to publish events from shimv2 to containerd, thus it depended containerd setup the ttrpc server, which was enabled by the following pull containerd/containerd#3195.

@egernst
Copy link
Member Author

egernst commented Jun 10, 2019

1.2.6

I used latest available on AKS also for testing...

What’s the minimum version?

egernst pushed a commit to egernst/runtime that referenced this issue Jun 10, 2019
This reverts:
 - 590ed09 vendor: update gogo/protobuf, containerd and agent vendors
 - eabfd99 shimv2: Improve shim shutdown logic

These introduce a regression for starting pods with k8s 1.14 + contaienr
1.2.6

Fixes: kata-containers#1781

Signed-off-by: Eric Ernst <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Incorrect behaviour needs-review Needs to be assessed by the team.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants