Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

Kubernetes Pods stuck in ContainerCreating when using CRI-O #388

Closed
jamiehannaford opened this issue Jun 12, 2018 · 30 comments
Closed

Kubernetes Pods stuck in ContainerCreating when using CRI-O #388

jamiehannaford opened this issue Jun 12, 2018 · 30 comments

Comments

@jamiehannaford
Copy link

Description of problem

We're running Kata and CRI-O within GCE VMs that have nested virt enabled. Whilst it mostly works, we've been seeing consistent failures with certain pods:

root@k8s-jamie-terraform-4c6ef93f-master-2674:~# kubectl get pods -n kube-system -o wide
NAME                                                               READY     STATUS              RESTARTS   AGE       IP              NODE
calico-node-hlwsv                                                  2/2       Running             0          4h        10.142.0.7      k8s-jamie-terraform-4c6ef93f-default-pool-r0rs
calico-node-ktk7c                                                  2/2       Running             0          4h        10.142.0.10     k8s-jamie-terraform-4c6ef93f-master-2674
calico-node-smp6p                                                  2/2       Running             0          4h        10.142.0.6      k8s-jamie-terraform-4c6ef93f-default-pool-mp8v
calico-node-zf9n9                                                  2/2       Running             0          4h        10.142.0.8      k8s-jamie-terraform-4c6ef93f-default-pool-mv3n
etcd-k8s-jamie-terraform-4c6ef93f-master-2674                      1/1       Running             0          4h        10.142.0.10     k8s-jamie-terraform-4c6ef93f-master-2674
kube-apiserver-k8s-jamie-terraform-4c6ef93f-master-2674            1/1       Running             1          4h        10.142.0.10     k8s-jamie-terraform-4c6ef93f-master-2674
kube-controller-manager-k8s-jamie-terraform-4c6ef93f-master-2674   1/1       Running             0          4h        10.142.0.10     k8s-jamie-terraform-4c6ef93f-master-2674
kube-dns-86f4d74b45-cdwtf                                          3/3       Running             0          2h        192.168.3.162   k8s-jamie-terraform-4c6ef93f-default-pool-mp8v
kube-proxy-4s2rk                                                   1/1       Running             0          4h        10.142.0.6      k8s-jamie-terraform-4c6ef93f-default-pool-mp8v
kube-proxy-58df2                                                   1/1       Running             0          4h        10.142.0.8      k8s-jamie-terraform-4c6ef93f-default-pool-mv3n
kube-proxy-cmls5                                                   1/1       Running             0          4h        10.142.0.7      k8s-jamie-terraform-4c6ef93f-default-pool-r0rs
kube-proxy-wvfb7                                                   1/1       Running             0          4h        10.142.0.10     k8s-jamie-terraform-4c6ef93f-master-2674
kube-scheduler-k8s-jamie-terraform-4c6ef93f-master-2674            1/1       Running             0          4h        10.142.0.10     k8s-jamie-terraform-4c6ef93f-master-2674
kubernetes-dashboard-75d647ffd9-74gpm                              1/1       Running             0          2h        192.168.3.165   k8s-jamie-terraform-4c6ef93f-default-pool-mp8v
l7-default-backend-5d5b9874d5-dw4qv                                0/1       ContainerCreating   0          28m       <none>          k8s-jamie-terraform-4c6ef93f-default-pool-mv3n
root@k8s-jamie-terraform-4c6ef93f-master-2674:~# kubectl describe pods -n kube-system l7-default-backend-5d5b9874d5-dw4qv
Name:           l7-default-backend-5d5b9874d5-dw4qv
Namespace:      kube-system
Node:           k8s-jamie-terraform-4c6ef93f-default-pool-mv3n/10.142.0.8
Start Time:     Tue, 12 Jun 2018 13:45:23 +0000
Labels:         k8s-app=glbc
                name=glbc
                pod-template-hash=1816543081
Annotations:    <none>
Status:         Pending
IP:
Controlled By:  ReplicaSet/l7-default-backend-5d5b9874d5
Containers:
  default-http-backend:
    Container ID:
    Image:          k8s.gcr.io/defaultbackend:1.4
    Image ID:
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     10m
      memory:  20Mi
    Requests:
      cpu:        10m
      memory:     20Mi
    Liveness:     http-get http://:8080/healthz delay=30s timeout=5s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-4nzxm (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          False
  PodScheduled   True
Volumes:
  default-token-4nzxm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-4nzxm
    Optional:    false
QoS Class:       Guaranteed
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                 From                                                     Message
  ----     ------                  ----                ----                                                     -------
  Normal   Scheduled               28m                 default-scheduler                                        Successfully assigned l7-default-backend-5d5b9874d5-dw4qv to k8s-jamie-terraform-4c6ef93f-default-pool-mv3n
  Normal   SuccessfulMountVolume   28m                 kubelet, k8s-jamie-terraform-4c6ef93f-default-pool-mv3n  MountVolume.SetUp succeeded for volume "default-token-4nzxm"
  Warning  FailedCreatePodSandBox  27m                 kubelet, k8s-jamie-terraform-4c6ef93f-default-pool-mv3n  Failed create pod sandbox: rpc error: code = Unknown desc = container create failed: Failed to fire hook: Unix syslog delivery error
  Warning  FailedCreatePodSandBox  24m (x15 over 28m)  kubelet, k8s-jamie-terraform-4c6ef93f-default-pool-mv3n  Failed create pod sandbox: rpc error: code = Unknown desc = container create failed:
  Warning  FailedCreatePodSandBox  13m (x57 over 28m)  kubelet, k8s-jamie-terraform-4c6ef93f-default-pool-mv3n  Failed create pod sandbox: rpc error: code = Unknown desc = error reading container (probably exited) json message: EOF
  Warning  FailedCreatePodSandBox  8m (x14 over 15m)   kubelet, k8s-jamie-terraform-4c6ef93f-default-pool-mv3n  Failed create pod sandbox: rpc error: code = Unknown desc = exit status 1
  Warning  FailedCreatePodSandBox  3m (x33 over 25m)   kubelet, k8s-jamie-terraform-4c6ef93f-default-pool-mv3n  Failed create pod sandbox: rpc error: code = Unknown desc = signal: killed

When we dig into the kubelet logs on that node, we see the following:

E0612 14:09:44.580324   15604 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = signal: killed
E0612 14:09:44.580411   15604 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "l7-default-backend-5d5b9874d5-dw4qv_kube-system(dae96a2e-6e46-11e8-97ac-42010a8e000a)" failed: rpc error: code = Unknown desc = signal: killed
E0612 14:09:44.580427   15604 kuberuntime_manager.go:646] createPodSandbox for pod "l7-default-backend-5d5b9874d5-dw4qv_kube-system(dae96a2e-6e46-11e8-97ac-42010a8e000a)" failed: rpc error: code = Unknown desc = signal: killed
E0612 14:09:44.580514   15604 pod_workers.go:186] Error syncing pod dae96a2e-6e46-11e8-97ac-42010a8e000a ("l7-default-backend-5d5b9874d5-dw4qv_kube-system(dae96a2e-6e46-11e8-97ac-42010a8e000a)"), skipping: failed to "CreatePodSandbox" for "l7-default-backend-5d5b9874d5-dw4qv_kube-system(dae96a2e-6e46-11e8-97ac-42010a8e000a)" with CreatePodSandboxError: "CreatePodSandbox for pod \"l7-default-backend-5d5b9874d5-dw4qv_kube-system(dae96a2e-6e46-11e8-97ac-42010a8e000a)\" failed: rpc error: code = Unknown desc = signal: killed"

Earlier, we also saw the following errors:

E0612 13:23:15.346687    4471 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = container create failed: rom: file bios-256k.bin       : read error: rc=118768 (expected 262144)
qemu: could not load PC BIOS 'bios-256k.bin'
E0612 13:23:15.346763    4471 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "l7-default-backend-5d5b9874d5-r2qct_kube-system(575ea8d0-6e43-11e8-97ac-42010a8e000a)" failed: rpc error: code = Unknown desc = container create failed: rom: file bios-256k.bin       : read error: rc=118768 (expected 262144)
qemu: could not load PC BIOS 'bios-256k.bin'
E0612 13:23:15.346779    4471 kuberuntime_manager.go:646] createPodSandbox for pod "l7-default-backend-5d5b9874d5-r2qct_kube-system(575ea8d0-6e43-11e8-97ac-42010a8e000a)" failed: rpc error: code = Unknown desc = container create failed: rom: file bios-256k.bin       : read error: rc=118768 (expected 262144)
qemu: could not load PC BIOS 'bios-256k.bin'
E0612 13:23:15.346864    4471 pod_workers.go:186] Error syncing pod 575ea8d0-6e43-11e8-97ac-42010a8e000a ("l7-default-backend-5d5b9874d5-r2qct_kube-system(575ea8d0-6e43-11e8-97ac-42010a8e000a)"), skipping: failed to "CreatePodSandbox" for "l7-default-backend-5d5b9874d5-r2qct_kube-system(575ea8d0-6e43-11e8-97ac-42010a8e000a)" with CreatePodSandboxError: "CreatePodSandbox for pod \"l7-default-backend-5d5b9874d5-r2qct_kube-system(575ea8d0-6e43-11e8-97ac-42010a8e000a)\" failed: rpc error: code = Unknown desc = container create failed: rom: file bios-256k.bin       : read error: rc=118768 (expected 262144)\nqemu: could not load PC BIOS 'bios-256k.bin'\n\n"

We then tried to manually set the firmware field to /usr/share/qemu-lite/qemu, but started seeing container create failed: qemu: could not load PC BIOS '/usr/share/qemu-lite/qemu'. So we reverted.

What types of problems might these errors indicate? Perhaps a problem with nested virt on GCE, or is there a common misconfiguration somewhere? Any help you might have would be super appreciated!


Output of kata-collect-data.sh: https://gist.github.com/jamiehannaford/f3024f6e45ce11c00312636945f397a9

@jamiehannaford
Copy link
Author

More journal logs:

time="2018-06-12T14:26:43.468723516Z" level=error arch=amd64 name=kata-runtime pid=14369 source=virtcontainers subsystem=qmp
time="2018-06-12T14:26:43.468767496Z" level=error command=create name=kata-runtime pid=14369 source=runtime
time="2018-06-12T14:26:44.285191637Z" level=error msg="Container ID (92736d660fcdb7a51af24fc96509aa978196e8d037628d907de94956ba661c1e) does not exist" command=delete name=kata-runtime pid=14438 source=runtime
time="2018-06-12T14:26:46.568736376Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=mon-64b30793-40cd-4af8-93c8-d5 original-name=mon-64b30793-40cd-4af8-93c8-d5a0416bdef3 pid=14512 source=virtcontainers subsystem=qemu
time="2018-06-12T14:26:46.568810339Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=ctl-64b30793-40cd-4af8-93c8-d5 original-name=ctl-64b30793-40cd-4af8-93c8-d5a0416bdef3 pid=14512 source=virtcontainers subsystem=qemu
time="2018-06-12T14:27:02.001001031Z" level=error msg="Container ID (bade73313e9fef4a9fa51baa107975aa5327d2ec41b4e1da8a492225a64b9fe5) does not exist" command=delete name=kata-runtime pid=14566 source=runtime
time="2018-06-12T14:27:04.868835719Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=mon-1e8cd7df-3ced-4161-a793-51 original-name=mon-1e8cd7df-3ced-4161-a793-5119cd7adf08 pid=14640 source=virtcontainers subsystem=qemu
time="2018-06-12T14:27:04.868918419Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=ctl-1e8cd7df-3ced-4161-a793-51 original-name=ctl-1e8cd7df-3ced-4161-a793-5119cd7adf08 pid=14640 source=virtcontainers subsystem=qemu
time="2018-06-12T14:27:15.968256418Z" level=error msg="Unable to launch qemu: signal: killed" arch=amd64 name=kata-runtime pid=14640 source=virtcontainers subsystem=qmp
time="2018-06-12T14:27:15.968365523Z" level=error arch=amd64 name=kata-runtime pid=14640 source=virtcontainers subsystem=qmp
time="2018-06-12T14:27:15.968423665Z" level=error command=create name=kata-runtime pid=14640 source=runtime
time="2018-06-12T14:27:16.675173789Z" level=error msg="Container ID (7bb666901c3150842fe55c86c962e2578accc670f43c39c8d4edde68421189e6) does not exist" command=delete name=kata-runtime pid=14719 source=runtime
time="2018-06-12T14:27:18.46886587Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=mon-504cde9d-4741-4d20-b634-40 original-name=mon-504cde9d-4741-4d20-b634-405038568d25 pid=14793 source=virtcontainers subsystem=qemu
time="2018-06-12T14:27:18.468942798Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=ctl-504cde9d-4741-4d20-b634-40 original-name=ctl-504cde9d-4741-4d20-b634-405038568d25 pid=14793 source=virtcontainers subsystem=qemu
time="2018-06-12T14:27:30.068507175Z" level=error msg="Unable to launch qemu: signal: killed" arch=amd64 name=kata-runtime pid=14793 source=virtcontainers subsystem=qmp
time="2018-06-12T14:27:30.068622567Z" level=error arch=amd64 name=kata-runtime pid=14793 source=virtcontainers subsystem=qmp
time="2018-06-12T14:27:30.068681295Z" level=error command=create name=kata-runtime pid=14793 source=runtime
time="2018-06-12T14:27:31.141815818Z" level=error msg="Container ID (14451305ab199c32133107814617893a1222e96e80f4613fa36959e6cf40fd2c) does not exist" command=delete name=kata-runtime pid=14864 source=runtime
time="2018-06-12T14:27:33.76874369Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=mon-0e50ab58-fbaf-46e1-b9e6-6c original-name=mon-0e50ab58-fbaf-46e1-b9e6-6c75addad8d5 pid=14947 source=virtcontainers subsystem=qemu
time="2018-06-12T14:27:33.768833379Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=ctl-0e50ab58-fbaf-46e1-b9e6-6c original-name=ctl-0e50ab58-fbaf-46e1-b9e6-6c75addad8d5 pid=14947 source=virtcontainers subsystem=qemu
time="2018-06-12T14:27:44.668476979Z" level=error msg="Unable to launch qemu: signal: killed" arch=amd64 name=kata-runtime pid=14947 source=virtcontainers subsystem=qmp
time="2018-06-12T14:27:44.668553506Z" level=error arch=amd64 name=kata-runtime pid=14947 source=virtcontainers subsystem=qmp
time="2018-06-12T14:27:44.668604587Z" level=error command=create name=kata-runtime pid=14947 source=runtime
time="2018-06-12T14:27:45.877479552Z" level=error msg="Container ID (1d8f9fedd8d9742fa7f87f2a43b37caebb4d4ef3bf563fe06bd2d74124e00d9d) does not exist" command=delete name=kata-runtime pid=15013 source=runtime
time="2018-06-12T14:27:47.568879077Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=mon-3562e8d2-11bc-44a4-86da-de original-name=mon-3562e8d2-11bc-44a4-86da-de285e77efdf pid=15089 source=virtcontainers subsystem=qemu
time="2018-06-12T14:27:47.568953094Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=ctl-3562e8d2-11bc-44a4-86da-de original-name=ctl-3562e8d2-11bc-44a4-86da-de285e77efdf pid=15089 source=virtcontainers subsystem=qemu
time="2018-06-12T14:27:58.668517827Z" level=error msg="Unable to launch qemu: signal: killed" arch=amd64 name=kata-runtime pid=15089 source=virtcontainers subsystem=qmp
time="2018-06-12T14:27:58.668649461Z" level=error arch=amd64 name=kata-runtime pid=15089 source=virtcontainers subsystem=qmp
time="2018-06-12T14:27:58.668696353Z" level=error command=create name=kata-runtime pid=15089 source=runtime
time="2018-06-12T14:27:59.704855148Z" level=error msg="Container ID (fc85709b9fb263fa4b73c58a84ec980525f32b20882425146ec83c6b6cae510e) does not exist" command=delete name=kata-runtime pid=15138 source=runtime
time="2018-06-12T14:28:01.96868106Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=mon-da72cdce-6d84-460c-83bc-a2 original-name=mon-da72cdce-6d84-460c-83bc-a293f5d446fa pid=15214 source=virtcontainers subsystem=qemu
time="2018-06-12T14:28:01.968758671Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=ctl-da72cdce-6d84-460c-83bc-a2 original-name=ctl-da72cdce-6d84-460c-83bc-a293f5d446fa pid=15214 source=virtcontainers subsystem=qemu
time="2018-06-12T14:28:11.368608648Z" level=error msg="Unable to launch qemu: signal: killed" arch=amd64 name=kata-runtime pid=15214 source=virtcontainers subsystem=qmp
time="2018-06-12T14:28:11.368734672Z" level=error arch=amd64 name=kata-runtime pid=15214 source=virtcontainers subsystem=qmp
time="2018-06-12T14:28:11.368787627Z" level=error command=create name=kata-runtime pid=15214 source=runtime
time="2018-06-12T14:28:12.149356444Z" level=error msg="Container ID (cf7d77f528f40d9316795bdedec46036e3b41c0db701a06c4a82a6700cb6b35b) does not exist" command=delete name=kata-runtime pid=15276 source=runtime
time="2018-06-12T14:28:14.96880123Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=mon-bdf51b09-6d90-4acd-b39d-b1 original-name=mon-bdf51b09-6d90-4acd-b39d-b14516dcc1b7 pid=15350 source=virtcontainers subsystem=qemu
time="2018-06-12T14:28:14.968880241Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=ctl-bdf51b09-6d90-4acd-b39d-b1 original-name=ctl-bdf51b09-6d90-4acd-b39d-b14516dcc1b7 pid=15350 source=virtcontainers subsystem=qemu
time="2018-06-12T14:28:25.368226968Z" level=error msg="Unable to launch qemu: signal: killed" arch=amd64 name=kata-runtime pid=15350 source=virtcontainers subsystem=qmp
time="2018-06-12T14:28:25.368317382Z" level=error arch=amd64 name=kata-runtime pid=15350 source=virtcontainers subsystem=qmp
time="2018-06-12T14:28:25.368373084Z" level=error command=create name=kata-runtime pid=15350 source=runtime
time="2018-06-12T14:28:26.104856651Z" level=error msg="Container ID (69b9e49315f5d50250065f32d119a69ee76eeaa826eeb77e843dbc903ab5af88) does not exist" command=delete name=kata-runtime pid=15406 source=runtime
time="2018-06-12T14:28:29.068722214Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=mon-7db92c56-022d-4019-97df-76 original-name=mon-7db92c56-022d-4019-97df-7679e5ab9048 pid=15490 source=virtcontainers subsystem=qemu
time="2018-06-12T14:28:29.068794619Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=ctl-7db92c56-022d-4019-97df-76 original-name=ctl-7db92c56-022d-4019-97df-7679e5ab9048 pid=15490 source=virtcontainers subsystem=qemu
time="2018-06-12T14:28:39.468234484Z" level=error msg="Unable to launch qemu: signal: killed" arch=amd64 name=kata-runtime pid=15490 source=virtcontainers subsystem=qmp
time="2018-06-12T14:28:39.46832343Z" level=error arch=amd64 name=kata-runtime pid=15490 source=virtcontainers subsystem=qmp
time="2018-06-12T14:28:39.468374854Z" level=error command=create name=kata-runtime pid=15490 source=runtime
time="2018-06-12T14:28:40.212839126Z" level=error msg="Container ID (67eed42514d4a5ba04caabcc1e006f704a395c808fc0839626e648c99e2dab76) does not exist" command=delete name=kata-runtime pid=15550 source=runtime
time="2018-06-12T14:28:42.968745885Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=mon-0502895c-5573-4eff-bed1-d9 original-name=mon-0502895c-5573-4eff-bed1-d9fcde8e3490 pid=15649 source=virtcontainers subsystem=qemu
time="2018-06-12T14:28:42.968819917Z" level=warning msg="shortening QMP socket name" arch=amd64 name=kata-runtime new-name=ctl-0502895c-5573-4eff-bed1-d9 original-name=ctl-0502895c-5573-4eff-bed1-d9fcde8e3490 pid=15649 source=virtcontainers subsystem=qemu

@jodh-intel
Copy link
Contributor

Hi @jamiehannaford - thanks for reporting.

The gist shows that your system is misconfigured:

/usr/share/defaults/kata-containers/configuration.toml: cannot specify an image and an initrd in configuration file

This implies you have built and installed the runtime manually, but did not disable either the initrd= or image= entry in the config file (you cannot use both). See:

@jodh-intel
Copy link
Contributor

Related: #332.

@jamiehannaford
Copy link
Author

jamiehannaford commented Jun 12, 2018

@jodh-intel Thanks for responding so quickly. I actually noticing that just after posting, but we're still seeing Pods stuck. Here's a more recent output: https://gist.github.com/jamiehannaford/dd304f3884d3b2fefb50761a15045af0

@jodh-intel
Copy link
Contributor

@jodh-intel
Copy link
Contributor

Clearly qemu is unhappy:

Unable to launch qemu: signal: killed

That coupled with the bios error suggests either a corrupt file (or FS/HDD). I'd re-install the qemu-lite package and see if that helps:

$ sudo apt-get install --reinstall qemu-lite

@jamiehannaford
Copy link
Author

@jodh-intel I reinstalled qemu-lite and enabled debug mode, it's still getting signal: killed. Here is output: https://gist.github.com/jamiehannaford/23adc9089ee3c054ab82233dc7fdf7b4

@jamiehannaford
Copy link
Author

Some more info:

kata-runtime kata-check
INFO[0000] CPU property found                            description="Intel Architecture CPU" name=GenuineIntel pid=17498 source=runtime type=attribute
INFO[0000] CPU property found                            description="Virtualization support" name=vmx pid=17498 source=runtime type=flag
INFO[0000] CPU property found                            description="64Bit CPU" name=lm pid=17498 source=runtime type=flag
INFO[0000] CPU property found                            description=SSE4.1 name=sse4_1 pid=17498 source=runtime type=flag
INFO[0000] kernel property found                         description="Kernel-based Virtual Machine" name=kvm pid=17498 source=runtime type=module
INFO[0000] kernel property found                         description="Intel KVM" name=kvm_intel pid=17498 source=runtime type=module
WARN[0000] kernel module parameter has unexpected value  description="Intel KVM" expected=Y name=kvm_intel parameter=nested pid=17498 source=runtime type=module value=N
INFO[0000] Kernel property value correct                 description="Intel KVM" expected=Y name=kvm_intel parameter=unrestricted_guest pid=17498 source=runtime type=module value=Y
INFO[0000] kernel property found                         description="Host kernel accelerator for virtio" name=vhost pid=17498 source=runtime type=module
INFO[0000] kernel property found                         description="Host kernel accelerator for virtio network" name=vhost_net pid=17498 source=runtime type=module
INFO[0000] System is capable of running Kata Containers  name=kata-runtime pid=17498 source=runtime
INFO[0000] device available                              check-type=full device=/dev/kvm name=kata-runtime pid=17498 source=runtime
INFO[0000] feature available                             check-type=full feature=create-vm name=kata-runtime pid=17498 source=runtime
INFO[0000] System can currently create Kata Containers   name=kata-runtime pid=17498 source=runtime

@sboeuf
Copy link

sboeuf commented Jun 12, 2018

@jamiehannaford could you reload the driver:

# modprobe -r kvm_intel
# modprobe kvm_intel nested=1

and re-run kata-runtime kata-check.

@grahamwhaley
Copy link
Contributor

And, just to throw some ideas out there and garner a few more details @jamiehannaford

  • Is there anything specific/special about that node?
    • different setup (memory etc.)
    • running a different workload/pod to the other nodes?
  • How much memory does it have, and how large is the workload/pod? (just to try and rule out an OOM type error)

Just trying to narrow down things that might affect just that node (it sounds like you are always failing on the same node, yes?)

@jamiehannaford
Copy link
Author

Before I try other things, I've just noticed something interesting. When I remove resource requests from the Pod manifest the container is created. So this works:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    k8s-app: glbc
  name: l7-default-backend
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: glbc
  template:
    metadata:
      labels:
        k8s-app: glbc
        name: glbc
    spec:
      containers:
      - image: k8s.gcr.io/defaultbackend:1.4
        imagePullPolicy: IfNotPresent
        name: default-http-backend
        ports:
        - containerPort: 8080
          protocol: TCP
      restartPolicy: Always
root@k8s-jamie-terraform-4c6ef93f-master-2674:~# kubectl -n kube-system get pods
NAME                                                               READY     STATUS    RESTARTS   AGE
calico-node-hlwsv                                                  2/2       Running   0          6h
calico-node-ktk7c                                                  2/2       Running   0          6h
calico-node-smp6p                                                  2/2       Running   0          6h
calico-node-zf9n9                                                  2/2       Running   0          6h
etcd-k8s-jamie-terraform-4c6ef93f-master-2674                      1/1       Running   0          6h
kube-apiserver-k8s-jamie-terraform-4c6ef93f-master-2674            1/1       Running   1          6h
kube-controller-manager-k8s-jamie-terraform-4c6ef93f-master-2674   1/1       Running   0          6h
kube-dns-86f4d74b45-cdwtf                                          3/3       Running   0          4h
kube-proxy-4s2rk                                                   1/1       Running   0          6h
kube-proxy-58df2                                                   1/1       Running   0          6h
kube-proxy-cmls5                                                   1/1       Running   0          6h
kube-proxy-wvfb7                                                   1/1       Running   0          6h
kube-scheduler-k8s-jamie-terraform-4c6ef93f-master-2674            1/1       Running   0          6h
kubernetes-dashboard-75d647ffd9-74gpm                              1/1       Running   0          4h
l7-default-backend-7c87ddd57d-s2l74                                1/1       Running   0          36s

But this doesn't work:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    k8s-app: glbc
  name: l7-default-backend
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: glbc
  template:
    metadata:
      labels:
        k8s-app: glbc
        name: glbc
    spec:
      containers:
      - image: k8s.gcr.io/defaultbackend:1.4
        imagePullPolicy: IfNotPresent
        name: default-http-backend
        ports:
        - containerPort: 8080
          protocol: TCP
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "128Mi"
            cpu: "500m"
      restartPolicy: Always

I just got a new error for the above manifest:

Failed create pod sandbox: rpc error: code = Unknown desc = container create failed: rpc error: code = Unavailable desc = transport is closing

@jamiehannaford
Copy link
Author

jamiehannaford commented Jun 12, 2018

@sboeuf It doesn't seem to make a difference:

root@k8s-jamie-terraform-4c6ef93f-default-pool-mv3n:# modprobe -r kvm_intel
modprobe: FATAL: Module kvm_intel is in use.
root@k8s-jamie-terraform-4c6ef93f-default-pool-mv3n:# sudo modprobe kvm_intel nested=1
root@k8s-jamie-terraform-4c6ef93f-default-pool-mv3n:# kata-runtime kata-check
INFO[0000] CPU property found                            description="Intel Architecture CPU" name=GenuineIntel pid=25355 source=runtime type=attribute
INFO[0000] CPU property found                            description="Virtualization support" name=vmx pid=25355 source=runtime type=flag
INFO[0000] CPU property found                            description="64Bit CPU" name=lm pid=25355 source=runtime type=flag
INFO[0000] CPU property found                            description=SSE4.1 name=sse4_1 pid=25355 source=runtime type=flag
INFO[0000] kernel property found                         description="Intel KVM" name=kvm_intel pid=25355 source=runtime type=module
INFO[0000] Kernel property value correct                 description="Intel KVM" name=kvm_intel parameter=unrestricted_guest pid=25355 source=runtime type=module value=Y
WARN[0000] kernel module parameter has unexpected value  description="Intel KVM" expected=Y name=kvm_intel parameter=nested pid=25355 source=runtime type=module value=N
INFO[0000] kernel property found                         description="Host kernel accelerator for virtio" name=vhost pid=25355 source=runtime type=module
INFO[0000] kernel property found                         description="Host kernel accelerator for virtio network" name=vhost_net pid=25355 source=runtime type=module
INFO[0000] kernel property found                         description="Kernel-based Virtual Machine" name=kvm pid=25355 source=runtime type=module
INFO[0000] System is capable of running Kata Containers  name=kata-runtime pid=25355 source=runtime
INFO[0000] device available                              check-type=full device=/dev/kvm name=kata-runtime pid=25355 source=runtime
INFO[0000] feature available                             check-type=full feature=create-vm name=kata-runtime pid=25355 source=runtime
INFO[0000] System can currently create Kata Containers   name=kata-runtime pid=25355 source=runtime

@grahamwhaley Nope, it's just a standard GCE VM with nested virt enabled. More details:

canIpForward: true
cpuPlatform: Intel Haswell
creationTimestamp: '2018-06-12T02:20:32.075-07:00'
deletionProtection: false
disks:
- autoDelete: true
  boot: true
  deviceName: persistent-disk-0
  guestOsFeatures:
  - type: VIRTIO_SCSI_MULTIQUEUE
  index: 0
  interface: SCSI
  kind: compute#attachedDisk
  licenses:
  - https://www.googleapis.com/compute/v1/projects/vm-options/global/licenses/enable-vmx
  - https://www.googleapis.com/compute/v1/projects/ubuntu-os-cloud/global/licenses/ubuntu-1604-xenial
  mode: READ_WRITE
  type: PERSISTENT
kind: compute#instance
machineType: n1-standard-4 (4 vCPUs, 15 GB memory)
name: xxx
networkInterfaces:
- kind: compute#networkInterface
  name: nic0
  network: default
  networkIP: 10.142.0.8
  subnetwork: default
scheduling:
  automaticRestart: true
  onHostMaintenance: MIGRATE
  preemptible: false
startRestricted: false
status: RUNNING
zone: us-east1-c

@sboeuf
Copy link

sboeuf commented Jun 12, 2018

@jamiehannaford did this modprobe -r actually worked ? you got modprobe: FATAL: Module kvm_intel is in use. error, I would tend to think the module has not been properly removed.

@sboeuf
Copy link

sboeuf commented Jun 12, 2018

Can you see the qemu instance being started and living for some time ?

I think @jon got Kata working on GCE a few weeks ago. @jon are there any specific steps that you had to go through when running Kata on GCE ?

@jamiehannaford
Copy link
Author

@sboeuf I killed all the kata containers on that box, ran modprobe -r kvm_intel and sudo modprobe kvm_intel nested=1, no difference. Pods with resource requests still get stuck in ContainerCreating with this error message:

Failed create pod sandbox: rpc error: code = Unknown desc = container create failed: rpc error: code = Unavailable desc = transport is closing

I'm interested to know if resource requests work -- either in general and specifically in GCE.

FWIW, Pods without resource requests seem to be fine.

@sboeuf
Copy link

sboeuf commented Jun 12, 2018

@jamiehannaford on the top of my head, I would say that if you request a certain amount of resources (20MiB in your case from the pod describe you've posted), I think this will ask Kata to run the VM with not enough memory for the kernel+agent to properly boot. Can you check the qemu command line issued when you try to start a Kata Containers pod with specific resources.

@jamiehannaford
Copy link
Author

Can you check the qemu command line issued when you try to start a Kata Containers pod with specific resources.

Sure! How do I do that?

@jamiehannaford
Copy link
Author

@sboeuf Oh that's interesting, when I bumped to

        resources:
          requests:
            memory: "500Mi"
            cpu: "500m"
          limits:
            memory: "500Mi"
            cpu: "500m"

it worked! So your hypothesis is probably correct. What is the minimum CPU/RAM requirements for the VM? We'd like to lock this down.

@sboeuf
Copy link

sboeuf commented Jun 12, 2018

You can enable full logs by adding enable_debug = true to /usr/share/defaults/kata-containers/configuration.toml config file in the section [hypervisor.qemu]
And you'll see the parameters Qemu has been started with if you look into the journal: journalctl

@grahamwhaley
Copy link
Contributor

Oh, yeah, my brain didn't see that as 20Mb... yeah, that is too small to run the minimal kata requirements I expect - previously we have run small containers in 64Mb. I've not tried to see how small a memory allocation we need for a while though (so, if you happened to be doing a binary/bisect on that @jamiehannaford - then please fee free to post your results ;-) )
good spot @sboeuf ;-)

@sboeuf
Copy link

sboeuf commented Jun 12, 2018

Ok cool, that's a good news, we definitely need to add some padding for the VM...
I'd say that you want to add 100MiB as a safety harness on top of what your workload might use. But I'm really not sure about the numbers.

@jamiehannaford
Copy link
Author

I'll try to figure it out through trial and error. Is there any way we can improve the error message on that? It seemed like a pretty random and nondeterministic error

@sboeuf
Copy link

sboeuf commented Jun 12, 2018

Feel free to open an issue for that ;)

@grahamwhaley
Copy link
Contributor

@jamiehannaford yes, definitely we should look at improving that - perhaps best to open a new Issue with a more accurate title now we know the problem, and reference this Issue in it (^^ me sees @sboeuf echo that ;-) )

@jodh-intel
Copy link
Contributor

@jon has raised a PR for a GCE install guide so might be worth looking at kata-containers/documentation#154.

@jamiehannaford
Copy link
Author

So I'm wondering how we can improve the communication between kubelet <-> cri-o <-> kata-runtime here so sandboxes aren't forever stuck in SANDBOX_NOTREADY. I think fixing this goes beyond just improving the error messages.

If we trace the workflow:

  1. kubelet runs CreatePodSandbox here.

  2. which then invokes CRI-O RunPodSandbox here.

  3. it sets up all the networking, and gets to the point where it creates the container here.

  4. the VM is started and the proxy starts. I think it's at this point that the qemu-lite process is killed (question: what might do this?). The agent sometimes runs into problems at this point.

  5. CRI-O launches conmon and waits for it to return here.

  6. conmon either returns -1 PID, or just returns an error.

  7. This error is returned back via CRI-O to the kubelet, which then reports back it couldn't perform CreatePodSandbox. But there seems to be different error responses:

    E0615 15:06:01.361808    4535 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "failure_default(d2aadf7c-7095-11e8-97ac-42010a8e000a)" failed: rpc error: code = Unknown desc = exit status 1
    E0615 14:59:27.887543    4535 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "failure_default(d2aadf7c-7095-11e8-97ac-42010a8e000a)" failed: rpc error: code = Unknown desc = signal: killed
    E0615 13:52:12.894431    4535 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "failure_default(d2aadf7c-7095-11e8-97ac-42010a8e000a)" failed: rpc error: code = Unknown desc = error reading container (probably exited) json message: EOF
    
  8. kubelet keeps retrying to create the container, which will always fail. kata-runtime then keeps trying to delete non-existent containers (I think it tries to reap non-successful starts).

@sboeuf
Copy link

sboeuf commented Jun 15, 2018

Hi @jamiehannaford , nice and accurate analysis here. I'm not sure either why Qemu is killed, but regarding the last point 8, I think the behavior is caused by our mechanism to detect if a container has to be stopped by looking at the shim.
We check the container that we have stored on disk and we check if we have associated shims running. If not, we call into container.stop() to make sure we properly remove the container from the agent, and that's where it might fail in this case.
To avoid such situations, we need to implement proper rollback, and in this case, I think this means making sure we don't leave some on-disk data about containers/pods that could not be created or started properly.

@jon
Copy link

jon commented Jun 18, 2018

I've run Kata on GCE with Kubernetes, but not with CRI-O -- I used containerd for that experiment. From a qemu/kvm-intel perspective I didn't run into any issues at all (and thus didn't have to do any debugging :).

My experiments were with Ubuntu 18.04 -- I have not tried 16.04. If the agent is failing at network setup, I wonder if there are some subtle differences between distributions (or Calico versions?). In my experiment I ran Kubernetes 1.10 with containerd 1.1.0 and Calico 3.0. I bootstrapped the cluster using kubeadm (with the package from the Kubernetes Xenial repo).

Let me run through that experiment again to verify that it still works as advertised, and then we can maybe tease apart what the differences are.

@jon
Copy link

jon commented Jun 19, 2018

Small updated to my comment from yesterday -- I tried the single-node cluster bits and they no longer produce a functioning Kata cluster. The runtime installer DaemonSet tries to do its thing, but never succeeds, so unfortunately it seems there's currently no one-stop way to do this :(

I have not had time to diagnose whether the failure is Kata related, kubeadm related, or containerd related.

@gabibeyer
Copy link
Contributor

Closing this issue since it seems to be resolved, and was an related to not passing enough memory to the Kata VM for the kernel+agent to properly boot

zklei pushed a commit to zklei/runtime that referenced this issue Jun 13, 2019
When memory is hot added the udevpath is:

 uevent-devpath=/devices/system/memory/memory81

But when the agent checks for the path it expects it
has to check in the /sys filesystem.

Fixes: kata-containers#388

Signed-off-by: Jose Carlos Venegas Munoz <[email protected]>
lifupan pushed a commit to lifupan/kata-runtime that referenced this issue Aug 5, 2020
config: there is no need to check vhost-vosck for FC
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants