Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

Do not require nested vt #820

Closed
bergwolf opened this issue Oct 15, 2018 · 5 comments
Closed

Do not require nested vt #820

bergwolf opened this issue Oct 15, 2018 · 5 comments

Comments

@bergwolf
Copy link
Member

We do not really require nested vt to run kata containers. And we do not enable CONFIG_KVM in the shipped guest kernel either. So there is little point to check and fail kata-check when nested vt is not present.

Let's remove this dependency and see if it fixes recent CI failures. Some IaaS cloud vendors might decide not to provide nested vt capability to users and remove the corresponding code from kvm-intel kernel module, -- which might be the case we see in recent CI failures, kvm-intel kernel module is there but its nested parameter is missing, e.g.,

time="2018-10-15T02:02:06Z" level=info msg="kernel property found" arch=amd64 description="Intel KVM" name=kvm_intel pid=16729 source=runtime type=module
time="2018-10-15T02:02:06Z" level=error msg="open /sys/module/kvm_intel/parameters/nested: no such file or directory" arch=amd64 name=kata-runtime pid=16729 source=runtime
open /sys/module/kvm_intel/parameters/nested: no such file or directory

bergwolf added a commit to bergwolf/kata-runtime that referenced this issue Oct 15, 2018
We do not really require nested VT to run kata containers. Let's not
depend on it being there.

Fixes: kata-containers#820

Signed-off-by: Peng Tao <[email protected]>
bergwolf added a commit to bergwolf/kata-runtime that referenced this issue Oct 16, 2018
We do not really require nested VT to run kata containers. Let's not
depend on it being there.

Fixes: kata-containers#820

Signed-off-by: Peng Tao <[email protected]>
bergwolf added a commit to bergwolf/kata-runtime that referenced this issue Oct 16, 2018
We do not really require nested VT to run kata containers. Let's not
depend on it being there.

Fixes: kata-containers#820

Signed-off-by: Peng Tao <[email protected]>
@bergwolf
Copy link
Member Author

Well, after trying a bit in #819, it turns out the CI machines might simply be missing vmx support sometimes.

time="2018-10-16T10:09:10Z" level=error msg="CPU property not found" arch=amd64 description="Virtualization support" name=vmx pid=12839 source=runtime type=flag

And sometimes it has vmx but misses nested vt:

time="2018-10-15T02:02:06Z" level=error msg="CPU property not found" arch=amd64 description="Virtualization support" name=vmx pid=16729 source=runtime type=flag
time="2018-10-15T02:02:06Z" level=info msg="CPU property found" arch=amd64 description="64Bit CPU" name=lm pid=16729 source=runtime type=flag
time="2018-10-15T02:02:06Z" level=info msg="CPU property found" arch=amd64 description=SSE4.1 name=sse4_1 pid=16729 source=runtime type=flag
time="2018-10-15T02:02:06Z" level=info msg="kernel property found" arch=amd64 description="Intel KVM" name=kvm_intel pid=16729 source=runtime type=module
time="2018-10-15T02:02:06Z" level=error msg="open /sys/module/kvm_intel/parameters/nested: no such file or directory" arch=amd64 name=kata-runtime pid=16729 source=runtime

While other times, it has vmx but misses vhost/vhost-net kernel modules.

time="2018-10-16T07:27:35Z" level=info msg="CPU property found" arch=amd64 description="Virtualization support" name=vmx pid=14362 source=runtime type=flag
time="2018-10-16T07:27:35Z" level=info msg="CPU property found" arch=amd64 description="64Bit CPU" name=lm pid=14362 source=runtime type=flag
time="2018-10-16T07:27:35Z" level=info msg="kernel property found" arch=amd64 description="Intel KVM" name=kvm_intel pid=14362 source=runtime type=module
time="2018-10-16T07:27:35Z" level=info msg="Kernel property value correct" arch=amd64 description="Intel KVM" name=kvm_intel parameter=unrestricted_guest pid=14362 source=runtime type=module value=Y
time="2018-10-16T07:27:35Z" level=info msg="kernel property found" arch=amd64 description="Kernel-based Virtual Machine" name=kvm pid=14362 source=runtime type=module
time="2018-10-16T07:27:35Z" level=error msg="kernel property not found" arch=amd64 description="Host kernel accelerator for virtio" name=vhost pid=14362 source=runtime type=module
time="2018-10-16T07:27:35Z" level=error msg="kernel property not found" arch=amd64 description="Host kernel accelerator for virtio network" name=vhost_net pid=14362 source=runtime type=module
time="2018-10-16T07:27:35Z" level=error msg="ERROR: System is not capable of running Kata Containers" arch=amd64 name=kata-runtime pid=14362 source=runtime

#819 should be able to handle the missing nested vt case. But for the missing vmx case, we have to fail. So, to solve the recent CI failures, I think we should:

  1. configure to allocate the testing VMs with vmx support
  2. control the host kernel package when running CI and we always provide vhost/vhost-net kernel modules

@jcvenegas @grahamwhaley do we have control over the two things?

@grahamwhaley
Copy link
Contributor

@bergwolf - we don't have the ability, no. afaik, the bottom line on vexxhost is that the kata CI is configured to run only on machines in a certain domain (cluster?), and that should have nesting enabled for all the machines. Not all machines across the whole of vexxhost support nesting you see, so our CI should be pinned to the cluster that does....
but, it seems that we are either getting scheduled on machines outside of that cluster, or there are some machines in that cluster that don't have nesting enabled right now.

@mnaser knows the real details, and has the ability to debug and set up the config etc. @mnaser - can you help here, or are you able to assign somebody else to help? This is really impacting our CI right now.

thanks!

@bergwolf
Copy link
Member Author

@grahamwhaley I think there are three things we want to look at right now:

  1. the testing VM is missing vmx capability: we need vexxhost side to help to make sure our VMs are scheduled to proper domain(cluster).
  2. the testing VM has vmx capability but doesn't have nested vt capability: we need to stop requiring nested vt as is done in kata-check: do not require nested vt #819
  3. the testing VM has vmx capability but misses vhost kernel modules: we need to make sure we are testing against the same host kernels. I thought we have already ensured this. Does it mean kata-check is run before we switch the host kernel to the one specified in CI?

@grahamwhaley
Copy link
Contributor

Hi @bergwolf I see on slack chat that hopefully (1) is now fixed.
Hence, I just nudged a CI build for #819.. let's see how that goes.
For (3) - I thought we had auto-module probe/load as well - @chavafg can you comment?

@bergwolf
Copy link
Member Author

@grahamwhaley

For (3) - I thought we had auto-module probe/load as well

We do. That's why I think the problem is the host kernel doesn't have vhost/vhost-net modules in the following error.

time="2018-10-16T07:27:35Z" level=error msg="kernel property not found" arch=amd64 description="Host kernel accelerator for virtio" name=vhost pid=14362 source=runtime type=module

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants