-
Notifications
You must be signed in to change notification settings - Fork 373
Add support for intel cloud hypervisor #2164
Conversation
/cc @sameo |
/test-ch |
test-ch |
/test-ch |
2 similar comments
/test-ch |
/test-ch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial review added, to make travis happy could you also add to your commit message
Fixes: #2046
pkg/katautils/config.go
Outdated
@@ -638,7 +639,6 @@ func newQemuHypervisorConfig(h hypervisor) (vc.HypervisorConfig, error) { | |||
GuestHookPath: h.guestHookPath(), | |||
}, nil | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REMOVED
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's still here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good and I think we should clean this PR up and merge it before switching to the HTTP based API. IMO this PR needs to be simplified:
- We should get rid of the arch abstraction
- We should remove the
govmm
dependency
With those 2 simplification steps, we'll pretty much have a self contained implementation for the Cloud Hypervisor support.
virtcontainers/ich.go
Outdated
type CloudHypervisorState struct { | ||
sync.RWMutex | ||
state ichState | ||
PID int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PID here is redundant with what's in CloudHypervisorInfo
, I think.
I feel it would be cleaner to clearly separate the state from the information:
type CloudHypervisorInfo struct {
PID int
VirtiofsdPid int
UUID string
}
type CloudHypervisorState struct {
sync.RWMutex
state ichState
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the govmm dependency and added a new CLI builder into the driver itself as agreed. I've also merged the info and state struct to simplify things and checked that the state in hypervisor.json is correct at runtime. I've also noticed that IF the hypervisor fails to start for whatever reason the virtiofsd obviously is not teared-down by the hypervisor as a consequence so I use now the stored virtiofsd PID to kill the it to avoid accumulating unused processes.
virtcontainers/ich_arch_base.go
Outdated
type ichArch interface { | ||
|
||
// ichPath returns the path to the Ich binary | ||
ichPath() (string, error) | ||
|
||
// kernelParameters returns the kernel parameters | ||
kernelParameters(debug bool) []Param | ||
|
||
// appendImage appends an image to devices | ||
appendImage(devices []govmmIch.Device, path string) ([]govmmIch.Device, error) | ||
|
||
// appendNetwork appends a endpoint device to devices | ||
appendNetwork(devices []govmmIch.Device, endpoint Endpoint) ([]govmmIch.Device, error) | ||
|
||
// appendHybridVSock appends a hybrid vsock PCI to devices | ||
appendHybridVSock(devices []govmmIch.Device, vsock types.HybridVSock) ([]govmmIch.Device, error) | ||
|
||
appendRNGDevice(devices []govmmIch.Device, rngDev config.RNGDev) ([]govmmIch.Device, error) | ||
|
||
appendSerialConsole(devices []govmmIch.Device, consoletype string) ([]govmmIch.Device, error) | ||
|
||
appendVirtioConsole(devices []govmmIch.Device, consoletype string) ([]govmmIch.Device, error) | ||
|
||
appendApiSocket(devices []govmmIch.Device, apiSocket string) ([]govmmIch.Device, error) | ||
|
||
appendVirtualFilesystem(devices []govmmIch.Device, tag string, path string) ([]govmmIch.Device, error) | ||
|
||
appendMemory(memorySize uint64, path string) (govmmIch.Memory, error) | ||
|
||
appendProcessors(vcpu uint32) (govmmIch.VCPU, error) | ||
|
||
//capabilities returns the capabilities supported by ICH | ||
capabilities() types.Capabilities | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I assume this is something that was taken from the qemu implementation, where the command line varies across architectures. With Cloud Hypervisor, we want to have a consistent HTTP API and hopefully command line, and I don't think adding such an abstraction is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline with your previous comment I can remove this but bear in mind that the main reason I added it was not the abstraction but the state aggregation into a device list before generating the final command line argument. This mechanism has to stay in the driver somehow because this builds by the 'add' calls into the driver (nic's are being added late in the process for instance) after which the state is ready to be translated into a working hypervisor. So to cut it short, I still need some state aggregation in the driver for devices if we cut off the iovmm logic.
How do you see this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, you're going to implement a way to aggregate devices and other parameters into a command line string for now (an HTTP payload later on). What I'm saying is that it should be part of the driver itself, not a govmm implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good, then we're on the same page. I'll fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
virtcontainers/ich.go
Outdated
ctx context.Context | ||
socketPath string | ||
ichConfig govmmIch.Config | ||
arch ichArchBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in another comment, we don't need an arch specific interface for Cloud Hypervisor.
virtcontainers/ich_arch_base.go
Outdated
} | ||
|
||
type ichArchBase struct { | ||
machineType string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, we don't expect to have different kind of machine types.
virtcontainers/ich_arch_base.go
Outdated
kernelParams []Param | ||
memory govmmIch.Memory | ||
vcpu govmmIch.VCPU | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the rest of the structure is pretty much arch agnostic from a CH standpoint.
virtcontainers/ich.go
Outdated
} | ||
|
||
// Add memory to the cloud hypervisor | ||
memory, err := ich.arch.appendMemory(ich.config.MemorySize, "/dev/shm") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the arch abstraction, this could be a cloudHypervisor
method instead.
virtcontainers/ich.go
Outdated
if err != nil { | ||
return err | ||
} | ||
kernel := govmmIch.Kernel{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another comment here: Similar to the arch abstraction, I think the govmm implementation is also modeled after the current qemu implementation.
For Cloud Hypervisor, I don't think there's a need to go through a govmm abstraction here. The cloudHypervisor
structure would have a local array of cloudHypervisorDevice
, etc.
The Cloud Hypervisor API is simple enough that we don't need to "hide" its complextiy behind the govmm package, and add a dependency between the Kata support for Cloud Hypervisor and the govmm
package. This will simplify and speed the development process.
virtcontainers/ich.go
Outdated
// | ||
//##################################### | ||
|
||
func (ich *cloudHypervisor) createSandbox(ctx context.Context, id string, networkNS NetworkNamespace, hypervisorConfig *HypervisorConfig, vcStore *store.VCStore) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we're going to eventually switch to an HTTP API based model, we should make sure that this routine creates a generic enough Cloud Hypervisor configuration for startSandbox
to consume and either build a command line for the cloud-hypervisor
binary or build an HTTP payload and then send it through an HTTP command.
virtcontainers/ich.go
Outdated
// ichConfig.Devices, err = ich.arch.appendApiSocket(ichConfig.Devices, apiSocketPath) | ||
// if err != nil { | ||
// return err | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should remove commented out code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. moved to rev 2 of the driver to be done next week
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising @ericooper!
Given the amount of new code, it would be great if you could add a few unit tests.
We have a doc explaining our table-driven approach and there are of course lots of examples in this repo and other Kata ones:
pkg/katautils/config.go
Outdated
return vc.HypervisorConfig{}, err | ||
} | ||
|
||
if initrd != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Odd indent compared to surrounding code (tabs/spaces issue maybe?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will reformat
virtcontainers/ich.go
Outdated
"fmt" | ||
"net" | ||
"syscall" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this file with git diff
shows a lot of extraneous whitespace (shows in red). I'd consider running gofmt
over all the changed files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As soon as this initial version of the driver isn't the subject of major requested structural changes, I'll start working on the unit tests for clh.go
virtcontainers/ich.go
Outdated
ichTimeout = 10 | ||
ichSocket = "ich.sock" | ||
vfsdSocket = "virtiofsd.sock" | ||
apiSocket = "cloud-hypervisor.9166" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is 9166
? Can this be simply cloud-hypervisor.sock
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be removed.
virtcontainers/ich.go
Outdated
|
||
{"root", "/dev/vda1"}, | ||
{"panic", "1"}, | ||
{"init", "/usr/lib/systemd/systemd"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? /sbin/init
always exists in our images and is either init binary or a link to it (systemd or the kata-agent
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we use our own images for other reasons so I'll see what I can do.
virtcontainers/ich.go
Outdated
var thePid = 0 | ||
strErr, err, thePid = govmmIch.LaunchIch(ich.ichConfig) | ||
if err != nil { | ||
return fmt.Errorf("fail to launch cloud-hypervisor: %s, error messages from log: %s", err, strErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: s/fail/failed/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed.
virtcontainers/ich.go
Outdated
} | ||
|
||
func (ich *cloudHypervisor) getSandboxConsole(id string) (string, error) { | ||
ich.Logger().WithField("function", "getSandboxConsole").WithField("ID", id).Info("Get Sandbox Console") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Structured logging fields should be all lower case, so id
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know. will change
virtcontainers/ich.go
Outdated
|
||
func (ich *cloudHypervisor) getSandboxConsole(id string) (string, error) { | ||
ich.Logger().WithField("function", "getSandboxConsole").WithField("ID", id).Info("Get Sandbox Console") | ||
// TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please can you add in a GitHub issue URL here so that the comment is in the form:
// FIXME: https://github.com/kata-containers/runtime/issues/...
That way we have both the issue and the standardised comment as a reminder of what remains to be done.
virtcontainers/ich.go
Outdated
} | ||
|
||
func (ich *cloudHypervisor) getPids() []int { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extraneous blank lines (ere and elsewhere).
|
||
func (config *Config) appendKernel() { | ||
|
||
//config.ichParams = append(config.ichParams, "--cmdline", fmt.Sprintf("\"%s\"", config.Kernel.Params)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented out code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will be refactored.
virtcontainers/ich.go
Outdated
case <-time.After(timeoutDuration): | ||
err = fmt.Errorf("timed out waiting for cloud-hypervisor vsock") | ||
} | ||
// delay between vsock is available but hypervisor is bstill ooting and starting the kata-agent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// delay between vsock is available but hypervisor is bstill ooting and starting the kata-agent | |
// delay between vsock is available but hypervisor is still booting and starting the kata-agent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It must have been late ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevertheless, this is an important issue I think. What I observed is that without this hard-coded wait the runtime shuts down the hypervisor even before it was able to start the agent. It's the time between the moment the vsock is available but the agent inside the hypervisor isn't ready yet. Somehow grpc doesn't wait during that time period but aborts immediately and causes a premature shutdown. This should be improved I think. Anyone an opinion on this?
Makefile
Outdated
|
||
# ICH-specific options (all should be suffixed by "_ICH") | ||
# currently, huge pages are required for virtiofsd support | ||
DEFENABLEHUGEPAGES_ICH := true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we want to set this as true, we got virtiofs to work without huge pages afaik.
cc @ganeshmaharaj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was more thinking about DPDK which is widely adopted in the telco domain. How do you see this?
pkg/katautils/config.go
Outdated
@@ -1222,6 +1313,8 @@ func checkHypervisorConfig(config vc.HypervisorConfig) error { | |||
} | |||
} | |||
} | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra lines added. Please remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
virtcontainers/ich.go
Outdated
|
||
{"root", "/dev/vda1"}, | ||
{"panic", "1"}, | ||
{"init", "/usr/lib/systemd/systemd"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
virtcontainers/ich.go
Outdated
if err != nil { | ||
return err | ||
} | ||
ich.Logger().WithField("function", "startSandbox").Infof("Starting virtiofsd") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, shouldn't this log line be inside the condition?
virtcontainers/ich.go
Outdated
} | ||
} | ||
|
||
ich.Logger().WithField("function", "startSandbox").Infof("virtiofsd starts sharing") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, this should be logged only when ich.config.SharedFS = config.VirtioFS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
Hi @ericooper - thanks for updating. Given all the comments flying around, please could you post a comment here when you think the PR is generally ready for re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking - adding some more comments to not forget in the refactor
virtcontainers/ich.go
Outdated
if ich.config.Debug { | ||
ich.Logger().WithField("source", "virtiofsd").Debug(scanner.Text()) | ||
} | ||
if !sent && strings.Contains(scanner.Text(), "Waiting for vhost-user socket connection...") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks that would be broken at any time virtiofsd changes the log, I wonder if woudl be a better way to check is ready
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not specific to my driver commit. it's the same in the other drivers using virtiofsd. I would like to keep it this way and focus on the essentials first until this is improved in general.
virtcontainers/ich_arch_base.go
Outdated
Tag: tag, | ||
Path: path, | ||
NumQueues: 1, | ||
QueueSize: 512, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Add config option for this
cacheSize := "0Gib" | ||
if(vfsdev.Dax) { | ||
daxParam = "on" | ||
cacheSize = "8Gib" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to big to be hardcoded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the default size of the hypervisor itself ie not my idea.
I think we have a misalignment of expectations here. This is a WIP for a hypervisor which itself is still in alfa stage and under heavy development. I've developed and published this driver to be able to start experimenting with cloud-hypervisor as soon as possible and to propell adoption of the cloud-hypervisor. I would like to focus first on the essentials and get it going and do the nitty gritty cosmetics at the end. This approach has proven to fail which is something I have to deal with. |
I think @jodh-intel comment was very much aligned with that approach. He was saying that it's up to you to decide which comments you want to address and then decide when the PR is ready for a second round of reviews. I also agree with the approach of getting something that's acceptable and working merged first and then work on some of the nits/details from all the reviews that you got. |
cli/config/configuration-ich.toml.in
Outdated
# Shared file system type: | ||
# - virtio-9p (do not use) | ||
# - virtio-fs (default & must) | ||
shared_fs = "@DEFSHAREDFS_ICH@" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC there is no virtio-9p
for cloud hypervisor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed both vsock and file-sharing options from Makefile, config template and hardwired them in newClhHypervisorConfig accordingly.
cli/config/configuration-ich.toml.in
Outdated
# If true and vsocks are supported, use vsocks to communicate directly | ||
# with the agent (no proxy is started). | ||
# Default true & must. cloud-hypervisor only supports vsock based agent communication | ||
use_vsock = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since cloud hypervisor must use vsock, we should not make it an option at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed both vsock and file-sharing options from Makefile, config template and hardwired them in newClhHypervisorConfig accordingly.
Hi @ericooper - this branch needs a rebase to resolve a conflict in |
cli/config/configuration-ich.toml
Outdated
virtio_fs_daemon = "/usr/bin/virtiofsd-x86_64" | ||
|
||
# cloud-hypervisor does not support virtio-fs caching | ||
virtio_fs_cache = "none" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does and it should be used for better performances.
cli/config/configuration-ich.toml
Outdated
[agent.kata] | ||
# If enabled, make the agent display debug-level messages. | ||
# (default: disabled) | ||
enable_debug = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the debug to be enabled by default?
pkg/rootless/rootless.go
Outdated
@@ -90,7 +90,7 @@ func setRootless() error { | |||
if err != nil { | |||
return parseError | |||
} | |||
rangeUID, err := strconv.ParseUint(ids[1], 10, 0) | |||
rangeUID, err := strconv.ParseUint(ids[2], 10, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please submit this fix as its own separate commit since it's not directly related to Cloud-Hypervisor support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix has already landed in master
so this change is no longer needed.
virtcontainers/clh.go
Outdated
} | ||
|
||
func (clh *cloudHypervisor) hotplugAddDevice(devInfo interface{}, devType deviceType) (interface{}, error) { | ||
clh.Logger().WithField("function", "hotplugAddDevice").Info("Add hotplug device") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message should indicate that it's not supported. And I would suggest to use a Warn
instead of Info
.
virtcontainers/clh.go
Outdated
"function": "resizeVCPUs", | ||
"curr": currentVCPUs, | ||
"new": newVCPUs, | ||
}).Info("resize the VCPU's ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to have lost the explanation for what the ditto means here @sboeuf ;)
virtcontainers/clh.go
Outdated
return nil | ||
} | ||
|
||
func (clh *cloudHypervisor) LaunchIch() (string, error, int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another occurence of ich
|
||
// Add the hybrid vsock device to hypervisor | ||
clh.cliBuilder.SetVsock(&CLIVsock{ | ||
cid: 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be the context ID provided by Kata? (instead of being hardcoded)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW Firecracker also uses a hardcoded 3
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't that imply you cannot run a FC container and a CLH container on the same host then? Could we use VMADDR_CID_ANY
(-1U
) as documented in vsock(7)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am adding quite a few comments, but I think all of them are small/details/nits. I think the architecture looks good now, and I like the standalone clh.go driver. We're getting quite close, thanks.
Makefile
Outdated
@@ -600,11 +640,12 @@ $(GENERATED_FILES): %: %.in $(MAKEFILE_LIST) VERSION .git-commit | |||
-e "s|@SHIMPATH@|$(SHIMPATH)|g" \ | |||
-e "s|@DEFVCPUS@|$(DEFVCPUS)|g" \ | |||
-e "s|@DEFMAXVCPUS@|$(DEFMAXVCPUS)|g" \ | |||
-e "s|@DEFMAXVCPUS_ACRN@|$(DEFMAXVCPUS_ACRN)|g" \ | |||
-e "s|@DEFMAXVCPUS_ACRN@|$(DEFMAXVCPUS_ACRN)|g" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please avoid modifying this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it. Which line do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ericooper - the whitespace in Makefile
(line 643 in your version) has been changed, but this looks like an oversight as the actual content hasn't changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
cli/config/configuration-clh.toml.in
Outdated
# to the proxy logs, but only when proxy debug is also enabled. | ||
# | ||
# Default false | ||
enable_debug = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to comment that one out as a default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed.
pkg/katautils/config.go
Outdated
@@ -638,7 +639,6 @@ func newQemuHypervisorConfig(h hypervisor) (vc.HypervisorConfig, error) { | |||
GuestHookPath: h.guestHookPath(), | |||
}, nil | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's still here :)
virtcontainers/clh.go
Outdated
// First take the default parameters defined by this driver | ||
clh.cliBuilder.AddKernelParameters(clhKernelParams) | ||
|
||
// Followed by extra debug parameters if debug enabled in configuration file | ||
if clh.config.Debug == true { | ||
clh.cliBuilder.AddKernelParameters(clhDebugKernelParams) | ||
} | ||
|
||
// Followed by extra debug parameters defined in the configuration file | ||
clh.cliBuilder.AddKernelParameters(clh.config.KernelParams) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be grouped into a single func (d *DefaultCLIBuilder) AddKernelParameters(hypervisorConfig *HypervisorConfig)
routine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a few nits to address, and the virtiofs caching enabled by default. Getting closer.
cli/config/configuration-clh.toml.in
Outdated
# cloud-hypervisor does not support virtio-fs caching | ||
virtio_fs_cache = "none" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, the documentation is outdated (It would not be proper documentation otherwise, wouldn't it ;-)).
I can confirm that caching is now enabled, and we systematically test both cases as part of our CI. See https://github.com/intel/cloud-hypervisor/blob/master/src/main.rs#L1775 for example.
I'll open an issue against the doc and we'll fix it. Meanwhile, as suggested by @sboeuf, I think your PR should support caching by default. Performance is much better with DAX on.
pkg/katautils/config.go
Outdated
@@ -1222,7 +1309,6 @@ func checkHypervisorConfig(config vc.HypervisorConfig) error { | |||
} | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that you're removing an empty line here, which is not something that's related to the PR. It adds noise to the patch/PR, so I'm suggesting to keep the empty line there.
if clh.version.Major < supportedMajorVersion && clh.version.Minor < supportedMinorVersion { | ||
errorMessage := fmt.Sprintf("Unsupported version: cloud-hypervisor %d.%d not supported by this driver version (%d.%d)", | ||
clh.version.Major, | ||
clh.version.Minor, | ||
supportedMajorVersion, | ||
supportedMinorVersion) | ||
return errors.New(errorMessage) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it is a fact that any binary below 0.3.0 the driver will fail to drive due to changes in the cli.
I agree. I don't have a strong opinion on that one, we can keep it.
I would actually like to see a different versioning for both cli and http api
That one I disagree with. It's ok to break the CLI/API between minor versions, especially with such a new project, but the CLI and HTTP APIs should be kept in sync when they map. There will be HTTP API endpoints that don't have a CLI counterpart (e.g. the hotplug ones), but for those who do they should be consistent.
|
||
// Add the hybrid vsock device to hypervisor | ||
clh.cliBuilder.SetVsock(&CLIVsock{ | ||
cid: 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW Firecracker also uses a hardcoded 3
.
cli/config/configuration-clh.toml.in
Outdated
# cloud-hypervisor does not support virtio-fs caching | ||
virtio_fs_cache = "none" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericooper: Can we set DisableVhostNet to true? I don't think we need it to be configurable, and today AFAICT the PR doesn't set it in the hypervisor config from TOML/Annotations anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericooper changed the kernel to use kata virtiofs kernel build, feel free to drop or squash the commit
Thanks for fixing the PR. The change should not be dropped since it is expected that if we want CloudHypervisor working with Kata, Kata will have to use virtio-fs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating again @ericooper. There are only a few unresolved questions now so we're getting close!
As shown in https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#updating-your-pr-based-on-review-comments, please can you squash all your commits down into one (since they all relate to the single task of adding CLH support). Once done, this PR will then have two commits: @jcvenegas's Makefile change and your single commit to add CLH support.
virtcontainers/clh.go
Outdated
@@ -416,7 +448,7 @@ func (clh *cloudHypervisor) addDevice(devInfo interface{}, devType deviceType) e | |||
}) | |||
|
|||
default: | |||
clh.Logger().WithField("function", "addDevice").Warn("Add device of type %v is not supported.", v) | |||
clh.Logger().WithField("function", "addDevice").Warn(fmt.Sprintf("Add device of type %v is not supported.", v)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You can simplify this to use:
.Warnf("Add device of type %v is not supported.", v)
pkg/rootless/rootless.go
Outdated
@@ -90,7 +90,7 @@ func setRootless() error { | |||
if err != nil { | |||
return parseError | |||
} | |||
rangeUID, err := strconv.ParseUint(ids[1], 10, 0) | |||
rangeUID, err := strconv.ParseUint(ids[2], 10, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix has already landed in master
so this change is no longer needed.
|
||
// Add the hybrid vsock device to hypervisor | ||
clh.cliBuilder.SetVsock(&CLIVsock{ | ||
cid: 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't that imply you cannot run a FC container and a CLH container on the same host then? Could we use VMADDR_CID_ANY
(-1U
) as documented in vsock(7)
?
virtcontainers/clh.go
Outdated
clh.Logger().Info("Cloud Hypervisor stopped") | ||
clh.state.PID = 0 | ||
clh.state.VirtiofsdPID = 0 | ||
clh.state.state = clhNotReady |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought: This might benefit from a cloudHypervisor.reset()
helper method (which possibly calls CloudHypervisorState.reset()
or NewCloudHypervisorState()
or similar). This might help avoid bugs if we ever extend CloudHypervisorState
.
} | ||
|
||
// Send a SIGTERM to the VM process to try to stop it properly | ||
if err = syscall.Kill(pid, syscall.SIGTERM); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having two kills like this is potentially racy. I'd remove the Kill(0)
and instead return nil
if Kill(SIGTERM)
fails with ESRCH
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be handled with a follow-on PR since this idiom is used for the other hypervisor implementations too.
virtcontainers/clh.go
Outdated
return 0, err | ||
} | ||
|
||
clh.Logger().WithField("Path", clh.config.VirtioFSDaemon).Info() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: All our log fields should be lower-case.
virtcontainers/clh.go
Outdated
"function": "resizeVCPUs", | ||
"curr": currentVCPUs, | ||
"new": newVCPUs, | ||
}).Info("resize the VCPU's ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to have lost the explanation for what the ditto means here @sboeuf ;)
Thanks -- I'll fix the merge conflict, then let's merge! |
/test |
/test-ch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ericooper!
Codecov Report
@@ Coverage Diff @@
## master #2164 +/- ##
=========================================
- Coverage 51.25% 49.35% -1.9%
=========================================
Files 110 111 +1
Lines 15114 15791 +677
=========================================
+ Hits 7747 7794 +47
- Misses 6410 7032 +622
- Partials 957 965 +8 |
/test |
Initial release of cloud hypervisor driver for kata-runtime Fixes: #2046 Signed-off-by: Johan Kuijpers <[email protected]>
use virtiofs kernel to allow boot kata. Signed-off-by: Jose Carlos Venegas Munoz <[email protected]>
/test |
if elapsed < timeoutDuration { | ||
timeout = timeout - int(elapsed.Seconds()) | ||
} else { | ||
timeout = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't error be returned in this case since the wait times out ?
|
||
sourcePath := filepath.Join(kataHostSharedDir(), clh.id) | ||
if _, err := os.Stat(sourcePath); os.IsNotExist(err) { | ||
os.MkdirAll(sourcePath, os.ModePerm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better check return value and log accordingly.
"-o", "source=" + sourcePath, | ||
"-o", "cache=" + clh.config.VirtioFSCache} | ||
|
||
if len(clh.config.VirtioFSExtraArgs) != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check that VirtioFSExtraArgs contains none of the 3 args from line 665 to 667 ?
|
||
func (clh *cloudHypervisor) storeState() error { | ||
if clh.store != nil { | ||
if err := clh.store.Store(store.Hypervisor, clh.state); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to check the return value - we can directly return it to caller.
conn, err := net.DialUnix("unix", nil, addr) | ||
|
||
if err != nil { | ||
time.Sleep(50 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly log after continuously sleeping 200 times.
err = fmt.Errorf("timed out waiting for cloud-hypervisor vsock") | ||
} | ||
|
||
time.Sleep(1000 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We know the ready status, why is this needed ?
Initial release of cloud hypervisor driver for kata-runtime
Related to issue 'Add support for cloud hypervisor 2046'
Signed-off-by: Johan Kuijpers [email protected]