Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

Add support for intel cloud hypervisor #2164

Merged
merged 2 commits into from
Nov 15, 2019
Merged

Add support for intel cloud hypervisor #2164

merged 2 commits into from
Nov 15, 2019

Conversation

ericooper
Copy link
Contributor

Initial release of cloud hypervisor driver for kata-runtime

Related to issue 'Add support for cloud hypervisor 2046'

Signed-off-by: Johan Kuijpers [email protected]

@egernst
Copy link
Member

egernst commented Nov 5, 2019

/cc @sameo

@jcvenegas
Copy link
Member

/test-ch

@jcvenegas
Copy link
Member

test-ch

@jcvenegas
Copy link
Member

/test-ch

2 similar comments
@jcvenegas
Copy link
Member

/test-ch

@jcvenegas
Copy link
Member

/test-ch

Copy link
Member

@jcvenegas jcvenegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review added, to make travis happy could you also add to your commit message

Fixes: #2046

@@ -638,7 +639,6 @@ func newQemuHypervisorConfig(h hypervisor) (vc.HypervisorConfig, error) {
GuestHookPath: h.guestHookPath(),
}, nil
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REMOVED

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's still here :)

Copy link

@sameo sameo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good and I think we should clean this PR up and merge it before switching to the HTTP based API. IMO this PR needs to be simplified:

  • We should get rid of the arch abstraction
  • We should remove the govmm dependency

With those 2 simplification steps, we'll pretty much have a self contained implementation for the Cloud Hypervisor support.

type CloudHypervisorState struct {
sync.RWMutex
state ichState
PID int
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PID here is redundant with what's in CloudHypervisorInfo, I think.

I feel it would be cleaner to clearly separate the state from the information:

type CloudHypervisorInfo struct {
	PID int
        VirtiofsdPid int
        UUID string
}

type CloudHypervisorState struct {
	sync.RWMutex
	state ichState
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved.

Copy link
Contributor Author

@ericooper ericooper Nov 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the govmm dependency and added a new CLI builder into the driver itself as agreed. I've also merged the info and state struct to simplify things and checked that the state in hypervisor.json is correct at runtime. I've also noticed that IF the hypervisor fails to start for whatever reason the virtiofsd obviously is not teared-down by the hypervisor as a consequence so I use now the stored virtiofsd PID to kill the it to avoid accumulating unused processes.

Comment on lines 21 to 55
type ichArch interface {

// ichPath returns the path to the Ich binary
ichPath() (string, error)

// kernelParameters returns the kernel parameters
kernelParameters(debug bool) []Param

// appendImage appends an image to devices
appendImage(devices []govmmIch.Device, path string) ([]govmmIch.Device, error)

// appendNetwork appends a endpoint device to devices
appendNetwork(devices []govmmIch.Device, endpoint Endpoint) ([]govmmIch.Device, error)

// appendHybridVSock appends a hybrid vsock PCI to devices
appendHybridVSock(devices []govmmIch.Device, vsock types.HybridVSock) ([]govmmIch.Device, error)

appendRNGDevice(devices []govmmIch.Device, rngDev config.RNGDev) ([]govmmIch.Device, error)

appendSerialConsole(devices []govmmIch.Device, consoletype string) ([]govmmIch.Device, error)

appendVirtioConsole(devices []govmmIch.Device, consoletype string) ([]govmmIch.Device, error)

appendApiSocket(devices []govmmIch.Device, apiSocket string) ([]govmmIch.Device, error)

appendVirtualFilesystem(devices []govmmIch.Device, tag string, path string) ([]govmmIch.Device, error)

appendMemory(memorySize uint64, path string) (govmmIch.Memory, error)

appendProcessors(vcpu uint32) (govmmIch.VCPU, error)

//capabilities returns the capabilities supported by ICH
capabilities() types.Capabilities

}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I assume this is something that was taken from the qemu implementation, where the command line varies across architectures. With Cloud Hypervisor, we want to have a consistent HTTP API and hopefully command line, and I don't think adding such an abstraction is necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline with your previous comment I can remove this but bear in mind that the main reason I added it was not the abstraction but the state aggregation into a device list before generating the final command line argument. This mechanism has to stay in the driver somehow because this builds by the 'add' calls into the driver (nic's are being added late in the process for instance) after which the state is ready to be translated into a working hypervisor. So to cut it short, I still need some state aggregation in the driver for devices if we cut off the iovmm logic.

How do you see this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, you're going to implement a way to aggregate devices and other parameters into a command line string for now (an HTTP payload later on). What I'm saying is that it should be part of the driver itself, not a govmm implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good, then we're on the same page. I'll fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

ctx context.Context
socketPath string
ichConfig govmmIch.Config
arch ichArchBase
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in another comment, we don't need an arch specific interface for Cloud Hypervisor.

}

type ichArchBase struct {
machineType string
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, we don't expect to have different kind of machine types.

kernelParams []Param
memory govmmIch.Memory
vcpu govmmIch.VCPU
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the rest of the structure is pretty much arch agnostic from a CH standpoint.

}

// Add memory to the cloud hypervisor
memory, err := ich.arch.appendMemory(ich.config.MemorySize, "/dev/shm")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the arch abstraction, this could be a cloudHypervisor method instead.

if err != nil {
return err
}
kernel := govmmIch.Kernel{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another comment here: Similar to the arch abstraction, I think the govmm implementation is also modeled after the current qemu implementation.
For Cloud Hypervisor, I don't think there's a need to go through a govmm abstraction here. The cloudHypervisor structure would have a local array of cloudHypervisorDevice, etc.

The Cloud Hypervisor API is simple enough that we don't need to "hide" its complextiy behind the govmm package, and add a dependency between the Kata support for Cloud Hypervisor and the govmm package. This will simplify and speed the development process.

//
//#####################################

func (ich *cloudHypervisor) createSandbox(ctx context.Context, id string, networkNS NetworkNamespace, hypervisorConfig *HypervisorConfig, vcStore *store.VCStore) error {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we're going to eventually switch to an HTTP API based model, we should make sure that this routine creates a generic enough Cloud Hypervisor configuration for startSandbox to consume and either build a command line for the cloud-hypervisor binary or build an HTTP payload and then send it through an HTTP command.

// ichConfig.Devices, err = ich.arch.appendApiSocket(ichConfig.Devices, apiSocketPath)
// if err != nil {
// return err
// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should remove commented out code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. moved to rev 2 of the driver to be done next week

Copy link
Contributor

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising @ericooper!

Given the amount of new code, it would be great if you could add a few unit tests.

We have a doc explaining our table-driven approach and there are of course lots of examples in this repo and other Kata ones:

return vc.HypervisorConfig{}, err
}

if initrd != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Odd indent compared to surrounding code (tabs/spaces issue maybe?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will reformat

"fmt"
"net"
"syscall"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this file with git diff shows a lot of extraneous whitespace (shows in red). I'd consider running gofmt over all the changed files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As soon as this initial version of the driver isn't the subject of major requested structural changes, I'll start working on the unit tests for clh.go

ichTimeout = 10
ichSocket = "ich.sock"
vfsdSocket = "virtiofsd.sock"
apiSocket = "cloud-hypervisor.9166"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 9166? Can this be simply cloud-hypervisor.sock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be removed.


{"root", "/dev/vda1"},
{"panic", "1"},
{"init", "/usr/lib/systemd/systemd"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? /sbin/init always exists in our images and is either init binary or a link to it (systemd or the kata-agent).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we use our own images for other reasons so I'll see what I can do.

var thePid = 0
strErr, err, thePid = govmmIch.LaunchIch(ich.ichConfig)
if err != nil {
return fmt.Errorf("fail to launch cloud-hypervisor: %s, error messages from log: %s", err, strErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/fail/failed/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

}

func (ich *cloudHypervisor) getSandboxConsole(id string) (string, error) {
ich.Logger().WithField("function", "getSandboxConsole").WithField("ID", id).Info("Get Sandbox Console")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Structured logging fields should be all lower case, so id here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know. will change


func (ich *cloudHypervisor) getSandboxConsole(id string) (string, error) {
ich.Logger().WithField("function", "getSandboxConsole").WithField("ID", id).Info("Get Sandbox Console")
// TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can you add in a GitHub issue URL here so that the comment is in the form:

// FIXME: https://github.com/kata-containers/runtime/issues/...

That way we have both the issue and the standardised comment as a reminder of what remains to be done.

}

func (ich *cloudHypervisor) getPids() []int {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extraneous blank lines (ere and elsewhere).


func (config *Config) appendKernel() {

//config.ichParams = append(config.ichParams, "--cmdline", fmt.Sprintf("\"%s\"", config.Kernel.Params))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be refactored.

case <-time.After(timeoutDuration):
err = fmt.Errorf("timed out waiting for cloud-hypervisor vsock")
}
// delay between vsock is available but hypervisor is bstill ooting and starting the kata-agent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// delay between vsock is available but hypervisor is bstill ooting and starting the kata-agent
// delay between vsock is available but hypervisor is still booting and starting the kata-agent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must have been late ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevertheless, this is an important issue I think. What I observed is that without this hard-coded wait the runtime shuts down the hypervisor even before it was able to start the agent. It's the time between the moment the vsock is available but the agent inside the hypervisor isn't ready yet. Somehow grpc doesn't wait during that time period but aborts immediately and causes a premature shutdown. This should be improved I think. Anyone an opinion on this?

Makefile Outdated

# ICH-specific options (all should be suffixed by "_ICH")
# currently, huge pages are required for virtiofsd support
DEFENABLEHUGEPAGES_ICH := true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we want to set this as true, we got virtiofs to work without huge pages afaik.
cc @ganeshmaharaj

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more thinking about DPDK which is widely adopted in the telco domain. How do you see this?

@@ -1222,6 +1313,8 @@ func checkHypervisorConfig(config vc.HypervisorConfig) error {
}
}
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra lines added. Please remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed


{"root", "/dev/vda1"},
{"panic", "1"},
{"init", "/usr/lib/systemd/systemd"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

if err != nil {
return err
}
ich.Logger().WithField("function", "startSandbox").Infof("Starting virtiofsd")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, shouldn't this log line be inside the condition?

}
}

ich.Logger().WithField("function", "startSandbox").Infof("virtiofsd starts sharing")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, this should be logged only when ich.config.SharedFS = config.VirtioFS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

@jodh-intel
Copy link
Contributor

Hi @ericooper - thanks for updating. Given all the comments flying around, please could you post a comment here when you think the PR is generally ready for re-review.

Copy link
Member

@jcvenegas jcvenegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking - adding some more comments to not forget in the refactor

if ich.config.Debug {
ich.Logger().WithField("source", "virtiofsd").Debug(scanner.Text())
}
if !sent && strings.Contains(scanner.Text(), "Waiting for vhost-user socket connection...") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks that would be broken at any time virtiofsd changes the log, I wonder if woudl be a better way to check is ready

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not specific to my driver commit. it's the same in the other drivers using virtiofsd. I would like to keep it this way and focus on the essentials first until this is improved in general.

Tag: tag,
Path: path,
NumQueues: 1,
QueueSize: 512,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Add config option for this

cacheSize := "0Gib"
if(vfsdev.Dax) {
daxParam = "on"
cacheSize = "8Gib"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to big to be hardcoded

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the default size of the hypervisor itself ie not my idea.

@ericooper
Copy link
Contributor Author

Hi @ericooper - thanks for updating. Given all the comments flying around, please could you post a comment here when you think the PR is generally ready for re-review.

I think we have a misalignment of expectations here. This is a WIP for a hypervisor which itself is still in alfa stage and under heavy development. I've developed and published this driver to be able to start experimenting with cloud-hypervisor as soon as possible and to propell adoption of the cloud-hypervisor. I would like to focus first on the essentials and get it going and do the nitty gritty cosmetics at the end. This approach has proven to fail which is something I have to deal with.

@sameo
Copy link

sameo commented Nov 10, 2019

Hi @ericooper - thanks for updating. Given all the comments flying around, please could you post a comment here when you think the PR is generally ready for re-review.

I think we have a misalignment of expectations here. This is a WIP for a hypervisor which itself is still in alfa stage and under heavy development. I've developed and published this driver to be able to start experimenting with cloud-hypervisor as soon as possible and to propell adoption of the cloud-hypervisor. I would like to focus first on the essentials and get it going and do the nitty gritty cosmetics at the end. This approach has proven to fail which is something I have to deal with.

I think @jodh-intel comment was very much aligned with that approach. He was saying that it's up to you to decide which comments you want to address and then decide when the PR is ready for a second round of reviews. I also agree with the approach of getting something that's acceptable and working merged first and then work on some of the nits/details from all the reviews that you got.
We only need to agree on what's acceptable as a mergeable PR, and I'm hoping that you can drive that discussion.

# Shared file system type:
# - virtio-9p (do not use)
# - virtio-fs (default & must)
shared_fs = "@DEFSHAREDFS_ICH@"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC there is no virtio-9p for cloud hypervisor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed both vsock and file-sharing options from Makefile, config template and hardwired them in newClhHypervisorConfig accordingly.

# If true and vsocks are supported, use vsocks to communicate directly
# with the agent (no proxy is started).
# Default true & must. cloud-hypervisor only supports vsock based agent communication
use_vsock = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since cloud hypervisor must use vsock, we should not make it an option at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed both vsock and file-sharing options from Makefile, config template and hardwired them in newClhHypervisorConfig accordingly.

@jodh-intel
Copy link
Contributor

Hi @ericooper - this branch needs a rebase to resolve a conflict in rootless.go.

virtio_fs_daemon = "/usr/bin/virtiofsd-x86_64"

# cloud-hypervisor does not support virtio-fs caching
virtio_fs_cache = "none"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does and it should be used for better performances.

[agent.kata]
# If enabled, make the agent display debug-level messages.
# (default: disabled)
enable_debug = true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the debug to be enabled by default?

@@ -90,7 +90,7 @@ func setRootless() error {
if err != nil {
return parseError
}
rangeUID, err := strconv.ParseUint(ids[1], 10, 0)
rangeUID, err := strconv.ParseUint(ids[2], 10, 0)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please submit this fix as its own separate commit since it's not directly related to Cloud-Hypervisor support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix has already landed in master so this change is no longer needed.

}

func (clh *cloudHypervisor) hotplugAddDevice(devInfo interface{}, devType deviceType) (interface{}, error) {
clh.Logger().WithField("function", "hotplugAddDevice").Info("Add hotplug device")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message should indicate that it's not supported. And I would suggest to use a Warn instead of Info.

"function": "resizeVCPUs",
"curr": currentVCPUs,
"new": newVCPUs,
}).Info("resize the VCPU's ")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to have lost the explanation for what the ditto means here @sboeuf ;)

return nil
}

func (clh *cloudHypervisor) LaunchIch() (string, error, int) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another occurence of ich


// Add the hybrid vsock device to hypervisor
clh.cliBuilder.SetVsock(&CLIVsock{
cid: 3,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be the context ID provided by Kata? (instead of being hardcoded)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW Firecracker also uses a hardcoded 3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that imply you cannot run a FC container and a CLH container on the same host then? Could we use VMADDR_CID_ANY (-1U) as documented in vsock(7)?

Copy link

@sameo sameo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am adding quite a few comments, but I think all of them are small/details/nits. I think the architecture looks good now, and I like the standalone clh.go driver. We're getting quite close, thanks.

Makefile Outdated
@@ -600,11 +640,12 @@ $(GENERATED_FILES): %: %.in $(MAKEFILE_LIST) VERSION .git-commit
-e "s|@SHIMPATH@|$(SHIMPATH)|g" \
-e "s|@DEFVCPUS@|$(DEFVCPUS)|g" \
-e "s|@DEFMAXVCPUS@|$(DEFMAXVCPUS)|g" \
-e "s|@DEFMAXVCPUS_ACRN@|$(DEFMAXVCPUS_ACRN)|g" \
-e "s|@DEFMAXVCPUS_ACRN@|$(DEFMAXVCPUS_ACRN)|g" \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please avoid modifying this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it. Which line do you mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ericooper - the whitespace in Makefile (line 643 in your version) has been changed, but this looks like an oversight as the actual content hasn't changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

# to the proxy logs, but only when proxy debug is also enabled.
#
# Default false
enable_debug = true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to comment that one out as a default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

@@ -638,7 +639,6 @@ func newQemuHypervisorConfig(h hypervisor) (vc.HypervisorConfig, error) {
GuestHookPath: h.guestHookPath(),
}, nil
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's still here :)

Comment on lines 171 to 187
// First take the default parameters defined by this driver
clh.cliBuilder.AddKernelParameters(clhKernelParams)

// Followed by extra debug parameters if debug enabled in configuration file
if clh.config.Debug == true {
clh.cliBuilder.AddKernelParameters(clhDebugKernelParams)
}

// Followed by extra debug parameters defined in the configuration file
clh.cliBuilder.AddKernelParameters(clh.config.KernelParams)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be grouped into a single func (d *DefaultCLIBuilder) AddKernelParameters(hypervisorConfig *HypervisorConfig) routine?

@ericooper ericooper requested a review from sameo November 13, 2019 11:26
Copy link

@sameo sameo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a few nits to address, and the virtiofs caching enabled by default. Getting closer.

Comment on lines 65 to 66
# cloud-hypervisor does not support virtio-fs caching
virtio_fs_cache = "none"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, the documentation is outdated (It would not be proper documentation otherwise, wouldn't it ;-)).
I can confirm that caching is now enabled, and we systematically test both cases as part of our CI. See https://github.com/intel/cloud-hypervisor/blob/master/src/main.rs#L1775 for example.

I'll open an issue against the doc and we'll fix it. Meanwhile, as suggested by @sboeuf, I think your PR should support caching by default. Performance is much better with DAX on.

@@ -1222,7 +1309,6 @@ func checkHypervisorConfig(config vc.HypervisorConfig) error {
}
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that you're removing an empty line here, which is not something that's related to the PR. It adds noise to the patch/PR, so I'm suggesting to keep the empty line there.

Comment on lines +126 to +140
if clh.version.Major < supportedMajorVersion && clh.version.Minor < supportedMinorVersion {
errorMessage := fmt.Sprintf("Unsupported version: cloud-hypervisor %d.%d not supported by this driver version (%d.%d)",
clh.version.Major,
clh.version.Minor,
supportedMajorVersion,
supportedMinorVersion)
return errors.New(errorMessage)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is a fact that any binary below 0.3.0 the driver will fail to drive due to changes in the cli.

I agree. I don't have a strong opinion on that one, we can keep it.

I would actually like to see a different versioning for both cli and http api

That one I disagree with. It's ok to break the CLI/API between minor versions, especially with such a new project, but the CLI and HTTP APIs should be kept in sync when they map. There will be HTTP API endpoints that don't have a CLI counterpart (e.g. the hotplug ones), but for those who do they should be consistent.


// Add the hybrid vsock device to hypervisor
clh.cliBuilder.SetVsock(&CLIVsock{
cid: 3,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW Firecracker also uses a hardcoded 3.

Comment on lines 65 to 66
# cloud-hypervisor does not support virtio-fs caching
virtio_fs_cache = "none"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@egernst egernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericooper: Can we set DisableVhostNet to true? I don't think we need it to be configurable, and today AFAICT the PR doesn't set it in the hypervisor config from TOML/Annotations anyway.

Copy link
Member

@jcvenegas jcvenegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericooper changed the kernel to use kata virtiofs kernel build, feel free to drop or squash the commit

@sboeuf
Copy link

sboeuf commented Nov 14, 2019

@jcvenegas

@ericooper changed the kernel to use kata virtiofs kernel build, feel free to drop or squash the commit

Thanks for fixing the PR. The change should not be dropped since it is expected that if we want CloudHypervisor working with Kata, Kata will have to use virtio-fs.

Copy link
Contributor

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating again @ericooper. There are only a few unresolved questions now so we're getting close!

As shown in https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#updating-your-pr-based-on-review-comments, please can you squash all your commits down into one (since they all relate to the single task of adding CLH support). Once done, this PR will then have two commits: @jcvenegas's Makefile change and your single commit to add CLH support.

@@ -416,7 +448,7 @@ func (clh *cloudHypervisor) addDevice(devInfo interface{}, devType deviceType) e
})

default:
clh.Logger().WithField("function", "addDevice").Warn("Add device of type %v is not supported.", v)
clh.Logger().WithField("function", "addDevice").Warn(fmt.Sprintf("Add device of type %v is not supported.", v))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You can simplify this to use:

.Warnf("Add device of type %v is not supported.", v)

@@ -90,7 +90,7 @@ func setRootless() error {
if err != nil {
return parseError
}
rangeUID, err := strconv.ParseUint(ids[1], 10, 0)
rangeUID, err := strconv.ParseUint(ids[2], 10, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix has already landed in master so this change is no longer needed.


// Add the hybrid vsock device to hypervisor
clh.cliBuilder.SetVsock(&CLIVsock{
cid: 3,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that imply you cannot run a FC container and a CLH container on the same host then? Could we use VMADDR_CID_ANY (-1U) as documented in vsock(7)?

clh.Logger().Info("Cloud Hypervisor stopped")
clh.state.PID = 0
clh.state.VirtiofsdPID = 0
clh.state.state = clhNotReady
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: This might benefit from a cloudHypervisor.reset() helper method (which possibly calls CloudHypervisorState.reset() or NewCloudHypervisorState() or similar). This might help avoid bugs if we ever extend CloudHypervisorState.

}

// Send a SIGTERM to the VM process to try to stop it properly
if err = syscall.Kill(pid, syscall.SIGTERM); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having two kills like this is potentially racy. I'd remove the Kill(0) and instead return nil if Kill(SIGTERM) fails with ESRCH.

Copy link
Contributor

@jodh-intel jodh-intel Nov 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be handled with a follow-on PR since this idiom is used for the other hypervisor implementations too.

return 0, err
}

clh.Logger().WithField("Path", clh.config.VirtioFSDaemon).Info()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: All our log fields should be lower-case.

"function": "resizeVCPUs",
"curr": currentVCPUs,
"new": newVCPUs,
}).Info("resize the VCPU's ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to have lost the explanation for what the ditto means here @sboeuf ;)

@egernst egernst changed the title [WIP] Add support for intel cloud hypervisor Add support for intel cloud hypervisor Nov 15, 2019
@egernst
Copy link
Member

egernst commented Nov 15, 2019

Thanks -- I'll fix the merge conflict, then let's merge!

@egernst
Copy link
Member

egernst commented Nov 15, 2019

/test

@egernst
Copy link
Member

egernst commented Nov 15, 2019

/test-ch

Copy link
Contributor

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ericooper!

@codecov
Copy link

codecov bot commented Nov 15, 2019

Codecov Report

Merging #2164 into master will decrease coverage by 1.89%.
The diff coverage is 6.94%.

@@            Coverage Diff            @@
##           master    #2164     +/-   ##
=========================================
- Coverage   51.25%   49.35%   -1.9%     
=========================================
  Files         110      111      +1     
  Lines       15114    15791    +677     
=========================================
+ Hits         7747     7794     +47     
- Misses       6410     7032    +622     
- Partials      957      965      +8

@jcvenegas
Copy link
Member

/test

ericooper and others added 2 commits November 15, 2019 19:35
Initial release of cloud hypervisor driver for kata-runtime

Fixes: #2046

Signed-off-by: Johan Kuijpers <[email protected]>
use virtiofs kernel to allow boot kata.

Signed-off-by: Jose Carlos Venegas Munoz <[email protected]>
@jcvenegas
Copy link
Member

/test

@sboeuf sboeuf merged commit 547d580 into kata-containers:master Nov 15, 2019
if elapsed < timeoutDuration {
timeout = timeout - int(elapsed.Seconds())
} else {
timeout = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't error be returned in this case since the wait times out ?


sourcePath := filepath.Join(kataHostSharedDir(), clh.id)
if _, err := os.Stat(sourcePath); os.IsNotExist(err) {
os.MkdirAll(sourcePath, os.ModePerm)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better check return value and log accordingly.

"-o", "source=" + sourcePath,
"-o", "cache=" + clh.config.VirtioFSCache}

if len(clh.config.VirtioFSExtraArgs) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check that VirtioFSExtraArgs contains none of the 3 args from line 665 to 667 ?


func (clh *cloudHypervisor) storeState() error {
if clh.store != nil {
if err := clh.store.Store(store.Hypervisor, clh.state); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to check the return value - we can directly return it to caller.

conn, err := net.DialUnix("unix", nil, addr)

if err != nil {
time.Sleep(50 * time.Millisecond)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly log after continuously sleeping 200 times.

err = fmt.Errorf("timed out waiting for cloud-hypervisor vsock")
}

time.Sleep(1000 * time.Millisecond)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know the ready status, why is this needed ?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants