-
Notifications
You must be signed in to change notification settings - Fork 373
[WIP] virtcontainers: netns_watcher: Monitor network changes #194
Conversation
bb4a8d2
to
058f461
Compare
@bergwolf @miaoyq @WeiZhang555 Still WIP, but I'd like to get a first review ! |
I guess it is better to put this in the cli directory? |
Codecov Report
@@ Coverage Diff @@
## master #194 +/- ##
=======================================
Coverage 66.67% 66.67%
=======================================
Files 93 93
Lines 9580 9580
=======================================
Hits 6387 6387
Misses 2506 2506
Partials 687 687 Continue to review full report at Codecov.
|
} | ||
|
||
/* Enter network namespace */ | ||
ret = enter_netns((const char*) params.netns_path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can run the monitor via network.run(networkNSPath string, cb func() error) error
in virtcontainers
, instead of enter_netns((const char*) params.netns_path)
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean that virtcontainers would start the process directly into the network namespace, and I agree this would work, but I think it is better to have this being performed from the C process directly as entering namespaces is safer in C than Go. See #148 for more details about namespaces issues in Go.
Moreover, this makes this program more generic as it is able to enter a namespace on its own.
ret = monitor_netns((const char*) params.pod_id, | ||
(const char*) params.runtime_path); | ||
if (ret) { | ||
goto exit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant if statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I'll remove it.
0e67822
to
f008bff
Compare
@bergwolf @miaoyq @egernst @amshinde I'll try to summarize how this program works so that it is easier for you to review the code. This watcher is supposed to be started by virtcontainers after the initial network has been setup. For this reason, after it entered the given network namespace, it takes a snapshot of what the network looks like. This will serve as a reference whenever a netlink event is received. List of tasks to consider:
|
@sboeuf There are two things I am thinking differently:
|
I think we only care the
Agree.
Anybody have started working on sandbox API definition related to the document ?
What does this mean? I'm not clear with this. :-p @sboeuf |
We can say that we don't send an event if we only receive a new interface, but I think we still need to track this new interface internally, so that we can provide the whole interface when a new IP address is provided. I am saying this for two reasons:
@WeiZhang555 @egernst I'd really like your input on this.
Just to confirm, what do you agree on ? Both the fact that we should limit the interface type to veth and the fact that the network hotplug API should be responsible for this ? Or only one of them ?
Not yet, but this is tied to the new network hotplug API. I can make a proposal about what is going to be needed here
I am using a simple global array of structures to save the snapshot of the network, and this works only for interfaces index being between 1 and 49. But if we want to make sure we can support 50 interfaces, using a hash table would work better. |
The chain you're describing here is not really different (binary wise) since virtcontainers is a simple library and not a binary(different process) itself.
There is no such thing as cycle dependency here since the binary is in C and it does not import anything from virtcontainers. To be honest, I understand your concern and I am balanced between the two different possibilities. The main reason I'd like to keep it inside virtcontainers is that we need to know when we can start this from the CLI perspective. Virtcontainers knows better about this kind of thing. That being said, maybe starting this after the call to |
@sboeuf Both. |
@sboeuf If we put this in virtcontainers, we make virtcontainers depend on kata cli because the ns_watcher binary depends on it. The functionality is not working without kata cli. That's why I call it logical dependency not golang import dependency. |
|
||
nif.idx = if_idx; | ||
|
||
if (if_idx >= MAX_IFACES) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think MAX_IFACES=50 is far from enough.
if_idx is link index, if host has more than 50 interfaces, then you create a new container, it's interface link index is easily larger than 50.
So I think link index shouldn't be used as the array's index, the array index should be standalone counter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I get a segment fault
when running the process.
To reproduce the error, you can create 50 containers, then try to exec
$ netns_watcher -d -n /proc/<pid>/ns/net -p hello -r /usr/local/bin/kata-runtime
Pid should be process id of last container, and its interface index is larger than 50.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the table of interfaces, that's exactly what I was mentioning when I said I'd like to replace this with a hash table ;). I will make the change.
Now about the segmentation fault, I am not sure, and I'll try to reproduce !
Some inputs from me:
I think latter one is better. ns_watcher should take all responsibilities of watching network changes, and send all interface infos to kata cli. Let kata cli/virtcontainers take another scan isn't a good idea in my opinion. And it looks good to me to put this into virtcontainers/ lib. Most part of the code looks quite cool, only some comments on details. |
|
||
#define PROGRAM_NAME "netns-watcher" | ||
|
||
#define MAX_IFACES 50 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd do the following for maximum built-time flexibility:
#ifndef MAX_IFACES
#define MAX_IFACES 50
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
* Copyright (C) 2018 Intel Corporation | ||
* | ||
* This program is free software; you can redistribute it and/or | ||
* modify it under the terms of the GNU General Public License |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- GPL or Apache 2.0?
- SPDX license header required ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
void print_iface_list() { | ||
int i; | ||
|
||
for (i = 0; i < MAX_IFACES; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather this was:
int max_ifaces = (int)(sizeof(iface_list)/sizeof(iface_list[0]));
for (i = 0; i < max_ifaces; i++) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
*/ | ||
void print_version(void) | ||
{ | ||
printf("%s v0.1\n", PROGRAM_NAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Hard-coded version number. How about a
#define
for it? - Version should be in semver format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/* | ||
* Free internal fields of the route structure. | ||
*/ | ||
void free_route(struct route *rt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add a check on the param:
if (! rt) {
return;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
.ip_addrs = NULL, | ||
}; | ||
|
||
iface_list[idx] = nif; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks wrong - you're appending a local variable to a global array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this is not simply to make sure every field of iface_list[idx]
get reset to the proper values. But I can do something different.
* Free internal fields of the iface structure. | ||
*/ | ||
void free_iface(struct iface *nif) { | ||
int i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
*/ | ||
int fork_runtime_call(char *params[]) | ||
{ | ||
int ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
int create_iface_from_ifaddrs(struct ifaddrs *ifa, struct iface *nif) | ||
{ | ||
int sock; | ||
int family; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -0,0 +1,1251 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about giving it a prefix like kata
or vc
? :)
I assume it's intentional that this isn't actually mentioned in the Makefile
yet too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes let's call this kata-netmon
which stands for "Kata network monitor". And yes, the Makefile will come later. I need to finalize the program first.
Since this PR introduces C code, we're also going to have to add C code static analysis checks specifically for this file. Ideally, we'd also register the project with https://scan.coverity.com/, but fwics that only works with Travis which we're trying to get rid of. |
Could you update the commit message to explain what this tool does when it detects a change to a network ns? |
PSS Measurement: Memory inside container: |
btw, I think you'll need to add a commit to enable Coverity Scan as this is the first piece of C code in the runtime repo. Basically crib what you can from: .. and do: |
... for static analysis, we could start by setting up https://github.com/marketplace/lgtm which is free for OSS projects and has full github integration fwics. |
I can't think of any reason not to give lgtm a tryout at least... |
Hi @sboeuf - please could you push your latest branch as your comments don't appear to match up with reality for us :) |
@jodh-intel yes you're right, but also what happened is that I have re-written this code in Go to prototype faster here. I will push another PR soon, replacing this one. The goal being to validate the whole behavior of what we expect from this binary, and maybe port it to C later. |
Sounds good - thanks for the update @sboeuf. |
This commit introduces a new watcher dedicated to the monitoring of a specific network namespace in order to detect any change that could happen to the network. As a result of such a detection, the watcher should call into the appropriate runtime path with the proper arguments to modify the pod network accordingly. Fixes kata-containers#170 Signed-off-by: Sebastien Boeuf <[email protected]>
I want to keep this PR around since we might move to the C implementation later. But for now, we want to prototype this through the new PR #534 |
PSS Measurement: Memory inside container: |
Build succeeded (third-party-check pipeline).
|
/* | ||
* Add route. | ||
*/ | ||
int add_route(const struct nlmsghdr *nh, const char *sandbox_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this kind of change, I would insist that either we make change to network config via the sandbox network hotplug API, or we put this binary in the cli directory. I do not think we should make virtcontainers call the kata-runtime binary directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. It sounds a bit recursive :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment here, let me know what you think about it !
Closing this PR as it's very unlikely that we'll use this C code. The code will still be available on my Github fork on the branch |
grpc: implement ListProcesses
This commit introduces a new watcher dedicated to the monitoring
of a specific network namespace in order to detect any change that
could happen to the network.
As a result of such a detection, the watcher should call into the
appropriate runtime path with the proper arguments to modify the
pod network accordingly.
Fixes #170
Signed-off-by: Sebastien Boeuf [email protected]