-
Notifications
You must be signed in to change notification settings - Fork 373
virtcontainers: netmon: Monitor network changes #534
virtcontainers: netmon: Monitor network changes #534
Conversation
This PR supersedes the original C implementation #194 |
virtcontainers/netmon/netmon.go
Outdated
|
||
netlinkFamily = netlink.FAMILY_V4 | ||
|
||
storageParentPath = "/var/run/kata-containers/netmon/sbs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this path should be generated by in the Makefile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
virtcontainers/netmon/netmon.go
Outdated
os.Exit(1) | ||
} | ||
|
||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this commented? is this a TODO ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, I forgot to remove it. I'll do that when I'll update the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
virtcontainers/netmon/netmon.go
Outdated
if addr.IPNet == nil { | ||
continue | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this loop if for ... ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah interesting I forgot to complete the implementation there !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Things left to do:
|
PSS Measurement: Memory inside container: |
a890338
to
42ebbd4
Compare
PSS Measurement: Memory inside container: |
Build succeeded (third-party-check pipeline).
|
virtcontainers/netmon/netmon.go
Outdated
|
||
const ( | ||
netmonName = "kata-netmon" | ||
netmonVersion = "0.0.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two variables should probably be set by the build (particularly version) as we do for the runtime itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they could be generated through the build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
virtcontainers/netmon/netmon.go
Outdated
|
||
kataSuffix = "kata" | ||
|
||
netlinkFamily = netlink.FAMILY_V4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth mentioning anything about IPv6 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I'll add a comment that we want to focus on ipv4 for now, just to make sure we're not introducing too much complexity at the first step.
virtcontainers/netmon/netmon.go
Outdated
} | ||
|
||
func printVersion() { | ||
fmt.Printf("%s version %s\n", netmonName, netmonVersion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new system component, so please could you update kata-env
to display details of it (including running kata-netmon --version
as we do for other components).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but this will come later, I need to get the code working and tested first.
virtcontainers/netmon/netmon.go
Outdated
flag.BoolVar(¶ms.debug, "d", false, "enable debug mode") | ||
flag.BoolVar(&version, "v", false, "display program version and exit") | ||
flag.StringVar(¶ms.sandboxID, "s", "", "sandbox id (required)") | ||
flag.StringVar(¶ms.runtimePath, "r", "", "runtime path (required)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flag
auto-generates a--help
option but it would be useful to override the default to display a description of the program too imho.- As mentioned on the original C-based PR, wouldn't it be better to call the vc API rather than having the overhead of launching instances of the runtime (even if they are short-lived)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flag auto-generates a --help option but it would be useful to override the default to display a description of the program too imho.
Do you mean the program usage ? Because that's the way it works by default. I think this improvement can definitely be part of a follow up PR.
As mentioned on the original C-based PR, wouldn't it be better to call the vc API rather than having the overhead of launching instances of the runtime (even if they are short-lived)?
Well we could do that from the Go implementation, but what if we move to C implementation later ? This is not gonna be pretty to provide the Kata API as a C library...
Also, by doing so, we pull the entire dependencies from Kata by importing the API as a package, meaning we'll end up with a pretty huge binary running on the system.
Technically, this is doable, but I don't know if we want to go this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, one benefit of calling the kata API directly is that if you modularize it properly, we can create a goroutine to monitor the netns in containerd kata shim and that presents no dependency on the kata-runtime cli, which is good from containerd kata shim perspective.
How about a structure like this:
- virtcontainers/netnsmon: a package that calls kata API to do netns monitoring
- cli/netmon: a command line built on top of virtcontainers/netnsmon. Other kata-runtime sub-command can invoke it to create a daemon to monitor the netns
- for containerd kata shim, it can create a goroutine that calls virtcontainers/netnsmon to monitor the netns when it wants to.
In both kata-runtime and containerd kata shim case, virtcontainers do not invoke netnsmon but let the callers decide and drive the procedure.
WDTY?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bergwolf
If we decide to go this way, I would appreciate if this could come as a follow up PR, since I have reworked this PR quite a few times now, and it's pretty big, and I would like to see it landing at some point.
virtcontainers/netmon/netmon.go
Outdated
|
||
src := "" | ||
if netRoute.Src.To4() != nil { | ||
dst = netRoute.Src.String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean src =
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for noticing ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
virtcontainers/netmon/netmon.go
Outdated
} | ||
} | ||
|
||
func main() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please could you add the standard signal and crash handing code. To avoid duplicating code inside this repo, you could do what I did in the throttler repo and create a new signals package:
func main() {
defer ksig.HandlePanic()
realMain()
}
And then add calls to:
- set
CrashOnErorr
behaviour. - Call that packages
SetLogger()
. - Create the signal-handling loop with calls to
HandledSignals()
,FatalSignal()
andNonFatalSignal()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a nice thing to do ;)
virtcontainers/netmon/netmon.go
Outdated
|
||
storageParentPath = "/var/run/kata-containers/netmon/sbs" | ||
storageDirPerm = 0750 | ||
sharedFile = "shared.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document what this is for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
a60e56e
to
849e97e
Compare
PSS Measurement: Memory inside container: |
849e97e
to
80b905e
Compare
PSS Measurement: Memory inside container: |
Build succeeded (third-party-check pipeline).
|
To clarify, this depends on #287 for the kata cli network sub-command. Adding WiP because of it. |
80b905e
to
3124c9d
Compare
PSS Measurement: Memory inside container: |
Codecov Report
@@ Coverage Diff @@
## master #534 +/- ##
==========================================
- Coverage 65.25% 65.11% -0.15%
==========================================
Files 85 87 +2
Lines 9978 10373 +395
==========================================
+ Hits 6511 6754 +243
- Misses 2815 2942 +127
- Partials 652 677 +25 |
Build succeeded (third-party-check pipeline).
|
Hi @ydjainopensource - I think we'll still need this as the tracing can only show that changes occurred - this new application needs to modify the network. |
Okay no issues we will keep this open |
I have tested the commit base on branch master after add some log print, but there is a little strange problem: call
I will get error messages after
but it will work if execute the command manually: and the new tap
so maybe there's some permissions issue in netmon or the way of my use is wrong? |
@ningbo9 the PR is not completely ready as it does not include the modification inside kata-runtime itself to launch the network monitor. |
cli/config/configuration.toml.in
Outdated
# network being added to the existing network namespace, after the | ||
# sandbox has been created. | ||
# (default: disabled) | ||
#enable = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits: as I put a test commit at kata-containers/tests#743, I noticed that it would be better to call this enable_netmon
instead, so that we can easily grep and replace it in scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course I can do that and I agree this might help!
btw, @sboeuf if you want to merge this PR feel free to do so. It would be great if you can rename the Anyway, whatever error I'm seeing w/ network hotplug, it is not related to your PR. So I'm giving my |
Maybe the different guest rootfs distro or kernel causes this error ? |
@caoruidong I've tried alpine based initrd w/ agent as init, and centos based rootfs image w/ systemd as init, both failed the same way. It seems kata-containers/tests#743 have given the same results:
@sboeuf what guest rootfs distro are you testing with? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
type Interface struct { | ||
Device string `json:"device,omitempty"` | ||
Name string `json:"name,omitempty"` | ||
IPAddresses []*IPAddress `json:"IPAddresses,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalization? OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! You mean from the json definition?
Yes I'll change that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, I did that so that it comply with the structure defined in agent/protocols/grpc/
. I'm afraid the unmarshalling might not work if I change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough!
bd4ea50
to
bf8c37b
Compare
PSS Measurement: Memory inside container: |
Build failed (third-party-check pipeline) integration testing with
|
@amshinde PTAL and feel free to merge if you're fine with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm @sboeuf
Just some minor nits.
// For simplicity the code will only focus on IPv4 addresses for now. | ||
netlinkFamily = netlink.FAMILY_V4 | ||
|
||
storageParentPath = "/var/run/kata-containers/netmon/sbs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this go in the Makefile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Makefile does not contain any other storage path yet. Should this be part of a more global PR pushing the default storage paths to the Makefile?
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok lets address in a separate PR.
} | ||
|
||
// First, ignore if the interface name contains "kata". This way we | ||
// are preventing from adding interfaces created by Kata Containers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/adding/deleting/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
netmon/netmon.go
Outdated
|
||
// First, ignore if the interface name contains "kata". This way we | ||
// are preventing from adding interfaces created by Kata Containers. | ||
if strings.Contains(linkAttrs.Name, kataSuffix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use strings.HasSuffix here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Let's see what we can get in CI environment. Please DO NOT MERGE! Depends-on: github.com/kata-containers/runtime#534 Fixes: #999999999 Signed-off-by: Peng Tao <[email protected]>
This commit introduces a new watcher dedicated to the monitoring of a specific network namespace in order to detect any change that could happen to the network. As a result of such a detection, the watcher should call into the appropriate runtime path with the correct arguments to modify the pod network accordingly. Fixes kata-containers#170 Signed-off-by: Sebastien Boeuf <[email protected]>
In order to reduce the overhead due to the import of the whole agent protocol, only the needed structures are duplicated. This is a temporary solution, and those structures should be defined into their own package to prevent from such overhead. Note: the overhead of the binray size went down from 15MiB to 3MiB when this commit removed the dependency on the agent protocol. Signed-off-by: Sebastien Boeuf <[email protected]>
Instead of dumping logs through the standard output with fmt.Printf() function, this commit improves the logging by relying on logrus. Also, it relies on the syslog hook so that all the logs get redirected to the journal. Signed-off-by: Sebastien Boeuf <[email protected]>
This commit modifies the Makefile at the root of this repository so that the binary kata-netmon can be built from there. Signed-off-by: Sebastien Boeuf <[email protected]>
This commit adds some unit testing in order to validate some of the new code that have been introduced with the new network monitor. Signed-off-by: Sebastien Boeuf <[email protected]>
8abc400 agent: add test to WaitProcess() f746ed8 agent: allow multiple waitProcess() 157f1c1 travis: Add variable needed to run static checks ed54087 travis: bump golang version ba0c7fc client: wait for session to be fully closed 0865c98 agent: wait session to be fully shutdown 55f1480 vendor: update yamux dependency 5e36bfc network: Wait for network device in UpdateInterface 218ce89 device: Rename getBlockDeviceNodeName to getPCIDeviceName c9a4e2e uevent: Store the interface field as device name for network interfaces 74a5364 build: fix make proto error b1c2ad8 agent: add support for online memory and cpu separately. Signed-off-by: Sebastien Boeuf <[email protected]>
Because the network monitor will be listening to every event received through the netlink socket, it will be notified everytime a new link will be added/updated/modified in the network namespace it's running into. The goal being to detect new interface added by Docker such as a veth pair. The problem is that kata-runtime will add other internal interfaces when the network monitor will ask for the addition of the new veth pair. And we need a way to ignore those new interfaces being created as they relate to the veth pair that is being added. That's why, in order to prevent from running into an infinite loop, virtcontainers needs to tag the internal interfaces with the "kata" suffix so that the network monitor will be able to ignore them. Signed-off-by: Sebastien Boeuf <[email protected]>
This patch enables the code responsible for starting and stopping the network monitor. Signed-off-by: Sebastien Boeuf <[email protected]>
In order to choose if the network monitor should be used or not, this patch makes it configurable from the configuration.toml file. Signed-off-by: Sebastien Boeuf <[email protected]>
bf8c37b
to
0ffe81c
Compare
@amshinde changes applied, PTAL. |
lgtm. |
@amshinde yeah I've just restarted because three builds failed because of networking issues. Nothing related to the PR. |
PSS Measurement: Memory inside container: |
@amshinde could you please merge this one? |
This commit introduces a new watcher dedicated to the monitoring
of a specific network namespace in order to detect any change that
could happen to the network.
As a result of such a detection, the watcher should call into the
appropriate runtime path with the correct arguments to modify the
pod network accordingly.
Fixes #170
Signed-off-by: Sebastien Boeuf [email protected]