-
Notifications
You must be signed in to change notification settings - Fork 373
Conversation
When using noopShim type from the unit tests, we were ending up getting a PID 1000, and when checking if the shim was around, we were always expecting the shim to be "not running", based on the fact that the process was not there anymore. Unfortunately, this was a very wrong assumption because we cannot control which PIDs are running or not on the system. The way to simplify this is to return a PID 0 in case of noopShim, processed as a special case by the function waitForShim(). Fixes kata-containers#208 Signed-off-by: Sebastien Boeuf <[email protected]>
Because of the bad design of the cc_proxy_mock go routine, we were leaving an infinite loop running into this go routine behind. This was consuming a lot of resources and it was obviously slowing down the tests being run in parallel. That's one of the reason we were hitting the 10 seconds timeout when running go tests. Fixes kata-containers#208 Signed-off-by: Sebastien Boeuf <[email protected]>
Those different files were all calling into a go routine that was eventually reporting some result through a go channel. The problem was the way those routine were implemented, as they were hanging around forever. Indeed, nothing was actually listening to the channel in some cases, and those routines never ended. This was one of the problem detected by the fact that our unit tests needed more time to pass because when they were all run in parallel, the resources consumed by those routines were increasing the time for other tests to complete. Fixes kata-containers#208 Signed-off-by: Sebastien Boeuf <[email protected]>
@devimc @grahamwhaley @egernst @bergwolf PTAL and let's merge this ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice hunting @sboeuf
lgtm
proxy.startListening() | ||
go func() { | ||
for { | ||
proxy.serve() | ||
|
||
proxy.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious - do you need the locking if you only have one reader and one writer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you do, otherwise Golang complains. I agree it is semantically overkill here.
Couple of strange things on the failed 17.10 CI - I'll give it a nudge (or is this a known issue @chavafg ?).
|
Codecov Report
@@ Coverage Diff @@
## master #209 +/- ##
==========================================
- Coverage 65.33% 65.33% -0.01%
==========================================
Files 73 73
Lines 7702 7707 +5
==========================================
+ Hits 5032 5035 +3
- Misses 2127 2128 +1
- Partials 543 544 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good; please address Graham's comments, though.
The go netlink package has added more netlink socket protocols than we need. Specify one to avoid failure due to unsupported protocols. Fixes: kata-containers#209 Signed-off-by: Peng Tao <[email protected]>
Several failures have been reported through kata-containers/tests#225 regarding our CI. A panic was triggered after our unit tests were reaching the defined timeout of 10s, and we could have bumped this timeout to a bigger value, but the root cause was our bad management of the go routines from our code and from our tests, leaving a lot of go routines behind, and slowing down our tests.
Fixes #208