You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 12, 2021. It is now read-only.
Our current grpc code defaults to an infinite timeout. This has been witnessed to hang up when the agent at the other end is not responding (for whatever reason). We should have some sort of safety timeout and/or retry mechanism in place to (at least eventually) notice a dead connection.
There may be implicit complications here involving paused containers or containers that are quiescent for a long time, and maybe some of our grpc calls expect a long delay (but tbh, they are grpc calls and I don't think we parallelise them do we - so, maybe they never expect a very long delay).
Here is a patch I knocked up whilst doing some debug, just for inspiration:
If a proto file is changed then we want a larger number of folks
to know about and review it. Add them to the CODEOWNERS file.
Fixes: kata-containers#460
Signed-off-by: Graham Whaley <[email protected]>
Description of problem
Our current grpc code defaults to an infinite timeout. This has been witnessed to hang up when the agent at the other end is not responding (for whatever reason). We should have some sort of safety timeout and/or retry mechanism in place to (at least eventually) notice a dead connection.
There may be implicit complications here involving paused containers or containers that are quiescent for a long time, and maybe some of our grpc calls expect a long delay (but tbh, they are grpc calls and I don't think we parallelise them do we - so, maybe they never expect a very long delay).
Here is a patch I knocked up whilst doing some debug, just for inspiration:
Expected result
I don't expect the runtime to ever hang up solid.
Actual result
Over in #406, due to some interesting yamux/qemu timeout death scenarios, we got the runtime to hang up solid in the grpc call dispatch.
The text was updated successfully, but these errors were encountered: