Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

RFC: Add host time sync in Kata #1279

Closed
mcastelino opened this issue Feb 25, 2019 · 23 comments
Closed

RFC: Add host time sync in Kata #1279

mcastelino opened this issue Feb 25, 2019 · 23 comments

Comments

@mcastelino
Copy link
Contributor

Time is not accurate within Kata containers

When running containers using runc as long as the host systems and time synchronized time is accurate within the containers and consistent across the cluster.

However when running Kata containers time is no longer accurate.

Any end to end traces obtained that involve Kata container will yield wrong/inconsistent results.

We should consider adding ptp_kvm kernel module to the default Kata Kernel and set it up for timesync with the host using chronyd

The downside of this is that it will add an active component in the Kata VM in addition to the kata agent.

Expected result

Time is Kata containers should be consistent with host time to match runc behavior

@mcastelino
Copy link
Contributor Author

/cc @egernst @jcvenegas

@grahamwhaley
Copy link
Contributor

grahamwhaley commented Feb 25, 2019

OOI @mcastelino - does it start out in sync and then drift, or start out out-of-sync or? Any feel for the magnitude as well?
We use both in/out container time in the metrics 'launch time test', under the assumption that at early boot in the container it will be 'near enough to the host', as the test only lasts <1s or so :-).

@jodh-intel
Copy link
Contributor

/cc @bergwolf since this will impact agent-as-init as we'll have to make the agent itself start any extra services, rather than using systemd.

@mcastelino
Copy link
Contributor Author

OOI @mcastelino - does it start out in sync and then drift, or start out out-of-sync or? Any feel for the magnitude as well?

It starts our of sync and then drifts further by quite a bit. I wrote a little tool to model this
https://github.com/mcastelino/testapi/tree/master/opencensus/http

If you see the zipkin traces you will see how bad the drift is right off the bat.

Also some more details on timesync
https://opensource.com/article/17/6/timekeeping-linux-vms

@devimc
Copy link

devimc commented Feb 25, 2019

I think this can be fixed using kata-containers/agent#425 and doing something like this #976

@egernst
Copy link
Member

egernst commented Mar 5, 2019

@bergwolf WDYT?

looks like we need an owner for this issue. any takers?

@amshinde
Copy link
Member

amshinde commented Mar 8, 2019

@mcastelino The article suggests adding ptp_kvm as a reference clock for chronyd. I was wondering if we could use systemd-timesyncd in place of chronyd since we already package that in our systemd images.

@dylanzr
Copy link

dylanzr commented Mar 19, 2019

@amshinde I think timesyncd only works with ntp sources, but I may be wrong.

@amshinde
Copy link
Member

@dylanzr You are right. I looked at this yesterday, and I realized that timesyncd is quite minimal and works with only ntp, it does not support ptp or hardware clocks.

@WeiZhang555
Copy link
Member

The feature is really necessary, I love it. Add chrony to rootfs sounds good though agent-as-init may not enjoy it.

@bergwolf
Copy link
Member

I agree that phc is quite useful for us. It's also worth noting that the KVM_HC_CLOCK_PAIRING hypercall was introduced in v4.10 (commit 55dd00a73a518). So the feature is not available on older host kernels. And kata's guest rootfs should be prepared to ignore errors when inserting the ptp_kvm kernel module (it fails when KVM_HC_CLOCK_PAIRING is not supported).

@sboeuf
Copy link

sboeuf commented Mar 20, 2019

@bergwolf good point, this needs to be handled and documented.
Is there any alternative for older host (before 4.10)?

@mcastelino
Copy link
Contributor Author

@bergwolf it is imperative that we have time sync. I see issues in maintaining consistency across the cluster without it.

So we need a fallback for older kernels. We should not hold up this PR. The VM (i.e container) cannot assume NTP connectivity. So if the host kernel is older we should fallback to gRPC based sync that was proposed.

@amshinde
Copy link
Member

I have raised PR kata-containers/osbuilder#256 to add chrony to rootfs. The PR also configures this chrony to use virtual ptp as a source.
This will work for systemd based rootfs.
For older kernels, I am thinking of disabling chrony based on if /dev/ptp0 exists, something like a `PathExists=/dev/ptp0" clause in the systemd service unit file for chrony.

As @mcastelino mentioned, we cannot assume NTP connectivity for the VM, we can fallback to GRPC based sync in that case. (We cannot rely on NTP sources for chrony or systemd-timesynced ).
The runtime/agent could check for the kernel version/chrony status and initiate the grpc sync.

What do others think about this approach. I would also like some input on how to handle time sync in case of initrd based rootfs.

@Pennyzct
Copy link
Contributor

Hi~ @mcastelino @egernst @devimc @bergwolf
Not just for old kernels to disable chrony, for now, kvm-ptp is only supported on x86.
So for aarch64, we also need to fall back to the grpc sync. ;)

@amshinde
Copy link
Member

@Pennyzct grpc sync is not really going to be accurate. We should be looking at adding ptp support for aarch64.

@Pennyzct
Copy link
Contributor

@mcastelino yes, we should be considering to implement kvm-ptp on aarch64. ;) @jongwu @justin-he

@evanfoster
Copy link

evanfoster commented Oct 1, 2020

Is there a chance that the jump to QEMU 5.0 broke this? I'm having time-sync issues running on a 4.19 x86 box, running Kata 1.11.3 with QEMU 5.0 and the pre-packaged VM images.

@jodh-intel
Copy link
Contributor

@evanfoster - could you provide further details of the sync issues? Do you get errors from chrony or is the time just wrong or drifting? (if so how much?)

@jodh-intel
Copy link
Contributor

@evanfoster - might be worth raising a fresh issue on it and referencing this one.

@evanfoster
Copy link

evanfoster commented Oct 1, 2020

Hey @jodh-intel ,

I haven't set up a debug image to test this yet, so I'm not sure if chrony is unhappy. Will I have to create a debug image? Or can I just bump up the log levels to debug?

I'm seeing time drift issues, but the time may also be incorrect on startup. I've had some folks report a 20 second discrepancy on pod start (which breaks their application, they need ±2 seconds), and I have a pod that's running ~63 seconds slow after 15 days.

I don't have quite enough data yet to justify creating a new issue, but I'll work on gathering that so I can do so.

@jodh-intel
Copy link
Contributor

Hi @evanfoster - yes, you'll need to build a debug image as documented in https://github.com/kata-containers/documentation/blob/master/Developer-Guide.md#set-up-a-debug-console. However, you could just enable full debug as a first step to see if that gives you anything interesting in the logs.

I hope the move to Qemu 5 didn't break this since we do have a basic time drift test that should have caught the issue here:

Could you possibly do a bit of digging and maybe open a new issue with the output of kata-collect-data.sh, and reference this issue?

@evanfoster
Copy link

Can do! It might be a few days, however.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests