New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use checksums instead of fsyncs to avoid slow discovery caching on MacOS #110851
Conversation
Hi @negz. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @Jefftree |
/ok-to-test |
staging/src/k8s.io/client-go/discovery/cached/disk/round_tripper.go
Outdated
Show resolved
Hide resolved
Feedback:
|
/priority important-soon |
Yeah, we never use that for discovery (yet), and OpenAPI changes all the time. This is as bad as installing a CRD on a cluster. I don't know if it warrants a release note. |
staging/src/k8s.io/client-go/discovery/cached/disk/round_tripper.go
Outdated
Show resolved
Hide resolved
staging/src/k8s.io/client-go/discovery/cached/disk/round_tripper.go
Outdated
Show resolved
Hide resolved
/retest |
This benchmark is intended to demonstrate a performance improvement gained by removing fsyncs. Refer to the below issue for more detail. kubernetes#110753 Signed-off-by: Nic Cope <nicc@rk0n.org>
Part of the API discovery cache uses an HTTP RoundTripper that transparently caches responses to disk. The upstream implementation of the disk cache is hard coded to call Sync() on every file it writes. This has noticably poor performance on modern Macs, which ask their disk controllers to flush all the way to persistant storage because Go uses the `F_FULLFSYNC` fnctl. Apple recommends minimizing this behaviour in order to avoid degrading performance and increasing disk wear. The content of the discovery cache is not critical; it is indeed just a cache and can be recreated by hitting the API servers' discovery endpoints. This commit replaces upstream httpcache's diskcache implementation with a similar implementation that can use CRC-32 checksums to detect corrupted cache entries at read-time. When such an entry is detected (e.g. because it was only partially flushed to permanent storage before the host lost power) the cache will report a miss. This causes httpcache to fall back to its underlying HTTP transport (i.e. the real API server) and re-cache the resulting value. Apart from adding CRC-32 checksums and avoiding calling fsync this implementation differs from upstream httpcache's diskcache package in that it uses FNV-32a hashes rather than MD5 hashes of cache keys in order to generate filenames. Signed-off-by: Nic Cope <nicc@rk0n.org>
This helps avoid (potentially malicious) collisions when reading and writing cache data. Signed-off-by: Nic Cope <nicc@rk0n.org>
This is a little more computationally expensive but reduces the likelihood of a potentially malicious cache collision. Signed-off-by: Nic Cope <nicc@rk0n.org>
d0b1b63
to
c5957c2
Compare
/retest |
/lgtm |
Hey folks, just checking in ahead of code freeze next week. Do you feel this is mergeable as is? |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lavalamp, negz, seans3 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Should this PR have a release note? |
@negz Could you provide a one-liner for release notes? |
@csantanapr Not sure how much I'd consider this a user facing change, but I've added a one-liner nonetheless. Thanks! |
Thanks @negz End users can observe the effects of this change without using debug tools. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Short version: Avoids kubectl "hanging" for 10+ seconds while writing to cache when an API server has many API groups.
Part of the API discovery cache uses an HTTP RoundTripper that transparently caches responses to disk. The upstream implementation of the disk cache is hard coded to call
Sync()
on every file it writes. This has noticeably poor performance on modern Macs, which ask their controllers to flush all the way to persistent storage because Go uses theF_FULLFSYNC
fnctl. Apple recommends minimizing this behavior in order to avoid degrading performance and increasing disk wear.The content of the discovery cache is not critical; it is indeed just a cache and can be recreated by hitting the API servers' discovery endpoints. This commit replaces upstream httpcache's diskcache implementation with a similar implementation that can use SHA256 sums to detect corrupted cache entries at read-time. When such an entry is detected (e.g. because it was only partially flushed to permanent storage before the host lost power) the cache will report a miss. This causes httpcache to fall back to its underlying HTTP transport (i.e. the real API server) and re-cache the resulting value.
Apart from adding SHA256 sums and avoiding calling fsync this implementation differs from upstream httpcache's diskcache package in that it also uses SHA256 sums rather than MD5 sums of cache keys in order to sanitize cache filenames.
Note that I'm adding this implementation here rather than submitting it upstream because httpcache does not appear to have been touched for the last three years - it seems unlikely that they are accepting contributions.
Which issue(s) this PR fixes:
Fixes #110753
Special notes for your reviewer:
Since this is intended to improve slow performance I've added a benchmark test. You can see there's a dramatic speed improvement on MacOS:
I also see a decent speed improvement on Linux, though note that I'm running this test inside a Parallels VM:
Finally, invoking kubectl against an API server with 300+ API groups now takes about a second rather than 10+ (as demonstrated in #110753).
Does this PR introduce a user-facing change?