New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change CPUCFSQuotaPeriod default value from 100ms to 100us to match Linux default #111520
Change CPUCFSQuotaPeriod default value from 100ms to 100us to match Linux default #111520
Conversation
Welcome @paskal! |
Hi @paskal. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
Unit tests failed due to yaml file having |
E2E tests need to be passing to get the PR merged. Kubelet is failing to start because:
|
Thanks a lot for pointer to the logs location! I'll fix it, but I'm struggling to understand of why my changes triggered that problem, as I only changed existing value and haven't introduced configuration overrides anywhere in the code. |
cpu.cfs_period_us is 100μs by default despite having an "ms" unit for some unfortunate reason. Documentation: https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management The desired effect of that change is to match k8s default `CPUCFSQuotaPeriod` value (100ms before that change) with one used in k8s without the `CustomCPUCFSQuotaPeriod` flag enabled and Linux CFS (100us, 1000x smaller than 100ms).
I'm clueless about how to properly fix the error which is highlighted by CI now: for some reason thousands of lines supposed to be added to |
It was an unrelated failure now resolved in master /retest |
Thanks! Reviewers, please review the code and the changelog entry part in the PR description: the PR changed the scope since the original introduction and requires a proper review before merging. #111554 (code-comments-only change) should be reviewed as well. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: liggitt, paskal, szuecs The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@bobbypage, could you please tell me if something is still missing here and in #111554, which prevents merging them into master? I'm not familiar with k8s merge flow enough to know what's missing at this moment here, and in another PR, approval is missing, but I already pinged all approvers once, and nothing happened. |
@paskal It looks to me like there are three people still pending approval on this one: bobbypage It also says tide is pending and isn't mergeable, although I am not familiar with tide. I hope they see it soon, I've been anxiously awaiting this change since you first opened it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
it'll merge when we thaw the codebase after the release of 1.25.
/hold cancel |
/lgtm |
@paskal I read through your discussion in #112108 and while I now also follow, I believe that the setting should not include a unit if the default value is a conflicting unit. If the value can never be less than 1ms then there's never a reason for microseconds to come into play. I think the "_us" should be removed from the setting and the current description stating the range of valid values be left unchanged. Unfortunately, I don't know how to go about requesting that change. I'm coming at this from the same perspective as many other people who've stumbled on these conversations due to poorly performing applications under CFS. What do you suggest? |
@PixelOrange, I would propose you check changes in #112123 and verify if something is missing there from your perspective. The valid range of values is 1ms-1s, and the k8s source code will be updated to reflect it.
|
Everything looks good as far as I can tell. That's unfortunate about the public API. I found an article on how to submit kernel patches but I imagine that's a daunting task. If you want a second set of eyes for the documentation updates, please feel free to message me. |
Thanks @paskal for going the long way. |
cpu.cfs_period_us is 100μs by default despite having an "ms" unit for some unfortunate reason. Documentation: https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management
The desired effect of that change is to match k8s default
CPUCFSQuotaPeriod
value (100ms before that change) with one used in k8s whenCPUCFSQuotaPeriod
is not set, and Linux CFS (100μs, 1000x smaller than 100ms).This PR is a followup of the #63437, which introduced the default of 100ms in k8s v1.12.
/kind api-change
Does this PR introduce a user-facing change? Yes, described above.