Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify CPUCFSQuotaPeriod values, set the minimum to 1ms #112123

Merged
merged 1 commit into from Sep 12, 2022
Merged

clarify CPUCFSQuotaPeriod values, set the minimum to 1ms #112123

merged 1 commit into from Sep 12, 2022

Conversation

paskal
Copy link
Contributor

@paskal paskal commented Aug 30, 2022

cpu.cfs_period_us is measured in microseconds in the kernel but provided in time.Duration by the user, that change clarifies the code to make this evident to the reader.

Also, the minimum value for that feature is 1ms and not 1μs, and this change alters the validation to reject values smaller than 1ms (kernel code reference).

This PR is a revert of #111554 plus updates based on clarification of CFS work in kernel provided by @odinuge in #112108 (comment)

/kind documentation
/kind bug

Does this PR introduce a user-facing change?

Yes and no. Previously providing any value less than 1ms, like 1μs, would result in errors like the following:

failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: failed to write "max 1": write /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod566772ddea2984f4084dd4cd59d51bff.slice/cri-containerd-78fd8bfc8b28488f6d87c32d6c79810266f119a935675ce8cdc435fd8a9a01c3.scope/cpu.max: invalid argument: unknown

After that change, any value below 1ms would fail the validation and die earlier and with a more straightforward error message.

Clarified the CFS quota as 100ms in the code comments and set the minimum cpuCFSQuotaPeriod to 1ms to match Linux kernel expectations.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/documentation Categorizes issue or PR as related to documentation. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 30, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @paskal. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/code-generation area/kubelet kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 30, 2022
@paskal
Copy link
Contributor Author

paskal commented Aug 30, 2022

Waiting for /ok-to-test on this one.

@odinuge
Copy link
Member

odinuge commented Aug 30, 2022

/ok-to-test
/priority backlog
/triage accepted

Feel free to ping me when ready for me to look at. Took a brief look now, and things look good.

The change of the validation is a breaking API change, but Imo. its very fine since users inserting lower values than 1ms will have problems anyways, and will get much less understandable error messages.

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 30, 2022
@paskal paskal marked this pull request as ready for review August 30, 2022 23:27
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 30, 2022
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 31, 2022
@szuecs
Copy link
Member

szuecs commented Sep 2, 2022

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 2, 2022
@szuecs
Copy link
Member

szuecs commented Sep 2, 2022

/assign @bobbypage

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 3, 2022
@odinuge
Copy link
Member

odinuge commented Sep 5, 2022

Thanks, looks good now!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 5, 2022
@pacoxu pacoxu moved this from Needs Reviewer to Needs Approver in SIG Node PR Triage Sep 7, 2022
@bobbypage
Copy link
Member

Thanks for the help clarifying our misinterpretation regarding the units. Couple suggestions to make it even more clear

pkg/kubelet/cm/container_manager_linux.go Outdated Show resolved Hide resolved
pkg/kubelet/cm/helpers_linux.go Show resolved Hide resolved
pkg/kubelet/cm/helpers_linux.go Show resolved Hide resolved
cpu.cfs_period_us is measured in microseconds in the kernel but
provided in time.Duration by the user, that change clarifies the code
to make this evident to the reader.

Also, the minimum value for that feature is 1ms and not 1μs, and this
change alters the validation to reject values smaller than 1ms.
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 8, 2022
@bobbypage
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 9, 2022
@paskal
Copy link
Contributor Author

paskal commented Sep 10, 2022

cc @liggitt @klueska for review as you reviewed and approved the previous MR, which I'm reverting and expanding now.

@paskal paskal requested review from bobbypage and removed request for mrunalp, odinuge and derekwaynecarr September 10, 2022 08:50
@liggitt
Copy link
Member

liggitt commented Sep 12, 2022

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, paskal, szuecs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 12, 2022
@k8s-ci-robot k8s-ci-robot merged commit 74469ca into kubernetes:master Sep 12, 2022
SIG Node PR Triage automation moved this from Needs Approver to Done Sep 12, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.26 milestone Sep 12, 2022
@paskal paskal deleted the paskal/cfs_clarification branch September 12, 2022 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/code-generation area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging this pull request may close these issues.

None yet

9 participants