Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GIT-110239: fix activeDeadlineSeconds enforcement bug #110294

Conversation

harshanarayana
Copy link
Contributor

@harshanarayana harshanarayana commented May 30, 2022

What type of PR is this?

/kind bug
/kind regression

What this PR does / why we need it:

The current implementation of the job_controller.go fails to enforce the activeDeadlineSeconds configuration properly when the JobReadyPods feature gate is enabled.

This causes the pods taking longer than activeDeadlineSeconds to not get deleted on time. When the pods actually do get terminated, it will then track this and mark this as DeadlineExceeded condition. This is not the documented behaviour of activeDeadlineSeconds. This PR addresses the changes required to fix the same.

Which issue(s) this PR fixes:

Fixes #110239

Special notes for your reviewer:

Implementation is based on the suggestion from @alculquicondor in #110239 (comment)

This changes also modifies one of the existing test to make sure the re-queue behaviour is asserted to prevent future regressions.

Does this PR introduce a user-facing change?

Fix 1.24 regression that prevented the job controller from enforcing activeDeadlineSeconds when set

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

None

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. kind/regression Categorizes issue or PR as related to a regression from a prior release. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels May 30, 2022
@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 30, 2022
@harshanarayana harshanarayana force-pushed the bug/git-110239/fix-activedeadline-enforcement branch from 2bf662c to 4c125cc Compare May 30, 2022 19:17
@harshanarayana
Copy link
Contributor Author

/test all

@harshanarayana
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-ubuntu-containerd

@harshanarayana harshanarayana force-pushed the bug/git-110239/fix-activedeadline-enforcement branch 3 times, most recently from 33f930a to c132ccf Compare May 31, 2022 06:09
@harshanarayana harshanarayana marked this pull request as ready for review May 31, 2022 06:09
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 31, 2022
@harshanarayana
Copy link
Contributor Author

/cc @alculquicondor

@harshanarayana harshanarayana force-pushed the bug/git-110239/fix-activedeadline-enforcement branch from c132ccf to caf33cf Compare May 31, 2022 07:23
@k8s-ci-robot k8s-ci-robot removed the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 31, 2022
@alculquicondor
Copy link
Member

/release-note-edit Fix bug that prevented the job controller from enforcing activeDeadlineSeconds when set

@alculquicondor
Copy link
Member

I don't know if that worked. Could you update the issue description with the suggested release note?

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jun 14, 2022
@harshanarayana
Copy link
Contributor Author

@alculquicondor Done. PTAL. Updated the other PRs with the same

@alculquicondor
Copy link
Member

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 14, 2022
@k8s-ci-robot k8s-ci-robot merged commit 03b18bf into kubernetes:master Jun 14, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.25 milestone Jun 14, 2022
@alculquicondor
Copy link
Member

FYI, the test is flaky #110697

@harshanarayana
Copy link
Contributor Author

@alculquicondor taking a look now.

k8s-ci-robot added a commit that referenced this pull request Jul 7, 2022
…of-#110294-upstream-release-1.22-1655134642

Cherry pick of #110294 GIT-110239: fix activeDeadlineSeconds enforcement bug
k8s-ci-robot added a commit that referenced this pull request Jul 7, 2022
…of-#110294-upstream-release-1.24-1655134646

Cherry pick of #110294 GIT-110239: fix activeDeadlineSeconds enforcement bug
k8s-ci-robot added a commit that referenced this pull request Jul 7, 2022
…of-#110294-upstream-release-1.23-1655134684

Cherry pick of #110294 GIT-110239: fix activeDeadlineSeconds enforcement bug
@alculquicondor
Copy link
Member

/remove-sig scheduling

@k8s-ci-robot k8s-ci-robot removed the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Oct 17, 2022
mimowo added a commit to mimowo/kubernetes that referenced this pull request Oct 20, 2022
…gration/job

I think I'm ready to start review and LGTM code changes within this
package, but not necessarily for the entire sig-apps.

My PRs to the packages:
kubernetes#110292
kubernetes#111113
kubernetes#112948

PRs to the packages I contributed reviews to:
kubernetes#113166
kubernetes#110294
ivelichkovich pushed a commit to ivelichkovich/kubernetes that referenced this pull request Dec 20, 2022
…gration/job

I think I'm ready to start review and LGTM code changes within this
package, but not necessarily for the entire sig-apps.

My PRs to the packages:
kubernetes#110292
kubernetes#111113
kubernetes#112948

PRs to the packages I contributed reviews to:
kubernetes#113166
kubernetes#110294
jaehnri pushed a commit to jaehnri/kubernetes that referenced this pull request Jan 3, 2023
…gration/job

I think I'm ready to start review and LGTM code changes within this
package, but not necessarily for the entire sig-apps.

My PRs to the packages:
kubernetes#110292
kubernetes#111113
kubernetes#112948

PRs to the packages I contributed reviews to:
kubernetes#113166
kubernetes#110294
chenchun pushed a commit to chenchun/kubernetes that referenced this pull request Mar 20, 2024
chenchun pushed a commit to chenchun/kubernetes that referenced this pull request Mar 20, 2024
…equest !670)

Cherry pick of kubernetes#110294: fix activeDeadlineSeconds enforcement bug kubernetes#110543
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/regression Categorizes issue or PR as related to a regression from a prior release. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

activeDeadlineSeconds not working correctly for Jobs in 1.24.0
7 participants