New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable the "Retriable and non-retriable pod failures for jobs" feature into beta #113360
Enable the "Retriable and non-retriable pod failures for jobs" feature into beta #113360
Conversation
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
Can you link all the PRs that this depends on? |
/retest |
Done |
36edb2b
to
d3e7057
Compare
bf4309a
to
8927d37
Compare
/retest |
8927d37
to
94009c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API constant change lgtm
a couple questions about the patch tests and one request on the default RBAC role if possible
@@ -327,6 +327,7 @@ func TestCreateNode(t *testing.T) { | |||
description string | |||
pods []v1.Pod | |||
node *v1.Node | |||
expectPatch bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this verify anything about the content of the patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered below
@@ -766,6 +770,7 @@ func TestEventualConsistency(t *testing.T) { | |||
newPod *v1.Pod | |||
oldNode *v1.Node | |||
newNode *v1.Node | |||
expectPatch bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this verify anything about the content of the patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This a unit test and the logic to generate the patch content is beyond Taint Manager, which prepares and sends the request:
if _, err := c.CoreV1().Pods(pod.Namespace).ApplyStatus(ctx, podApply, metav1.ApplyOptions{FieldManager: fieldManager, Force: true}); err != nil { |
Test coverage for the feature would not change much as we already have integration tests which verifies the condition is added:
_, cond := podutil.GetPodCondition(&testPod.Status, v1.AlphaNoCompatGuaranteeDisruptionTarget) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch would be a slice of bytes. I'm fine not checking it, given that we have integration tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, closed there related issue with a comment: #111612 (comment)
@@ -337,6 +337,9 @@ func TestPostFilter(t *testing.T) { | |||
} | |||
// As we use a bare clientset above, it's needed to add a reactor here | |||
// to not fail Victims deletion logic. | |||
cs.PrependReactor("patch", "pods", func(action clienttesting.Action) (bool, runtime.Object, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's surprising these don't need to handle the patch/delete requests in any way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unit test focuses asserts on nominating the node for preemption. It does not assert the modifications done to the pod so it ignores the DELETE (and now also PATCH) action.
As a proposal I pushed a new commit implementing another approach (I find simple and more powerful) to index the potential victim pods in the fake client so that the test does not fail trying to patch non-existing pods (proposed in the new commit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alculquicondor @soltysh PTAL
plugin/pkg/auth/authorizer/rbac/bootstrappolicy/testdata/controller-roles.yaml
Outdated
Show resolved
Hide resolved
5726091
to
818e180
Compare
/retest |
RBAC and API changes lgtm will defer to area approvers for ack of patch test at #113360 (comment) |
@@ -766,6 +770,7 @@ func TestEventualConsistency(t *testing.T) { | |||
newPod *v1.Pod | |||
oldNode *v1.Node | |||
newNode *v1.Node | |||
expectPatch bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch would be a slice of bytes. I'm fine not checking it, given that we have integration tests.
for _, pod := range tt.pods { | ||
podItems = append(podItems, *pod) | ||
} | ||
cs := clientsetfake.NewSimpleClientset(&v1.PodList{Items: podItems}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, this seems better.
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, liggitt, mimowo, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
adding to milestone per code-freeze exception granted at https://groups.google.com/g/kubernetes-sig-release/c/tZPjSWW_g30/m/_FzB7nrUAAAJ?utm_medium=email&utm_source=footer /milestone v1.26 |
exception request approved: email Your updated deadline to make any changes to your PR is 18:00 PST Friday 11th November 2022. /milestone v1.26 |
/retest |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Which issue(s) this PR fixes:
Special notes for your reviewer:
It depends on the following PRs:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: