fix error type when handling failures in scheduler #111999

kerthcet · 2022-08-24T08:14:44Z

Signed-off-by: kerthcet kerthcet@gmail.com

What type of PR is this?

/kind bug
/sig scheduling

What this PR does / why we need it:

When we report metrics about scheduling error, we should also update the pod status with schedulerError, to keep them consistent.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Pod failed in scheduling due to expected error will be updated with the reason of "SchedulerError" 
rather than "Unschedulable"

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Signed-off-by: kerthcet <kerthcet@gmail.com>

k8s-ci-robot · 2022-08-24T08:14:51Z

@kerthcet: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kerthcet · 2022-08-24T09:21:00Z

/retest

sanposhiho · 2022-08-24T09:57:51Z

I don't investigate deeply, but I think this test failure is related to the change, not due to flaky.
Test failed here and it waits for the pod condition's reason to become PodReasonUnschedulable. Seems related.

kubernetes/test/integration/util/util.go

Line 775 in 6c0bab8

cond.Reason == v1.PodReasonUnschedulable && pod.Spec.NodeName == "", nil

kerthcet · 2022-08-24T10:17:30Z

PodReasonUnschedulable

Thanks @sanposhiho I'll dig into that.

Signed-off-by: kerthcet <kerthcet@gmail.com>

kerthcet · 2022-08-25T05:55:36Z

/test pull-kubernetes-integration

Signed-off-by: kerthcet <kerthcet@gmail.com>

kerthcet · 2022-08-25T08:03:38Z

/retest

kerthcet · 2022-08-25T09:19:54Z

pkg/scheduler/schedule_one.go

 		}
-		sched.FailureHandler(ctx, fwk, podInfo, err, v1.PodReasonUnschedulable, nominatingInfo)
+		sched.FailureHandler(ctx, fwk, podInfo, err, reason, nominatingInfo)


This is a break change, since we'll change the reason when updating pod's status. I have no idea whether this was intended, but IMO, I think we should keep the metrics and reason in consistent. cc @Huang-Wei @alculquicondor @ahg-g

Do we consider reason values part of the API after https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures?

Does CA rely on this value? I am guessing changing this to a different value is not a an issue with CA since we shouldn't be scaling up because of a scheduling error (and CA will anyway not scale up if it indeed fits)

I guess it wouldn't break CA. Its logic to find unschedulabe pods is like: spec.nodeName==,status.phase!=Succeeded,status.phase!=Failed:

https://github.com/kubernetes/autoscaler/blob/81d70f94ad2094b2a20f1c8e52d15aff8df7bac1/cluster-autoscaler/utils/kubernetes/listers.go#L183-L194

However, in https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-does-scale-up-work, it says:

Whenever a Kubernetes scheduler fails to find a place to run a pod, it sets "schedulable" PodCondition to false and reason to "unschedulable".

So cc @x13n for confirmation. (TL;DR, we want to figure out if scheduler updates a pod with an internal ScheduleError reason, would it break CA or not?)

Do we consider reason values part of the API after https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures?

From the description https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures#using-pod-statusreason-field, reason is not considered as it's arbitrary and hard to track between different versions. cc @alculquicondor @mimowo for confirmation.

CA actually filters pods by reason client-side: https://github.com/kubernetes/autoscaler/blob/81d70f94ad2094b2a20f1c8e52d15aff8df7bac1/cluster-autoscaler/utils/kubernetes/listers.go#L168-L173

When do we expect to get SchedulerError instead of PodReasonUnschedulable? Any error returned by some plugin? If so, CA probably shouldn't attempt to find a place for such pod, there's no way to tell whether the error can be addressed by adding more capacity to the cluster.

If the reason is in the v1 package, it's considered part of the API. We don't filter based on it for Job failure policy, but that's a specific feature. In any case, we should be good to add new values.

@x13n an error is expected in the cases where apiserver is unavailable (this is already the case), or there are parsing errors #111791 (here we are currently returning Unschedulable). So yes, in general CA should skip these pods. This means that CA is not skipping these errors, which is actually a bug.

alculquicondor

/lgtm
/approve
/hold on @x13n to confirm the current understanding

alculquicondor · 2022-08-29T14:44:53Z

pkg/scheduler/schedule_one.go

 		}
-		sched.FailureHandler(ctx, fwk, podInfo, err, v1.PodReasonUnschedulable, nominatingInfo)
+		sched.FailureHandler(ctx, fwk, podInfo, err, reason, nominatingInfo)


If the reason is in the v1 package, it's considered part of the API. We don't filter based on it for Job failure policy, but that's a specific feature. In any case, we should be good to add new values.

@x13n an error is expected in the cases where apiserver is unavailable (this is already the case), or there are parsing errors #111791 (here we are currently returning Unschedulable). So yes, in general CA should skip these pods. This means that CA is not skipping these errors, which is actually a bug.

alculquicondor · 2022-08-29T14:48:18Z

pkg/scheduler/schedule_one.go

@@ -150,8 +151,9 @@ func (sched *Scheduler) schedulingCycle(ctx context.Context, state *framework.Cy
 			nominatingInfo = clearNominatedNode
 			klog.ErrorS(err, "Error selecting node for pod", "pod", klog.KObj(pod))
 			metrics.PodScheduleError(fwk.ProfileName(), metrics.SinceInSeconds(start))
+			reason = SchedulerError


we should probably move it to the v1 package.

But I would do it in a separate PR so that we can cherry-pick this as is.

Yes, I was plan to.

k8s-ci-robot · 2022-08-29T14:54:52Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, kerthcet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/scheduler/OWNERS~~ [alculquicondor]
~~test/integration/scheduler/OWNERS~~ [alculquicondor]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

x13n · 2022-08-30T11:14:32Z

Got it. In that case I think it should be safe to merge this.

/hold cancel

kerthcet · 2022-08-30T12:11:16Z

/retest

…1999-upstream-release-1.23 Automated cherry pick of #111999: fix error type

…1999-upstream-release-1.24 Automated cherry pick of #111999: fix error type

…1999-upstream-release-1.25 Automated cherry pick of #111999: fix error type

fix error type

eeb6e79

Signed-off-by: kerthcet <kerthcet@gmail.com>

k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Aug 24, 2022

k8s-ci-robot requested review from denkensk and sanposhiho August 24, 2022 08:15

fix test error

dd4fb3c

Signed-off-by: kerthcet <kerthcet@gmail.com>

fix tests

e61c16c

Signed-off-by: kerthcet <kerthcet@gmail.com>

kerthcet commented Aug 25, 2022

View reviewed changes

alculquicondor reviewed Aug 29, 2022

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 29, 2022

k8s-ci-robot assigned alculquicondor Aug 29, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 29, 2022

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 29, 2022

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 30, 2022

k8s-ci-robot merged commit 04f8a5c into kubernetes:master Aug 30, 2022

k8s-ci-robot added this to the v1.26 milestone Aug 30, 2022

kerthcet deleted the refactor/handle-failure branch August 30, 2022 13:51

This was referenced Aug 31, 2022

Automated cherry pick of #111999: fix error type #112138

Merged

Automated cherry pick of #111999: fix error type #112139

Merged

Automated cherry pick of #111999: fix error type #112140

Merged

Automated cherry pick of #111999: fix error type #112141

Closed

k8s-ci-robot added a commit that referenced this pull request Sep 7, 2022

Merge pull request #112140 from kerthcet/automated-cherry-pick-of-#11…

cc8c1e2

…1999-upstream-release-1.23 Automated cherry pick of #111999: fix error type

k8s-ci-robot added a commit that referenced this pull request Sep 7, 2022

Merge pull request #112139 from kerthcet/automated-cherry-pick-of-#11…

d24f60c

…1999-upstream-release-1.24 Automated cherry pick of #111999: fix error type

k8s-ci-robot added a commit that referenced this pull request Sep 7, 2022

Merge pull request #112138 from kerthcet/automated-cherry-pick-of-#11…

8afdbe5

…1999-upstream-release-1.25 Automated cherry pick of #111999: fix error type

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix error type when handling failures in scheduler #111999

fix error type when handling failures in scheduler #111999

kerthcet commented Aug 24, 2022 •

edited

k8s-ci-robot commented Aug 24, 2022

kerthcet commented Aug 24, 2022

sanposhiho commented Aug 24, 2022

kerthcet commented Aug 24, 2022

kerthcet commented Aug 25, 2022

kerthcet commented Aug 25, 2022

kerthcet Aug 25, 2022

ahg-g Aug 25, 2022

Huang-Wei Aug 25, 2022

kerthcet Aug 29, 2022

x13n Aug 29, 2022 •

edited

alculquicondor Aug 29, 2022

alculquicondor left a comment

alculquicondor Aug 29, 2022

alculquicondor Aug 29, 2022

kerthcet Aug 29, 2022

k8s-ci-robot commented Aug 29, 2022

x13n commented Aug 30, 2022

kerthcet commented Aug 30, 2022

fix error type when handling failures in scheduler #111999

fix error type when handling failures in scheduler #111999

Conversation

kerthcet commented Aug 24, 2022 • edited

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot commented Aug 24, 2022

kerthcet commented Aug 24, 2022

sanposhiho commented Aug 24, 2022

kerthcet commented Aug 24, 2022

kerthcet commented Aug 25, 2022

kerthcet commented Aug 25, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

x13n Aug 29, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alculquicondor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented Aug 29, 2022

x13n commented Aug 30, 2022

kerthcet commented Aug 30, 2022

kerthcet commented Aug 24, 2022 •

edited

x13n Aug 29, 2022 •

edited