New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix error type when handling failures in scheduler #111999
fix error type when handling failures in scheduler #111999
Conversation
Signed-off-by: kerthcet <kerthcet@gmail.com>
@kerthcet: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
I don't investigate deeply, but I think this test failure is related to the change, not due to flaky. kubernetes/test/integration/util/util.go Line 775 in 6c0bab8
|
Thanks @sanposhiho I'll dig into that. |
Signed-off-by: kerthcet <kerthcet@gmail.com>
/test pull-kubernetes-integration |
/retest |
} | ||
sched.FailureHandler(ctx, fwk, podInfo, err, v1.PodReasonUnschedulable, nominatingInfo) | ||
sched.FailureHandler(ctx, fwk, podInfo, err, reason, nominatingInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a break change, since we'll change the reason when updating pod's status. I have no idea whether this was intended, but IMO, I think we should keep the metrics and reason in consistent. cc @Huang-Wei @alculquicondor @ahg-g
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we consider reason
values part of the API after https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures?
Does CA rely on this value? I am guessing changing this to a different value is not a an issue with CA since we shouldn't be scaling up because of a scheduling error (and CA will anyway not scale up if it indeed fits)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it wouldn't break CA. Its logic to find unschedulabe pods is like: spec.nodeName==,status.phase!=Succeeded,status.phase!=Failed
:
However, in https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-does-scale-up-work, it says:
Whenever a Kubernetes scheduler fails to find a place to run a pod, it sets "schedulable" PodCondition to false and reason to "unschedulable".
So cc @x13n for confirmation. (TL;DR, we want to figure out if scheduler updates a pod with an internal ScheduleError reason, would it break CA or not?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we consider
reason
values part of the API after https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures?
From the description https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures#using-pod-statusreason-field, reason
is not considered as it's arbitrary and hard to track between different versions. cc @alculquicondor @mimowo for confirmation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CA actually filters pods by reason client-side: https://github.com/kubernetes/autoscaler/blob/81d70f94ad2094b2a20f1c8e52d15aff8df7bac1/cluster-autoscaler/utils/kubernetes/listers.go#L168-L173
When do we expect to get SchedulerError
instead of PodReasonUnschedulable
? Any error returned by some plugin? If so, CA probably shouldn't attempt to find a place for such pod, there's no way to tell whether the error can be addressed by adding more capacity to the cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the reason is in the v1 package, it's considered part of the API. We don't filter based on it for Job failure policy, but that's a specific feature. In any case, we should be good to add new values.
@x13n an error is expected in the cases where apiserver is unavailable (this is already the case), or there are parsing errors #111791 (here we are currently returning Unschedulable). So yes, in general CA should skip these pods. This means that CA is not skipping these errors, which is actually a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
/hold on @x13n to confirm the current understanding
} | ||
sched.FailureHandler(ctx, fwk, podInfo, err, v1.PodReasonUnschedulable, nominatingInfo) | ||
sched.FailureHandler(ctx, fwk, podInfo, err, reason, nominatingInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the reason is in the v1 package, it's considered part of the API. We don't filter based on it for Job failure policy, but that's a specific feature. In any case, we should be good to add new values.
@x13n an error is expected in the cases where apiserver is unavailable (this is already the case), or there are parsing errors #111791 (here we are currently returning Unschedulable). So yes, in general CA should skip these pods. This means that CA is not skipping these errors, which is actually a bug.
@@ -150,8 +151,9 @@ func (sched *Scheduler) schedulingCycle(ctx context.Context, state *framework.Cy | |||
nominatingInfo = clearNominatedNode | |||
klog.ErrorS(err, "Error selecting node for pod", "pod", klog.KObj(pod)) | |||
metrics.PodScheduleError(fwk.ProfileName(), metrics.SinceInSeconds(start)) | |||
reason = SchedulerError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably move it to the v1 package.
But I would do it in a separate PR so that we can cherry-pick this as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was plan to.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, kerthcet The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Got it. In that case I think it should be safe to merge this. /hold cancel |
/retest |
…1999-upstream-release-1.23 Automated cherry pick of #111999: fix error type
…1999-upstream-release-1.24 Automated cherry pick of #111999: fix error type
…1999-upstream-release-1.25 Automated cherry pick of #111999: fix error type
Signed-off-by: kerthcet kerthcet@gmail.com
What type of PR is this?
/kind bug
/sig scheduling
What this PR does / why we need it:
When we report metrics about scheduling error, we should also update the pod status with
schedulerError
, to keep them consistent.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: