Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EndpointSlice with Pods without an existing Node #110639

Merged
merged 3 commits into from Jun 22, 2022

Conversation

aojea
Copy link
Member

@aojea aojea commented Jun 17, 2022

/kind bug
/kind cleanup
/kind documentation

What this PR does / why we need it:

endpointslices: node missing on Pod scenario

When a Pod is referencing a Node that doesn't exist on the local
informer cache, the current behavior was to return an error to
retry later and stop processing.
However, this can cause scenarios that a missing node leaves a
Slice stuck, it can no reflect other changes, or be created.
Also, this doesn't respect the publishNotReadyAddresses options
on Services, that considers ok to publish pod Addresses that are
known to not be ready.

The new behavior keeps retrying the problematic Service, but it
keeps processing the updates, reflacting current state on the
EndpointSlice. If the publishNotReadyAddresses is set, a missing
node on a Pod is not treated as an error.

EndpointSlices with Pod referencing Nodes that doesn't exist couldn't be created or updated.
The behavior on the EndpointSlice controller has been modified to update the EndpointSlice without the Pods that reference non-existing Nodes, and keep retrying until all Pods reference existing Nodes.
However, if service.Spec.PublishNotReadyAddresses is set, all the Pods are published without retrying.
Fixed EndpointSlices metrics to reflect correctly the number of desired EndpointSlices when no endpoints are present.

Fixes: #107927

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/documentation Categorizes issue or PR as related to documentation. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 17, 2022
@k8s-ci-robot
Copy link
Contributor

@aojea: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Jun 17, 2022
@aojea aojea changed the title Slice no node EndpointSlice with Pods without an existing Node Jun 17, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/network Categorizes an issue or PR as relevant to SIG Network. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 17, 2022
@aojea
Copy link
Member Author

aojea commented Jun 17, 2022

/assign @thockin @robscott

@aojea
Copy link
Member Author

aojea commented Jun 17, 2022

/priority important-soon

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 17, 2022
@@ -89,7 +89,13 @@ func (spc *ServicePortCache) totals(maxEndpointsPerSlice int) (int, int, int) {
for _, eInfo := range spc.items {
endpoints += eInfo.Endpoints
actualSlices += eInfo.Slices
desiredSlices += numDesiredSlices(eInfo.Endpoints, maxEndpointsPerSlice)
if eInfo.Endpoints > 0 {
desiredSlices += numDesiredSlices(eInfo.Endpoints, maxEndpointsPerSlice)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aojea Not suggesting a change necessarily, but would it be worth updating numDesiredSlices() to check for numEndpoints==0 and return 0, rather than what it does right now and return 1? Then you can just special-case desiredSlices=0 at the bottom but not have to special-case 0 in the loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we need to change endpointslicemirroring at all?

Copy link
Member Author

@aojea aojea Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think that is a matter of where we want to special-case the case "there is always one slice, despite there are not endpoints at all" ... I took this approach because numDesiredSlices is also used in

https://github.com/kubernetes/kubernetes/blob/ee08c977fd2e44ceb10506290fbaaf6b242d6a99/pkg/controller/endpointslicemirroring/metrics/cache.go#L139

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, seems you are right

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

th eendpointslicemirroring seems to not use a placeholder, and already returns 0 slices for 0 endpoints, but better @robscott for confirming

There is always a placeholder slice.

The ServicePortCache logic was considering always one endpointSlice
per Endpoint, but if there are multiple empty Endpoints, we just
use one placeholder slice, not multiple placeholder slices.
When a Pod is referencing a Node that doesn't exist on the local
informer cache, the current behavior was to return an error to
retry later and stop processing.
However, this can cause scenarios that a missing node leaves a
Slice stuck, it can no reflect other changes, or be created.
Also, this doesn't respect the publishNotReadyAddresses options
on Services, that considers ok to publish pod Addresses that are
known to not be ready.

The new behavior keeps retrying the problematic Service, but it
keeps processing the updates, reflacting current state on the
EndpointSlice. If the publishNotReadyAddresses is set, a missing
node on a Pod is not treated as an error.
@dcbw
Copy link
Member

dcbw commented Jun 22, 2022

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 22, 2022
@dcbw
Copy link
Member

dcbw commented Jun 22, 2022

@aojea the conformance test failure looks odd, because it's about EPS. However, it doesn't look related to EndpointSlice at all; the failure is that the test pods don't get their IPs within a 3m timeout. TLDR the image pulls are taking 2+ minutes for the pod:

Jun 22 08:09:18 kind-worker kubelet[280]: I0622 08:09:18.943215     280 status_manager.go:685] "Patch status for pod" pod="endpointslice-805/pod1" patch="{\"metadata\":{\"uid\":\"61a94b34-adbb-4fc7-8429-8bec09
03d5ed\"},\"status\":{\"$setElementOrder/conditions\":[{\"type\":\"Initialized\"},{\"type\":\"Ready\"},{\"type\":\"ContainersReady\"},{\"type\":\"PodScheduled\"}],\"conditions\":[{\"lastProbeTime\":null,\"last
TransitionTime\":\"2022-06-22T08:09:11Z\",\"status\":\"True\",\"type\":\"Initialized\"},{\"lastProbeTime\":null,\"lastTransitionTime\":\"2022-06-22T08:09:11Z\",\"message\":\"containers with unready status: [co
ntainer1]\",\"reason\":\"ContainersNotReady\",\"status\":\"False\",\"type\":\"Ready\"},{\"lastProbeTime\":null,\"lastTransitionTime\":\"2022-06-22T08:09:11Z\",\"message\":\"containers with unready status: [con
tainer1]\",\"reason\":\"ContainersNotReady\",\"status\":\"False\",\"type\":\"ContainersReady\"}],\"containerStatuses\":[{\"image\":\"registry.k8s.io/e2e-test-images/nginx:1.14-2\",\"imageID\":\"\",\"lastState\
":{},\"name\":\"container1\",\"ready\":false,\"restartCount\":0,\"started\":false,\"state\":{\"waiting\":{\"reason\":\"ContainerCreating\"}}}],\"hostIP\":\"172.18.0.4\",\"startTime\":\"2022-06-22T08:09:11Z\"}}
"
Jun 22 08:12:05 kind-worker kubelet[280]: I0622 08:12:05.273805     280 event.go:294] "Event occurred" object="endpointslice-805/pod1" fieldPath="spec.containers{container1}" kind="Pod" apiVersion="v1" type="Normal" reason="Pulled" message="Successfully pulled image \"registry.k8s.io/e2e-test-images/nginx:1.14-2\" in 2m51.156327184s"

and a few seconds after that the e3e test times out waiting for the pods to be up. Well yeah if the image pull takes 2m51s then I'm not surprised the e2e test's 3m timeout is exceeded.

@dcbw
Copy link
Member

dcbw commented Jun 22, 2022

/retest

@k8s-ci-robot k8s-ci-robot merged commit ae35371 into kubernetes:master Jun 22, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.25 milestone Jun 22, 2022
k8s-ci-robot added a commit that referenced this pull request Jul 7, 2022
…0639-upstream-release-1.24

Automated cherry pick of #110639: fix a bug on endpointslices tests comparing the wrong
dgrisonnet pushed a commit to dgrisonnet/kubernetes that referenced this pull request Sep 2, 2022
…ce_no_node

EndpointSlice with Pods without an existing Node
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/documentation Categorizes issue or PR as related to documentation. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EndpointSlice controller sync fails when Node is not found
5 participants