Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix image pulling failure when IMDS is unavailable in kubelet startup #110523

Merged
merged 1 commit into from Jun 13, 2022

Conversation

andyzhangx
Copy link
Member

@andyzhangx andyzhangx commented Jun 12, 2022

What type of PR is this?

/kind bug

What this PR does / why we need it:

fix image pulling failure when IMDS is unavailable in kubelet startup

This PR would try getting Azure token from IMDS if it failed in the beginning when kubelet startup, this PR also removes the unused registryClient

  • workaround: restart kubelet on the agent node by:
systemctl restart kubelet 

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

/kind bug
/priority important-soon
/sig cloud-provider
/area provider/azure
/triage accepted

Does this PR introduce a user-facing change?

make sure auto-mounted subpath mount source is already mounted

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

fix image pulling failure when IMDS is unavailable in kubelet startup
Kubelet startup disabling managed identity plugin

Whenever the kubelet service starts, it loads the credentials plugins, calls .Enabled() on them, and the ones that return true are part of the docker keyring for image pulls. The ones that return false are disabled for the life of the kubelet.

In this case we’re using a user-assigned Managed Identity as the identity for ACR pulls. The Enabled() call-chain looks like this:

kuberuntime_manger:: NewKubeGenericRuntimeManager
|- plugins.go::NewDockerKeyring
|- provider.go::Enabled
|- azure_credentials.go::Enabled ([source link](https://github.com/kubernetes/kubernetes/blob/520b991347b9c77635c0e4555f1703a86dcdd4ff/pkg/credentialprovider/azure/azure_credentials.go#L212))
|- azure_auth.go::GetServicePrincipalToken
|- token.go::NewServicePrincipalTokenFromMSIWithUserAssignedID
|- token.go::newServicePrincipalTokenFromMSI
|- token.go::getMSIType
|- token.go::MSIAvailable
|- token.go::getMSIEndpoint

If the IMDS endpoint isn’t available, then azure_credentials.go::Enabled() returns false which disables it. On our bad node we see this behavior in these snipped logs:

I0525 06:41:19.936396 750 kubelet.go:290] "Adding apiserver pod source"
I0525 06:41:19.936910 750 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
E0525 06:41:19.937147 750 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://prod-aks-csp-us-wu2-0-fe-dns-629883d0.hcp.westus2.azmk8s.io:443/api/v1/nodes?fieldSelector=metadata.name%3Daks-cloudconnect-14073472-vmss000000&limit=500&resourceVersion=0": dial tcp: lookup prod-aks-csp-us-wu2-0-fe-dns-629883d0.hcp.westus2.azmk8s.io on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: connect: network is unreachable
E0525 06:41:19.939295 750 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://prod-aks-csp-us-wu2-0-fe-dns-629883d0.hcp.westus2.azmk8s.io:443/api/v1/services?limit=500&resourceVersion=0": dial tcp: lookup prod-aks-csp-us-wu2-0-fe-dns-629883d0.hcp.westus2.azmk8s.io on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: connect: network is unreachable
I0525 06:41:19.943322 750 kuberuntime_manager.go:245] "Container runtime initialized" containerRuntime="containerd" version="v1.5.9.m" apiVersion="v1alpha2"I0525 06:41:19.944919 750 azure_auth.go:94] azure: using managed identity extension to retrieve access tokenE0525 06:41:19.946121 750 azure_credentials.go:201] Failed to create service principal token: MSI not available...I0525 06:41:19.947741 750 server.go:1213] "Started kubelet"I0525 06:41:19.948086 750 server.go:149] "Starting to listen" address="0.0.0.0" port=10250

That explains why we can't pull images, as the kubelet has decided to not use the azure_credentials plugin. Also note the "network is unreachable" errors, as they lead to the next issue

Understanding why the IMDS endpoint isn't available

Since this seems to be an issue reported only on Mariner and not on Ubuntu, we tried to determine why the MSI wasn’t available at kubelet start. Investigating the logs we see a few interesting items:

The node originally started correctly
The issue started after a node reboot
From correlating timestamps, the kubelet service is starting (and hitting the MSI not available issue) during cloud-init, which is responsible for setting up the network
At the same time as we're seeing the missing MSI (which we can pull ourselves with curl when debugging) we're seeing network unreachable errors for other endpoints

From looking at the systemd service descriptions, there’s nothing specifying that the kubelet service should delay starting until cloud-init has finished. This is true on both Mariner and Ubuntu. We were able to repro the behavior on nodes by modifying the startup order to ensure kubelet starts before cloud-init, and validated that specifying the kubelet to start after cloud-init finishes fixes the issue. Thus it seems that we also have a problem that the kubelet is trying to use the network while it's still being configured.

Next Steps

First, Henry is going to continue driving the investigation of cloud-init ordering. He plans to see if there’s anything explicit or implicit in Ubuntu that’s ensuring the services start in the right order, and in any case will drive a fix with AKS to make the dependency explicit. 


Andy, would you be willing to look into the Enabled issue in the kubelet plugins? It seems at least in this case that we would want Enabled to be true if an MI is specified, regardless of if the IMDS endpoint is currently available or not. That would allow the node to recover from transient issues, rather than caching the failure for the life of the kubelet. Alternatively, we should differentiate the errors between IMDS not available and IMDS available but MI not available.

Last, since these fixes will take a few weeks to deploy, I’m planning on using a DaemonSet in our clusters to patch the systemd config for kubelet.service to ensure it always starts after cloud-init. That will unblock us, and we can easily remove the DaemonSet once the fix is live.

</details>

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. triage/accepted Indicates an issue or PR is ready to be actively worked on. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 12, 2022
Copy link
Member

@feiskyer feiskyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 13, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andyzhangx, feiskyer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

if a.servicePrincipalToken == nil {
token, err := auth.GetServicePrincipalToken(a.config, a.environment)
if err != nil {
klog.Errorf("Failed to create service principal token: %v", err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the log message has to be like Failed to get service principal token: %v", err

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha, good catch! let me work out another typo fix.

k8s-ci-robot added a commit that referenced this pull request Jul 6, 2022
…110523-upstream-release-1.22

Automated cherry pick of #110523: fix image pulling failure when IMDS is unavailalbe in kubelet
k8s-ci-robot added a commit that referenced this pull request Jul 6, 2022
…110523-upstream-release-1.24

Automated cherry pick of #110523: fix image pulling failure when IMDS is unavailalbe in kubelet
k8s-ci-robot added a commit that referenced this pull request Jul 6, 2022
…110523-upstream-release-1.23

Automated cherry pick of #110523: fix image pulling failure when IMDS is unavailalbe in kubelet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants