Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid propagating "search ." into containers /etc/resolv.conf #112157

Merged
merged 1 commit into from Sep 2, 2022

Conversation

dghubble
Copy link
Contributor

@dghubble dghubble commented Aug 31, 2022

Starting in Kubelet v1.25.0, on certain hosts (those with a FQDN hostname and recent systemd), a "search ." in /etc/resolv.conf will now propagate into containers (it did not before v1.25). This breaks musl-based DNS resolution (e.g. any alpine image) for Pods running on such nodes.

#112135 provides a lot of details.

What type of PR is this?

/kind bug
/kind regression

What this PR does / why we need it:

This reverts commit 5832b84 to fix a regression introduced in v1.25.0 (v1.25.0-alpha.1 to be specific). Without this change. alpine or other musl-based containers will fail DNS resolution on certain nodes (described in issue).

Adapts #109441 to avoid propagating search . into containers /etc/resolv.conf

Which issue(s) this PR fixes:

Fixes #112135

Does this PR introduce a user-facing change?

Fixes a 1.25 regression propagating hosts' `search .` into containers' `/etc/resolv.conf` which can fail DNS resolution

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Historical:

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. kind/regression Categorizes issue or PR as related to a regression from a prior release. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 31, 2022
@k8s-ci-robot
Copy link
Contributor

@dghubble: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 31, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @dghubble. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/kubelet sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 31, 2022
@pacoxu
Copy link
Member

pacoxu commented Sep 1, 2022

/cc aojea dcbw

@thockin thockin self-assigned this Sep 1, 2022
@thockin
Copy link
Member

thockin commented Sep 1, 2022

We can't just revert it. The revert-state is ALSO a bug.

It's obviously a MUSL bug, the question is whether we should "fix" this by simply removing any search item which is . entirely (instead of preserving . or converting it to an empty string. That will fix MUSL compat entirely, and honestly, propagating . is kind of wonky anyway.

Thoughts?

@thockin
Copy link
Member

thockin commented Sep 1, 2022

			searches = []string{}
			for _, s := range fields[1:] {
				if s != "." {
-					s = strings.TrimSuffix(s, ".")
+					searches = append(searches, strings.TrimSuffix(s, "."))
				}
-				searches = append(searches, s)
			}
		}

@dghubble
Copy link
Contributor Author

dghubble commented Sep 1, 2022

Agreed, #109441 seems to be trying to trying to prevent an empty entry. We could solve this differently than propagating "." and confusing musl. I'll update with your suggestion

@dghubble dghubble changed the title Revert "kubelet: parseResolvConf: Handle "search ." Avoid propagating "search ." into containers /etc/resolv.conf Sep 1, 2022
@dghubble
Copy link
Contributor Author

dghubble commented Sep 1, 2022

Updated title, description, and code/tests.

{"search ", []string{}, []string{}, []string{}, false}, // search empty
{"search .", []string{}, []string{"."}, []string{}, false},
{"search ", []string{}, []string{}, []string{}, false}, // search empty
{"search .", []string{}, []string{}, []string{}, false}, // ignore lone dot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add cases for "search . foo" -> ["foo"] and "search foo ." (same) and "search foo . bar" -> ["foo", "bar"].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this witrh a MUSL container?

{"search ", []string{}, []string{}, []string{}, false}, // search empty
{"search .", []string{}, []string{"."}, []string{}, false},
{"search ", []string{}, []string{}, []string{}, false}, // search empty
{"search .", []string{}, []string{}, []string{}, false}, // ignore lone dot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add cases for "search . foo" -> ["foo"] and "search foo ." (same) and "search foo . bar" -> ["foo", "bar"].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

* Adapt kubernetes#109441 but
ensures that `search .` does not get propagated into containers'
/etc/resolv.conf. There is no reason to put `.` in a container's
search field and it causes issues for musl
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 1, 2022
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

/lgtm
/approve

/hold

In case @aojea or @danwinship or others want a look

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Sep 1, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dghubble, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 1, 2022
@thockin
Copy link
Member

thockin commented Sep 1, 2022

if no NAK in a day or two, go ahead and remove hold

@dghubble
Copy link
Contributor Author

dghubble commented Sep 1, 2022

I've not built a cluster from this PR. But I've fiddled with this a good bit to discover this 😢 . For my part, in Typhoon I'm working around it by removing "search ." from hosts which works. I also found manually removing the "." from inside alpine (or similar) containers would fix them, as would removing ndots (undesired). And reverting to Kubelet v1.24.4 also fixed (bc the "." doesn't propagate). So I'm hopeful this is the thing!

@thockin thockin added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 1, 2022
@dghubble
Copy link
Contributor Author

dghubble commented Sep 1, 2022

/retest

@aojea
Copy link
Member

aojea commented Sep 1, 2022

+1

search . breaks compatibility, if there is an use case for it we should roll it as a new feature, this sounds like a good solution to me
/lgtm

@lucab
Copy link
Contributor

lucab commented Sep 2, 2022

I see the cherry-pick deadline for 1.25.1 is on 2022-09-09 (a week from now).
If this patch lands in main quickly within that timeframe, it would be good to aim the same fix at 1.25.1 too.

@aojea
Copy link
Member

aojea commented Sep 2, 2022

I see the cherry-pick deadline for 1.25.1 is on 2022-09-09 (a week from now). If this patch lands in main quickly within that timeframe, it would be good to aim the same fix at 1.25.1 too.

if no NAK in a day or two, go ahead and remove hold

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 2, 2022
@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@lucab
Copy link
Contributor

lucab commented Sep 2, 2022

I proposed this for 1.25.1, cherry-pick at #112204.

k8s-ci-robot added a commit that referenced this pull request Sep 7, 2022
…7-upstream-release-1.25

Automated cherry pick of #112157: Avoid propagating `search .` into containers /etc/resolv.conf
dghubble added a commit to poseidon/typhoon that referenced this pull request Sep 20, 2022
dghubble added a commit to poseidon/typhoon that referenced this pull request Sep 20, 2022
Snaipe pushed a commit to aristanetworks/monsoon that referenced this pull request Apr 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/regression Categorizes issue or PR as related to a regression from a prior release. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

musl-based DNS resolution will break on v1.25.0 in certain configurations
7 participants