Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-proxy react on Node PodCIDR changes #111344

Merged
merged 3 commits into from Oct 18, 2022

Conversation

aojea
Copy link
Member

@aojea aojea commented Jul 22, 2022

/kind bug

kube-proxy handle node PodCIDR changs
Kube/proxy, in NodeCIDR local detector mode, uses the node.Spec.PodCIDRs
field to build the Services iptables rules.

The Node object depends on the kubelet, but if kube-proxy runs as a
static pods or as a standalone binary, it is not possible to guarantee
that the values obtained at bootsrap are valid, causing traffic outages.

Kube-proxy has to react on node changes to avoid this problems, it
simply restarts if detect that the node PodCIDRs have changed.

In case that the Node has been deleted, kube-proxy will only log an
error and keep working, since it may break graceful shutdowns of the
node.

Fixes #111321, #112739

kube-proxy, will restart in case it detects that the Node assigned pod.Spec.PodCIDRs have changed

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 22, 2022
@k8s-ci-robot
Copy link
Contributor

@aojea: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jul 22, 2022
@aojea aojea changed the title Kproxy node cidr kube-proxy in LocalModeNodeCIDR mode restart if Node.Spec.PodCIDR has changed Jul 22, 2022
@k8s-ci-robot k8s-ci-robot added area/ipvs sig/network Categorizes an issue or PR as relevant to SIG Network. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 22, 2022
@aojea
Copy link
Member Author

aojea commented Jul 22, 2022

/sig network
/assign @thockin @danwinship @bowei

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 22, 2022
@danwinship
Copy link
Contributor

PodCIDR is immutable once set but it is possible to delete and
recreate the Node object to get into this state.

Do we actually expect that to work? (Deleting and recreating a node while kubernetes components are running on that node without restarting them.)

@danwinship
Copy link
Contributor

It's possible that the node IP would change in that circumstance as well, so this is not the only situation in which kube-proxy would need to restart itself if we want to support that.

if proxier.localDetector.IsImplemented() && proxier.localDetector.Mode() == proxyconfigapi.LocalModeNodeCIDR {
nodeLocalCIDR := proxier.localDetector.IfLocal()[1]
if !utilproxy.NodeContainsPodCIDR(node, nodeLocalCIDR) {
klog.ErrorS(nil, "Inconsisten statu, kube-proxy configed Local CIDR not present on Node.Spec.PodCIDRs", "node", klog.KObj(node), "local CIDR", nodeLocalCIDR, "PodCIDRs", node.Spec.PodCIDRs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Match this log message with the one above.

Also, this code looks to be the same, we might want to factor it out to a func? (name it MaybeRestartOnInconsistentPodCIDR()?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, the reasons I didn't do it is because these are 7 lines of code with a lot of deps on proxier methods ... the refactor to a func is not going to save more lines and will make it harder to read IMHO ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be honest, I like your idea, but with current state of the code on both proxies, I don't know if it is better to keep them isolated or to try to consolidate them

@aojea
Copy link
Member Author

aojea commented Jul 29, 2022

It's possible that the node IP would change in that circumstance as well, so this is not the only situation in which kube-proxy would need to restart itself if we want to support that.

one bug at a time ;) ... until now we only have evidence of this one

pkg/proxy/util/utils.go Outdated Show resolved Hide resolved
@bowei
Copy link
Member

bowei commented Jul 29, 2022

Left a comment about potentially being able to restrict the surface area of the change.

Copy link
Member

@bowei bowei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot to hit send on the review

cmd/kube-proxy/app/server_others_test.go Outdated Show resolved Hide resolved
cmd/kube-proxy/app/server_others_test.go Outdated Show resolved Hide resolved
cmd/kube-proxy/app/server_others_test.go Outdated Show resolved Hide resolved
pkg/proxy/iptables/proxier.go Outdated Show resolved Hide resolved
pkg/proxy/ipvs/proxier.go Outdated Show resolved Hide resolved
@@ -639,6 +640,17 @@ func (proxier *Proxier) OnNodeUpdate(oldNode, node *v1.Node) {
return
}

// kube-proxy in LocalModeNodeCIDR mode may cache stale Node.PodCIDR.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative to avoid exposing the details of the detector everywhere:

add method to localDetector

err := localDetector.IsValid(node)
if err != nil {
  klkog.ErrorS(nil, "proxier.localDetector is no longer valid (%v), restarting kube-proxy", err)
  ...FlushAndExit()
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a new method to return the Value() of the local traffic discriminator

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant instead of exposing the Value() and putting the logic outside, we expose just the notion that the detector is still valid (which is what is being checked here).

so the proxier call site just has

if proxier.localDetector.IsImplemented() {
  if err := proxier.localDetector.IsValid(node); err != nil {
    klog.ErrorS(...)
    klog.FlushAndExit(...)
  }
}

Copy link
Member Author

@aojea aojea Aug 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but localDetectors can use CIDRs or interfaces names/prefixes, or whatever we want to implement in the detector, adding a new method IsValid(node *v1.Node) is only valid for one of the traffic detector modes.
Do we want to add such specific method to an Interface?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm suggesting removing Mode() and Value() and instead have a IsValid(node *v1.Node).

For detectors that are always Valid, this can always return true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, that makes more sense

@dcbw
Copy link
Member

dcbw commented Oct 13, 2022

Implementation LGTM. I'll leave the larger/philsophical questions about Node object changes to others...

@aojea aojea force-pushed the kproxy_node_cidr branch 2 times, most recently from 1d2acdf to a95b033 Compare October 13, 2022 19:56
@aojea aojea changed the title kube-proxy react on Node changes kube-proxy react on Node PodCIDR changes Oct 13, 2022
@aojea
Copy link
Member Author

aojea commented Oct 13, 2022

I've reduced the scope to PodCIDRs changes only when using nodeCIDR local detector, this is much safer now

nodeConfig := config.NewNodeConfig(currentNodeInformerFactory.Core().V1().Nodes(), s.ConfigSyncPeriod)
// https://issues.k8s.io/111321
if s.localDetectorMode == kubeproxyconfig.LocalModeNodeCIDR {
nodeConfig.RegisterEventHandler(&proxy.NodePodCIDRHandler{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that make it panic on_add since there is no local node UID set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check I changed the logic to panic on PodCIDR only and added test cases to cover the different states, if no PodCIDR exists in the handler it means that the handler wasn't initialized, os it uses the new value for that

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 15, 2022
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Oct 15, 2022
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 18, 2022
Kube/proxy, in NodeCIDR local detector mode, uses the node.Spec.PodCIDRs
field to build the Services iptables rules.

The Node object depends on the kubelet, but if kube-proxy runs as a
static pods or as a standalone binary, it is not possible to guarantee
that the values obtained at bootsrap are valid, causing traffic outages.

Kube-proxy has to react on node changes to avoid this problems, it
simply restarts if detect that the node PodCIDRs have changed.

In case that the Node has been deleted, kube-proxy will only log an
error and keep working, since it may break graceful shutdowns of the
node.
@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Oct 18, 2022
@aojea
Copy link
Member Author

aojea commented Oct 18, 2022

I had to rebase @thockin :/

@thockin
Copy link
Member

thockin commented Oct 18, 2022

Thanks!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 18, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea, dcbw, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/ipvs cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kube-proxy in LocalModeNodeCIDR mode may cache stale Node.PodCIDR if the Node object is recreated
10 participants