Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firewall Pinhole Fix for ILB and NetLB #109510

Merged
merged 2 commits into from Jun 23, 2022

Conversation

sugangli
Copy link

@sugangli sugangli commented Apr 15, 2022

/kind bug
Adding new gce firewall field DestinationRanges to ILB and NetLB. This change does not change the existing behaviors while creating these LBs. It simply restricts the access from Node & All IPs to Node & DestinationRanges.

Two major changes:

  1. Adding new patchFirewall function to support new fields.
  2. Moving firewall creation to the end of the function so that it can use the IP from the forwarding rule.

Before the introduction of GCE pinhole firewall feature, the destination-ranges of the firewall rule were default to all, i.e., any source allowed in the source-ranges was allowed to access any IP in the backend. This introduced a security risk, for example, any netlb service opens port for 22 would expose ssh access to the node in addition to other security issues. The fix here is aimed at restricting the access to the backend IPs via allowing traffic via ILB or NetLB IP only.

This change picks up the latest GCE pinhole firewall feature, which introduces destination-ranges in the ingress firewall-rules. It restricts the access to the backend IPs via allowing traffic via allowing traffic through ILB or NetLB IP only. This change does NOT change the existing ILB or NetLB behavior. 

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 15, 2022
@k8s-ci-robot
Copy link
Contributor

Welcome @sugangli!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Apr 15, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @sugangli. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@MrHohn
Copy link
Member

MrHohn commented Apr 15, 2022

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. area/apiserver area/cloudprovider area/code-generation area/dependency Issues or PRs related to dependency changes area/kubectl sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 15, 2022
@sugangli
Copy link
Author

It would be helpful for more context in the PR description as to why this is a bug and why its needed (I'm only supposed to approve bugs into legacy providers). Happy to approve when you add a little more context.

Thank you for reviewing this PR! I added some background in the description. Please let me know if I need to provide further detail.

@nckturner
Copy link
Contributor

/approve

This is a bug fix.

@thockin
Copy link
Member

thockin commented May 23, 2022

/approve

Who's doing the code-review on this? I can approve but it would be best to get someone who knows this code well to do a thorough review. It all looks OK to me but I didn't go super deep on it.

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 23, 2022
@bowei
Copy link
Member

bowei commented May 24, 2022

@sugangli -- can we sync for 15 minutes to understand why certain sections of logic was moved around in the function, it will help me review this quickly.

@kl52752
Copy link
Contributor

kl52752 commented May 24, 2022

/lgtm

@k8s-ci-robot
Copy link
Contributor

@kl52752: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -27,7 +27,7 @@ import (
"strings"

"github.com/GoogleCloudPlatform/k8s-cloud-provider/pkg/cloud"
"k8s.io/api/core/v1"
v1 "k8s.io/api/core/v1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is redundant..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was added by vs code automatically, which I am not 100% sure why. But I see a earlier PR has this too:https://github.com/kubernetes/kubernetes/blame/master/staging/src/k8s.io/legacy-cloud-providers/gce/gce_loadbalancer_internal.go#L34

@@ -123,12 +123,6 @@ func (g *Cloud) ensureExternalLoadBalancer(clusterName string, clusterID string,
// keeping the static IP around even though this is more complicated because
// it makes it less likely that we'll run into quota issues. Only 7 static
// IP addresses are allowed per region by default.
//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this comment removed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we moved the firewall configuration to the last.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline and we decided to move this back so we don't accidentally break implicit logic around this.

@@ -286,6 +256,30 @@ func (g *Cloud) ensureExternalLoadBalancer(clusterName string, clusterID string,
status := &v1.LoadBalancerStatus{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep this right before the return

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -921,7 +915,7 @@ func (g *Cloud) ensureHTTPHealthCheckFirewall(svc *v1.Service, serviceName, ipAd
return fmt.Errorf("error getting firewall for health checks: %v", err)
}
klog.Infof("Creating firewall %v for health checks.", fwName)
if err := g.createFirewall(svc, fwName, desc, sourceRanges, ports, hosts); err != nil {
if err := g.createFirewall(svc, fwName, desc, "", sourceRanges, ports, hosts); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

g.createFirewall(..., /*destinationIP*/ "", ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For HC, we don't have to set destination IP as the sourceRanges are known CIDRs. I set it anyway as it doesn't hurt.

@@ -934,7 +928,7 @@ func (g *Cloud) ensureHTTPHealthCheckFirewall(svc *v1.Service, serviceName, ipAd
!equalStringSets(fw.Allowed[0].Ports, []string{strconv.Itoa(int(ports[0].Port))}) ||
!equalStringSets(fw.SourceRanges, sourceRanges.StringSlice()) {
klog.Warningf("Firewall %v exists but parameters have drifted - updating...", fwName)
if err := g.updateFirewall(svc, fwName, desc, sourceRanges, ports, hosts); err != nil {
if err := g.updateFirewall(svc, fwName, desc, "", sourceRanges, ports, hosts); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/*destinationIP*/ ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

@@ -1007,7 +1001,7 @@ func (g *Cloud) updateFirewall(svc *v1.Service, name, desc string, sourceRanges
return nil
}

func (g *Cloud) firewallObject(name, desc string, sourceRanges utilnet.IPNetSet, ports []v1.ServicePort, hosts []*gceInstance) (*compute.Firewall, error) {
func (g *Cloud) firewallObject(name, desc, destinationIP string, sourceRanges utilnet.IPNetSet, ports []v1.ServicePort, hosts []*gceInstance) (*compute.Firewall, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document that destinationIP can be empty string "" and this means that it is not set.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@@ -416,7 +421,7 @@ func (g *Cloud) teardownInternalHealthCheckAndFirewall(svc *v1.Service, hcName s
return nil
}

func (g *Cloud) ensureInternalFirewall(svc *v1.Service, fwName, fwDesc string, sourceRanges []string, portRanges []string, protocol v1.Protocol, nodes []*v1.Node, legacyFwName string) error {
func (g *Cloud) ensureInternalFirewall(svc *v1.Service, fwName, fwDesc, DestinationIP string, sourceRanges []string, portRanges []string, protocol v1.Protocol, nodes []*v1.Node, legacyFwName string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't capitalize DestinationIP

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 11, 2022
Copy link
Member

@MrHohn MrHohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Walked through the changes with Sugang offline on this and overall LGTM.

@@ -123,12 +123,6 @@ func (g *Cloud) ensureExternalLoadBalancer(clusterName string, clusterID string,
// keeping the static IP around even though this is more complicated because
// it makes it less likely that we'll run into quota issues. Only 7 static
// IP addresses are allowed per region by default.
//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline and we decided to move this back so we don't accidentally break implicit logic around this.

@@ -225,6 +220,16 @@ func (g *Cloud) ensureInternalLoadBalancer(clusterName, clusterID string, svc *v
}
}

fwd, err := g.GetRegionForwardingRule(loadBalancerName, g.region)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pasting comment from offline - merge the second GetRegionForwardingRule down below into this so we don't generate additional get call.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 23, 2022
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jun 23, 2022

@sugangli: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-e2e-capz-windows-containerd 451c0c46a35755008ff2253d30610b8e629698af link false /test pull-kubernetes-e2e-capz-windows-containerd
pull-kubernetes-e2e-capz-azure-file-vmss 451c0c46a35755008ff2253d30610b8e629698af link false /test pull-kubernetes-e2e-capz-azure-file-vmss
pull-kubernetes-e2e-capz-azure-disk-vmss 451c0c46a35755008ff2253d30610b8e629698af link false /test pull-kubernetes-e2e-capz-azure-disk-vmss
pull-kubernetes-e2e-capz-conformance 451c0c46a35755008ff2253d30610b8e629698af link false /test pull-kubernetes-e2e-capz-conformance
pull-kubernetes-e2e-capz-azure-file 451c0c46a35755008ff2253d30610b8e629698af link false /test pull-kubernetes-e2e-capz-azure-file
pull-kubernetes-e2e-gce-alpha-features 451c0c46a35755008ff2253d30610b8e629698af link false /test pull-kubernetes-e2e-gce-alpha-features
pull-kubernetes-e2e-capz-azure-disk 451c0c46a35755008ff2253d30610b8e629698af link false /test pull-kubernetes-e2e-capz-azure-disk

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Member

@MrHohn MrHohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Could you update the release note since this changes the firewall behavior for new/existing L4 LBs?
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 23, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MrHohn, nckturner, sugangli, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 23, 2022
@k8s-ci-robot k8s-ci-robot merged commit 487512b into kubernetes:master Jun 23, 2022
SIG Node PR Triage automation moved this from Triage to Done Jun 23, 2022
SIG Auth Old automation moved this from Needs Triage to Closed / Done Jun 23, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.25 milestone Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/cloudprovider area/code-generation area/dependency Issues or PRs related to dependency changes area/kubectl cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
SIG Auth Old
Closed / Done
Development

Successfully merging this pull request may close these issues.

None yet