New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promote APIServerIdentity to Beta #113629
Promote APIServerIdentity to Beta #113629
Conversation
Changelog entry suggestion Promote the `APIServerIdentity` feature to Beta. By default, each kube-apiserver will now create a Lease in the `kube-system` namespace. These lease objects can be used to identify the number of active API servers in the cluster, and may also be used for future features such as the Storage Version API. Not sure what StorageVersionAPI is; if possible, follow https://kubernetes.io/docs/contribute/style/style-guide/ when working out how to refer to it. |
669e27a
to
d781f1d
Compare
Thanks for the suggestion! Updated :) |
/hold #113307 (comment) needs to be addressed first |
d781f1d
to
79b7b2e
Compare
pkg/controlplane/instance.go
Outdated
@@ -480,9 +477,9 @@ func (c completedConfig) New(delegationTarget genericapiserver.DelegationTarget) | |||
clock.RealClock{}, | |||
kubeClient, | |||
holderIdentity, | |||
int32(c.ExtraConfig.IdentityLeaseDurationSeconds), | |||
3600, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
Inside this function create a const
for this an re-use it in the two spots:
const identityLeaseDurationSeconds = 3600
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, updated
controlPlaneAddress := nodeAddresses[0] | ||
|
||
host := controlPlaneAddress + ":" + e2essh.SSHPort | ||
result, err := e2essh.SSH("hostname", host, framework.TestContext.Provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, that is one way to do it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not using exec on pods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apiserver pods generally aren't part of the cluster (static pods managed only by a kubelet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, exec a pod on that node as hostNetwork , not execing on the apiserver pod ... but maybe this is restricted, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubectl debug
has tooling for running a more-privileged Pod on a node to learn about it; maybe we could use some of that approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to clarify, I commented because I was curious, we are close to code freeze and I don't mean to delay this PR
h := fnv.New32a() | ||
h.Write([]byte(strings.TrimSpace(hostname))) | ||
leaseName := "kube-apiserver-" + fmt.Sprint(h.Sum32()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder to update this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated, but test will fail until the other PR merges
f.NamespacePodSecurityEnforceLevel = admissionapi.LevelPrivileged | ||
|
||
ginkgo.It("kube-apiserver identity should persist after restart [Disruptive]", func() { | ||
e2eskipper.SkipUnlessProviderIs("gce") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something we do often?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, especially on tests where you have to ssh into the nodes or run some external operation against them
LabelSelector: "k8s.io/component=kube-apiserver", | ||
}) | ||
framework.ExpectNoError(err) | ||
framework.ExpectEqual(len(leases.Items), len(controlPlaneNodes), "unexpected number of leases") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this flake in CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if you run it on a cluster while it is still being provisioned. Otherwise the node check above should prevent flakes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it doesn't hurt to add a retry loop here
79b7b2e
to
70faaf6
Compare
// This shorten the GC check period to make the test run faster. | ||
// Since we are testing GC behavior on leases we create, what happens to | ||
// the real apiserver lease doesn't matter. | ||
[]string{"--identity-lease-duration-seconds=1"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to remove this test, because of the dependence to this flag. But I think the unit tests added in #113074 and the additional unit test I added in this PR should make up for the coverage we lose here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other places we just let the integration test override the "constant"
kubernetes/staging/src/k8s.io/client-go/transport/cert_rotation.go
Lines 37 to 38 in f8750e2
// CertCallbackRefreshDuration is exposed so that integration tests can crank up the reload speed. | |
var CertCallbackRefreshDuration = 5 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm -- updating the lease parameters into global vars seems worthwhile so we can run integration tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use public vars
899fc01
to
b423794
Compare
/lgtm |
/retest |
// the real apiserver lease doesn't matter. | ||
controlplane.IdentityLeaseDurationSeconds = 1 | ||
controlplane.IdentityLeaseGCPeriod = time.Second | ||
controlplane.IdentityLeaseRenewIntervalPeriod = time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you set them back when you're done to avoid weird things in other tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch -- updated
Looks good except for the one comment |
… behavior when restarting kube-apiserver Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
…identity-lease-renew-interval-seconds Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
…ey can be set by integration tests Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
…public lease parameter vars Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
b423794
to
196a3b9
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andrewsykim, lavalamp The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test pull-kubernetes-e2e-gce-alpha-features |
@andrewsykim: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/hold cancel |
/triage accepted |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR promotes the APIServerIdentity feature, per KEP-1965 updates for v1.26:
In addition, it adds e2e tests which were part of the Beta graduation criterias and also removes kube-apiserver flags used to configure the lease duration and renewal interval.
The following Beta criterias have been met:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: