Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: fix the bug that configurable KubernetesVersion not respected during kubeadm join #110791

Merged
merged 1 commit into from Jul 1, 2022

Conversation

SataQiu
Copy link
Member

@SataQiu SataQiu commented Jun 26, 2022

What type of PR is this?

/kind bug

What this PR does / why we need it:

kubeadm: fix the bug that configurable KubernetesVersion not respected during kubeadm join

Which issue(s) this PR fixes:

Fixes kubernetes/kubeadm#2713

Special notes for your reviewer:

Does this PR introduce a user-facing change?

kubeadm: fix the bug that configurable KubernetesVersion not respected during kubeadm join

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 26, 2022
@k8s-ci-robot
Copy link
Contributor

@SataQiu: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 26, 2022
@k8s-ci-robot k8s-ci-robot added area/kubeadm sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 26, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: SataQiu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 26, 2022
@neolit123
Copy link
Member

added some comments here:
kubernetes/kubeadm#2713 (comment)

@neolit123
Copy link
Member

/hold
let's discuss what we want to do.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 27, 2022
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 28, 2022
// UnNormalizedKubernetesVersion stores the unnormalized target version of the control plane.
// Useful to restore the original target version before upload config.
// +k8s:conversion-gen=false
UnNormalizedKubernetesVersion string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnNormalized can be Unnormalized as a whole word:
https://en.wiktionary.org/wiki/unnormalize

@@ -100,6 +100,7 @@ func NormalizeKubernetesVersion(cfg *kubeadmapi.ClusterConfiguration) error {
if err != nil {
return err
}
cfg.UnNormalizedKubernetesVersion = cfg.KubernetesVersion
Copy link
Member

@neolit123 neolit123 Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so i think the solution is clean, but it would basically take the user config and write it to the config to upload.
i have no tested any of this, so please correct me, but i think what happens is that if the user provides ci/latest, the clusterconfiguration.kubernetesversion is written in the config map as ci/latest.

as mentioned on the issue, one problem with that is that if an init node was using ci/latest at that particular moment it can end up as 1.25-alpha... but if a joining node was added later in time (e.g. 4 months later) it can end up as 1.26-alpha..., because ci/latest is now a much newer version and ci/latest will be resolved to that. this use case is strange and we should not worry too much about it, i agree, but what we can do instead is the following:

  • check if the version is a CI version (kubeadmutil.KubernetesIsCIVersion above can be used)
  • resolve the version
  • write ci/<resolved-version to cfg.UnnormalizedKubernetesVersion

this also puts a question on the name UnnormalizedKubernetesVersion, because ci/<resolved-version is a mixture of normalized / unnormalized. perhaps CIKubernetesVersion is better (matches CIImageRepository too)?

WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. It seems that we should NOT save ci/latest directly in this scenario.

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
This seems better to me. Thanks.

/hold
@pacoxu does this seem right to you?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 30, 2022
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 1, 2022
@k8s-ci-robot k8s-ci-robot merged commit fa16bf8 into kubernetes:master Jul 1, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.25 milestone Jul 1, 2022
k8s-ci-robot added a commit that referenced this pull request Jul 5, 2022
…#110791-upstream-release-1.24

Automated cherry pick of #110791: kubeadm: fix the bug that configurable KubernetesVersion not
k8s-ci-robot added a commit that referenced this pull request Jul 6, 2022
…#110791-upstream-release-1.23

Automated cherry pick of #110791: kubeadm: fix the bug that configurable KubernetesVersion not
k8s-ci-robot added a commit that referenced this pull request Jul 6, 2022
…#110791-upstream-release-1.22

Automated cherry pick of #110791: kubeadm: fix the bug that configurable KubernetesVersion not
@jackfrancis
Copy link
Contributor

@SataQiu @neolit123 @pacoxu I'm seeing some failures in conformance tests against latest-1.24, perhaps since this was cherry-picked and a new build pushed with these changes?

[2022-07-07 11:26:24] kubectl version:  Client Version: v1.24.3-rc.0.27+7ca32977924bc1 Kustomize Version: v4.5.4
[2022-07-07 11:26:24] kubelet version:  Kubernetes v1.24.3-rc.0.27+7ca32977924bc1
[2022-07-07 11:26:24] *************************************************
[2022-07-07 11:26:24] [preflight] Running pre-flight checks
[2022-07-07 11:26:25] [preflight] Reading configuration from the cluster...
[2022-07-07 11:26:25] [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[2022-07-07 11:26:25] error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: failed to get component configs: could not parse "ci/v1.24.3-rc.0.27+7ca32977924bc1" as version

Does the above look related to these changes?

This is on a worker node btw, not a control plane node. Control plane node looks good:

$ k get nodes -o wide
NAME                                   STATUS   ROLES           AGE   VERSION                          INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
capz-conf-0fmzec-control-plane-sr69v   Ready    control-plane   11m   v1.24.3-rc.0.27+7ca32977924bc1   10.0.0.4      <none>        Ubuntu 18.04.6 LTS   5.4.0-1085-azure   containerd://1.6.2

@neolit123
Copy link
Member

@SataQiu the version parser should work but perhaps something is missing and maybe we don't have a unit test.

@jackfrancis
Copy link
Contributor

Repro cluster configmap below, certainly seems the result of these changes:

s$ k get configmaps kubeadm-config -n kube-system -o yaml
apiVersion: v1
data:
  ClusterConfiguration: |
    apiServer:
      extraArgs:
        authorization-mode: Node,RBAC
        cloud-config: /etc/kubernetes/azure.json
        cloud-provider: azure
        feature-gates: ""
      extraVolumes:
      - hostPath: /etc/kubernetes/azure.json
        mountPath: /etc/kubernetes/azure.json
        name: cloud-config
        readOnly: true
      timeoutForControlPlane: 20m0s
    apiVersion: kubeadm.k8s.io/v1beta3
    certificatesDir: /etc/kubernetes/pki
    clusterName: capz-conf-3ikdkt
    controlPlaneEndpoint: capz-conf-3ikdkt-8249ef04.westus2.cloudapp.azure.com:6443
    controllerManager:
      extraArgs:
        allocate-node-cidrs: "false"
        cloud-config: /etc/kubernetes/azure.json
        cloud-provider: azure
        cluster-name: capz-conf-3ikdkt
        feature-gates: HPAContainerMetrics=true
        v: "4"
      extraVolumes:
      - hostPath: /etc/kubernetes/azure.json
        mountPath: /etc/kubernetes/azure.json
        name: cloud-config
        readOnly: true
    dns: {}
    etcd:
      local:
        dataDir: /var/lib/etcddisk/etcd
        extraArgs:
          quota-backend-bytes: "8589934592"
    imageRepository: k8s.gcr.io
    kind: ClusterConfiguration
    kubernetesVersion: ci/v1.24.3-rc.0.27+7ca32977924bc1
    networking:
      dnsDomain: cluster.local
      podSubnet: 192.168.0.0/16
      serviceSubnet: 10.96.0.0/12
    scheduler: {}
kind: ConfigMap
metadata:
  creationTimestamp: "2022-07-07T11:53:24Z"
  name: kubeadm-config
  namespace: kube-system
  resourceVersion: "235"
  uid: 172efec9-42aa-4d13-b2a7-a3e3fadae3b6

@jackfrancis
Copy link
Contributor

Confirmed the obvious:

$ git diff staging/src/k8s.io/apimachinery/pkg/util/version/version_test.go
diff --git a/staging/src/k8s.io/apimachinery/pkg/util/version/version_test.go b/staging/src/k8s.io/apimachinery/pkg/util/version/version_test.go
index aa0675f7f12..1ea5a75de27 100644
--- a/staging/src/k8s.io/apimachinery/pkg/util/version/version_test.go
+++ b/staging/src/k8s.io/apimachinery/pkg/util/version/version_test.go
@@ -96,6 +96,7 @@ func TestSemanticVersions(t *testing.T) {
                {version: "2.1.0"},
                {version: "2.1.1"},
                {version: "42.0.0"},
+               {version: "ci/v1.24.3-rc.0.27+7ca32977924bc1"},
 
                // We also allow whitespace and "v" prefix
                {version: "   42.0.0", unparsed: "42.0.0", equalsPrev: true},

go test -timeout 10m -tags e2e -run ^TestSemanticVersions$ k8s.io/apimachinery/pkg/util/version

--- FAIL: TestSemanticVersions (0.00s)
    /Users/jackfrancis/go/src/github.com/kubernetes/kubernetes/staging/src/k8s.io/apimachinery/pkg/util/version/version_test.go:115: unexpected parse error: could not parse "ci/v1.24.3-rc.0.27+7ca32977924bc1" as version

@neolit123
Copy link
Member

Kubeadm should strip and store the ci prefix before passing the version to the apimachinery version library.

@jackfrancis
Copy link
Contributor

@neolit123 I'm not sure it's that simple

$ git diff cmd/kubeadm/app/util/config/common_test.go
diff --git a/cmd/kubeadm/app/util/config/common_test.go b/cmd/kubeadm/app/util/config/common_test.go
index b708657de71..5648deb648a 100644
--- a/cmd/kubeadm/app/util/config/common_test.go
+++ b/cmd/kubeadm/app/util/config/common_test.go
@@ -26,6 +26,7 @@ import (
        "k8s.io/apimachinery/pkg/util/version"
        apimachineryversion "k8s.io/apimachinery/pkg/version"
 
+       kubeadmapi "k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm"
        kubeadmapiv1old "k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2"
        kubeadmapiv1 "k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta3"
        "k8s.io/kubernetes/cmd/kubeadm/app/constants"
@@ -452,3 +453,32 @@ func TestIsKubeadmPrereleaseVersion(t *testing.T) {
                })
        }
 }
+
+func TestNormalizeKubernetesVersion(t *testing.T) {
+       tests := []struct {
+               name string
+               in   *kubeadmapi.ClusterConfiguration
+               out  error
+       }{
+               {
+                       name: "valid",
+                       in: &kubeadmapi.ClusterConfiguration{
+                               KubernetesVersion: "v1.24.3-rc.0.27+7ca32977924bc1",
+                       },
+               },
+               {
+                       name: "still valid",
+                       in: &kubeadmapi.ClusterConfiguration{
+                               KubernetesVersion: "ci/v1.24.3-rc.0.27+7ca32977924bc1",
+                       },
+               },
+       }
+
+       for _, test := range tests {
+               t.Run(test.name, func(t *testing.T) {
+                       if err := NormalizeKubernetesVersion(test.in); err != test.out {
+                               t.Errorf("expected err response be %q, got %q", test.out, err)
+                       }
+               })
+       }
+}

go test -timeout 10m -tags e2e -run ^TestNormalizeKubernetesVersion$ cmd/kubeadm/app/util/config

ok  	k8s.io/kubernetes/cmd/kubeadm/app/util/config	0.440s

@jackfrancis
Copy link
Contributor

From the error returned in cloud-init preflight stdout it seems that the version parsing failure is occuring as we enumerate through the kube-proxy and kubelet component configmap handlers:

Probably somewhere in here:

@SataQiu
Copy link
Member Author

SataQiu commented Jul 7, 2022

@SataQiu the version parser should work but perhaps something is missing and maybe we don't have a unit test.

Yes, maybe our e2e tests are not robust enough. I'll look at the issue again.
Perhaps we should add a separate e2e test for the CI Kubernetes Version.

@neolit123
Copy link
Member

The e2e tests pin a k8s version that is without ci prefixes. I guess that is mostly because we bake our own images from tars and we treat it as a regular k8s version, not ci.

It should be possible to test this scenario with unit tests. Or modify our e2e tests somehow (probably harder to do).

@SataQiu
Copy link
Member Author

SataQiu commented Jul 8, 2022

Hi @jackfrancis , I guess you probably didn't update the kubeadm binary for the worker node.
I ran into this problem if using the old kubeadm.
This problem no longer occurs when I use the latest version.

int control-plane node:

root@kind-control-plane:/# ./kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"25+", GitVersion:"v1.25.0-alpha.0.1261+d3092cd296f82d-dirty", GitCommit:"d3092cd296f82d24236f57f5928cd4755a080d5c", GitTreeState:"dirty", BuildDate:"2022-07-08T06:45:45Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/arm64"}
root@kind-control-plane:/# ./kubeadm init --kubernetes-version=ci/latest-1.24
[init] Using Kubernetes version: v1.24.3-rc.0.29+99b7713ba36e8f
[preflight] Running pre-flight checks
	[WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
	[WARNING SystemVerification]: missing optional cgroups: blkio
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kind-control-plane kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.18.0.3]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kind-control-plane localhost] and IPs [172.18.0.3 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kind-control-plane localhost] and IPs [172.18.0.3 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 5.503461 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node kind-control-plane as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node kind-control-plane as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: ri7v1v.ctuj2k2l9pt17id4
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.18.0.3:6443 --token ri7v1v.ctuj2k2l9pt17id4 \
	--discovery-token-ca-cert-hash sha256:3eb72a6cfa9616f8c1b8f9b061a91255e98add0870c966b84d7932e41a8a905f 

check kubeadm-config:

root@kind-control-plane:/# kubectl get cm -n kube-system kubeadm-config -oyaml
apiVersion: v1
data:
  ClusterConfiguration: |
    apiServer:
      extraArgs:
        authorization-mode: Node,RBAC
      timeoutForControlPlane: 4m0s
    apiVersion: kubeadm.k8s.io/v1beta3
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controllerManager: {}
    dns: {}
    etcd:
      local:
        dataDir: /var/lib/etcd
    imageRepository: registry.k8s.io
    kind: ClusterConfiguration
    kubernetesVersion: ci/v1.24.3-rc.0.29+99b7713ba36e8f
    networking:
      dnsDomain: cluster.local
      serviceSubnet: 10.96.0.0/12
    scheduler: {}
kind: ConfigMap
metadata:
  creationTimestamp: "2022-07-08T08:01:43Z"
  name: kubeadm-config
  namespace: kube-system
  resourceVersion: "204"
  uid: 6022a1cf-7f75-4a2c-99f5-607b0c75fbae

join worker node using old kubeadm:

root@kind-worker:/# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-19T15:42:59Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/arm64"}
root@kind-worker:/# kubeadm join 172.18.0.3:6443 --token ri7v1v.ctuj2k2l9pt17id4 \
        --discovery-token-ca-cert-hash sha256:3eb72a6cfa9616f8c1b8f9b061a91255e98add0870c966b84d7932e41a8a905f 
[preflight] Running pre-flight checks
	[WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
	[WARNING SystemVerification]: missing optional cgroups: blkio
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: failed to get component configs: could not parse "ci/v1.24.3-rc.0.29+99b7713ba36e8f" as version
To see the stack trace of this error execute with --v=5 or higher

join worker using new kubeadm:

root@kind-worker:/# ./kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"25+", GitVersion:"v1.25.0-alpha.0.1261+d3092cd296f82d-dirty", GitCommit:"d3092cd296f82d24236f57f5928cd4755a080d5c", GitTreeState:"dirty", BuildDate:"2022-07-08T06:45:45Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/arm64"}
root@kind-worker:/# ./kubeadm join 172.18.0.3:6443 --token ri7v1v.ctuj2k2l9pt17id4      --discovery-token-ca-cert-hash sha256:3eb72a6cfa9616f8c1b8f9b061a91255e98add0870c966b84d7932e41a8a905f 
[preflight] Running pre-flight checks
	[WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
	[WARNING SystemVerification]: missing optional cgroups: blkio
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

@jackfrancis
Copy link
Contributor

Thanks @SataQiu, I will check that!

@jackfrancis
Copy link
Contributor

From control plane node able to successfully init:

[ 141.179848] cloud-init[1842]: [2022-07-08 10:12:59] kubeadm version: v1.24.3-rc.0.29+99b7713ba36e8f

From a worker node that fails pre-flight check:

[  585.910150] cloud-init[1807]: [2022-07-08 10:16:02] kubeadm version:  v1.24.3-rc.0.29+99b7713ba36e8f
[  585.910654] cloud-init[1807]: [2022-07-08 10:16:02] Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
[  585.911603] cloud-init[1807]: [2022-07-08 10:16:02] kubectl version:  Client Version: v1.24.3-rc.0.29+99b7713ba36e8f Kustomize Version: v4.5.4
[  585.912095] cloud-init[1807]: [2022-07-08 10:16:02] kubelet version:  Kubernetes v1.24.3-rc.0.29+99b7713ba36e8f
[  585.912575] cloud-init[1807]: [2022-07-08 10:16:02] *************************************************
[  585.913051] cloud-init[1807]: [2022-07-08 10:16:02] [preflight] Running pre-flight checks
[  585.913816] cloud-init[1807]: [2022-07-08 10:16:06] [preflight] Reading configuration from the cluster...
[  585.914380] cloud-init[1807]: [2022-07-08 10:16:06] [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[  585.914879] cloud-init[1807]: [2022-07-08 10:16:06] error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: failed to get component configs: could not parse "ci/v1.24.3-rc.0.29+99b7713ba36e8f" as version

tl;dr we are using a new version of kubeadm that has the change in our repro:

$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.3-rc.0.29+99b7713ba36e8f", GitCommit:"99b7713ba36e8ffb82e2ca8d631c960853cbb637", GitTreeState:"clean", BuildDate:"2022-07-07T18:13:36Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}

@neolit123
Copy link
Member

neolit123 commented Jul 8, 2022

From a worker node that fails pre-flight check:

a --v=5 would have helped here.

@SataQiu
Copy link
Member Author

SataQiu commented Jul 8, 2022

@jackfrancis I have sent a PR to fix this #111021

@jackfrancis
Copy link
Contributor

@neolit123 I'll paste a verbosity=5 output from a failing node here

thanks @SataQiu! hopefully my --v=5 output will further confirm that what we're seeing is the same thing your fix is addressing

@neolit123
Copy link
Member

commented on #111021
for different k8s versions the fix will have to be different < 1.25 VS >= 1.25

@jackfrancis
Copy link
Contributor

/test pull-kubernetes-e2e-capz-conformance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubeadm: configurable KubernetesVersion not respected during kubeadm join
5 participants