Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: Add the option to cleanup the tmp directory #112172

Merged
merged 1 commit into from Sep 2, 2022

Conversation

chendave
Copy link
Member

@chendave chendave commented Sep 1, 2022

The tmp is created by kubeadm but is never removed, the
size is expected to be expanded as time goes by.

Add one bool option to cleanup the tmp dir, the flag is
off by default.

Here is what I see on my desktop,

# du -sh /etc/kubernetes/tmp
2.2G    /etc/kubernetes/tmp

Signed-off-by: Dave Chen dave.chen@arm.com

/kind feature

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

kubeadm: add the flag "--cleanup-tmp-dir" for "kubeadm reset". It will cleanup the contents of "/etc/kubernetes/tmp". The flag is off by default.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


/sig cluster-lifecycle

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 1, 2022
@k8s-ci-robot
Copy link
Contributor

@chendave: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Sep 1, 2022
@pacoxu
Copy link
Member

pacoxu commented Sep 1, 2022

The flag may help for some use cases. The larger part of this dir would be back-up etcd data which may be very important for users.
Other files are backup manifests and dry-run data.

@@ -215,11 +223,16 @@ func (r *resetData) Cfg() *kubeadmapi.InitConfiguration {
return r.cfg
}

// DryRun returns the DryRun flag.
// DryRun returns the dryRun flag.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is irrelevant, but I'd like to clean this up, pls let me know whether you want to revert this back.

@chendave
Copy link
Member Author

chendave commented Sep 1, 2022

The flag may help for some use cases. The larger part of this dir would be back-up etcd data which may be very important for users.
Other files are backup manifests and dry-run data.

Yes, it might be important but also might not after tearing down the whole cluster. So, I made the flag off by default.

My understanding is that those files are created by kubeadm, and kubeadm should have the responsibility to take care of it. Leaving it as is imho is definitely not a ideal approach.
Besides, the dir occupy too large space, and it grows day by day, unless someone really needs those file or else there should be a choice for end-user to clean them up.

@@ -103,7 +104,14 @@ func runCleanupNode(c workflow.RunData) error {
if certsDir != kubeadmapiv1.DefaultCertificatesDir {
klog.Warningf("[reset] WARNING: Cleaning a non-default certificates directory: %q\n", certsDir)
}
dirsToClean = append(dirsToClean, certsDir)

// It's possible the tmp dir is overriden by Env, but kubeadm unables to know how that Env is defined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unables -> is unable to

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -160,6 +164,10 @@ func AddResetFlags(flagSet *flag.FlagSet, resetOptions *resetOptions) {
&resetOptions.dryRun, options.DryRun, resetOptions.dryRun,
"Don't apply any changes; just output what would be done.",
)
flagSet.BoolVar(
&resetOptions.cleanupTmpDir, options.CleanupTmpDir, resetOptions.cleanupTmpDir,
"Indicate whether reset will cleanup the temp dir",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

("Cleanup the %q directory", path.Join(kubeadmconstants.KubernetesDir, kubeadmconstants.TempDirForKubeadm))

to show the user what we are cleaning

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

tempDir := path.Join(kubeadmconstants.KubernetesDir, kubeadmconstants.TempDirForKubeadm)
dirsToClean = append(dirsToClean, tempDir)
if r.CleanupTmpDir() {
dirsToClean = append(dirsToClean, certsDir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we always append certs dir and optionally clean the tmp dir?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha, my fault :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the pr @chendave

@SataQiu @pacoxu I think i am +1 to this. WDYT?

@neolit123
Copy link
Member

/release-note-edit

kubeadm: add the flag "--cleanup-tmp-dir" for "kubeadm reset". It will cleanup the contents of "/etc/kubernetes/tmp". The flag is off by default.

@neolit123
Copy link
Member

I think i am +1 to this. WDYT?

on the other hand we discussed in the past we are not going to add more flags unless really needed. today users can cleanup manually.

@pacoxu
Copy link
Member

pacoxu commented Sep 1, 2022

@SataQiu @pacoxu I think i am +1 to this. WDYT?

+1 for the flag.

And I prefer to add an interactively confirm for removing backup data. (And the flag name is temp-dir. I am not sure if the flag can make users aware that this is a backup dir as well.)

@chendave
Copy link
Member Author

chendave commented Sep 1, 2022

Thanks for @neolit123 @pacoxu for the comments!

updated to fix the comments left by @neolit123 , we can continue to discuss the interactive confirmation.

@chendave
Copy link
Member Author

chendave commented Sep 1, 2022

I checked interactively confirmation, and I think it is a little over-killed in this case.

We need the confirmation when reset the cluster because the user might don't know what's going to happen next, but he should know what's going on if he explicitly enabled the the flag. Do we need his confirmation again when the flag is enabled?

I am fine with add a confirmation, but I personally don't think we need it in this case.

@neolit123 @SataQiu for thoughts?

@neolit123
Copy link
Member

don't we have like one global confirmation at the beginning of reset?

I think cleaning the etcd backup does need confirmation.

@chendave
Copy link
Member Author

chendave commented Sep 1, 2022

Confirmation is added, here is what you will see when the flag is enabled,

[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[cleanup-tmp-dir] Cleanup the "/etc/kubernetes/tmp" directory will clean up all backup data created by kubeadm, e.g. etcd data. Are you sure you want to proceed? [y/N]:

@pacoxu @neolit123 pls help to review it again, thanks!

@neolit123
Copy link
Member

does reset now have multiple confirmation prompts? i thought we had only one, which i think is better and simpler.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 1, 2022
@chendave
Copy link
Member Author

chendave commented Sep 1, 2022

Now it looks like,

W0901 10:18:55.389827  683315 preflight.go:58] [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
W0901 10:18:55.389887  683315 preflight.go:62] [cleanup-tmp-dir] WARNING: Cleanup the "/etc/kubernetes/tmp" directory will clean up all backup data created by kubeadm, e.g. etcd data.
[reset] Are you sure you want to proceed? [y/N]:

Btw, I inherit the flag to cleanup-node phase, so that the directory could be cleanup when the phase is called. But we won't get the confirmation when this phase is called only.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 1, 2022
@pacoxu
Copy link
Member

pacoxu commented Sep 1, 2022

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 1, 2022
@@ -51,8 +53,14 @@ func runPreflight(c workflow.RunData) error {
return errors.New("preflight phase invoked with an invalid data struct")
}

if !r.ForceReset() && !r.DryRun() {
klog.Warning("[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.")
if (!r.ForceReset() && !r.DryRun()) || r.CleanupTmpDir() {
Copy link
Member

@neolit123 neolit123 Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we can just remove handling of the cleanup temp dir flag here and printing the warning below. the flag is opt in, so users know they passed it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, I will update and pls let me know whether that is what you meant.

dirsToClean = append(dirsToClean, certsDir)
if r.CleanupTmpDir() {
// It's possible the tmp dir is overridden by Env, but kubeadm is unable to know how that Env is defined
Copy link
Member

@neolit123 neolit123 Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment accurate though? we are using hardcoded constants to form the path.
maybe we can remove the comment entirely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments is valid, since the directory for dry-run could be overridden by a env.
pls see:

if dryRunDir, err = kubeadmconstants.CreateTempDirForKubeadm(os.Getenv("KUBEADM_INIT_DRYRUN_DIR"), "kubeadm-init-dryrun"); err != nil {
return nil, errors.Wrap(err, "couldn't create a temporary directory")
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking we might let other cmd able to redirect the tmp directory as well, for consistency. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the tmp dir for dryrun is not the tmp dir for backup, they are different concepts. the dry run dir ends up in /tmp, while the other tmp dir is under /kubernetes.

that env override was only added to make integration tests happy for init. preferably we should not add similar overrides for more commands.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I miss something? If the env is not defined, dryrun will use the /etc/kubernetes/tmp, right?

ls /etc/kubernetes/tmp/kubeadm-init-dryrun328586547
admin.conf                    ca.crt                   front-proxy-ca.key            kube-scheduler.yaml
apiserver.crt                 ca.key                   front-proxy-client.crt        sa.key
...

func CreateTempDirForKubeadm(kubernetesDir, dirName string) (string, error) {
tempDir := path.Join(KubernetesDir, TempDirForKubeadm)
if len(kubernetesDir) != 0 {
tempDir = path.Join(kubernetesDir, TempDirForKubeadm)
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But anyway, now that this env is only added for integration test, I agree that comments is not necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the env is not defined, dryrun will use the /etc/kubernetes/tmp, right?

ok, you are right.

i think we can drop the comment.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 2, 2022
Comment on lines 58 to 61
if r.CleanupTmpDir() {
tempDir := path.Join(kubeadmconstants.KubernetesDir, kubeadmconstants.TempDirForKubeadm)
klog.Warningf("[cleanup-tmp-dir] WARNING: Cleanup the %q directory will clean up all backup data created by kubeadm, e.g. etcd data.", tempDir)
}
Copy link
Member

@neolit123 neolit123 Sep 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if r.CleanupTmpDir() {
tempDir := path.Join(kubeadmconstants.KubernetesDir, kubeadmconstants.TempDirForKubeadm)
klog.Warningf("[cleanup-tmp-dir] WARNING: Cleanup the %q directory will clean up all backup data created by kubeadm, e.g. etcd data.", tempDir)
}

what i meant in my previous comment is that we can drop the tmp dir cleanup warning entirely.
-cleanup-tmp-dir just means "add another extra dir to cleanup"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need the check like

if (!r.ForceReset() && !r.DryRun()) || r.CleanupTmpDir()) 

So that this flag will hit the confirmation below as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think so?
if -f or -dryrun are passed the prompt will not be shown. else it will be shown independent on additional flags.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, got you, so we don't need any change here, will update soon.

The `tmp` is created by `kubeadm` but is never removed, the
size is expected to be expanded as time goes by.

Add one bool option to cleanup the `tmp` dir, the flag is
off by default.

Signed-off-by: Dave Chen <dave.chen@arm.com>
Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @chendave
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 2, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chendave, feroenmmek, neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 2, 2022
@k8s-ci-robot k8s-ci-robot merged commit 1913c6b into kubernetes:master Sep 2, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.26 milestone Sep 2, 2022
@chendave
Copy link
Member Author

chendave commented Sep 2, 2022

thanks @pacoxu @neolit123 for the review and comments!

@chendave chendave deleted the remove_tmp branch September 2, 2022 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants