Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add borrowing between priority levels in APF #113485

Merged
merged 8 commits into from Nov 9, 2022

Conversation

MikeSpreitzer
Copy link
Member

@MikeSpreitzer MikeSpreitzer commented Oct 31, 2022

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR implement the design change in kubernetes/enhancements#3391 and kubernetes/enhancements#3479 . That is, this PR adds borrowing of concurrency between priority levels in the API Priority and Fairness feature.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

The API change part of this was already reviewed in #112830 .
The first six commits of this PR are the commits of that earlier PR, rebased onto a later revision of the master branch.
The "apiserver: update borrowing parameters for apf bootstrap objects" commit is a cherry-pick of #113016 .

Does this PR introduce a user-facing change?

Priority and Fairness has introduced a new feature called _borrowing_ that allows an API priority level
to borrow a number of seats from other priority level(s). As a cluster operator, you can enable borrowing
for a certain priority level configuration object via the two newly introduced fields `lendablePercent`, and
`borrowingLimitPercent` located under the `.spec.limited` field of the designated priority level.
This PR adds the following metrics.
- `apiserver_flowcontrol_nominal_limit_seats`: Nominal number of execution seats configured for each priority level
- `apiserver_flowcontrol_lower_limit_seats`: Configured lower bound on number of execution seats available to each priority level
- `apiserver_flowcontrol_upper_limit_seats`: Configured upper bound on number of execution seats available to each priority level
- `apiserver_flowcontrol_demand_seats`: Observations, at the end of every nanosecond, of (the number of seats each priority level could use) / (nominal number of seats for that level)
- `apiserver_flowcontrol_demand_seats_high_watermark`: High watermark, over last adjustment period, of demand_seats
- `apiserver_flowcontrol_demand_seats_average`: Time-weighted average, over last adjustment period, of demand_seats
- `apiserver_flowcontrol_demand_seats_stdev`: Time-weighted standard deviation, over last adjustment period, of demand_seats
- `apiserver_flowcontrol_demand_seats_smoothed`: Smoothed seat demands
- `apiserver_flowcontrol_target_seats`: Seat allocation targets
- `apiserver_flowcontrol_seat_fair_frac`: Fair fraction of server's concurrency to allocate to each priority level that can use it
- `apiserver_flowcontrol_current_limit_seats`: current derived number of execution seats available to each priority level

The possibility of borrowing means that the old metric apiserver_flowcontrol_request_concurrency_limit can no longer mean both the configured concurrency limit and the enforced concurrency limit.  Henceforth it means the configured concurrency limit.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

[KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1040-priority-and-fairness#dispatching

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 31, 2022
@k8s-ci-robot k8s-ci-robot added area/code-generation kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 31, 2022
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. area/apiserver and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 31, 2022
@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Nov 1, 2022
@MikeSpreitzer MikeSpreitzer changed the title [WIP] Add borrowing between priority levels in APF Add borrowing between priority levels in APF Nov 1, 2022
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 1, 2022
@MikeSpreitzer MikeSpreitzer force-pushed the apf-borrowing branch 3 times, most recently from f13d70b to 5d0aea9 Compare November 1, 2022 04:54
@k8s-triage-robot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@MikeSpreitzer
Copy link
Member Author

/cc @tkashem
/cc @wojtek-t
@cyang49

@lavalamp
Copy link
Member

lavalamp commented Nov 8, 2022

I'm gonna tag this to make sure it meets the code freeze criteria

/lgtm
/approve

@MikeSpreitzer if you could rebase that would be great, I can live with it if you fix the nits in a followup.

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 8, 2022
@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 8, 2022
@lavalamp
Copy link
Member

lavalamp commented Nov 8, 2022

/lgtm

Can you update the release note text per #113485 (comment)?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2022
pkg/apis/flowcontrol/internalbootstrap/defaults_test.go Outdated Show resolved Hide resolved
pkg/apis/flowcontrol/v1alpha1/defaults_test.go Outdated Show resolved Hide resolved
pkg/apis/flowcontrol/v1beta1/defaults_test.go Outdated Show resolved Hide resolved
pkg/apis/flowcontrol/v1beta2/defaults_test.go Outdated Show resolved Hide resolved
pkg/apis/flowcontrol/v1beta3/defaults_test.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 9, 2022
Also make some design changes exposed in testing and review.

Do not remove the ambiguous old metric
`apiserver_flowcontrol_request_concurrency_limit` because reviewers
though it is too early.  This creates a problem, that metric can not
keep both of its old meanings.  I chose the configured concurrency
limit.

Testing has revealed a design flaw, which concerns the initialization
of the seat demand state tracking.  The current design in the KEP is
as follows.

> Adjustment is also done on configuration change … For a newly
> introduced priority level, we set HighSeatDemand, AvgSeatDemand, and
> SmoothSeatDemand to NominalCL-LendableSD/2 and StDevSeatDemand to
> zero.

But this does not work out well at server startup.  As part of its
construction, the APF controller does a configuration change with zero
objects read, to initialize its request-handling state.  As always,
the two mandatory priority levels are implicitly added whenever they
are not read.  So this initial reconfig has one non-exempt priority
level, the mandatory one called catch-all --- and it gets its
SmoothSeatDemand initialized to the whole server concurrency limit.
From there it decays slowly, as per the regular design.  So for a
fairly long time, it appears to have a high demand and competes
strongly with the other priority levels.  Its Target is higher than
all the others, once they start to show up.  It properly gets a low
NominalCL once other levels show up, which actually makes it compete
harder for borrowing: it has an exceptionally high Target and a rather
low NominalCL.

I have considered the following fix.  The idea is that the designed
initialization is not appropriate before all the default objects are
read.  So the fix is to have a mode bit in the controller.  In the
initial state, those seat demand tracking variables are set to zero.
Once the config-producing controller detects that all the default
objects are pre-existing, it flips the mode bit.  In the later mode,
the seat demand tracking variables are initialized as originally
designed.

However, that still gives preferential treatment to the default
PriorityLevelConfiguration objects, over any that may be added later.

So I have made a universal and simpler fix: always initialize those
seat demand tracking variables to zero.  Even if a lot of load shows
up quickly, remember that adjustments are frequent (every 10 sec) and
the very next one will fully respond to that load.

Also: revise logging logic, to log at numerically lower V level when
there is a change.

Also: bug fix in float64close.

Also, separate imports in some file

Co-authored-by: Han Kang <hankang@google.com>
@MikeSpreitzer
Copy link
Member Author

The force-push to feb4227 renames the wait group to make its purpose clearer and switches from a formula to an array for the number of clients per flow.

@MikeSpreitzer
Copy link
Member Author

I updated the release note to list the new metrics and explain what happened to apiserver_flowcontrol_request_concurrency_limit.

@wojtek-t
Copy link
Member

wojtek-t commented Nov 9, 2022

I'm retagging after just nits were addressed based on the fact that it was tagged before code-freeze in:
#113485 (comment)

FWIW - it LGTM too.

/lgtm
/approve
/milestone v1.26

@k8s-ci-robot k8s-ci-robot added this to the v1.26 milestone Nov 9, 2022
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 9, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp, MikeSpreitzer, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 1193a9a into kubernetes:master Nov 9, 2022
@MikeSpreitzer MikeSpreitzer deleted the apf-borrowing branch November 9, 2022 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/code-generation area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: API review completed, 1.26
Development

Successfully merging this pull request may close these issues.

None yet