-
Notifications
You must be signed in to change notification settings - Fork 10.3k
mvcc: simplify watch store by eliminating victims mechanism #21104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Zhijun <[email protected]>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: zhijun42 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @zhijun42. Thanks for your PR. I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Signed-off-by: Zhijun <[email protected]>
| } | ||
|
|
||
| // cancelWatcher removes references of the watcher from the watchableStore | ||
| func (s *watchableStore) cancelWatcher(wa *watcher) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try to structure the code in a way to minimize the diff. It's not clear for me why we drop the for loop and how it impacts rest of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I rearranged the code of cancelWatcher and retryPendingEvents to play well with git diff to show clear ---- and +++.
| s.mu.Unlock() | ||
| panic("watcher not victim but not in watch groups") | ||
| } | ||
| // Check if already canceled first to handle double-cancel scenarios |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it was not needed before? If you are adding a new mechanism, maybe do it first in separate mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was indeed needed before in the else if wa.ch == nil check of the for loop.
| func (s *watchableStore) moveVictims() (moved int) { | ||
| // retryPendingEvents tries to send events to watchers that have pending event batches. | ||
| // Returns the number of watchers whose pending events were successfully sent. | ||
| func (s *watchableStore) retryPendingEvents() (sent int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fail to see "eliminating victims mechanism" as you are just writing same mechanism with different name "retry pending events". What's the benefit here? Can you maybe focus on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefit isn't just renaming - it's simplifying the state machine to only two states: synced and unsynced.
In the old implementation,when moveVictims() runs, it does:
s.mu.Lock()
victims := s.victims
s.victims = nil
s.mu.Unlock()
During this window, a watcher being processed isn't in unsynced, synced, OR s.victims - it's in this local snapshot victims variable. If cancelWatcher runs at this moment, it can't find the watcher anywhere and would have to retry, and thus the complexity of the for loop.
With the new approach, a watcher is never "in flight" - always findable in either synced or unsynced.
Besides, the s.victims field was a slice of maps []watcherBatch, very cumbersome to iterate/ add/ delete.
| // It skips watchers with pending events (non-nil eventBatch) as they are handled by retryLoop. | ||
| // It also handles compacted watchers by sending compaction responses and removing them. | ||
| // Returns the selected watchers and the minimum revision needed for DB fetch. | ||
| func (s *watchableStore) selectForSync(maxWatchers int, curRev, compactRev int64) (*watcherGroup, int64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I appreciate refactor for the choose and chooseAll functions, I don't think it should be in the same PR as removal of victims. Cost of review grows exponentially with size/complexity. Please consider moving extraction of selectForSync function to separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! Moved it to the new PR #21108
Signed-off-by: Zhijun <[email protected]>
Signed-off-by: Zhijun <[email protected]>
part of issue #16839 and #19806
I'm making a couple of improvements to optimize the memory and CPU usage on the server-side watch implementation. But to enable that, I need to first simplify/clean-up the existing implementation. Otherwise, this system would be hard to reason about and maintain.
Once this pull request is merged, I will post the actual optimization PR.
Overview
The previous implementation had three places a watcher could exist:
syncedgroup - watchers caught up with current revisionunsyncedgroup - watchers that need to fetch events from DBvictimsslice - watchers with blocked send channels waiting to retryThis led to:
cancelWatcher;victimsfield was a slice of maps[]watcherBatch, making it very cumbersome to iterate/ add/ delete.So I eliminated
victimsand significantly simplified the functioncancelWatcher.Now we keep all watchers in either the
syncedorunsyncedgroup, using a non-nileventBatchto track pending events.Other changes:
syncVictimsLoop/moveVictimsare renamed asretryLoop/retryPendingEvents, which are more intuitive.Testing
All existing tests pass, including:
TestNewWatcherCountGauge- validates gauge correctness for regular, compacted, andrace conditions
No new tests are needed as this is a pure refactoring with no behavioral changes.