mvcc: simplify watch store by eliminating victims mechanism #21104

zhijun42 · 2026-01-11T06:32:50Z

part of issue #16839 and #19806

I'm making a couple of improvements to optimize the memory and CPU usage on the server-side watch implementation. But to enable that, I need to first simplify/clean-up the existing implementation. Otherwise, this system would be hard to reason about and maintain.

Once this pull request is merged, I will post the actual optimization PR.

Overview

The previous implementation had three places a watcher could exist:

synced group - watchers caught up with current revision
unsynced group - watchers that need to fetch events from DB
victims slice - watchers with blocked send channels waiting to retry

This led to:

system hard to reason about with these 3 places;
complex logic in function cancelWatcher;
the victims field was a slice of maps []watcherBatch, making it very cumbersome to iterate/ add/ delete.

So I eliminated victims and significantly simplified the function cancelWatcher.

Now we keep all watchers in either the synced or unsynced group, using a non-nil eventBatch to track pending events.

Other changes:

With victims gone, functions syncVictimsLoop / moveVictims are renamed as retryLoop / retryPendingEvents, which are more intuitive.
Added comments to existing code to improve readability.

Testing

All existing tests pass, including:

TestNewWatcherCountGauge - validates gauge correctness for regular, compacted, and
race conditions
Full mvcc test suite

No new tests are needed as this is a pure refactoring with no behavioral changes.

Signed-off-by: Zhijun <[email protected]>

k8s-ci-robot · 2026-01-11T06:32:56Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: zhijun42
Once this PR has been reviewed and has the lgtm label, please assign serathius for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2026-01-11T06:33:00Z

Hi @zhijun42. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: Zhijun <[email protected]>

serathius · 2026-01-12T09:43:15Z

server/storage/mvcc/watchable_store.go

 }

 // cancelWatcher removes references of the watcher from the watchableStore
 func (s *watchableStore) cancelWatcher(wa *watcher) {


Please try to structure the code in a way to minimize the diff. It's not clear for me why we drop the for loop and how it impacts rest of the code.

Sounds good. I rearranged the code of cancelWatcher and retryPendingEvents to play well with git diff to show clear ---- and +++.

serathius · 2026-01-12T09:44:04Z

server/storage/mvcc/watchable_store.go

-			s.mu.Unlock()
-			panic("watcher not victim but not in watch groups")
-		}
+	// Check if already canceled first to handle double-cancel scenarios


Why it was not needed before? If you are adding a new mechanism, maybe do it first in separate mechanism.

It was indeed needed before in the else if wa.ch == nil check of the for loop.

serathius · 2026-01-12T09:46:30Z

server/storage/mvcc/watchable_store.go

-func (s *watchableStore) moveVictims() (moved int) {
+// retryPendingEvents tries to send events to watchers that have pending event batches.
+// Returns the number of watchers whose pending events were successfully sent.
+func (s *watchableStore) retryPendingEvents() (sent int) {


I fail to see "eliminating victims mechanism" as you are just writing same mechanism with different name "retry pending events". What's the benefit here? Can you maybe focus on that.

The benefit isn't just renaming - it's simplifying the state machine to only two states: synced and unsynced.

In the old implementation,when moveVictims() runs, it does:

s.mu.Lock() victims := s.victims s.victims = nil s.mu.Unlock()

During this window, a watcher being processed isn't in unsynced, synced, OR s.victims - it's in this local snapshot victims variable. If cancelWatcher runs at this moment, it can't find the watcher anywhere and would have to retry, and thus the complexity of the for loop.

With the new approach, a watcher is never "in flight" - always findable in either synced or unsynced.

Besides, the s.victims field was a slice of maps []watcherBatch, very cumbersome to iterate/ add/ delete.

serathius · 2026-01-12T09:49:07Z

server/storage/mvcc/watchable_store.go

+// It skips watchers with pending events (non-nil eventBatch) as they are handled by retryLoop.
+// It also handles compacted watchers by sending compaction responses and removing them.
+// Returns the selected watchers and the minimum revision needed for DB fetch.
+func (s *watchableStore) selectForSync(maxWatchers int, curRev, compactRev int64) (*watcherGroup, int64) {


While I appreciate refactor for the choose and chooseAll functions, I don't think it should be in the same PR as removal of victims. Cost of review grows exponentially with size/complexity. Please consider moving extraction of selectForSync function to separate PR.

Good call! Moved it to the new PR #21108

Signed-off-by: Zhijun <[email protected]>

Simplify watch store implementation

b6fa827

Signed-off-by: Zhijun <[email protected]>

k8s-ci-robot added the needs-ok-to-test label Jan 11, 2026

k8s-ci-robot added the size/L label Jan 11, 2026

Early return from cancelWatcher

d9317d1

Signed-off-by: Zhijun <[email protected]>

serathius reviewed Jan 12, 2026

View reviewed changes

zhijun42 mentioned this pull request Jan 12, 2026

mvcc: refactor watch store by consolidating watcherGroup.choose/chooseAll #21108

Open

zhijun42 added 2 commits January 12, 2026 22:44

Split the selectForSync change

d9f4863

Signed-off-by: Zhijun <[email protected]>

Rearrange the code for easier review

ce54bf2

Signed-off-by: Zhijun <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mvcc: simplify watch store by eliminating victims mechanism #21104

mvcc: simplify watch store by eliminating victims mechanism #21104

zhijun42 commented Jan 11, 2026 •

edited

Loading

Uh oh!

k8s-ci-robot commented Jan 11, 2026

Uh oh!

k8s-ci-robot commented Jan 11, 2026

Uh oh!

serathius Jan 12, 2026

Uh oh!

zhijun42 Jan 12, 2026

Uh oh!

serathius Jan 12, 2026

Uh oh!

zhijun42 Jan 12, 2026

Uh oh!

serathius Jan 12, 2026

Uh oh!

zhijun42 Jan 12, 2026

Uh oh!

serathius Jan 12, 2026

Uh oh!

zhijun42 Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

mvcc: simplify watch store by eliminating victims mechanism #21104

Are you sure you want to change the base?

mvcc: simplify watch store by eliminating victims mechanism #21104

Conversation

zhijun42 commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Testing

Uh oh!

k8s-ci-robot commented Jan 11, 2026

Uh oh!

k8s-ci-robot commented Jan 11, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

zhijun42 commented Jan 11, 2026 •

edited

Loading