Skip to content

Conversation

@ubik2
Copy link
Contributor

@ubik2 ubik2 commented May 29, 2025

No description provided.

@ubik2 ubik2 requested a review from anotherjesse May 29, 2025 17:29
Copy link
Collaborator

@jsantell jsantell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, simple fix! Some comments on unhandled rejections

In the scenario where a scheduler/worker crashes, disables all charms in a space, and is recreated, would ensureCharms get called to eventually dispose of that worker? We could possibly have some timing issues in space-manager, but let's see where this gets us

addCancel(scheduler.watch(didCharms));
}

const removedSpaces = new Set(this.charmSchedulers.keys()).difference(dids);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL Set.prototype.difference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, these functions are convenient. Particularly here, where the natural thing would be to modify the map as you walk through it, which can mess up your iterator.

// we are no longer monitoring this space
const scheduler = this.charmSchedulers.get(did);
this.charmSchedulers.delete(did);
scheduler?.stop();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like shutting down old workers in space-manager.ts, this returns a promise mapping to the shutdown process, we don't want to await here, but we do want to handle a rejection here (or else we crash) -- consider something like scheduler?.stop().catch(e => console.error(Error stopping scheduler: ${e}));

Copy link
Contributor Author

@ubik2 ubik2 May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. i wish there were a good way for me to await.
Edit: I also did add your suggested error handling.

@ubik2
Copy link
Contributor Author

ubik2 commented May 29, 2025

In the scenario where a scheduler/worker crashes, disables all charms in a space, and is recreated, would ensureCharms get called to eventually dispose of that worker? We could possibly have some timing issues in space-manager, but let's see where this gets us

When the bgcharms cell is updated, that sink method called earlier has registered us for callbacks, so when we disable a charm (writing to the bgcharms cell), we should invoke ensureCharms with the updated result.

You're right that this is a bit sticky, since we could have just done the write of the disabledAt in the worker. We do give a deactivation timeout in the stop function.

I think this should all work correctly, though, and after the third failure, the worker will be stopped and removed (assuming that was the last enabled charm in that space).

@ubik2 ubik2 merged commit c1198b8 into fix/limit-bgworkers-enabled May 29, 2025
@ubik2 ubik2 deleted the fix/remove-disabled-spaces branch May 29, 2025 18:32
ubik2 added a commit that referenced this pull request May 29, 2025
…1193)

* Don't spawn a charm scheduler for a space with only disabled charms.

* If all the charms in a space become disabled, stop the scheduler for that space. (#1194)

* If all the charms in a space become disabled, stop the scheduler for that space.

* catch errors in stop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants