Discard rule results from runs superseded by Shake restarts #4988
Discard rule results from runs superseded by Shake restarts #4988crtschin wants to merge 2 commits into
Conversation
| A restart cancels the running build with an async exception, but the computation | ||
| it spawned can outlive it and finish under a later build. 'compute' stamps the | ||
| result with whatever the step is *now*, so a value read from stale inputs is | ||
| recorded as freshly built and dependents skip recomputing it: |
There was a problem hiding this comment.
I thought asyncWithCleanUp would cleanup all the chilren ? So the spawned one won't live after we stop the old session ? Where does it go wrong 🤔 should we fix it instead ?
There was a problem hiding this comment.
I thought asyncWithCleanUp would cleanup all the chilren ?
That's the intention, and it does.
But notice that incDatabase is called in a separate thread relative to all of the reads in compute, which means it can interleave and cause a rule's computation to assume the step of a latter rule build action. This is what causes the superceded diagnostic rule result to survive.
I don't think this is savable from the level of asyncWithCleanup. It's compute that writes the incorrect Clean with the new step value. A check has to occur in the same transaction that sets the status.
d09584c to
42382e0
Compare
Closes #4985.
When any rule is (re)ran, do to being invalidated (in the above issue, due to
documentDidChangenotifications from typing), we track in our shake setup which shake session rule and rule results belong to. Thestepbookkeeping is an implementation detail of our shake setup to track whether keys are dirty.What went wrong here is that a rule result being put in the database for a rule build demand, can interleave with the
stepincrementing done on shake restarts. This has the consequence that a shake rule may write it's result under a newstepvalue, or leave a now outdated result.This PR adds a guard against that by checking that the step value is that same as when the rule started, implying no restart occurred. If one did, we should not overwrite it with the old result, as a new build may have raced and written a new one.
I added the (in the previous setup) flaky test, and reran it multiple times.