Skip to content

[hf] don't await healthcheck job completion

aguestuser requested to merge hf-dont-await-healthcheck-jobs into main
  • bug: we expect healthcheck jobs to run every 10 minutes (and also timeout every 10 minutes). however, they are actually running every 20 minutes whenever one times out
  • likely cause: job runner is running job, then waiting timeout, then waiting interval, then running job again. in other words: the semantics of repeatUntilCancelled are causing each job call to await the promise returned by the thunk invoking diagnostics.sendHealthChecks(), when what we really want is for that job to be launched in a "fire and forget" way, so the interval timer starts immediately
  • attempted fix:
    • don't return the call to diagnostics.sendHealthChecks() in jobs.app.launchHealthchecks

Merge request reports