cron icon indicating copy to clipboard operation
cron copied to clipboard

SkipIfStillRunning Chain Not Applied When Manually Calling Entry.Job.Run()

Open thisisdev-patrick opened this issue 5 months ago • 1 comments

Bug Report: SkipIfStillRunning Chain Not Applied When Manually Calling Entry.Job.Run()

Description

When using cron.WithChain(cron.SkipIfStillRunning(...)) to prevent concurrent job execution, manually calling Entry.Job.Run() bypasses the chain wrapper, allowing concurrent execution despite the SkipIfStillRunning configuration.

Environment

  • Library Version: github.com/robfig/cron/v3 v3.0.1
  • Go Version: 1.23.2
  • OS: Any

Steps to Reproduce

package main

import (
    "fmt"
    "log"
    "time"
    "github.com/robfig/cron/v3"
)

func main() {
    // Create cron with SkipIfStillRunning chain
    c := cron.New(
        cron.WithChain(cron.SkipIfStillRunning(cron.VerbosePrintfLogger(log.New(os.Stdout, "cron: ", log.LstdFlags)))),
    )
    
    // Add a job that takes 5 seconds
    jobFunc := func() {
        fmt.Println("Job started at", time.Now())
        time.Sleep(5 * time.Second)
        fmt.Println("Job finished at", time.Now())
    }
    
    entryID, _ := c.AddFunc("@every 10s", jobFunc)
    
    // Start cron scheduler
    c.Start()
    
    // Get the entry
    entry := c.Entry(entryID)
    
    // Manually run the job twice - BYPASSES SkipIfStillRunning!
    go entry.Job.Run()  // First manual run
    
    // Small delay to ensure first job starts
    time.Sleep(100 * time.Millisecond)
    
    go entry.Job.Run()  // Second manual run - runs concurrently!
    
    // Wait to see both jobs running
    time.Sleep(10 * time.Second)
}

Expected Behavior

The second manual Run() call should be skipped because the first one is still running, respecting the SkipIfStillRunning chain configuration. The output should show only one job execution at a time.

Actual Behavior

Both manual Run() calls execute concurrently, completely bypassing the SkipIfStillRunning wrapper. The chain is only applied to scheduled executions, not manual calls.

Expected Output:

Job started at 18:43:29.377
Job finished at 18:43:29.578

Actual Output (Confirmed):

Job execution #1 started at 18:43:29.377
Job execution #2 started at 18:43:29.388  // Should have been skipped!
Job execution #1 finished at 18:43:29.578
Job execution #2 finished at 18:43:29.589

Test Results (Confirmed Working):

  1. Manual Run() - BUG CONFIRMED: Both jobs run concurrently despite SkipIfStillRunning
  2. Scheduled Runs - WORKS CORRECTLY: Properly skips concurrent executions (shows "skip" in logs)
  3. Workaround - SUCCESSFUL: Using atomic.Bool properly prevents concurrent execution

Root Cause Analysis

The Entry.Job field contains the original unwrapped job function, not the chain-wrapped version. When WithChain is used, the cron scheduler internally wraps the job with the chain decorators, but this wrapped version is not exposed through the Entry struct.

// Simplified internal structure
type Entry struct {
    ID       EntryID
    Schedule Schedule
    Next     time.Time
    Prev     time.Time
    Job      Job        // This is the ORIGINAL job, not the wrapped one
    // The wrapped job is stored internally but not exposed
}

When the scheduler runs a job, it uses the wrapped version. When you call entry.Job.Run() directly, you're calling the original, unwrapped function.

Impact

This behavior can lead to:

  1. Race conditions when manually triggering jobs that were designed to be singleton
  2. Resource contention if the job manages shared resources (database connections, file handles, etc.)
  3. Data corruption if the job performs non-idempotent operations
  4. Unexpected behavior in applications that rely on SkipIfStillRunning for concurrency control
  5. Production incidents when jobs that should never run concurrently do so

Real-World Use Case

This issue commonly affects applications that need to:

  • Run a job immediately on startup (manual trigger) to ensure fresh data
  • Then continue with scheduled execution
  • While ensuring no concurrent execution occurs

Example: A cache prewarming service that needs to run immediately on deployment and then every 5 minutes, but should never have two instances running simultaneously.

Workaround

Implement manual concurrency control outside of the cron library:

package main

import (
    "fmt"
    "sync/atomic"
    "time"
    "github.com/robfig/cron/v3"
)

func main() {
    var jobRunning atomic.Bool
    
    // Create the actual job function
    jobFunc := func() {
        fmt.Println("Job started at", time.Now())
        time.Sleep(5 * time.Second)
        fmt.Println("Job finished at", time.Now())
    }
    
    // Wrap it with concurrency control
    wrappedJob := func() {
        if !jobRunning.CompareAndSwap(false, true) {
            fmt.Println("Job already running, skipping")
            return
        }
        defer jobRunning.Store(false)
        
        jobFunc()
    }
    
    // Create cron (SkipIfStillRunning still useful for scheduled runs)
    c := cron.New(
        cron.WithChain(cron.SkipIfStillRunning(cron.VerbosePrintfLogger(log.New(os.Stdout, "cron: ", log.LstdFlags)))),
    )
    
    // Use wrappedJob for both cron and manual execution
    c.AddFunc("@every 10s", wrappedJob)
    c.Start()
    
    // Manual execution now respects the same lock
    go wrappedJob()  // First manual run
    time.Sleep(100 * time.Millisecond)
    go wrappedJob()  // Second manual run - properly skipped!
    
    time.Sleep(10 * time.Second)
}

Suggested Fixes

Option 1: Expose the Wrapped Job

Add a WrappedJob field to Entry that contains the chain-wrapped version:

type Entry struct {
    ID         EntryID
    Schedule   Schedule
    Next       time.Time
    Prev       time.Time
    Job        Job  // Original job (for backwards compatibility)
    WrappedJob Job  // Chain-wrapped job
}

Option 2: Add a RunWithChain() Method

Provide a method on Entry that runs the job through the chain:

func (e *Entry) RunWithChain() {
    // Internal implementation would use the wrapped job
}

Option 3: Make Job Private and Provide Safe Methods

type Entry struct {
    ID       EntryID
    Schedule Schedule
    Next     time.Time
    Prev     time.Time
    job      Job  // Now private
}

func (e *Entry) Run() {
    // Always uses the wrapped version
}

Option 4: Clear Documentation

At minimum, clearly document that Entry.Job.Run() bypasses chains and should not be used when chain behavior is required.

Additional Context

This issue is particularly problematic because:

  1. The behavior is not intuitive - users expect chains to always apply
  2. The bug only manifests under concurrent load, making it hard to catch in testing
  3. There's no warning or error when bypassing the chain
  4. The documentation doesn't clearly warn about this limitation

Related Issues

  • This may affect other chain decorators like DelayIfStillRunning and Recover
  • Custom chain implementations are also bypassed when using manual Run()

Test Case

Here's a test case that demonstrates the issue:

func TestSkipIfStillRunningBypass(t *testing.T) {
    var executions atomic.Int32
    
    c := cron.New(
        cron.WithChain(cron.SkipIfStillRunning(cron.DiscardLogger)),
    )
    
    jobFunc := func() {
        executions.Add(1)
        time.Sleep(200 * time.Millisecond)
    }
    
    entryID, _ := c.AddFunc("@every 1s", jobFunc)
    c.Start()
    
    entry := c.Entry(entryID)
    
    // Start two manual runs concurrently
    var wg sync.WaitGroup
    wg.Add(2)
    
    go func() {
        defer wg.Done()
        entry.Job.Run()
    }()
    
    time.Sleep(10 * time.Millisecond) // Ensure first starts
    
    go func() {
        defer wg.Done()
        entry.Job.Run()
    }()
    
    wg.Wait()
    
    // This will fail - both jobs run despite SkipIfStillRunning
    if executions.Load() != 1 {
        t.Errorf("Expected 1 execution, got %d", executions.Load())
    }
}

thisisdev-patrick avatar Aug 06 '25 16:08 thisisdev-patrick

Update: Found a workaround

After further investigation, I've confirmed this is indeed a limitation - Entry.Job.Run() bypasses the chain wrappers.

I've worked around it by manually creating chain-wrapped jobs that can be used for both scheduled and manual execution:

// Initialize cron with the chain
skipWrapper := cron.SkipIfStillRunning(cron.VerbosePrintfLogger(log))
cr0n := cron.New(
    cron.WithLogger(cron.VerbosePrintfLogger(log)),
    cron.WithChain(skipWrapper),
)

// Define job functions
prewarmerJob := func() {
    log.Infof("[daemon] running prewarmer")
    prewarmer.Reset(ctx)
    prewarmer.CreateHeat(ctx, errorChan)
}

enrichmentJob := func() {
    log.Infof("[daemon] running enrichment")
    enrichmentPrewarmer.Reset(ctx)
    enrichmentPrewarmer.CreateHeat(ctx, errorChan)
}

// Manually wrap the jobs with the same chain
pJob := cron.NewChain(skipWrapper).Then(cron.FuncJob(prewarmerJob))
eJob := cron.NewChain(skipWrapper).Then(cron.FuncJob(enrichmentJob))

// Add wrapped jobs to scheduler
cr0n.AddJob(cfg.Heater.Schedules.Prewarmer, pJob)
cr0n.AddJob(cfg.Heater.Schedules.Enrichment, eJob)

// Start scheduler and trigger initial runs
cr0n.Start()
go pJob.Run()  // Now respects SkipIfStillRunning!
go eJob.Run()  // Same here

The key is using the same skipWrapper for both the cron initialization and the manual chain wrapping. This ensures the skip logic works for both scheduled and manual runs.

For anyone else hitting this: you can't use Entry.Job.Run() - you need to manually wrap your job with NewChain().Then() and use that wrapped version for manual runs.

Would still be nice if the library exposed the wrapped job directly on the Entry struct.

thisisdev-patrick avatar Aug 06 '25 18:08 thisisdev-patrick

Solution available in maintained fork

This issue has been fixed in netresearch/go-cron, a maintained fork of this library.

Fix: Entry.Run() now properly invokes the wrapped job (with all chain wrappers applied) instead of bypassing the chain. The WrappedJob field is now properly used.

How to migrate:

// Change import from:
import "github.com/robfig/cron/v3"

// To:
import cron "github.com/netresearch/go-cron"

The API is 100% backward compatible. See the migration guide for details.

CybotTM avatar Dec 04 '25 23:12 CybotTM