bacalhau icon indicating copy to clipboard operation
bacalhau copied to clipboard

Short JobIDs appear to work for some commands but not others - return 500 error.

Open frrist opened this issue 11 months ago • 1 comments

Bacalhau Version: 050b873c3 Network: Private Network.

Steps to reproduce (maybe? This seems hard to debug)

  1. Run a job, hit Ctrl+C, the JobID (from below) is: ed1966b2-fa70-4010-be05-083db4bfb325
$ bacalhau docker run ubuntu:latest echo hello worldssdsa
Job successfully submitted. Job ID: ed1966b2-fa70-4010-be05-083db4bfb325
Checking job status... (Enter Ctrl+C to exit at any time, your job will continue running):

	Communicating with the network  ................  done ✅  0.0s
	   Creating job for submission  ................           33.4s


Printout canceled (the job is still running).

To get more information at any time, run:
   bacalhau describe ed1966b2-fa70-4010-be05-083db4bfb325

To cancel the job, run:
   bacalhau cancel ed1966b2-fa70-4010-be05-083db4bfb325
  1. List jobs, see it's in pending state:
$ bacalhau job list --limit=100 
 CREATED   ID          JOB     TYPE   STATE     
 21:10:28  8498d0cd    docker  batch  Completed 
 21:29:51  76e79ed9    docker  ops    Completed 
 21:30:45  a7cda682    docker  batch  Completed 
 21:31:15  cfaea1d6    docker  batch  Completed 
 23:32:51  2a15fcd6    docker  batch  Completed 
 00:04:07  921ac136    docker  batch  Failed    
 00:04:16  334236cf    docker  batch  Failed    
 16:45:17  595eaa6d    docker  batch  Stopped   
 16:51:02  ed1966b2    docker  batch  Pending   <- this one
  1. Describe the job via the shortID:
$ bacalhau job describe ed1966b2
ID            = ed1966b2-fa70-4010-be05-083db4bfb325
Name          = ed1966b2-fa70-4010-be05-083db4bfb325
Namespace     = 69d3a761943b09c56b5561784a855a8f43e6ce9746132578fe7c26fd80c3af56
Type          = batch
State         = Pending
Count         = 1
Created Time  = 2024-03-21 16:51:02
Modified Time = 2024-03-21 16:51:02
Version       = 1

Summary
New = 1

Executions
 ID          NODE ID   STATE  DESIRED  REV.  CREATED    MODIFIED   COMMENT 
 e-dbf3ad1c  QmXE1McE  New    Pending  1     4m47s ago  4m47s ago          
  1. Stop the job (This is the failure):
$ bacalhau job stop ed1966b2
Checking job status

	Connecting to network  ................  done ✅  0.0s
	  Verifying job state  ................  done ✅  0.0s
	         Stopping job  ................  err  ❌  0.0s
Error: unknown error trying to stop job (ID: ed1966b2): Unexpected response code: 500 ({
  "error": "Multiple jobs found for jobID prefix: ed1966b2, matching jobIDs: [ed1966b2 ed1966b2-fa70-4010-be05-083db4bfb325]",
  "message": "Internal Server Error"
})
Usage:
  bacalhau job stop [id] [flags]

Examples:
  # Stop a previously submitted job
  bacalhau job stop j-51225160-807e-48b8-88c9-28311c7899e1
  
  # Stop a job, with a short ID.
  bacalhau job stop j-51225160

Flags:
  -h, --help    help for stop
      --quiet   Do not print anything to stdout or stderr

Global Flags:
      --api-host string         The host for the client and server to communicate on (via REST).
                                Ignored if BACALHAU_API_HOST environment variable is set. (default "bootstrap.production.bacalhau.org")
      --api-port int            The port for the client and server to communicate on (via REST).
                                Ignored if BACALHAU_API_PORT environment variable is set. (default 1234)
      --cacert string           The location of a CA certificate file when self-signed certificates
                                	are used by the server
      --insecure                Enables TLS but does not verify certificates
      --log-mode logging-mode   Log format: 'default','station','json','combined','event' (default default)
      --repo string             path to bacalhau repo (default "/home/frrist/.bacalhau")
      --tls                     Instructs the client to use TLS

unknown error trying to stop job (ID: ed1966b2): Unexpected response code: 500 ({
  "error": "Multiple jobs found for jobID prefix: ed1966b2, matching jobIDs: [ed1966b2 ed1966b2-fa70-4010-be05-083db4bfb325]",
  "message": "Internal Server Error"
})

frrist avatar Mar 21 '24 17:03 frrist

Stopping the job with its long JobID appears to work:

bacalhau job stop ed1966b2-fa70-4010-be05-083db4bfb325
Checking job status

	Connecting to network  ................  done ✅  0.0s
	  Verifying job state  ................  done ✅  0.0s
	         Stopping job  ................  done ✅  0.0s

Job stop successfully submitted with evaluation ID: 7f6149b0-5553-4df0-8299-7646ebe5cb83
bacalhau job list
 CREATED   ID          JOB     TYPE   STATE     
 21:10:28  8498d0cd    docker  batch  Completed 
 21:29:51  76e79ed9    docker  ops    Completed 
 21:30:45  a7cda682    docker  batch  Completed 
 21:31:15  cfaea1d6    docker  batch  Completed 
 23:32:51  2a15fcd6    docker  batch  Completed 
 00:04:07  921ac136    docker  batch  Failed    
 00:04:16  334236cf    docker  batch  Failed    
 16:45:17  595eaa6d    docker  batch  Stopped   
 16:51:02  ed1966b2    docker  batch  Stopped   <- this one

frrist avatar Mar 21 '24 17:03 frrist