codalab-competitions Issues Scoring Submission - "Program command is not specified."

Hello, I am currently organizing a reinforcement learning-based competition where participants submit their agents via Python scripts which are then automatically evaluated, as defined by our custom Dockerfile:

  FROM ubuntu:22.10
  FROM python:3.7.9
  
  WORKDIR /cage
  
  COPY . /cage
  
  RUN pip install -e .
  
  ENTRYPOINT ["python", "/cage/CybORG/Evaluation/validation.py"]

This script both writes the results to standard output and creates a file within /cage/CybORG/Evaluation/ and I am trying to pull the relevant information so I can write it out to scores.txt. However, I am getting the Exception: "Program command is not specified." and I am unsure how to check if scores.txt was created by my score.py script. That can be seen as follows:

  import os
  import argparse
  import re
  
  if __name__ == "__main__":
      parser = argparse.ArgumentParser(
          description="Score program for CAGE Challenge 4"
      )
      parser.add_argument(
          "-o",
          "--output-dir",
          type=str,
          default="",
          help="",
      )
  
      args, unknown = parser.parse_known_args()
      output_dir = os.path.abspath(args.output_dir)
  
      if not os.path.exists(args.output_dir):
          print("Path not found:", args.output_dir)
          os.makedirs(args.output_dir)
  
      results_file_name = f"/cage/CybORG/Evaluation/"
      evaluation_files = os.listdir(f"{results_file_name}")
      for f in evaluation_files:
          if "summary_text" in f:
              results_file_name = f"{results_file_name}{f}"
              print(f'{results_file_name}')
  
      with open(f"{results_file_name}", "r") as fin:
          results = fin.read()
          with open(f"{output_dir}/scores.txt", "w+") as fout:
              # Regex to find the reward
              reward = re.findall(
                  r"Average reward is: (-?[0-9]\d*\.\d+?) with a standard",
                  results,
              )
              if reward:
                  fout.write(f"avg_reward: {reward[0]}")
                  print(f'{reward}')

and is run using the command python $program/score.py $output.

I do know that is at least running and evaluating properly as the standard output of the scoring portion will print the expected output if no scoring program is provided. If anyone has any insight, it would be greatly appreciated.

Sep 28 '23 13:09 kcowan6

Hello, does your scoring program includes a metadata file? If so, what is its content?

Sep 28 '23 14:09 Didayolo

Yes it does. It contains python $program/score.py $output. I admit I don't fully understand where $output is actually pointing to, so I'm unsure if there's additional information I need in the scoring program.

Sep 28 '23 19:09 kcowan6

Hi,

I am not sure I understand your problem. Where does the error message come from?

You can draw inspiration from this example bundle if it helps:

https://github.com/codalab/competition-examples/tree/master/codalab/Iris

The scoring program is here:

https://github.com/codalab/competition-examples/tree/master/codalab/Iris/iris_competition_bundle/scoring_program

Note that the metadata file contains command: python $program/score.py $input $output. $input points to the reference data, so it's usually useful for scoring submissions (but it depends on your specific problem).

I admit I don't fully understand where $output is actually pointing to

The variables $program and $output points to the right folder when the submission is running (the scoring program and the path where results are saved).

You can learn more in this Wiki page:

https://github.com/codalab/codalab-competitions/wiki/User_Building-a-Scoring-Program-for-a-Competition

Sep 29 '23 11:09 Didayolo

This error comes from submitting an example submission as I am testing the scoring portion of the competition before its publication.

For this competition, there is no input data needed as the reinforcement learning agent is just run on a scenario specified by the custom Docker container it is running in. I know the agent is being evaluated and scored, but the results are being written to a folder different from the normal output folder (/cage/CybORG/Evaluation/) just because of how the validation script is written. I’m trying to open the results file in score.py and writing it out to scores.txt which is (hopefully) being written to the output folder.

I know that the submission is for sure being evaluated, I am just unsure if the exception is because there is an issue with score.py, my metadata file, or something else.

Sorry for any confusion.

Sep 29 '23 13:09 kcowan6

To have the scores reflected on the leaderboard, you need to write them in the right format, and in the scores.txt file in the right folder.

The format is:

score_1: 0.44
score_2: 0.77

The numbers are given for the example. score_1 and score_2 should be replaced by the keys of your leaderboard column, as defined in the competition.yaml file.

For the path, it should be:

# Scoring program
import sys, os, os.path

input_dir = sys.argv[1]
output_dir = sys.argv[2]

submit_dir = os.path.join(input_dir, 'res') 
truth_dir = os.path.join(input_dir, 'ref')
output_filename = os.path.join(output_dir, 'scores.txt')

This bundle examples may be more relevant, as it is lighter, and does not involve ground truth, like in your problem: https://github.com/codalab/competition-examples/blob/master/codalab/Compute_pi/compute_pi_competition_bundle/program/evaluate.py

I hope this helps.

Sep 30 '23 10:09 Didayolo

So I have updated the score.py script I am using for the scoring program to extract and write out the score.txt file. It now looks as such:

import sys, os, os.path
import re

input_dir = sys.argv[1]
output_dir = sys.argv[2]

submit_dir = os.path.join(input_dir, 'res')
truth_dir = os.path.join(input_dir, 'ref')
output_filename = os.path.join(output_dir, 'scores.txt')

results_file_name = f"/cage/CybORG/Evaluation/"
evaluation_files = os.listdir(f"{results_file_name}")
for f in evaluation_files:
    if "summary_text" in f:
        results_file_name = f"{results_file_name}{f}"
        print(f'{results_file_name}')

with open(f"{results_file_name}", "r") as fin:
    results = fin.read()
    with open(f"{output_filename}", "w+") as fout:
        # Regex to find the reward
        reward = re.findall(
            r"Average reward is: (-?[0-9]\d*\.\d+?) with a standard",
            results,
        )
        if reward:
            fout.write(f"avg_reward: {reward[0]}")
            print(f'{reward}')

My metadata file for the scoring program just contains:

command: python $program/score.py $input $output

I am no longer getting the "Program command is not specified" error, but there is still no score being pulled to the leaderboard. The logic the Docker image follows has the submission evaluation being done immediately and writes out a summary file to a specified file path (/cage/CybORG/Evaluation/), as well as prints it out to the console. The score.py I've written looks for the file in that folder and should be writing out "avg_rewards: some_reward" to $output/scores.txt. However, the submission fails at the scoring step and the console contents are printed out to the output log, so I am not sure if score.py is actually being run. I have also verified that the leaderboard key does match up wtih what is being written to scores.txt.

Oct 30 '23 19:10 kcowan6

the submission fails at the scoring step

Probably your scoring program crashes. Try to check the "scoring error logs".

Oct 31 '23 14:10 Didayolo