OpenML icon indicating copy to clipboard operation
OpenML copied to clipboard

Security: sanitize/replace shell invocations to avoid injection

Open lucifer4330k opened this issue 1 month ago • 1 comments

Potential command injection risk in shell invocations that include user-controlled values without shell-escaping.

Examples

  • openml_OS/helpers/api_helper.php validate_arff(): builds a sed command using $name and $did to prepend info to an ARFF file:

    $info = '% Data set "'.$name.'". ... d/'.$did;
    $string = '1s/^/'.$info.'\n/';
    $command2 = "sed -i -e '$string' $newUrl";
    exec(CMD_PREFIX . $command2, $res, $code);
    

    $name originates from dataset metadata and may contain characters that break quoting. This should avoid shell entirely.

  • openml_OS/controllers/Api_splits.php: multiple system()/exec() calls build Java commands with request-derived inputs. Some checks (is_safe, is_numeric) exist, but defense-in-depth suggests escaping args or using proc_open with argv arrays.

Proposed fix

  • Replace the sed call with pure PHP file I/O to prepend the info line.
  • When shelling out to Java, strictly validate and/or escape all args (e.g., via escapeshellarg) and prefer argv arrays.
  • Add unit tests covering edge-case names.

Acceptance criteria

  • No use of unescaped user-provided values in shell commands.
  • Prepend operation implemented without shell invocation.

lucifer4330k avatar Nov 17 '25 09:11 lucifer4330k

hi @lucifer4330k ,Could you please assign it to me. I can conttribute here well

Aymuos22 avatar Nov 24 '25 18:11 Aymuos22