Security: sanitize/replace shell invocations to avoid injection

Open lucifer4330k opened this issue 1 month ago • 1 comments

Potential command injection risk in shell invocations that include user-controlled values without shell-escaping.

Examples

openml_OS/helpers/api_helper.php validate_arff(): builds a sed command using $name and $did to prepend info to an ARFF file:
```
$info = '% Data set "'.$name.'". ... d/'.$did;
$string = '1s/^/'.$info.'\n/';
$command2 = "sed -i -e '$string' $newUrl";
exec(CMD_PREFIX . $command2, $res, $code);
```
$name originates from dataset metadata and may contain characters that break quoting. This should avoid shell entirely.
openml_OS/controllers/Api_splits.php: multiple system()/exec() calls build Java commands with request-derived inputs. Some checks (is_safe, is_numeric) exist, but defense-in-depth suggests escaping args or using proc_open with argv arrays.

Proposed fix

Replace the sed call with pure PHP file I/O to prepend the info line.
When shelling out to Java, strictly validate and/or escape all args (e.g., via escapeshellarg) and prefer argv arrays.
Add unit tests covering edge-case names.

Acceptance criteria

Nov 17 '25 09:11 lucifer4330k

hi @lucifer4330k ,Could you please assign it to me. I can conttribute here well

Nov 24 '25 18:11 Aymuos22