Security: sanitize/replace shell invocations to avoid injection
Potential command injection risk in shell invocations that include user-controlled values without shell-escaping.
Examples
-
openml_OS/helpers/api_helper.phpvalidate_arff(): builds asedcommand using$nameand$didto prepend info to an ARFF file:$info = '% Data set "'.$name.'". ... d/'.$did; $string = '1s/^/'.$info.'\n/'; $command2 = "sed -i -e '$string' $newUrl"; exec(CMD_PREFIX . $command2, $res, $code);$nameoriginates from dataset metadata and may contain characters that break quoting. This should avoid shell entirely. -
openml_OS/controllers/Api_splits.php: multiplesystem()/exec()calls build Java commands with request-derived inputs. Some checks (is_safe,is_numeric) exist, but defense-in-depth suggests escaping args or usingproc_openwith argv arrays.
Proposed fix
- Replace the
sedcall with pure PHP file I/O to prepend the info line. - When shelling out to Java, strictly validate and/or escape all args (e.g., via
escapeshellarg) and prefer argv arrays. - Add unit tests covering edge-case names.
Acceptance criteria
- No use of unescaped user-provided values in shell commands.
- Prepend operation implemented without shell invocation.
hi @lucifer4330k ,Could you please assign it to me. I can conttribute here well