Planenmo test with <has_line line="..."> fails when stdout is bigger than 2**14 characters long.
When we run tools the stdout is cut if its length is greater than 2**14 characters.
This behavior causes error in the has_line test, as it seems to rely on the "size-reduced" stdout instead of the "whole" stdout.
Success and fail examples
Here are two examples of xml tools. This one works fine because the output will be 1200 chars long:
<tool id="ok_planemo_gx" name="ok_planemo_gx" version="1.0.0" profile="22.01">
<requirements>
<requirement type="package">python</requirement>
</requirements>
<command><![CDATA[python -c 'print("\n".join(map("{:05}".format, range(200))))']]></command>
<inputs><param type="data" format="txt" name="i"/></inputs>
<outputs><data name="o" format="txt" /></outputs>
<tests>
<test>
<param name="i" value="input.txt" />
<assert_stdout>
<has_line line="00123" />
</assert_stdout>
</test>
</tests>
</tool>
But in this one, the test fails because the stdout is cut in the area shown above:
<tool id="bug_planemo_gx" name="bug_planemo_gx" version="1.0.0" profile="22.01">
<requirements>
<requirement type="package">python</requirement>
</requirements>
<command><![CDATA[python -c 'print("\n".join(map(map("{:05}".format, range(6000))))']]></command>
<inputs><param type="data" format="txt" name="i"/></inputs>
<outputs><data name="o" format="txt" /></outputs>
<tests>
<test>
<param name="i" value="input.txt" />
<assert_stdout>
<has_line line="03000" />
</assert_stdout>
</test>
</tests>
</tool>
Output is cut here:
02727
02728
02729
02
..
9
03270
03271
03272
So, the line "03000" cannot be found, even if it is present in the logs.
As we can see, the first part of the output is 6*2729+3 (=16377) characters long. Which is only seven characters away from to 2**14 (=16384).
I suppose other tests like has_text, not_has_text, has_text_matching - and so on - will be prone to this problem.
EDIT: has_text is error prone too.
tests
to test these files; I ran the following script:
VENV=.venv
TEST_DIR=/tmp/test_tool
if [ ! -d ".venv" ];then
python3 -m virtualenv ${VENV} ;
fi
. ${VENV}/bin/activate
if [ "$(which planemo)" = "" ];then
pip install planemo ;
fi
if [ ! -d "${TEST_DIR}" ];then
planemo conda_install --conda_prefix ${TEST_DIR} . ;
fi
mkdir -p test-data
touch test-data/input.txt
planemo lint --fail_level error ./*.xml || exit 255
planemo test \
--install_galaxy \
--conda_dependency_resolution \
--conda_prefix ${TEST_DIR} \
--no_cleanup \
;
There is another problem, that may be corrected by the same bugfix (so I did not open a new issue).
The stdout seems to be duplicated.
For example, I get an error when running this tool's test:
<tool id="ok_planemo_gx" name="ok_planemo_gx" version="1.0.0" profile="22.01">
<requirements>
<requirement type="package">python</requirement>
</requirements>
<command><![CDATA[python -c 'print("0" * 200)']]></command>
<inputs><param type="data" format="txt" name="i"/></inputs>
<outputs><data name="o" format="txt" /></outputs>
<tests>
<test>
<param name="i" value="input.txt" />
<assert_stdout>
<has_size value="200" /> <!-- EDIT: this is an error, the actual size is 201 -->
</assert_stdout>
</test>
</tests>
</tool>
Planemo's outputs:
---------------------- >> begin tool stdout << -----------------------
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
----------------------- >> end tool stdout << ------------------------
---------------------- >> begin tool stderr << -----------------------
----------------------- >> end tool stderr << ------------------------
FAIL
======================================================================
FAIL: ( ok_planemo_gx ) > Test-1
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/tmper40nzja/galaxy-dev/test/functional/test_toolbox.py", line 98, in test_tool
self.do_it(tool_version=tool_version, test_index=test_index)
File "/tmp/tmper40nzja/galaxy-dev/test/functional/test_toolbox.py", line 35, in do_it
verify_tool(tool_id, self.galaxy_interactor, resource_parameters=resource_parameters, test_index=test_index, tool_version=tool_version, register_job_data=register_job_data)
File "/tmp/tmper40nzja/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1127, in verify_tool
raise e
File "/tmp/tmper40nzja/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1123, in verify_tool
job_stdio = _verify_outputs(testdef, test_history, jobs, data_list, data_collection_list, galaxy_interactor, quiet=quiet)
File "/tmp/tmper40nzja/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1299, in _verify_outputs
raise JobOutputsError(found_exceptions, job_stdio)
galaxy.tool_util.verify.interactor.JobOutputsError: Standard output of the job different than expected
Expected file size of 200+-0 found 403
As we can see, the stdout is duplicated, so the output's size is duplicated too.
But when the original stdout is bigger than 2**15, the duplicated stdout's size becomes bigger 2**16. And then, we have funnier things:
<tool id="bug_planemo_gx" name="bug_planemo_gx" version="1.0.0" profile="22.01">
<requirements>
<requirement type="package">python</requirement>
</requirements>
<command><![CDATA[python -c 'print("\n".join(map("{:05}".format, range(6000))))']]></command>
<inputs><param type="data" format="txt" name="i"/></inputs>
<outputs><data name="o" format="txt" /></outputs>
<tests>
<test>
<param name="i" value="input.txt" />
<assert_stdout>
<has_size value="35999" /><!-- EDIT: this is an error, the actual size is 36000 -->
</assert_stdout>
</test>
</tests>
</tool>
======================================================================
FAIL: ( bug_planemo_gx ) > Test-1
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/tmpmu71gnuy/galaxy-dev/test/functional/test_toolbox.py", line 98, in test_tool
self.do_it(tool_version=tool_version, test_index=test_index)
File "/tmp/tmpmu71gnuy/galaxy-dev/test/functional/test_toolbox.py", line 35, in do_it
verify_tool(tool_id, self.galaxy_interactor, resource_parameters=resource_parameters, test_index=test_index, tool_version=tool_version, register_job_data=register_job_data)
File "/tmp/tmpmu71gnuy/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1127, in verify_tool
raise e
File "/tmp/tmpmu71gnuy/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1123, in verify_tool
job_stdio = _verify_outputs(testdef, test_history, jobs, data_list, data_collection_list, galaxy_interactor, quiet=quiet)
File "/tmp/tmpmu71gnuy/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1299, in _verify_outputs
raise JobOutputsError(found_exceptions, job_stdio)
galaxy.tool_util.verify.interactor.JobOutputsError: Standard output of the job different than expected
Expected file size of 35999+-0 found 65537
Stdout is 65537 chars long,according to planemo (2**16+1) instead of either 36000 (the real size) or 72001 (duplicated size). This looks like the stdout is cut to never be bigger that 2**16, which is fine. But the test "has_size" sould only rely on original stdoutm instead of the cut one.
The problem comes from the way galaxy tests a tool:
galaxy.tool_util.verify.script._test_tool function uses the galaxy.tool_util.verify.interactor.verify_tool, which uses the interactor (so, the api) to retrieve the job's json data...
This json's stdout attribute is always reduced in size, even with ?full=true.
So, this is galaxy issue.
EDIT: in fact, this comes from the way job's stdout/stderr are stored in DB... much bigger problem... EDIT2:
DATABASE_MAX_STRING_SIZE = 32768
where does this limit come from..??
https://github.com/galaxyproject/galaxy/commit/4f93f0a04d0d19d4afa116f3bf33d75bd0b624a2
And why are Strings length limits applied to Text fields..?
class Job(Base, JobLike, UsesCreateAndUpdateTime, Dictifiable, RepresentById):
"""
A job represents a request to run a tool given input datasets, tool
parameters, and output datasets.
"""
__tablename__ = 'job'
...
runner_name = Column(String(255))
job_stdout = Column(TEXT)
job_stderr = Column(TEXT)
tool_stdout = Column(TEXT)
tool_stderr = Column(TEXT)
exit_code = Column(Integer, nullable=True)
traceback = Column(TEXT)
Hi @Lain-inrae, thanks for looking into this!
Maybe we can add a line to the docs for assert_stderr and assert_stdout warning users that stderr and stdout get cut when stored in the database.