planemo Planenmo test with <has_line line="..."> fails when stdout is bigger than 2**14 characters long.

When we run tools the stdout is cut if its length is greater than 2**14 characters.

This behavior causes error in the has_line test, as it seems to rely on the "size-reduced" stdout instead of the "whole" stdout.

Success and fail examples

Here are two examples of xml tools. This one works fine because the output will be 1200 chars long:

<tool id="ok_planemo_gx" name="ok_planemo_gx" version="1.0.0" profile="22.01">
    <requirements>
        <requirement type="package">python</requirement>
    </requirements>
    <command><![CDATA[python -c 'print("\n".join(map("{:05}".format, range(200))))']]></command>
    <inputs><param type="data" format="txt" name="i"/></inputs>
    <outputs><data name="o" format="txt" /></outputs>
    <tests>
        <test>
            <param name="i" value="input.txt" />
            <assert_stdout>
                <has_line line="00123" />
            </assert_stdout>
        </test>
    </tests>
</tool>

But in this one, the test fails because the stdout is cut in the area shown above:

<tool id="bug_planemo_gx" name="bug_planemo_gx" version="1.0.0" profile="22.01">
    <requirements>
        <requirement type="package">python</requirement>
    </requirements>
    <command><![CDATA[python -c 'print("\n".join(map(map("{:05}".format, range(6000))))']]></command>
    <inputs><param type="data" format="txt" name="i"/></inputs>
    <outputs><data name="o" format="txt" /></outputs>
    <tests>
        <test>
            <param name="i" value="input.txt" />
            <assert_stdout>
                <has_line line="03000" />
            </assert_stdout>
        </test>
    </tests>
</tool>

Output is cut here:

So, the line "03000" cannot be found, even if it is present in the logs.

As we can see, the first part of the output is 6*2729+3 (=16377) characters long. Which is only seven characters away from to 2**14 (=16384).

I suppose other tests like has_text, not_has_text, has_text_matching - and so on - will be prone to this problem.

EDIT: has_text is error prone too.

tests

to test these files; I ran the following script:

VENV=.venv
TEST_DIR=/tmp/test_tool

if [ ! -d ".venv" ];then
  python3 -m virtualenv ${VENV} ;
fi


. ${VENV}/bin/activate

if [ "$(which planemo)" = "" ];then
   pip install planemo ;
fi


if [ ! -d "${TEST_DIR}" ];then
  planemo conda_install --conda_prefix ${TEST_DIR} . ;
fi

mkdir -p test-data
touch test-data/input.txt

planemo lint --fail_level error ./*.xml || exit 255

planemo test \
  --install_galaxy \
  --conda_dependency_resolution \
  --conda_prefix ${TEST_DIR} \
  --no_cleanup \
;

May 19 '22 13:05 Lain-inrae

There is another problem, that may be corrected by the same bugfix (so I did not open a new issue).

The stdout seems to be duplicated.

For example, I get an error when running this tool's test:

<tool id="ok_planemo_gx" name="ok_planemo_gx" version="1.0.0" profile="22.01">
    <requirements>
        <requirement type="package">python</requirement>
    </requirements>
    <command><![CDATA[python -c 'print("0" * 200)']]></command>
    <inputs><param type="data" format="txt" name="i"/></inputs>
    <outputs><data name="o" format="txt" /></outputs>
    <tests>
        <test>
            <param name="i" value="input.txt" />
            <assert_stdout>
                <has_size value="200" /> <!-- EDIT: this is an error, the actual size is 201 -->
            </assert_stdout>
        </test>
    </tests>
</tool>

Planemo's outputs:

---------------------- >> begin tool stdout << -----------------------
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

----------------------- >> end tool stdout << ------------------------

---------------------- >> begin tool stderr << -----------------------

----------------------- >> end tool stderr << ------------------------
FAIL

======================================================================
FAIL: ( ok_planemo_gx ) > Test-1
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/tmper40nzja/galaxy-dev/test/functional/test_toolbox.py", line 98, in test_tool
    self.do_it(tool_version=tool_version, test_index=test_index)
  File "/tmp/tmper40nzja/galaxy-dev/test/functional/test_toolbox.py", line 35, in do_it
    verify_tool(tool_id, self.galaxy_interactor, resource_parameters=resource_parameters, test_index=test_index, tool_version=tool_version, register_job_data=register_job_data)
  File "/tmp/tmper40nzja/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1127, in verify_tool
    raise e
  File "/tmp/tmper40nzja/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1123, in verify_tool
    job_stdio = _verify_outputs(testdef, test_history, jobs, data_list, data_collection_list, galaxy_interactor, quiet=quiet)
  File "/tmp/tmper40nzja/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1299, in _verify_outputs
    raise JobOutputsError(found_exceptions, job_stdio)
galaxy.tool_util.verify.interactor.JobOutputsError: Standard output of the job different than expected
Expected file size of 200+-0 found 403

As we can see, the stdout is duplicated, so the output's size is duplicated too.

But when the original stdout is bigger than 2**15, the duplicated stdout's size becomes bigger 2**16. And then, we have funnier things:

<tool id="bug_planemo_gx" name="bug_planemo_gx" version="1.0.0" profile="22.01">
    <requirements>
        <requirement type="package">python</requirement>
    </requirements>
    <command><![CDATA[python -c 'print("\n".join(map("{:05}".format, range(6000))))']]></command>
    <inputs><param type="data" format="txt" name="i"/></inputs>
    <outputs><data name="o" format="txt" /></outputs>
    <tests>
        <test>
            <param name="i" value="input.txt" />
            <assert_stdout>
                <has_size value="35999" /><!-- EDIT: this is an error, the actual size is 36000 -->
            </assert_stdout>
        </test>
    </tests>
</tool>

======================================================================
FAIL: ( bug_planemo_gx ) > Test-1
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/tmpmu71gnuy/galaxy-dev/test/functional/test_toolbox.py", line 98, in test_tool
    self.do_it(tool_version=tool_version, test_index=test_index)
  File "/tmp/tmpmu71gnuy/galaxy-dev/test/functional/test_toolbox.py", line 35, in do_it
    verify_tool(tool_id, self.galaxy_interactor, resource_parameters=resource_parameters, test_index=test_index, tool_version=tool_version, register_job_data=register_job_data)
  File "/tmp/tmpmu71gnuy/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1127, in verify_tool
    raise e
  File "/tmp/tmpmu71gnuy/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1123, in verify_tool
    job_stdio = _verify_outputs(testdef, test_history, jobs, data_list, data_collection_list, galaxy_interactor, quiet=quiet)
  File "/tmp/tmpmu71gnuy/galaxy-dev/lib/galaxy/tool_util/verify/interactor.py", line 1299, in _verify_outputs
    raise JobOutputsError(found_exceptions, job_stdio)
galaxy.tool_util.verify.interactor.JobOutputsError: Standard output of the job different than expected
Expected file size of 35999+-0 found 65537

Stdout is 65537 chars long,according to planemo (2**16+1) instead of either 36000 (the real size) or 72001 (duplicated size). This looks like the stdout is cut to never be bigger that 2**16, which is fine. But the test "has_size" sould only rely on original stdoutm instead of the cut one.

May 19 '22 14:05 Lain-inrae

The problem comes from the way galaxy tests a tool: galaxy.tool_util.verify.script._test_tool function uses the galaxy.tool_util.verify.interactor.verify_tool, which uses the interactor (so, the api) to retrieve the job's json data... This json's stdout attribute is always reduced in size, even with ?full=true.

So, this is galaxy issue.

EDIT: in fact, this comes from the way job's stdout/stderr are stored in DB... much bigger problem... EDIT2:

 DATABASE_MAX_STRING_SIZE = 32768

where does this limit come from..?? https://github.com/galaxyproject/galaxy/commit/4f93f0a04d0d19d4afa116f3bf33d75bd0b624a2 And why are Strings length limits applied to Text fields..?



class Job(Base, JobLike, UsesCreateAndUpdateTime, Dictifiable, RepresentById):
    """
    A job represents a request to run a tool given input datasets, tool
    parameters, and output datasets.
    """
    __tablename__ = 'job'

    ...

    runner_name = Column(String(255))
    job_stdout = Column(TEXT)
    job_stderr = Column(TEXT)
    tool_stdout = Column(TEXT)
    tool_stderr = Column(TEXT)
    exit_code = Column(Integer, nullable=True)
    traceback = Column(TEXT)

May 20 '22 08:05 Lain-inrae

Hi @Lain-inrae, thanks for looking into this!

Maybe we can add a line to the docs for assert_stderr and assert_stdout warning users that stderr and stdout get cut when stored in the database.

Jun 02 '22 08:06 simonbray