fdb-joshua icon indicating copy to clipboard operation
fdb-joshua copied to clipboard

Describe how to run the standard simulation tests with joshua

Open oleg68 opened this issue 3 years ago • 10 comments

There are lots of simulaton tests of fondationdb in the test subdirectory of the sourcecode.

joshua requires a tarball fo running a test. What should be in the tarball for running the standard simulation tests from the test subdirectory? Do I need to pack the test subdirectory to the tarball? Do I need to pack fdbserver and other executables in the tarball?

It would be nice to have an example of such tarball described in the readme.md

oleg68 avatar Apr 08 '21 16:04 oleg68

Yes, that's something missing in the README.md.

FYI, the tarball can be generated when building foundationdb, e.g., ninja package_tests. The package is located at cmake_outputdir/packages/correctness-VERSION.tar.gz.

jzhou77 avatar Apr 08 '21 16:04 jzhou77

After

python3 -m joshua.joshua -C ../devops/clusters/joshua/fdb.cluster start --tarball '/home/oleg/work/fdb/FoundationDb/bld/packages/correctness-6.2.33.tar.gz'

something went wrong. The agent failed with

[oleg@oleg2 FdbJoshua]$ docker run --rm  --security-opt label=disable -v /home/oleg/work/fdb/devops/clusters/joshua:/opt/joshua -it foundationdb/joshua-agent:latest
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/rh/rh-python38/root/usr/local/lib64/python3.8/site-packages/joshua/joshua_agent.py", line 658, in agent
    retcode = run_ensemble(chosen_ensemble, save_on, work_dir=work_dir, timeout_command_timeout=timeout_command_timeout)
  File "/opt/rh/rh-python38/root/usr/local/lib64/python3.8/site-packages/joshua/joshua_agent.py", line 362, in run_ensemble
    for k, v in env_settings:
ValueError: not enough values to unpack (expected 2, got 1)

oleg68 avatar Apr 08 '21 17:04 oleg68

This seems to be a bug introduced by #3.

jzhou77 avatar Apr 08 '21 18:04 jzhou77

Can you try edit line 360 of /opt/rh/rh-python38/root/usr/local/lib64/python3.8/site-packages/joshua/joshua_agent.py to:

    if 'env' in properties and properties['env']:

I think this change can fix the bug.

jzhou77 avatar Apr 08 '21 18:04 jzhou77

I couldn't test the change you proposed, but tested #12. The agent stopped crashing,

[oleg@oleg2 FdbJoshua]$ docker run --rm  --security-opt label=disable -v /home/oleg/work/fdb/devops/clusters/joshua:/opt/joshua -it foundationdb/joshua-agent:latest
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
Unpacking/var/joshua/ensembles/20210409-151019-oleg-618f4164d06d707d
20210409-151019-oleg-618f4164d06d707d6151521263419774359./joshua_test
103

20210409-151019-oleg-618f4164d06d707d4568811136083724774./joshua_test
103

20210409-151019-oleg-618f4164d06d707d5441746154163069208./joshua_test
103

20210409-151019-oleg-618f4164d06d707d3795326571262142719./joshua_test
103

20210409-151019-oleg-618f4164d06d707d2012513511425241517./joshua_test
103

20210409-151019-oleg-618f4164d06d707d4496758372103192668./joshua_test
103

20210409-151019-oleg-618f4164d06d707d2537493284973285639./joshua_test
103

20210409-151019-oleg-618f4164d06d707d6154259346310925684./joshua_test
103

20210409-151019-oleg-618f4164d06d707d1768467256300473424./joshua_test
103

20210409-151019-oleg-618f4164d06d707d6198835866733904722./joshua_test
103

20210409-151019-oleg-618f4164d06d707d3421236884922169869./joshua_test
<jobstopped>
20210409-151019-oleg-618f4164d06d707d805089129740123916./joshua_test
<jobstopped>
20210409-151019-oleg-618f4164d06d707d2605797536880722932./joshua_test
<jobstopped>
removing 20210409-151019-oleg-618f4164d06d707d /var/joshua/ensembles/20210409-151019-oleg-618f4164d06d707d

But I cann't see their result:

[oleg@oleg2 FdbJoshua]$ python3 -m joshua.joshua -C ../devops/clusters/joshua/fdb.cluster tail
No active ensembles

oleg68 avatar Apr 09 '21 15:04 oleg68

The tail command looks for the active ensemble or the given one, so in order to see your results, use:

python3 -m joshua.joshua -C ../devops/clusters/joshua/fdb.cluster tail 20210409-151019-oleg-618f4164d06d707d

You can optionally give --errors --xml arguments.

jzhou77 avatar Apr 09 '21 16:04 jzhou77

  1. Is there a capability of displaying a list of tests ran in the past?
  2. Is there a forum to discuss joshua?

oleg68 avatar Apr 12 '21 06:04 oleg68

  1. Is there a capability of displaying a list of tests ran in the past?

Yes. Use python3 -m joshua.joshua list --stopped.

  1. Is there a forum to discuss joshua?

I think https://forums.foundationdb.org/ could be a good place.

jzhou77 avatar Apr 12 '21 16:04 jzhou77

I started a topic in https://forums.foundationdb.org/t/simulation-testing-of-foundationdb/2654 tu discuss how to run tests.

Seems some extra info should be added to README.md

oleg68 avatar Apr 14 '21 16:04 oleg68

I would love a full example which include:

  • how to start Joshua for testing purpose on a single machine,
  • how to use Joshua.

My usecase is to run some simulation on the rust client using the bindingTester.

PierreZ avatar Oct 05 '21 09:10 PierreZ