nano-node icon indicating copy to clipboard operation
nano-node copied to clipboard

Collect all intermittently failing tests

Open rkeene opened this issue 6 years ago • 71 comments

If you have a test that fails intermittently let's collect them here for systematic investigation

rkeene avatar Aug 29 '18 19:08 rkeene

On Linux/x86_64 from Travis-CI:

[ RUN      ] rpc.wallet_destroy
core_test: /workspace/rai/secure/blockstore.cpp:102: rai::mdb_iterator<T, U>::mdb_iterator(MDB_txn*, MDB_dbi, const MDB_val&, rai::epoch) [with T = rai::uint256_union; U = rai::wallet_value; MDB_txn = MDB_txn; MDB_dbi = unsigned int; MDB_val = MDB_val]: Assertion `status == 0' failed.

FIXED - original error is not reproducible on develop, but there is a tsan race which is fixed with https://github.com/nanocurrency/nano-node/pull/2615

rkeene avatar Aug 29 '18 19:08 rkeene

On Linux/x86_64:

[ RUN      ] node.auto_bootstrap
/home/rkeene/devel/raiblocks/rai/core_test/node.cpp:227: Failure
Value of: system.poll ()
  Actual: Deadline expired
Expected:

FIXED

rkeene avatar Aug 29 '18 22:08 rkeene

On Travis (trusty):

[ RUN      ] wallet.create_open_receive
/workspace/rai/qt_test/qt.cpp:400: Failure
Value of: json1.empty ()
  Actual: true
Expected: false

FIXED

cryptocode avatar Aug 31 '18 13:08 cryptocode

~~Failing on master:~~ Fixed by @SergiySW

[ RUN      ] rpc.payment_begin_end
/workspace/rai/core_test/rpc.cpp:1162: Failure
Value of: ec
  Actual: Deadline expired
Expected: 
[  FAILED  ] rpc.payment_begin_end (10080 ms)

Quick when it succeeds, when it fails, it seems stuck forever (all threads are waiting on some condition variable)

cryptocode avatar Sep 04 '18 10:09 cryptocode

[ RUN      ] node.unlock_search
core_test: /workspace/rai/node/wallet.cpp:1385:
                     void rai::wallets::foreach_representative(
                         const rai::transaction&,
                         const std::function<void(
                             const rai::uint256_union&,
                             const rai::raw_key&
                         )>&
                     ): Assertion `!error' failed.

FIXED by https://github.com/nanocurrency/raiblocks/commit/b1d05d0ce83e6aa2950f8ecfa1a3c266cfcde2a0

rkeene avatar Sep 23 '18 19:09 rkeene

EDIT: Fixed by https://github.com/nanocurrency/raiblocks/pull/1532

On local Linux box with some deadlock detection code that may exacerbate the race:

/var/hdd/programming/cpp/raiblocks/rai/core_test/network.cpp:92: Failure
Value of: peers1.size ()
  Actual: 0
Expected: 1
[  FAILED  ] network.send_node_id_handshake (238 ms)

PlasmaPower avatar Sep 27 '18 04:09 PlasmaPower

on 051df7be9b4a9f755e6a94569da1f430f9982ca0

[ RUN      ] bulk.offline_send
/var/hdd/programming/cpp/raiblocks/rai/core_test/network.cpp:826: Failure
Value of: system.poll ()
  Actual: Deadline expired
Expected:
[  FAILED  ] bulk.offline_send (20410 ms)

FIXED

PlasmaPower avatar Oct 04 '18 22:10 PlasmaPower

On master 37de74f795e32daa02150da49b5f0140211a8c44

[ RUN      ] rpc.online_reps
/home/lee/programming/cpp/raiblocks2/rai/core_test/rpc.cpp:3662: Failure
Expected: (nullptr) != (receive), actual: 8-byte object <00-00 00-00 00-00 00-00> vs 16-byte object <00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00>
[  FAILED  ] rpc.online_reps (238 ms)

EDIT: This test now fails with:

[ RUN      ] rpc.online_reps
nano/rpc_test/rpc.cpp:6121: Failure
Value of: system.poll ()
  Actual: Deadline expired
Expected:

PlasmaPower avatar Nov 06 '18 23:11 PlasmaPower

On Linux/x86_64 from Travis-CI:

[ RUN      ] rpc.wallet_destroy
core_test: /workspace/rai/secure/blockstore.cpp:102: rai::mdb_iterator<T, U>::mdb_iterator(MDB_txn*, MDB_dbi, const MDB_val&, rai::epoch) [with T = rai::uint256_union; U = rai::wallet_value; MDB_txn = MDB_txn; MDB_dbi = unsigned int; MDB_val = MDB_val]: Assertion `status == 0' failed.

This one happens because wallet_destroy sets handle = 0 after do_wallet_actions does the wallet->live () check, so a TOCTTOU it seems.

cryptocode avatar Dec 09 '18 01:12 cryptocode

[ RUN      ] node.fork_publish
/home/rkeene/devel/raiblocks/nano/core_test/node.cpp:643: Failure
Value of: node1.active.roots.size ()
  Actual: 0
Expected: 1
[  FAILED  ] node.fork_publish (668 ms)

and

[ RUN      ] node.fork_multi_flip
/home/rkeene/devel/raiblocks/nano/core_test/node.cpp:788: Failure
Value of: node2.active.roots.size ()
  Actual: 1
Expected: 2
[  FAILED  ] node.fork_multi_flip (1820 ms)

rkeene avatar Dec 20 '18 22:12 rkeene

[ RUN      ] wallet.select_account
/workspace/rai/qt_test/qt.cpp:96: Failure
Value of: key4
  Actual: 32-byte object <52-E8 58-72 7C-65 C8-2F DE-A5 30-C8 EC-DF BC-B9 BA-14 06-D0 82-64 06-20 5C-03 92-67 06-DB 0F-F1>
Expected: key2
Which is: 32-byte object <2C-29 08-C1 5F-B9 6C-C6 6F-4E 2E-AB A6-3F 5F-C7 E2-5E CE-53 AC-0E 3C-09 01-A4 23-4E 5B-07 4B-AD>
[  FAILED  ] wallet.select_account (55 ms)

FIXED

cryptocode avatar Dec 22 '18 02:12 cryptocode

node.unlock_search fails sporadically. When debugging it this sequence happens:

vote_generator::send -> foreach_representative -> wallet.store.fetch

...where fetch returns an error here:

	if (!result)
	{
		nano::public_key compare (nano::pub_key (prv.data));
		std::cout << pub.to_string() << " == " << compare.to_string() << std::endl;
		if (!(pub == compare))
		{
			result = true;  // <<======== oops
		}
	}

FIXED (https://github.com/nanocurrency/nano-node/issues/1121#issuecomment-450866102)

cryptocode avatar Jan 02 '19 13:01 cryptocode

@cryptocode was fixed in 17.0, but not in master https://github.com/nanocurrency/raiblocks/pull/1409/commits/b1d05d0ce83e6aa2950f8ecfa1a3c266cfcde2a0

SergiySW avatar Jan 02 '19 13:01 SergiySW

Rare node.vote_republish test failure. node::process_confirmed () can start before rolling back of fork block

[ RUN      ] node.vote_republish
/root/rai_test/nano/core_test/node.cpp:1835: Failure
Value of: system.poll ()
  Actual: Deadline expired
Expected:
[  FAILED  ] node.vote_republish (5307 ms)

FIXED

SergiySW avatar Jan 29 '19 11:01 SergiySW

[ RUN      ] wallet.no_work
Expected equality of these values:
  0
  cached_work
    Which is: 4545208490332381752

Happened once, unable to reproduce

FIXED

cryptocode avatar Mar 21 '19 09:03 cryptocode

rpc.online_reps fails about 50% of the time for me (on MSVC at least)

[ RUN      ] rpc.online_reps
c:\users\wesley\documents\raiblocks\nano\core_test\rpc.cpp(4246): error: Value of: weight2
 Actual: "340281366920938463463374607431768211455"
Expected: system.nodes[1]->weight (nano::test_genesis_key.pub).convert_to<std::string> ()
Which is: "340282366920938463463374607431768211455"

FIXED #1848

wezrule avatar Mar 21 '19 10:03 wezrule

[ RUN      ] work.eco_pow
work_pool.cpp:185: Failure
Expected: (future1.get ()) < (future2.get ()), actual: 8-byte object <A4-A5 57-00 00-00 00-00> vs 8-byte object <9E-2C 34-00 00-00 00-00>

FIXED

cryptocode avatar Apr 07 '19 01:04 cryptocode

@cryptocode This should be fixed in #1882 can you give it a try? I actually couldn't reproduce it on my Windows machine, but I do know it could have an issue (I had a comment in the test, but didn't really explain why).

FIXED

wezrule avatar Apr 07 '19 20:04 wezrule

[ RUN      ] node.broadcast_elected
Assertion (!error) failed c:\users\wesley\documents\raiblocks\nano\node\node.cpp:3114

Failing on this release_assert in nano::active_transactions::add:

auto error (nano::work_validate (*block_a, &difficulty));
release_assert (!error);

wezrule avatar Apr 09 '19 13:04 wezrule

[  RUN         ] conflicts.adjusted_difficulty
conflicts.cpp:247: Failure
Expected equality of these values:
  11
  nodes.active.size ()
    Which is: 10

Fails consistently when running TSAN on Ubuntu/GCC

FIXED (no longer able to reproduce locally, feel free to re-check @wezrule)

wezrule avatar Apr 11 '19 09:04 wezrule

[  RUN         ] network.send_node_id_handshake
nano/core_test/network:247: Failure
Expected equality of these values:
  0
  system.nodes[0]->network.size ()
    Which is: 1

Fails consistently when running TSAN on Ubuntu/GCC

FIXED in https://github.com/nanocurrency/nano-node/pull/2612

wezrule avatar Apr 11 '19 09:04 wezrule

[ RUN      ] receivable_processor.send_with_receive
/workspace/nano/core_test/network.cpp:337: Failure
Expected equality of these values:
  system.nodes[0]->config.receive_minimum.number ()
    Which is: 1000000000000000000000000
  system.nodes[1]->balance (key2.pub)
    Which is: 0

FIXED in #1940

wezrule avatar Apr 21 '19 10:04 wezrule

[ RUN      ] rpc.work_peer_bad
d:\agent\_work\2\s\src\vctools\crt\crtw32\stdcpp\thr\mutex.c(51): mutex destroyed while busy

Stack trace

wezrule avatar Apr 23 '19 09:04 wezrule

[ RUN      ] network.replace_port
c:\users\wesley\documents\raiblocks\nano\core_test\network.cpp(2037): error: Expected equality of these values:
  system.nodes[0]->network.udp_channels.size ()
    Which is: 1
  0
[  FAILED  ] network.replace_port (129 ms)

FIXED in #2630

wezrule avatar Apr 24 '19 07:04 wezrule

[ RUN      ] node.vote_by_hash_republish
/workspace/nano/core_test/node.cpp:2196: Failure
Value of: system.poll ()
  Actual: Deadline expired

wezrule avatar May 09 '19 16:05 wezrule

[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from active_transactions
[ RUN      ] active_transactions.adjusted_difficulty_priority
nano/core_test/active_transactions.cpp:143: Failure
Expected: (i->adjusted_difficulty) < (last_adjusted), actual: 18446724124103881201 vs 18446724124103881201
[  FAILED  ] active_transactions.adjusted_difficulty_priority (3781 ms)
[----------] 1 test from active_transactions (3781 ms total)

FIXED

cryptocode avatar May 26 '19 14:05 cryptocode

[ RUN      ] ledger.unchecked_epoch_invalid
core_test/ledger.cpp:2580: Failure
Value of: node1.active.empty ()
  Actual: false
Expected: true

FIXED

cryptocode avatar Jun 02 '19 19:06 cryptocode

[ RUN      ] active_transactions.prioritize_chains
nano/core_test/active_transactions.cpp:211: Failure
Value of: system.poll ()
  Actual: Deadline expired
Expected:

FIXED

cryptocode avatar Jun 07 '19 13:06 cryptocode

confirmation_height.modified_chain sometimes get stuck (no timeout), presumably it's one of the while (!node->write_database_queue.contains (nano::writer::confirmation_height)) loops @wezrule ?

FIXED (was reproducible, but now survives thousands of iterations, so seems fixed by recent changes)

cryptocode avatar Jun 28 '19 23:06 cryptocode

Ocasionally node.bootstrap_confirm_frontiers times out waiting for the election to start, in:

while (node1->active.empty ())
{
	ASSERT_NO_ERROR (system0.poll ());
	ASSERT_NO_ERROR (system1.poll ());
}

edit: No longer failing

guilhermelawless avatar Aug 26 '19 19:08 guilhermelawless