dragonfly icon indicating copy to clipboard operation
dragonfly copied to clipboard

RegTests with epoll failrues

Open kostasrim opened this issue 1 year ago • 8 comments

See https://github.com/dragonflydb/dragonfly/actions/runs/12764061257/job/35575450779#step:6:3437

  • [ ] test_migration_timeout_on_sync
  • [ ] test_network_disconnect_during_migration
  • [ ] test_cluster_migration_huge_container
  • [ ] test_reply_count
  • [ ] test_big_containers
  • [ ] test_cluster_migration_while_seeding
  • [ ] test_denyoom_commands

For test_big_containers it would almost never finish because of a bug within the epoll socket (https://github.com/romange/helio/pull/363). After the fix, locally the memory does not surpass 1.4x (dropped from 2x) but it seems that on the CI it still fails with (2x). I also:

  • compared the INFO MEMORY from both EPOLL and IOURING leading to very similar memory footprints for both variants
  • calculated the total allocations and deallocations with the memory tracker for both EPOLL and IOURING and both of them had very similar results

I suspect there is something around the EPOLL socket operations that "wastes" memory but we need to somehow narrow it down

kostasrim avatar Jan 14 '25 09:01 kostasrim

Full list of skipped tests: https://github.com/dragonflydb/dragonfly/pull/4426/files

kostasrim avatar Jan 17 '25 11:01 kostasrim

Two more: test_cluster_migration_while_seeding https://github.com/dragonflydb/dragonfly/actions/runs/12875654869/job/35897346679 and https://github.com/dragonflydb/dragonfly/actions/runs/12840327559/job/35808838383

kostasrim avatar Jan 21 '25 07:01 kostasrim

https://github.com/dragonflydb/dragonfly/actions/runs/12873612142/job/35891503547 test_denyoom_commands

kostasrim avatar Jan 21 '25 07:01 kostasrim

#4812

kostasrim avatar Mar 24 '25 07:03 kostasrim

issues with test_take_over_counters

https://github.com/dragonflydb/dragonfly/actions/runs/14373421041/job/40300580227#step:6:1021 https://github.com/dragonflydb/dragonfly/actions/runs/14369307821/job/40289169729#step:6:2126 https://github.com/dragonflydb/dragonfly/actions/runs/14366893644/job/40281910082#step:6:2050 https://github.com/dragonflydb/dragonfly/actions/runs/14363918820/job/40272082908#step:6:1945

abhijat avatar Apr 10 '25 07:04 abhijat

  • test_replication_all https://github.com/dragonflydb/dragonfly/actions/runs/16138817366/job/45541251264
  • test_redis_replication_all https://github.com/dragonflydb/dragonfly/actions/runs/16154282647/job/45592990184

BagritsevichStepan avatar Jul 10 '25 17:07 BagritsevichStepan

test_replicate_disconnect_redis_cluster https://github.com/dragonflydb/dragonfly/actions/runs/16280701062/job/45969533984#step:6:357

Curiously the segfault was in pytest, so might not be an issue with dragonfly but the pytest setup:

Fatal Python error: Segmentation fault

abhijat avatar Jul 15 '25 06:07 abhijat

https://github.com/dragonflydb/dragonfly/actions/runs/18256603537/job/51978806842

vyavdoshenko avatar Oct 05 '25 17:10 vyavdoshenko