hardlinkpy icon indicating copy to clipboard operation
hardlinkpy copied to clipboard

Unit tests failing

Open akaihola opened this issue 7 years ago • 5 comments

Commit 7b66fa1 (ping @chlunde) broke the unit test suite:

$ python setup.py test
running test
running egg_info
writing hardlinkpy.egg-info/PKG-INFO
writing top-level names to hardlinkpy.egg-info/top_level.txt
writing dependency_links to hardlinkpy.egg-info/dependency_links.txt
writing entry points to hardlinkpy.egg-info/entry_points.txt
reading manifest file 'hardlinkpy.egg-info/SOURCES.txt'
writing manifest file 'hardlinkpy.egg-info/SOURCES.txt'
running build_ext
test_hardlink_tree (tests.TestHappy) ... ok
test_hardlink_tree_dryrun (tests.TestHappy) ... ok
test_hardlink_tree_exclude (tests.TestHappy) ... ok
test_hardlink_tree_filenames_equal (tests.TestHappy) ... FAIL
test_hardlink_tree_timestamp_ignore (tests.TestHappy) ... ok

======================================================================
FAIL: test_hardlink_tree_filenames_equal (tests.TestHappy)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/mnt/root/home/kaiant/repos/python/hardlinkpy/tests.py", line 105, in test_hardlink_tree_filenames_equal
    self.assertEqual(get_inode("dir1/name1.ext"), get_inode("dir2/name1.ext"))
AssertionError: 771490 != 771491

----------------------------------------------------------------------
Ran 5 tests in 0.008s

FAILED (failures=1)

akaihola avatar May 07 '18 19:05 akaihola

Which OS/filesystem/python-version is this? I can't reproduce it

chlunde avatar May 08 '18 14:05 chlunde

  • Fedora 26
  • btrfs
  • Python 2.7.14

akaihola avatar May 09 '18 05:05 akaihola

Yes, the algorithm when the "filenames-equal" option is enabled, is order dependent, and so the tests can fail depending on the order that the OS returns the filenames when iterating over a directory, if some of those files are already hardlinked.

For example, the dir1/name1.ext and dir1/link files are hardlinked before the test begins. If the files are returned in the order "dir2/name1.ext", "dir1/link", "dir1/name1.ext", then the test fails because "dir1/name1.ext" won't be linked to "dir2/name1.ext", as it is already linked to "dir1/link". However, if the order is "dir1/name1.ext", "dir1/link", "dir2/name1.ext", then "dir2/name1.ext" will be linked to "dir1/name1.ext", because "dir2/name1.ext" is not already linked to a previously seen file (such as "dir1/link").

It should be possible to ensure that when the "filenames-equal" flag is set, the program doesn't abort the search early when it finds a file that it is already linked to (unless perhaps it has the same basename). I'll work on a solution.

chadnetzer avatar Jun 26 '18 21:06 chadnetzer

Would it make sense to normalize directory content iteration to always happen in alphabetically sorted order?

akaihola avatar Jun 29 '18 18:06 akaihola

@akaihola Well, sorting directory iteration might at least make the tests work consistently, but there are other problems with the existing algorithm (which I discussed in my other reply) that mean that not all identical files remain hardlinked together. I think if we solve that issue (which is alluded to in the hardlink.py TODO), it will also fix the tests.

chadnetzer avatar Jun 29 '18 19:06 chadnetzer