librpm.rs icon indicating copy to clipboard operation
librpm.rs copied to clipboard

Searching for two packages in succession makes the program hang

Open mkpankov opened this issue 5 years ago • 5 comments

Add a new find call to the end of db_find_test:

fn db_find_test() {
    configure();

    let mut matches = Index::Name.find(PACKAGE_NAME);

    if let Some(package) = matches.next() {
        assert_eq!(package.name, PACKAGE_NAME);
        assert_eq!(package.summary, PACKAGE_SUMMARY);
        assert_eq!(package.license.as_str(), PACKAGE_LICENSE);
        assert!(matches.next().is_none(), "expected one result, got more!");
    } else {
        if librpm::db::installed_packages().count() == 0 {
            eprintln!("*** warning: No RPMs installed! Tests skipped!")
        } else {
            panic!("some RPMs installed, but not `rpm-devel`?!");
        }
    }

    // Add this line
    let mut matches = Index::Name.find("glibc-common");
}

Log:

[root@3d483c5561e3 work]# cargo test --no-default-features --target-dir docker-target -- 
   Compiling librpm v0.1.1 (/work)
warning: unused variable: `matches`
  --> tests/integration.rs:40:13
   |
40 |     let mut matches = Index::Name.find("glibc-common");
   |             ^^^^^^^ help: consider prefixing with an underscore: `_matches`
   |
   = note: `#[warn(unused_variables)]` on by default

warning: variable does not need to be mutable
  --> tests/integration.rs:40:9
   |
40 |     let mut matches = Index::Name.find("glibc-common");
   |         ----^^^^^^^
   |         |
   |         help: remove this `mut`
   |
   = note: `#[warn(unused_mut)]` on by default

    Finished dev [unoptimized + debuginfo] target(s) in 0.55s
     Running docker-target/debug/deps/librpm-04d878a85d1dcd2e

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running docker-target/debug/deps/integration-9c17917f995b4ddf

running 1 test
BDB2053 Freeing read locks for locker 0x225: 4201/139700743603968
BDB2053 Freeing read locks for locker 0x226: 4201/139700743603968
BDB2053 Freeing read locks for locker 0x227: 4201/139700743603968
^C
[root@3d483c5561e3 work]# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.0  11836  2376 pts/0    Ss   Nov22   0:00 /bin/bash
root        4247  0.5  0.1 145596 10096 pts/0    Sl   06:56   0:00 /work/docker-target/debug/deps/integration-9c17917f995b4ddf
root        4249  0.0  0.0  51760  3464 pts/0    R+   06:57   0:00 ps aux
[root@3d483c5561e3 work]# kill -9 4247

mkpankov avatar Nov 25 '19 07:11 mkpankov

The issue is this:

impl MatchIterator {
    /// Create a new `MatchIterator` for the current RPM database, searching
    /// by the (optionally) given search key.
    pub(crate) fn new(tag: Tag, key_opt: Option<&str>) -> Self {
        // Next line is the issue
        let mut txn = GlobalTS::create();
        let next_item = None;

When I step over let mut txn = GlobalTS::create(); in GDB, program hangs.

When I abort it, backtrace is this:

#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00005609499d9693 in wait () at src/libstd/sys/unix/condvar.rs:71
#2  wait () at src/libstd/sys_common/condvar.rs:41
#3  wait<()> ()	at src/libstd/sync/condvar.rs:204
#4  std::thread::park::h26069396648f9b17 () at src/libstd/thread/mod.rs:911
#5  0x00005609499df9b2 in std::sync::mpsc::blocking::WaitToken::wait::h27bf0c2b77166d78 () at src/libstd/sync/mpsc/blocking.rs:71
#6  0x000056094998ea52 in std::sync::mpsc::shared::Packet$LT$T$GT$::recv::h518641c3bb24750b () at /rustc/4560ea788cb760f0a34127156c78e2552949f734/src/libstd/sync/mpsc/shared.rs:234
#7  0x000056094998db3c in std::sync::mpsc::Receiver$LT$T$GT$::recv::h8c771889fde2897a () at /rustc/4560ea788cb760f0a34127156c78e2552949f734/src/libstd/sync/mpsc/mod.rs:1205
#8  0x00005609499a9271 in run_tests<closure-2> () at src/libtest/lib.rs:1133
#9  test::run_tests_console::h2ef88807f9bb22d7 () at src/libtest/lib.rs:957
#10 0x000056094999fd62 in test::test_main::h248c73a93c9c81cf ()	at src/libtest/lib.rs:295
#11 0x00005609499a057d in test::test_main_static::h927a79ac80550204 () at src/libtest/lib.rs:329
#12 0x0000560949989b26 in integration::main::ha57f13047a0e8a7f ()
#13 0x000056094998a9b0 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::hf2ce7b33b92f543a () at /rustc/4560ea788cb760f0a34127156c78e2552949f734/src/libstd/rt.rs:64
#14 0x00005609499e2d43 in {{closure}} () at src/libstd/rt.rs:49
#15 std::panicking::try::do_call::h7c2a8488f72db90c () at src/libstd/panicking.rs:292

mkpankov avatar Nov 25 '19 07:11 mkpankov

To be a little more specific, each MatchIterator holds a reference to the global lock, so if you try to create a new MatchIterator while another one is already live, boom you now have a deadlock. Mutex::lock() will block the thread and there is nothing to release the other lock.

https://doc.rust-lang.org/std/sync/struct.Mutex.html#method.lock

@tarcieri Do you have a preferred method of addressing this? Perhaps parking_lot::ReentrantMutex?

https://docs.rs/parking_lot/0.11.2/parking_lot/type.ReentrantMutex.html

dralley avatar Nov 16 '21 05:11 dralley

@dralley sounds like a plausible solution

tarcieri avatar Nov 16 '21 13:11 tarcieri

It works btw. I'm not sure if it's slightly overkill though, would a RwLock be sufficient?

dralley avatar Dec 26 '21 06:12 dralley

I haven't seen the thread safety of librpm documented well enough to know if an RwLock is sound. In fact, I'm not entirely certain the current Mutex is sound. Previously I had implemented database access as being !Send and relaxed that based on some discussions with RPM devs.

Perhaps you could try to dig up some authoritative documentation on librpm thread safety? An RwLock-based approach might be worth exploring if it is actually sound.

tarcieri avatar Dec 27 '21 15:12 tarcieri