qBittorrent
qBittorrent copied to clipboard
[API] inconsistencies with v2/hybrid torrents (TorrentID)
qBittorrent & operating system versions
All versions, All OS. Reproduced freshly on master branch last night.
What is the problem?
There seems to be some problems with API dealing with v2/hybrid torrent. Basically, i believe most operations only work with truncated v2 hash with v2/hybrid torrents, but fail with the full v2 hash. They also do not work with v1 hash for hybrid torrents.
List of operations supporting all hash forms:
- [ ] properties
- [ ] delete
- [ ] info
- ... to be completed
Why that's a problem
I'm writing software that interfaces with qBittorrent. Only dealing with "TorrentID" forces me to duplicate qBittorrent torrent state across my programs. That's not really a problem for long-lived operations (although it's a waste of resources), but it is a great loss for short-lived operations like a CLI to get information about a torrent by infohash... which has to download the entire torrent list to display info about a single torrent.
Steps to reproduce
#! /usr/bin/env bash
tempDir="$(mktemp -d)"
api="http://localhost:8080/api/v2"
magnet="magnet:?xt=urn:btih:631a31dd0a46257d5078c0dee4e66e26f73e42ac&xt=urn:btmh:1220d8dd32ac93357c368556af3ac1d95c9d76bd0dff6fa9833ecdac3d53134efabb&dn=bittorrent-v1-v2-hybrid-test"
magnet2="magnet:?xt=urn:btmh:1220caf1e1c30e81cb361b9ee167c4aa64228a7fa4fa9f6105232b28ad099f3a302e&dn=bittorrent-v2-test"
if ! curl --silent -c "$tempDir"/cookies.jar --data 'username=admin&password=adminadmin' "$api"/auth/login 2>&1 | grep "Ok" 2>&1 >/dev/null; then
echo "Failed to login"
exit 1
fi
echo "Logged in."
# $1 magnet
# $2 name
addTorrent() {
if curl -b "$tempDir"/cookies.jar "$api"/torrents/info 2>/dev/null | grep "$1" 2>&1 >/dev/null; then
echo "$2 torrent already here"
else
echo "Adding $2 torrent"
if curl --silent -b "$tempDir"/cookies.jar -F "urls=$3" -F "paused=true" "$api"/torrents/add | grep "Fails" 2>&1 >/dev/null; then
echo "Adding failed"
return 1
fi
fi
}
# $1 check for this value
# $2 name
# $3 make sure this value is not here
checkTorrent() {
OUTPUT="$(curl --silent -b "$tempDir"/cookies.jar --data "hashes=$1" "$api"/torrents/info)"
if [ ! $? -eq 0 ]; then
echo "Checking torrent failed"
return 1
fi
if echo "$OUTPUT" | grep "$1" 2>&1 >/dev/null; then
if echo "$OUTPUT" | grep "daf1e1c30e81cb361b9ee167c4aa64228a7fa4fa9f6105232b28ad099f3adddd" 2>&1 >/dev/null; then
echo "$OUTPUT"
echo "FAIL $2 (bogus data)"
else
echo OK "$2"
return 0
fi
else
echo "$OUTPUT"
echo FAIL "$2"
return 1
fi
}
addTorrent "d8dd32ac93357c368556af3ac1d95c9d76bd0dff6fa9833ecdac3d53134efabb" "Hybrid" "magnet:?xt=urn:btih:631a31dd0a46257d5078c0dee4e66e26f73e42ac&xt=urn:btmh:1220d8dd32ac93357c368556af3ac1d95c9d76bd0dff6fa9833ecdac3d53134efabb&dn=bittorrent-v1-v2-hybrid-test" || exit 1
addTorrent "caf1e1c30e81cb361b9ee167c4aa64228a7fa4fa9f6105232b28ad099f3a302e" "V2" "magnet:?xt=urn:btmh:1220caf1e1c30e81cb361b9ee167c4aa64228a7fa4fa9f6105232b28ad099f3a302e&dn=bittorrent-v2-test" || exit 1
addTorrent "daf1e1c30e81cb361b9ee167c4aa64228a7fa4fa9f6105232b28ad099f3adddd" "bogus" "magnet:?xt=urn:btmh:1220daf1e1c30e81cb361b9ee167c4aa64228a7fa4fa9f6105232b28ad099f3adddd&dn=bogus-torrent-test" || exit 1
checkTorrent "d8dd32ac93357c368556af3ac1d95c9d76bd0dff" "HYBRID TORRENTID"
checkTorrent "d8dd32ac93357c368556af3ac1d95c9d76bd0dff6fa9833ecdac3d53134efabb" "HYBRID HASHV2"
checkTorrent "631a31dd0a46257d5078c0dee4e66e26f73e42ac" "HYBRID HASHV1"
checkTorrent "caf1e1c30e81cb361b9ee167c4aa64228a7fa4fa9f6105232b28ad099f3a302e" "V2 HASHV2"
checkTorrent "caf1e1c30e81cb361b9ee167c4aa64228a7fa4fa" "V2 TORRENTID"
rm -R "$tempDir"
Additional context
Possible solutions
From a quick read, it appears the TorrentIDs generated from API are just stringy types. TorrentFilter::matchHash
only tries to check against the TorrentID returned by libtorrent, which is the v1 hash for v1 torrents, and the truncated v2 hash for v2/hybrid torrents. However, it appears:
- TorrentID::fromString does not truncate input strings, resulting in full v2 hashes being matched against truncated hashes
- TorrentID::fromXHash returns 000...000 in circumstances i have not investigated (was it with full infohash v2?)
Here's an example dirty quick patch:
diff --git a/src/base/bittorrent/infohash.cpp b/src/base/bittorrent/infohash.cpp
index dcef97874..e43194eda 100644
--- a/src/base/bittorrent/infohash.cpp
+++ b/src/base/bittorrent/infohash.cpp
@@ -93,7 +93,12 @@ BitTorrent::InfoHash::operator WrappedType() const
BitTorrent::TorrentID BitTorrent::TorrentID::fromString(const QString &hashString)
{
- return TorrentID(BaseType::fromString(hashString));
+ if (hashString.length() > 40) {
+ return TorrentID(BaseType::fromString(hashString.left(40)));
+ }
+ else {
+ return TorrentID(BaseType::fromString(hashString));
+ }
}
BitTorrent::TorrentID BitTorrent::TorrentID::fromInfoHash(const BitTorrent::InfoHash &infoHash)
diff --git a/src/base/torrentfilter.cpp b/src/base/torrentfilter.cpp
index 9eb2a8833..d4edc8be7 100644
--- a/src/base/torrentfilter.cpp
+++ b/src/base/torrentfilter.cpp
@@ -199,7 +199,22 @@ bool TorrentFilter::matchHash(const BitTorrent::Torrent *const torrent) const
if (!m_idSet)
return true;
- return m_idSet->contains(torrent->id());
+ // TorrentID is truncated to 40 chars so a full v2 infohash will match here
+ // Matching by torrentID is the most likely scenario so it's the first thing we try
+ if (m_idSet->contains(torrent->id())) {
+ return true;
+ }
+
+#ifdef QBT_USES_LIBTORRENT2
+ // Match hybrid torrent by infohash v1 (edge case)
+ if (torrent->infoHash().v1() != SHA1Hash()) {
+ if (m_idSet->contains(BitTorrent::TorrentID::fromSHA1Hash(torrent->infoHash().v1()))) {
+ return true;
+ }
+ }
+#endif
+
+ return false;
}
It produces the following results for the test case presented above:
Logged in.
Hybrid torrent already here
V2 torrent already here
bogus torrent already here
OK HYBRID TORRENTID
OK HYBRID HASHV2
OK HYBRID HASHV1
OK V2 HASHV2
OK V2 TORRENTID
Follow-up
Should i continue investigating this and make reproducible test cases for other endpoints? Should i try to fix it myself? To be honest it looks a little intimidating because there's a lot of indirection, and it's my first time hacking on C++ code, so if other persons more experienced with the code base want to take this on, please do. Otherwise, i will try... but i did not find docs on qBittorrent internals and how it deals with v1/hybrid/v2 torrents. Some stuff is really unclear to me, like why SessionImpl::findTorrent would lookup m_hybridTorrentsByAltID when infohash is not an hybrid (just an example).
PS: This new split-textarea Github interface is horrible for opening tickets. I'm not against an HTML form with several fields, but these Github fields change size and UI elements depending on focus so that makes it really hard to use. Also, it does not have a global preview button to ensure the bug is globally coherent (which could be not the case after i had to split my message across several form fields full of interactive JavaScript crap). This is the worst bug-submitting UX i have ever seen in my life (and i've used bugzilla/redmine). I know it's not your fault but i'm just letting you know in case you haven't experienced it yet (it seems to be a new github feature).
Log(s) & preferences file(s)
No response
So i added tests for the properties and delete endpoints. These use SessionImpl::getTorrent (not matchHash) so they were not covered by my previous patch. Here's the updated test:
#! /usr/bin/env bash
tempDir="$(mktemp -d)"
api="http://localhost:8080/api/v2"
V2_V2HASH="caf1e1c30e81cb361b9ee167c4aa64228a7fa4fa9f6105232b28ad099f3a302e"
V2_MAGNET="magnet:?xt=urn:btmh:1220${V2_V2HASH}&dn=bittorrent-v2-test"
V2_ID="${V2_V2HASH:0:40}"
HYBRID_V1HASH="631a31dd0a46257d5078c0dee4e66e26f73e42ac"
HYBRID_V2HASH="d8dd32ac93357c368556af3ac1d95c9d76bd0dff6fa9833ecdac3d53134efabb"
HYBRID_MAGNET="magnet:?xt=urn:btih:${HYBRID_V1HASH}&xt=urn:btmh:1220${HYBRID_V2HASH}&dn=bittorrent-v1-v2-hybrid-test"
HYBRID_ID="${HYBRID_V2HASH:0:40}"
BOGUS_V2HASH="daf1e1c30e81cb361b9ee167c4aa64228a7fa4fa9f6105232b28ad099f3adddd"
BOGUS_MAGNET="magnet:?xt=urn:btmh:1220${BOGUS_V2HASH}&dn=bogus-torrent-test"
BOGUS_ID="${BOGUS_V2HASH:0:40}"
if ! curl --silent -c "$tempDir"/cookies.jar --data 'username=admin&password=adminadmin' "$api"/auth/login 2>&1 | grep "Ok" 2>&1 >/dev/null; then
echo "Failed to login"
exit 1
fi
echo "Logged in."
addTorrent() {
if curl -b "$tempDir"/cookies.jar "$api"/torrents/info 2>/dev/null | grep "$1" 2>&1 >/dev/null; then
echo "$2 torrent already here"
else
echo "Adding $2 torrent"
if curl --silent -b "$tempDir"/cookies.jar -F "urls=$3" -F "paused=true" "$api"/torrents/add | grep "Fails" 2>&1 >/dev/null; then
echo "Adding failed"
return 1
fi
fi
}
# $1 name
# $2 require info about this hash/id
checkInfo() {
echo -n "$1 ("
OUTPUT="$(curl --silent -b "$tempDir"/cookies.jar --data "hashes=$2" "$api"/torrents/info)"
if [ ! $? -eq 0 ]; then
echo -n "FAIL: connection)"
return 1
fi
if echo "$OUTPUT" | grep "$2" 2>&1 >/dev/null; then
if echo "$OUTPUT" | grep "$BOGUS_V2HASH" 2>&1 >/dev/null; then
echo "$OUTPUT" > /dev/stderr
echo -n "FAIL: bogus data)"
else
echo -n "OK)"
return 0
fi
else
echo "$OUTPUT" > /dev/stderr
echo -n "FAIL: missing)"
return 1
fi
}
# $1 name
# $2 require info about this hash/id
checkProperties() {
echo -n "$1 ("
OUTPUT="$(curl --silent -b "$tempDir"/cookies.jar --data "hash=$2" "$api"/torrents/properties)"
if [ ! $? -eq 0 ]; then
echo -n "FAIL: connection)"
return 1
fi
if echo "$OUTPUT" | grep "$2" 2>&1 >/dev/null; then
if echo "$OUTPUT" | grep "$BOGUS_V2HASH" 2>&1 >/dev/null; then
echo "$OUTPUT" > /dev/stderr
echo -n "FAIL: bogus data)"
else
echo -n "OK)"
return 0
fi
else
echo "$OUTPUT" > /dev/stderr
echo -n "FAIL: missing)"
return 1
fi
}
# $1 name
# $2 delete this hash/id
# $3 magnet to readd after deletion
checkDelete() {
echo -n "$1 ("
curl --silent -b "$tempDir"/cookies.jar --data "hashes=$2&deleteFiles=false" "$api"/torrents/delete >/dev/null
if [ ! $? -eq 0 ]; then
echo -n "FAIL: delete_connection)"
return 1
fi
# Check in full torrent list if torrent is still here
OUTPUT="$(curl --silent -b "$tempDir"/cookies.jar "$api"/torrents/info)"
if [ ! $? -eq 0 ]; then
echo -n "FAIL: check1_connection)"
return 1
fi
if echo "$OUTPUT" | grep "$2" 2>&1 >/dev/null; then
echo -n "FAIL: not_deleted)"
return 1
fi
# Readd torrent
curl --silent -b "$tempDir"/cookies.jar -F "paused=true" -F "urls=$3" "$api"/torrents/add >/dev/null
if [ ! $? -eq 0 ]; then
echo -n "FAIL: readd_connection)"
return 1
fi
# Check torrent was readded in full torrent list
OUTPUT="$(curl --silent -b "$tempDir"/cookies.jar "$api"/torrents/info)"
if [ ! $? -eq 0 ]; then
echo -n "FAIL: check2_connection)"
return 1
fi
if ! echo "$OUTPUT" | grep "$2" 2>&1 >/dev/null; then
echo -n "FAIL: not_readded)"
return 1
else
echo -n "OK)"
fi
}
# $1 name
# $2 TorrentID (infoHash v1 / truncated v2)
# [$3 infoHash v2]
# [$4 infoHash v1 for hybrid torrents]
# MAGNET (env variable): full magnet link
checkTorrent() {
echo "$1"
# info endpoint
echo -n " - info: "
checkInfo "ID" "$2"
if [[ "$3" != "" ]]; then
echo -n ", "
checkInfo "V2" "$3"
fi
if [[ "$4" != "" ]]; then
echo -n ", "
checkInfo "V1" "$4"
fi
echo
# properties endpoint
echo -n " - properties: "
checkProperties "ID" "$2"
if [[ "$3" != "" ]]; then
echo -n ", "
checkProperties "V2" "$3"
fi
if [[ "$4" != "" ]]; then
echo -n ", "
checkProperties "V1" "$4"
fi
echo
# delete endpoint
echo -n " - delete: "
checkDelete "ID" "$2" "$MAGNET"
if [[ "$3" != "" ]]; then
echo -n ", "
checkDelete "V2" "$3" "$MAGNET"
fi
if [[ "$4" != "" ]]; then
echo -n ", "
checkDelete "V1" "$4" "$MAGNET"
fi
echo
}
addTorrent "$HYBRID_V2HASH" "Hybrid" "$HYBRID_MAGNET" || exit 1
addTorrent "$V2_V2HASH" "V2" "$V2_MAGNET" || exit 1
addTorrent "$BOGUS_V2HASH" "bogus" "$BOGUS_MAGNET" || exit 1
MAGNET="$HYBRID_MAGNET" checkTorrent "HYBRID" "$HYBRID_ID" "$HYBRID_V2HASH" "$HYBRID_V1HASH"
MAGNET="$V2_MAGNET" checkTorrent "V2" "$V2_ID" "$V2_V2HASH"
rm -R "$tempDir"
Here's a new patch:
diff --git a/src/base/bittorrent/sessionimpl.cpp b/src/base/bittorrent/sessionimpl.cpp
index 18644a370..a68f6781c 100644
--- a/src/base/bittorrent/sessionimpl.cpp
+++ b/src/base/bittorrent/sessionimpl.cpp
@@ -2208,7 +2208,11 @@ void SessionImpl::fileSearchFinished(const TorrentID &id, const Path &savePath,
Torrent *SessionImpl::getTorrent(const TorrentID &id) const
{
- return m_torrents.value(id);
+ auto t = m_torrents.value(id);
+ if (!t) {
+ return m_hybridTorrentsByAltID.value(id);
+ }
+ return t;
}
Torrent *SessionImpl::findTorrent(const InfoHash &infoHash) const
It produces the following test results:
Logged in.
Hybrid torrent already here
V2 torrent already here
bogus torrent already here
HYBRID
- info: ID (OK), V2 (OK), V1 (OK)
- properties: ID (OK), V2 (OK), V1 (OK)
- delete: ID (OK), V2 (OK), V1 (OK)
V2
- info: ID (OK), V2 (OK)
- properties: ID (OK), V2 (OK)
- delete: ID (OK), V2 (OK)
I'm not sure if there's other methods which need patching, but SessionImpl::findTorrent still looks suspicious to me.
Note that in both patches i proposed here, there is no implication for performance on the happy path (if the torrent is found). Extra lookups should not take place during normal qBittorrent operations (which use TorrentID in all cases and should not query a non-existing TorrentID), but we could add some method arguments or compilation flags to make sure that never happens (for example when running the test suite).
There seems to be some problems with API dealing with v2/hybrid torrent. Basically, i believe most operations only work with truncated v2 hash with v2/hybrid torrents, but fail with the full v2 hash. They also do not work with v1 hash for hybrid torrents.
Torrents are identified by qBittorrent using so-called "torrent ID" (which is info-hash of v1 torrents and truncated sha256 info-hash of v2/hybrid torrents). So "hash" or "info-hash" in WebAPI is usually refers to "torrent ID". The names are preserved for backward compatibility (I suppose).
Torrents are identified by qBittorrent using so-called "torrent ID" (...) So "hash" or "info-hash" in WebAPI is usually refers to "torrent ID"
I am aware of this and i've been working around it. It makes sense internally, but i find it inconvenient to consume on a high-level from the API, because end-users may not be aware of that.
So for example, if i make a command to edit trackers on a torrent, and a user uses a full v2 hash (or a v1 hash for a hybrid torrent), i can either:
- fail saying the torrent does not exist (like the API does), which is a lie and is not user-friendly
- query the entire torrent list to match against the requested infohash, which is what i'm doing right now but is not bandwidth/memory-efficient
I feel like at the very least info/properties should support other hash forms to find the corresponding TorrentID for use in other methods. I find it sad (but acceptable) that i would have to make two network round trips (three counting auth) to effectively use the API, but as long as it does not involve sending the entire torrent list over the network that's an acceptable middleground for me. Of course my preferred solution would be if qbittorrent API supported all hash forms (including torrent ID) in all API methods.
Do you understand the usecase and find it reasonable?
It's not clear. When you request some action on an existing (in qBittorrent) torrent, you are already aware that it exists, aren't you? I.e. you have previously received information about all existing torrents and, accordingly, you know the IDs of all these torrents. So why to complicate things? IMO, matching all info-hash variants are really useful only in very specific cases, e.g. when you add new torrent.
you are already aware that it exists, aren't you?
That is the case for long-lived applications interacting with qbittorrent (who will probably cache the full torrent list anyway). That is not the case for a short-lived CLI. The problem is interaction with end-user based on infohash. The user may not know all variants of the hash and/or may not know the difference between torrentID and an actual infohash.
For example, a lot of tooling (such as torrent indexes) only expose infohashv1 in the interface. Copying that infohash to query the qBittorrent API just fails mysteriously even if the torrent is already known to qBittorrent.
why to complicate things?
Matching only against TorrentID makes sense in internal APIs, because qBittorrent knows the entire state of the system. But for consumers of the web API it makes sense to support all variants because:
- returning "not found" when torrent is known to qBittorrent is weird
- placing the burden of hash->torrentID mapping on the API client makes more work for API clients (and i think the goal of an API is to make interop easy), and requires more network calls/data
But even in internal APIs, i note that qBittorrent already keeps a separate mapping called m_hybridTorrentsByAltID
used to find torrent by infohash in SessionImpl::findTorrent
, which is used in the add torrent dialog. So such mapping/resolution is not entirely useless.
Also, in my patch i tried not to make things more complicated. The changes are very simple and do not introduce performance degradation for current API calls. If you have some critique of the proposed patches i'm willing to address them.
But even in internal APIs, i note that qBittorrent already keeps a separate mapping called
m_hybridTorrentsByAltID
used to find torrent by infohash inSessionImpl::findTorrent
, which is used in the add torrent dialog. So such mapping/resolution is not entirely useless.
As I said above:
IMO, matching all info-hash variants are really useful only in very specific cases, e.g. when you add new torrent.
There is some sense in your proposal, but I still don't get the basics of it, since you haven't described the details of the use case.
Also, in my patch i tried not to make things more complicated. The changes are very simple and do not introduce performance degradation for current API calls. If you have some critique of the proposed patches i'm willing to address them.
If you really interested in providing some changes it's better to publish Pull Request.
Did you notice that if you insert v2 hash only it sometimes cannot load the torrent directory in the menu until you press okay?
Did you notice that if you insert v2 hash only it sometimes cannot load the torrent directory in the menu until you press okay?
Unfortunately I didn't understand what you meant at all.
There is a preview when you load a magnet. It does not work on windows for v2 hashes. (I use libtorrent v2.)
There is a preview when you load a magnet.
So I suppose you talk about regular desktop client. But this Issue is about web API.
There is a preview when you load a magnet.
So I suppose you talk about regular desktop client. But this Issue is about web API.
Do you reproduce it at least? I can open a new bug.
Did you notice that if you insert v2 hash only it sometimes cannot load the torrent directory in the menu until you press okay?
I can't reproduce and "sometimes" is actually a non issue.... unless you can pinpoint an actual problem e.g. it only happens with a proxy or whatever. And yes do make an issue if you intend to continue with this because it's off topic here.