bencode icon indicating copy to clipboard operation
bencode copied to clipboard

Decode exception when decoding string from http://bttracker.debian.org/

Open mikewolfxyou opened this issue 4 years ago • 4 comments

Hi,

Just found may be an issue, when I try to use bencode.decode some string from debian bttracker, always got exception.

How to reproduce:

Make a get call use following url: http://bttracker.debian.org:6969/announce?info_hash=N%f5%ad%16%c0%9e%b5%0b%8c%3b%90~%02u%24%eeC%ad%ad%01&peer_id=%ff%fe%fd%fc%fb%fa%f9%f8%f7%f6%f5%f4%f3%f2%f1%f0%ef%ee%ed%ec&port=6881&uploaded=0&downloaded=0&left=659554304%22

then will get following UTF_8 String back:

"d8:intervali900e5:peers2:ip13:213.49.181.984:porti6881e2:ip14:216.195.129.274:porti6881e2:ip12:146.71.73.214:porti6881e2:ip12:146.71.73.514:porti6881eee"

When I try to decode the above string,

Bencode bencode = new Bencode(StandardCharsets.UTF_8, true);
bencode.decode("d8:intervali900e5:peers2:ip13:213.49.181.984:porti6881e2:ip14:216.195.129.274:porti6881e2:ip12:146.71.73.214:porti6881e2:ip12:146.71.73.514:porti6881eee".getBytes(StandardCharsets.UTF_8,), Type.DICTIONARY);

then got an exception:

"Unexcept token 'i'"

I try to debug it, then found out that "d8:" these 3 chars are ok, till the fourth chars "i", then got the exception.

I just want to confirm, if it is the string encode problem? or something else?

btw: Env: Java 11, Mac OS. I use the example string in the README.md, the decoding works fine.

Thank you for your help!

best regards

Mike

mikewolfxyou avatar Jan 16 '21 15:01 mikewolfxyou

I am trying to understand how this is formatted. Right now parsing it manually by hand to JSON:

{
  "interval": 900,
  "peers": "ip",
  "213.49.181.984": "port", // also this is not a valid IP address
  //This is where it fails. A number cannot be a key in a bencode dict
}

I have also tried running it through a some other bencode parsers and they all fail at the same spot. Are you actually able to get it decode with anything? As far as I can tell it is malformed. If you are able to get it to decode please let me know.

My guess is that the peers are supposed an array of dictionaries as that is what would make the most logic sense but that is not what the data specifies.

dampcake avatar Jan 16 '21 18:01 dampcake

请考虑下中文编码的问题, 还有种子文件中的pieces字段的值

  1. Consider Chinese encoding

  2. torrent --> info --> pieces

blanexie avatar Nov 11 '21 12:11 blanexie

@blanexie Chinese characters should parse fine as long as you pass a Charset that can understand them. Do you have an example of valid bencode formatting with those characters in a string that is not parsing correctly?

dampcake avatar Nov 19 '21 08:11 dampcake

thanks , It's normal to run.

  1. useBytes : true
  2. ByteBuffer to convert into String

kotlin code

val bencode = Bencode(charset("utf8"), true)


 UTF8.decode((infoMap["name"] as ByteBuffer)).toString()

blanexie avatar Dec 07 '21 11:12 blanexie