python-bitcoinlib icon indicating copy to clipboard operation
python-bitcoinlib copied to clipboard

CTransaction.stream_deserialize allows flagbyte != 1

Open dgpv opened this issue 5 years ago • 7 comments

https://github.com/petertodd/python-bitcoinlib/blob/ebc85bf3f75508aba92d3a94cd024670bf25be76/bitcoin/core/init.py#L425

if flagbyte != 1, then it will try to deserialize as non-witness, whereas Core in UnserializeTransaction checks that flags are actually expected (only first bit is expected now) and will do throw std::ios_base::failure("Unknown transaction optional data"); if there's unexpected bits set in flags.

It certainly looks like that the CTransaction.deserialize() is not correct and should fail if it sees flagbytes != 1

dgpv avatar Dec 31 '19 12:12 dgpv

Hi @dgpv, this looks simple enough to tip my toes in without messing up, so you are referring to:

https://github.com/bitcoin/bitcoin/blob/80c8a02f1b4f6ad2b5c02595d66a74db22373ed4/src/primitives/transaction.h#L219

in this library it means raising an exception at around the line you referenced when flagbyte is not 0 and not 1, in particular a ValueError with the string "Unknown transaction optional data", and then adding an invalid tx in the invalid_tx.json file of the tests/data folder.

Is there something else to be taken into consideration?

ThomasRossi avatar Oct 19 '20 16:10 ThomasRossi

I've handled it this way: https://github.com/Simplexum/python-bitcointx/blob/e3e3726d53c3948ed249ee79cf2b808c7f875add/bitcointx/core/init.py#L1100-L1108

Note that this does not allow a transaction with empty inputs, and the unserialize code in Core allows it, but for the case when both inputs and outputs are empty. AFAIR it checks for empty inputs/outputs in another place. I think there is some fringe case where Core needs to unserialize a transaction with empty inputs and outputs, but I can't remember what it is. I decided to just straightforwardly disallow this and fail for markerbyte != 1.

I also did not add a test for this. If you add test data, I might use if for tests in bitcointx.

dgpv avatar Oct 19 '20 16:10 dgpv

Hi, thanks! I gave it a quick try yesterday evening, that approach breaks 6 tests, I think the reason lies in the comment: "...transactions which have zero inputs: they are invalid but may be (de)serialized anyway for the purpose of signing them and adding inputs."

so there are two ways:

  • changing tests, I'll prepare an example of exactly what goes wrong but I'm not sure it's safe for me to procede this way without being very knowledgeable of the library.
  • moving the raise ValueError in the else block, so that I can check for len(vin) and raise only if there's >0 inputs, markerbyte==0 and flagbyte!=1, which doesn't break any test, I should just add one that raises it

let me know what you think, thanks!

https://github.com/petertodd/python-bitcoinlib/blob/ebc85bf3f75508aba92d3a94cd024670bf25be76/bitcoin/core/init.py#L432-L434

ThomasRossi avatar Oct 20 '20 07:10 ThomasRossi

I thought a bit about my solution and decided that it was not entirely adequate, as indeed deserializing empty transactions can be useful in some contexts. So I decided to just copy the logic from the Core: https://github.com/Simplexum/python-bitcointx/commit/3bd0ce82683ccb180dc10451d5ac96b727004131

This has added benefit that it gets rid of the requirement for the stream to be seekable, as f.tell() and f.seek() are not used anymore

dgpv avatar Oct 20 '20 08:10 dgpv

Hello, so after reading more in depth, it looks to me the tests want exceptions raised from CheckTransaction

https://github.com/petertodd/python-bitcoinlib/blob/ebc85bf3f75508aba92d3a94cd024670bf25be76/bitcoin/core/init.py#L784

To do so, I've added "markerbyte" and "flagbyte" to the class around here: https://github.com/petertodd/python-bitcoinlib/blob/ebc85bf3f75508aba92d3a94cd024670bf25be76/bitcoin/core/init.py#L388-L392

and then in stream_deserialize the first if clause remains as is, in the second one I try to deserialise everything if marker=0 regardless of the flag, if that fails, roll back the stream with f.seek and continue as before

Inside CheckTransactions there are two new checks:

if(tx.markerbyte==0 and tx.flagbyte!=1):
    raise CheckTransactionError("CheckTransaction() : Unknown transaction optional data")
if(tx.markerbyte==0 and tx.flagbyte==1 and tx.wit.is_null()):
    raise CheckTransactionError("CheckTransaction() : Superfluous witness record")

to test I'm just using a segwit transaction from the bitcoincpp tests and switching the flagbyte to "2":

["Witness with SigHash Single|AnyoneCanPay (same signature as previous)"], [[["0000000000000000000000000000000000000000000000000000000000000100", 0, "0x51", 1000], ["0000000000000000000000000000000000000000000000000000000000000100", 1, "0x00 0x14 0x4c9c3dfac4207d5d8cb89df5722cb3d712385e3f", 2000], ["0000000000000000000000000000000000000000000000000000000000000100", 2, "0x51", 3000]], "01000000000**2**0300010000000000000000000000000000000000000000000000000000000000000000000000ffffffff00010000000000000000000000000000000000000000000000000000000000000100000000ffffffff00010000000000000000000000000000000000000000000000000000000000000200000000ffffffff03e8030000000000000151d0070000000000000151b80b0000000000000151000248304502210092f4777a0f17bf5aeb8ae768dec5f2c14feabf9d1fe2c89c78dfed0f13fdb86902206da90a86042e252bcd1e80a168c719e4a1ddcc3cebea24b9812c5453c79107e9832103596d3451025c19dbbdeb932d6bf8bfb4ad499b95b6f88db8899efac102e5fc710000000000", "P2SH,WITNESS"],

to use this as-is (with a 4th element in prevout), we'll need >=3 here https://github.com/petertodd/python-bitcoinlib/blob/ebc85bf3f75508aba92d3a94cd024670bf25be76/bitcoin/tests/test_transactions.py#L33

So I'll clean it up and post it, thanks for sharing also Simplexum, that library though is quite different, for instance even if we could remove seek and pos from this deserialisation, it is still used elsewhere. Moreover & 1 and ^=1 may work as intended also in python, but in cpp may have a different meaning, it may work, I didn't test it, it just looked odd to me.

ThomasRossi avatar Oct 26 '20 23:10 ThomasRossi

Moreover & 1 and ^=1 may work as intended also in python, but in cpp may have a different meaning, it may work, I didn't test it, it just looked odd to me.

As far as I know, the semantics of bit manipulation operations is the same, only the bit width in python is not fixed: https://wiki.python.org/moin/BitwiseOperators

I don't think that adding a property (markerbyte) to the instance is a good idea. The markerbyte issue is very local to the deserialization, and introducing a slot to the whole class will pollute its namespace needlessly.

dgpv avatar Oct 27 '20 00:10 dgpv

Hello, sorry got stuck on something else. I've followed your advice, used only flagbyte, and just sent the pull request with the simple modifications.

isn't "x^=1" the same as "x = x^1" in python? I didn't test though, it just caught my eye

ThomasRossi avatar Nov 18 '20 21:11 ThomasRossi