python-bitcoinlib
python-bitcoinlib copied to clipboard
CTransaction.stream_deserialize allows flagbyte != 1
https://github.com/petertodd/python-bitcoinlib/blob/ebc85bf3f75508aba92d3a94cd024670bf25be76/bitcoin/core/init.py#L425
if flagbyte != 1, then it will try to deserialize as non-witness, whereas Core in UnserializeTransaction
checks that flags are actually expected (only first bit is expected now) and will do throw std::ios_base::failure("Unknown transaction optional data");
if there's unexpected bits set in flags.
It certainly looks like that the CTransaction.deserialize()
is not correct and should fail if it sees flagbytes != 1
Hi @dgpv, this looks simple enough to tip my toes in without messing up, so you are referring to:
https://github.com/bitcoin/bitcoin/blob/80c8a02f1b4f6ad2b5c02595d66a74db22373ed4/src/primitives/transaction.h#L219
in this library it means raising an exception at around the line you referenced when flagbyte is not 0 and not 1, in particular a ValueError with the string "Unknown transaction optional data", and then adding an invalid tx in the invalid_tx.json file of the tests/data folder.
Is there something else to be taken into consideration?
I've handled it this way: https://github.com/Simplexum/python-bitcointx/blob/e3e3726d53c3948ed249ee79cf2b808c7f875add/bitcointx/core/init.py#L1100-L1108
Note that this does not allow a transaction with empty inputs, and the unserialize code in Core allows it, but for the case when both inputs and outputs are empty. AFAIR it checks for empty inputs/outputs in another place. I think there is some fringe case where Core needs to unserialize a transaction with empty inputs and outputs, but I can't remember what it is. I decided to just straightforwardly disallow this and fail for markerbyte != 1
.
I also did not add a test for this. If you add test data, I might use if for tests in bitcointx.
Hi, thanks! I gave it a quick try yesterday evening, that approach breaks 6 tests, I think the reason lies in the comment: "...transactions which have zero inputs: they are invalid but may be (de)serialized anyway for the purpose of signing them and adding inputs."
so there are two ways:
- changing tests, I'll prepare an example of exactly what goes wrong but I'm not sure it's safe for me to procede this way without being very knowledgeable of the library.
- moving the raise ValueError in the else block, so that I can check for len(vin) and raise only if there's >0 inputs, markerbyte==0 and flagbyte!=1, which doesn't break any test, I should just add one that raises it
let me know what you think, thanks!
https://github.com/petertodd/python-bitcoinlib/blob/ebc85bf3f75508aba92d3a94cd024670bf25be76/bitcoin/core/init.py#L432-L434
I thought a bit about my solution and decided that it was not entirely adequate, as indeed deserializing empty transactions can be useful in some contexts. So I decided to just copy the logic from the Core: https://github.com/Simplexum/python-bitcointx/commit/3bd0ce82683ccb180dc10451d5ac96b727004131
This has added benefit that it gets rid of the requirement for the stream to be seekable, as f.tell()
and f.seek()
are not used anymore
Hello, so after reading more in depth, it looks to me the tests want exceptions raised from CheckTransaction
https://github.com/petertodd/python-bitcoinlib/blob/ebc85bf3f75508aba92d3a94cd024670bf25be76/bitcoin/core/init.py#L784
To do so, I've added "markerbyte" and "flagbyte" to the class around here: https://github.com/petertodd/python-bitcoinlib/blob/ebc85bf3f75508aba92d3a94cd024670bf25be76/bitcoin/core/init.py#L388-L392
and then in stream_deserialize the first if clause remains as is, in the second one I try to deserialise everything if marker=0 regardless of the flag, if that fails, roll back the stream with f.seek and continue as before
Inside CheckTransactions there are two new checks:
if(tx.markerbyte==0 and tx.flagbyte!=1):
raise CheckTransactionError("CheckTransaction() : Unknown transaction optional data")
if(tx.markerbyte==0 and tx.flagbyte==1 and tx.wit.is_null()):
raise CheckTransactionError("CheckTransaction() : Superfluous witness record")
to test I'm just using a segwit transaction from the bitcoincpp tests and switching the flagbyte to "2":
["Witness with SigHash Single|AnyoneCanPay (same signature as previous)"], [[["0000000000000000000000000000000000000000000000000000000000000100", 0, "0x51", 1000], ["0000000000000000000000000000000000000000000000000000000000000100", 1, "0x00 0x14 0x4c9c3dfac4207d5d8cb89df5722cb3d712385e3f", 2000], ["0000000000000000000000000000000000000000000000000000000000000100", 2, "0x51", 3000]], "01000000000**2**0300010000000000000000000000000000000000000000000000000000000000000000000000ffffffff00010000000000000000000000000000000000000000000000000000000000000100000000ffffffff00010000000000000000000000000000000000000000000000000000000000000200000000ffffffff03e8030000000000000151d0070000000000000151b80b0000000000000151000248304502210092f4777a0f17bf5aeb8ae768dec5f2c14feabf9d1fe2c89c78dfed0f13fdb86902206da90a86042e252bcd1e80a168c719e4a1ddcc3cebea24b9812c5453c79107e9832103596d3451025c19dbbdeb932d6bf8bfb4ad499b95b6f88db8899efac102e5fc710000000000", "P2SH,WITNESS"],
to use this as-is (with a 4th element in prevout), we'll need >=3 here https://github.com/petertodd/python-bitcoinlib/blob/ebc85bf3f75508aba92d3a94cd024670bf25be76/bitcoin/tests/test_transactions.py#L33
So I'll clean it up and post it, thanks for sharing also Simplexum, that library though is quite different, for instance even if we could remove seek and pos from this deserialisation, it is still used elsewhere. Moreover & 1 and ^=1 may work as intended also in python, but in cpp may have a different meaning, it may work, I didn't test it, it just looked odd to me.
Moreover & 1 and ^=1 may work as intended also in python, but in cpp may have a different meaning, it may work, I didn't test it, it just looked odd to me.
As far as I know, the semantics of bit manipulation operations is the same, only the bit width in python is not fixed: https://wiki.python.org/moin/BitwiseOperators
I don't think that adding a property (markerbyte
) to the instance is a good idea. The markerbyte
issue is very local to the deserialization, and introducing a slot to the whole class will pollute its namespace needlessly.
Hello, sorry got stuck on something else. I've followed your advice, used only flagbyte, and just sent the pull request with the simple modifications.
isn't "x^=1" the same as "x = x^1" in python? I didn't test though, it just caught my eye