bitcoin-etl icon indicating copy to clipboard operation
bitcoin-etl copied to clipboard

About nonstandard output

Open SuperLSH opened this issue 4 years ago • 14 comments

@medvedev1088 Hi, I found that there ara two kinds of output in transactions.json. One is nonstandard, and the other is pubkeyhash. I thought the nonstandard output is the op_return output, but i found outputs of many (not all ) coinbase txs also are nonstandard. And address of miner is like“nonstandard3318537dfb3135df9f3d950dbdf8a7ae68dd7c7d”. Here is an example: transaction: { "hash": "4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b", "block_number": 0, "block_hash": "000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f", "is_coinbase": true, "outputs": [{"index": 0, "script_asm": "04678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5f OP_CHECKSIG", "script_hex": "4104678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5fac", "required_signatures": null, "type": "nonstandard", "addresses": ["nonstandard3318537dfb3135df9f3d950dbdf8a7ae68dd7c7d"], "value": 5000000000}], "input_count": 0, "output_count": 1, "input_value": 0, "output_value": 5000000000, "fee": 0} The address above is 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa according to [https://www.blockchain.com/btc/tx/4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b]

I'm confused. Perhaps you know why? Thank you very much!!!!

SuperLSH avatar Apr 29 '20 11:04 SuperLSH

That's interesting. What version of bitcoin-cli are you using? bitcoin-etl uses the types and addresses returned by the JSON RPC api so I'd check what the api returns for this tx.

medvedev1088 avatar Apr 29 '20 13:04 medvedev1088

$ bitcoin-cli -version Bitcoin Core RPC client version v0.19.99.0-41fa292

SuperLSH avatar Apr 29 '20 13:04 SuperLSH

[sihao@localhost ~]$ bitcoin-cli getrawtransaction 792c4a613e51966fc1c852cd8d94fd10a05efa55fffdd6a41281c8d139540b23 1 { "txid": "792c4a613e51966fc1c852cd8d94fd10a05efa55fffdd6a41281c8d139540b23", "hash": "792c4a613e51966fc1c852cd8d94fd10a05efa55fffdd6a41281c8d139540b23", "version": 1, "size": 134, "vsize": 134, "weight": 536, "locktime": 0, "vin": [ { "coinbase": "0415112a1c010f", "sequence": 4294967295 } ], "vout": [ { "value": 50.00000000, "n": 0, "scriptPubKey": { "asm": "042f7edc4ff409281bdbd3cdca6ae365e10d0473b527025473fafccb1bd87f43855ddcde9b662fae2ca306b694a74923ae7055e8e60d10e8af68c705e923f86372 OP_CHECKSIG", "hex": "41042f7edc4ff409281bdbd3cdca6ae365e10d0473b527025473fafccb1bd87f43855ddcde9b662fae2ca306b694a74923ae7055e8e60d10e8af68c705e923f86372ac", "type": "pubkey" } } ], "hex": "01000000010000000000000000000000000000000000000000000000000000000000000000ffffffff070415112a1c010fffffffff0100f2052a010000004341042f7edc4ff409281bdbd3cdca6ae365e10d0473b527025473fafccb1bd87f43855ddcde9b662fae2ca306b694a74923ae7055e8e60d10e8af68c705e923f86372ac00000000", "blockhash": "00000000237ce1b692f5ef9f1f9ac64d62d5e6a24dcfec08633fa2d339434e2a", "confirmations": 578264, "time": 1270917916, "blocktime": 1270917916 } It turns out that scripts of the "pubkey" type are parsed to nonstandard. If I change the bitcoin-cli version to 0.19.0, according to your experience, will the result be correct?

@medvedev1088 I'm sorry to trouble you so much. Looking forward to your reply.

SuperLSH avatar Apr 30 '20 12:04 SuperLSH

That's unlikely. We don't parse the scripts in bitcoin-etl, we only use the data that bitcoin-cli provides. I don't see the address 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa anywhere in the JSON output.

medvedev1088 avatar Apr 30 '20 13:04 medvedev1088

Thank you!!!!!!! I see.

SuperLSH avatar Apr 30 '20 13:04 SuperLSH

@medvedev1088 Do you have a plan parsing the script to extract bitcoin address from the public key? For me, both are fine, but what I only need is preserving "type" value even if "address" is not in response. I'm planning to support another coin that has multiple "type" values without "address" in their responses.

tokusyu avatar May 05 '20 10:05 tokusyu

@tokusyu preserving the "type" field sounds like a good idea. Parsing scripts can also be done in this tool, if can find a good Python library or wrapper that already implemented script parsing.

medvedev1088 avatar May 05 '20 13:05 medvedev1088

@SuperLSH @medvedev1088 @allenday I looked into a similar case mentioned here.

Transaction Hash: 03453deedd0d3051a7752e2b04c58a907c5ff5cb59a7e1fbabba0bf479fa1b66. The first output of this transaction has the following script: 020e46e79a2a8d12b9b5d12c7a91adb4e454edfae43c0a0cb805427d2ac7613fd9 OP_CHECKSIG

Here are the result from several Bitcoin Blockchain Explorers:

Both Blockchain.com and BlockChair.com correctly parse the first output as P2PKH with address 1P3rU1Nk1pmc2BiWC8dEy9bZa1ZbMp5jfg and value 0.00098800 BTC.

However, the Google BigQuery as well as BTC.com are unable to parse it correctly. BTC.com shows “unable to decode output address”. Google BigQuery parse it to nonstandardf4acdc49b8af461468a5e65549db65f9b294c9fe. GBQ uses this bitcoin-etl tool to etl from bitcoin core.

I suspect that the reason why cli sets the address to nonstandard is that the script for this transaction does not conform to the P2PKH standard format. For P2PKH script, the spec should be

  • scriptSig |
  • scriptPubKey | OP_DUP OP_HASH160 <pkHash> OP_EQUALVERIFY OP_CHECKSIG
  • assembled | <scriptSig><scriptPubKey>

The script should start with OP_DUP OP_HASH160 and has OP_EQUALVERIFY before OP_CHECKSIG. For example, OP_DUP OP_HASH160 000012a6ddb2584c883aeab296d6a08419e0cc71 OP_EQUALVERIFY OP_CHECKSIG.

However, the script in this cases does not contain OP_DUP OP_HASH160 nor OP_EQUALVERIFY, i.e., 020e46e79a2a8d12b9b5d12c7a91adb4e454edfae43c0a0cb805427d2ac7613fd9 OP_CHECKSIG. The cli therefore does not know how to handle such script and set the address to nonstandard.

Any plan and timeline to enhance and solve this issue? Thanks.

jingli85 avatar May 13 '21 18:05 jingli85

Hey @medvedev1088 , can I get some help on this issue?

I'm thinking about using a library like btcpy to implement script parsing. Is this a good approach?

Here's my intial code. Let me know what you think.

neelriyer avatar Aug 11 '21 01:08 neelriyer

Sorry quite busy nowadays.

Will try to look into it as soon as I have time.

medvedev1088 avatar Aug 15 '21 14:08 medvedev1088

@jingli85, @SuperLSH, @tokusyu Are you still working on this? What kind of progress did you make?

neelriyer avatar Aug 17 '21 23:08 neelriyer

@spiyer99 unfortunately no progress. using a library like btcpy is interesting, we need to consider coins other than bitcoin that have their prefix code eg. https://bitcoin.stackexchange.com/questions/62781/litecoin-constants-and-prefixes.

tokusyu avatar Aug 19 '21 00:08 tokusyu

@tokusyu, @medvedev1088 after looking at the script asm and script hex for the nonstandard addresses I’m seeing a lot of them are prefixed with OP_RETURN. According to this source (​​https://en.bitcoin.it/wiki/OP_RETURN) these would correspond to invalid transactions. Does that mean we can safely disregard all nonstandard addresses?

neelriyer avatar Sep 10 '21 00:09 neelriyer

I'm interested in implementing such feature. So, if it's desired, I can make a PR. I was thinking of something similar to this I've posted here.

akhorshidi avatar Jan 15 '22 12:01 akhorshidi