Decoding Legacy Bitcoin Transaction
In this article, we will go deep into a Legacy Bitcoin transaction, understand and decode it different parts and write a simple Python script we can use to easily decode Legacy Bitcoin Transactions.
Following is the hex format of a raw Bitcoin transaction :
010000000104dde43b0e4724f1e3b45782a9bfbcc91ea764c7cb1c245fba1fefa175c3a5d0010000006a4730440220519f7867349790ee441e83e545afbd25b954a34e0733cd4da3b5f1e5588625050220166730d053c3672973bcb2bb1a977b747837023b647e3af2ac9c15728b0681da01210236ccb7ee3a9f154127f384a05870c4fd86a8727eab7316f1449a0b9e65bfd90dffffffff025d360100000000001976a91478364a559841329304188cd791ad9dabbb2a3fdb88ac605b0300000000001976a914064e0aa817486573f4c2de09f927697e1e6f233f88ac00000000
This transaction is composed of different parts that can be put into 4 main parts :
Version
Input
Output
Locktime
Version
Most of the Bitcoin transactions are version 1. Version 2 were introduced in BIP168 and add more constraints on version 2 transactions.
The version of a transaction is determined by the first 4 bytes of the hex format of transaction.
In our case it is :
01000000
Turning 01000000 into Big endian returns 00000001 and the value of version is 1.
Inputs
- Number of inputs
The first field of the inputs section is the number of inputs considered in the transaction. This field is a 'varint' or a 'CompactSize Unsigned Integer' as explained in Mastering Bitcoin book.
Figure 1 : Bytes used by CompactSize Unsigned Integers, Mastering Bitcoin, Chap 6
In our case the first byte is :
01
01 (<= 252) and will be the only byte to be used to determine the number of input. (Use the above table to determine the number of bytes to be considered)
01 (hex) is 1 in decimal and indicates the transactions has only one input.
- ID of the input transaction
This is a constant size of 32 bytes section that indicate the ID of the input's transaction.
In this example, the ID is :
04dde43b0e4724f1e3b45782a9bfbcc91ea764c7cb1c245fba1fefa175c3a5d0
The ID is in Little endian format and we need to convert it to Big endian in order to be able to search for it on any block explorer.
d0a5c375a1ef1fba5f241ccbc764a71ec9bcbfa98257b4e3f124470e3be4dd04
Click HERE to see more details about the transaction on Mempool.space
- Input index in the transaction
This section is a constant size of 4 bytes that indicate the position of the input in the previous transaction.
In our case, we have :
01000000
Turning this into Big endian gives 00000001 and then 1 in decimal. So the input is at position 1 in the outputs of the previous transaction.
- Size of Unlocking script
This is a varint that indicate the size of the Unlocking script of the input.
Here the first byte is 6a (106 in decimal and <= 252). It will then be the only considered byte.
6a
It indicates the size of the unlocking script as 106 bytes.
- Input Unlocking script
This is a variable size section containing the Unlock script of the input. With the previous section, the size of the script is 106 bytes.
It is as follow :
4730440220519f7867349790ee441e83e545afbd25b954a34e0733cd4da3b5f1e5588625050220166730d053c3672973bcb2bb1a977b747837023b647e3af2ac9c15728b0681da01210236ccb7ee3a9f154127f384a05870c4fd86a8727eab7316f1449a0b9e65bfd90d
Note : The last 4 steps should be repeated according to the number of input in the transaction to fully decode it. In our case, the transaction has only one input so we can move to the next section.
- Sequence Number
This is constant size of 4 bytes section used to indicate if a transaction if replace-by-fee enable by setting the value to any value less than ffffffff -1.
ffffffff
It is little endian.
Outputs
- Number of outputs
The first section in the output section is the number of outputs the transaction has. It is a varint.
In our case :
02
It indicates the number of output is 2.
- Amount of sats in output 0 (first output)
This is a fixed size section of 8 bytes that indicates the amount of sats locked in an output.
In our example :
5d36010000000000
It is Little endian section. Converting into Big endian and decimal gives 79453.
So the number of sats locked in the first output in 79453. Click HERE to confirm.
- Output 0 locking script size
This is a varint section that indicates the size of the locking script of the output.
In our example :
19
19 (hex) is 25 (dec) indicating the next 25 bytes represent the locking script of the first output.
- Output 0 locking script
This is a variable length section that contains the locking script of the first output.
From the last section, we get the information it is 25 bytes long in this example.
It is as follow :
76a91478364a559841329304188cd791ad9dabbb2a3fdb88ac
Note : We need to repeat the last three steps the number of output times in order to get full information on the outputs.
- Amount of sats in output 1 (Second output)
Constant size (8 bytes) Little endian section.
As follow in our example :
605b030000000000
Conversion into Big endian and dec indicates 220000 sats locked in the second output
- Output 1 locking script size
As in the output 0, this is a varint that indicate the size of Output 1 locking script
As follow in our example :
19
Conversion gives 25 in dec and indicate the size of Output 1 locking script is 25 bytes.
- Output 1 locking script
Variable size section containing the locking script of Output 1 and from the last section the size is 25 bytes.
As follow in our example :
76a914064e0aa817486573f4c2de09f927697e1e6f233f88ac
Note : This transaction has only 2 outputs but we can have a lot more. We should this into account when working with other transactions.
nLocktime
This is fixed 4 bytes size section that indicate when output will be spendable.
If the value is less than 500.000.000, it represents block height, otherwise, it is parsed as an epoch time (the number of seconds since 1970-01-01T00:00 UTC).
In this example :
00000000
nLocktime is 0 then the outputs would be immediately spendable.
Bonus : Raw transaction Decoder Script in Python
This is a simple Python script that can be used to easily decode a raw bitcoin legacy transaction.
Do not hesitate to try and share feedback with me.
def decoder():
trx_h = "010000000104dde43b0e4724f1e3b45782a9bfbcc91ea764c7cb1c245fba1fefa175c3a5d0010000006a4730440220519f7867349790ee441e83e545afbd25b954a34e0733cd4da3b5f1e5588625050220166730d053c3672973bcb2bb1a977b747837023b647e3af2ac9c15728b0681da01210236ccb7ee3a9f154127f384a05870c4fd86a8727eab7316f1449a0b9e65bfd90dffffffff025d360100000000001976a91478364a559841329304188cd791ad9dabbb2a3fdb88ac605b0300000000001976a914064e0aa817486573f4c2de09f927697e1e6f233f88ac00000000"
print("Raw Transaction in Hex:", trx_h, end="\n\n")
trx_bytes = bytes.fromhex(trx_h)
i = 4
# Version
version_bytes = trx_bytes[0:i]
version = int(version_bytes[::-1].hex(), 16)
print("Version : {}\n".format(version))
i = 4
# Inputs
print("-----Inputs-----")
num_input = trx_bytes[i]
print("Number of input :", num_input)
for j in range(num_input):
print("Input_{}".format(j))
i += 1
inputjtrxid = trx_bytes[i:i + 32][::-1].hex()
print("Input_{} Trx ID : {}".format(j, inputjtrxid))
i += 32
inputjindex = int(trx_bytes[i:i + 4][::-1].hex(), 16)
print("Input_{} index : {}".format(j, inputjindex))
i += 4
unlockscriptsize = trx_bytes[i]
print("Input_{} unlock Script Size : {}".format(j, unlockscriptsize))
i += 1
unlockscript = trx_bytes[i:i+unlockscriptsize].hex()
print("Input_{} unlock Script : {}".format(j, unlockscript))
i += unlockscriptsize
sequencenumber = trx_bytes[i:i + 4][::-1].hex()
print("Input_{} Sequence number : {}".format(j, sequencenumber))
i += 4
print()
# Outputs
print("-----Outputs-----")
num_output = trx_bytes[i]
print("Number of output :", num_output)
for k in range(num_output):
print("Output_{}".format(k))
i += 1
amountoutputk = int(trx_bytes[i:i+8][::-1].hex(), 16)
print("Amount Output_{} : {}".format(k, amountoutputk))
i += 8
lockscriptsize = trx_bytes[i]
i += 1
lockscript = trx_bytes[i:i+lockscriptsize].hex()
print("Output_{} Script : {}".format(k, lockscript))
i += lockscriptsize - 1
print()
i += 1
#Locktime is the last 4 Bytes
print("Locktime :", trx_bytes[i:].hex())
if __name__ == "__main__":
decoder()
Conclusion
Looking at a raw Bitcoin transaction, more than 250 bytes long can be intimidating. But going through make understand it is just different parts of the transaction put together. In this article I tried to break a bitcoin raw transaction in hex down to show the different section it contains. And at the end I share a simple transaction decoder script in Python we can use to easily see the different section of a transaction.
I hope you learned something from it and Looking forward to sharing new article with you.
Do not hesitate to comment with feedbacks and share if you like the content.
We Keep Learning and Growing...
Alphonse.
Subscribe to my newsletter
Read articles from Alphonse Mehounme directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Alphonse Mehounme
Alphonse Mehounme
I am Dev interested in Bitcoin and FinTech in Africa. Currently building Flash...