Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Introduction
One of the most discussed and important problems in Bitcoin today is scaling. Recent implementation of Thin Blocks in Bitcoin Unlimited and planned implementation of Compact Blocks
in Bitcoin Core client solves the problem of traffic spikes between block propagations on Bitcoin network. This is especially important for nodes with low internet connection speeds (users
running nodes from home). However, these solutions dont solve the issue of the blockchain
size, which grows over time, as new users still need to download the whole blockchain first in
order to start operating as a node. This might take several days on a low speed connection.
For average internet connection speed approximation 1/5 of current average internet speed
in the U.S. (12.6 MBs) 1 is taken, which is 2.5 MBs. This figure is lower than average internet
1
speed connection in almost all countries in the report (see Fig.3 in Appendix), except Bolivia,
Paraguay and Venezuela. While this figure is much lower than average world internet speed, it
is a good guarantee that Bitcoin network is kept decentralized as literally everyone will be able
to run own node.
Storage. One of the issues blockchain size imposes is that users need to have high capacity
drives in order to store the blockchain. This is not actually a problem as currently existing
pruning solutions allow to prune blockchain on the client side of the node reducing the amount
of stored data. Current blockchain size is 74 Gb 2 (see Fig.4 in Appendix). Since Core Client
v.0.11 pruning of the blockchain files is supported 3 , savings can reach up to 98% as current
size of UTXO (unspent bitcoin transaction outputs) is only 1.3 Gb 4 (see Fig.5 in Appendix).
Download and validation. These are the main problems and limitation of scalability. Even
with blockchain files pruning available, node client still needs to download the full blockchain
from the network and validate all transactions (takes the most time here). It is often required to
redownload the blockchain, e.g. to rebuild index from scratch, when local pruning doesnt help
2
much. In the following sections we describe the method to organize data structures in order to
make pruning happening on-chain.
4. Included UTXO block header hash changes the header and hash of a raw block. That is
how it is included into main blockchain and is covered by all mining power available.
5. UTXO blocks are built upon a predefined sorting algorithm of UTXOs within given
chunk of blocks available publicly. That means all miners can pregenerate and check
whether included hash of the UTXO block is correct and reject the whole block, if it is
not.
6. UTXO block should be created only when since last block included into another UTXO
block there are 8192 blocks mined. This means that there is always at least 1 month of
transactions (4096 blocks) kept on the raw blockchain and are not included into UTXO
block for preserving reasonable security level of the network.
7. UTXO set is calculated based on the state of the last included raw block into UTXO
block.
8. Nodes can choose whether they download UTXO block or work only with raw blocks (as
is now), this will keep part of the network storing the whole history of all transactions.
9. Blocks that can have UTXO block hash included have to have a height multiple of 4096.
Every UTXO block is built upon all history of raw blocks minus 4096 blocks, or for
efficiency it can be built upon last avaialble UTXO block and all transaction in 4096
blocks since then.
10. The parameter of 4096 is chosen based on reasonable expectations of how networks operates. It can be chosen differently. Requirement to have always 4096 raw blocks on the
main blockchain is dictated by security (if someone has a hashing power to rewrite last
4096 blocks, they basically will be able to rewrite all history by faking UTXO blockchain,
but only for those nodes who decided to use light UTXO block).
4
Also this factor needs to be dependent on average transaction output life length [todo:
have to be a separate study]. But keeping in mind that many transactions are long chained
and included into the same block, such pruning can have immediate effect even with much
smaller than 4096 value.
One more argument here is the bigger the value chosen for this parameter, the longer raw
blocks chain is not included in UTXO block, which forces new nodes to download and
validate more transactional data in raw format (including spent outputs).
So a balanced value needs to be chosen here. Which will keep the network private and
increase performance for nodes operating based on UTXO blocks.
11. This method requires miners to accept it at a high approval rate (95%) and can be implemented via a softfork without changing consensus rules.
12. The pruned UTXO blocks can be delivered in any form (best to have it implemented
natively by a client), however torrent networks works as well here. Advantage of the
described method is that the user doesnt need to trust the origin source of the file, as the
hash of it is written on the blockchain and can be easily verified.
Sample scenario
Current blockchain height is 418000. Assume miners already accepted and voted for proposal
and it starts working now. Next block for UTXO block inclusion will be 421888 (see Fig.1). So
before we reach this block, all miners will take first 417792 raw blocks (starting with genesis
block), take only UTXO from there and put them in a separate UTXO block structure. Every
miner will be able to independently create this data structure and calculate hash of this UTXO
block, which will be included inside block 421888 and will become part of the history. If the
hash is incorrect or not included into coinbase transaction, absolute majority of miners will
5
reject it and will continue on mining their own block # 421888. As the order of UTXOs is
predefined, every miner can pregenrate the hash of required UTXO block long before the block
# 421888 is mined.
Figure 1: Bitcoin On-chain Pruning First UTXO block
Assuming UTXO block hash is correct, blocks are mined further. When we reach block
425984 all miners will already have next UTXO block created, which will be built upon raw
blocks 1-421888 or for more efficiency based on previous UTXO block and transactions in
blocks 417793-421888 (see Fig.2). Hash of the new UTXO block will be written in block
425984. And so on.
When a new node starts and chooses to work upon UTXO blocks, only UTXO data will be
downloaded (last available UTXO block) plus all the raw blocks since this last UTXO block.
Currently that will result in only about 6Gb of the data instead of 74 Gb of raw blocks (1.3Gb
of UTXO set and 4Gb of last 4096 blocks in raw format). The speed of UTXO blockchain size
growing will be much lower, due to steady filtering out of spent outputs. The advantage of this
method is that it makes the size of the required blockchain to be downloaded for first initiating
6
nodes relatively stable, mainly dependent on number of blocks kept from raw blockchain (4096
in this proposal) and the size of blocks (1Mb currently). So total data to be downloaded will
fluctuate between 4-8 Gb of raw blocks depending when last pruning happened and the size of
last pruned UTXO block ( 1.3 Gb currently).
With assumed internet speed of 2.5 Mb/s it will take roughly 40 minutes to download that
much data (6 Gb). CPU usage for signature verification will be reduced dramatically as most
of the transaction chains will be reduced to have only UTXO in the block. So starting a node
within couple of hours on a low speed internet connection will become possible compared to
several days now.
Conclusion
The described method introduced a way to make on-chain pruning for Bitcoin. That will create
a separate data structure called UTXO block, which will store only UTXO set and reduce download times dramatically compared to download of a full raw blockchain now. Some nodes will
choose to use light UTXO block in order to use less resources and operate faster. At the same
time they will still be fully validating nodes on the network. So this method introduces a new
type of nodes between current fully validating nodes having the full blockchain downloaded
and light client nodes (SPV).
Appendix
10
11