Está en la página 1de 6

BlockStor: A Distributed Storage System for

Blockchain-based Cryptocurrencies
Arlo Miles
November 16, 2016
Abstract
Current implementations of blockchain-based cryptocurrencies require
a large amount of disk space to store the transaction history of the network, and require that the transaction history is downloaded before use.
The most widely used solution to these problems is to rely on a trusted
third party to store and verify transactions. This allows resource-constrained
devices to conduct cryptocurrency transactions, but makes cryptocurrency networks more centralized. We describe a distributed storage system
for blockchain-based cryptocurrencies. A computer model was created and
the results of the model were analyzed. We found that our protocol is secure enough and has low enough latency to support a cryptocurrency at
a large scale.

1
1.1

Introduction
Problem Description

Cryptocurrencies allow financial transactions to occur over a computer network


without a trusted third party. However, current implementations of blockchainbased cryptocurrencies require a large amount of disk space to store the transaction history of the network, and require the transaction history to be downloaded before use. The most widely used solution to these problems is to rely
on a trusted third party to store and verify transactions [1]. This process allows resource-constrained devices to conduct cryptocurrency transactions, but
it makes cryptocurrency networks more centralized. As it becomes less feasible
for each computer to download and store the entire transaction history, fewer
computers will directly connect to the network, which will make the network
less secure and less reliable than it is currently. No decentralized, scalable solution to this problem has been proposed. What is needed is a distributed system
for storing and verifying cryptocurrency transactions. Each computer in the
network should only be required to store a fraction of the transactions in the

transaction history, but each computer must be able to verify every new transaction that takes place in the network, without having to trust data coming
from any other computers.

1.2

Review of Literature

The first description of a cryptocurrency, known as b-money, was published


in 1998 [2], but it was never implemented. Like cryptocurrencies that would
follow, it described a cryptocurrency system in which units of currency were
created by submitting to the network a completed proof-of-work, a problem
that is designed to be difficult to solve but easy to check [3], after which the rest
of computers would increment the balance of the account that submitted the
proof-of-work, and in which transfers were authenticated using cryptographic
signatures. However, unlike the majority of cryptocurrencies currently in use,
it required the storage of account balances, instead of the transaction history,
on each computer.
The next major improvement to this technology was disclosed in 2008, when
the description of a cryptocurrency known as Bitcoin was published [1]. The
major improvement it described was the use of a data structure known as a
block chain to maintain proper ordering of transactions. The rate at which
blocks were created was limited by requiring that a proof-of-work be included in
each one. Each time a block was created, the computer that created it received
a fixed amount of cryptocurrency. Blocks also stored transactions that had
been verified while the proof-of-work was being computed, and a hash of the
previous block to maintain ordering. All blocks, since the first transaction of a
given cryptocurrency took place, were stored on each computer in the network.
This system would theoretically allow each computer to independently verify
every transaction that occurred. A process that allowed computers to use the
network without storing the entire transaction history, known as simplified
payment verification, was also described. However, this process did not allow
these computers to verify each transaction that was submitted to the network.

1.3

Our Investigation

The possibility of allowing computers in a cryptocurrency network to verify


every transaction submitted to the network while each only storing a fraction
of the transaction history was investigated. This approach was taken because
the transaction history for the Bitcoin cryptocurrency network, the most popular cryptocurrency network in existence, is over 50 gigabytes in size. This is
large enough that it is very inconvenient for most users of cryptocurrencies to
download and store. As a result, most users of cryptocurrencies use simplified
payment verification. This has reduced the number of computers in the network
that can verify every transaction, making the network more vulnerable to an

attack. A process for allowing computers in the network to verify all transactions without requiring that the entire transaction history be stored on each
computer will make the network less vulnerable and more scalable.

1.4

Method and Results

A protocol for allowing computers to verify transactions without storing the


transaction history locally was developed. Computer models of the security
and performance of the network were developed. We found that our protocol
is resistant to attack, even when a majority of computers in the network act
maliciously, and that the network continues to perform adequately, even when
the network is very large.

2
2.1

Protocol and Computer Model Description


Protocol Description

Our protocol consists of three messages: potential transaction, previously-verified


transaction, and potential block. A potential transaction message only contains
a transaction. A previously-verified transaction message contains a transaction,
but also contains the cryptographic hash of the block in which it was included,
and a partial Merkle tree so that each client can independently verify that it
was included in the block. A potential block message contains a block and the
previously-verified transaction messages necessary to verify each transaction in
the block. On each computer, there is a message cache, a database containing
only the headers of every block in the blockchain, and a block storage facility.
When a potential transaction message is received, it is broadcast to the rest
of the network. The message cache is then checked for any previously-verified
transactions that have an output that is used by the potential transaction. If
such a transaction exists, that input of the potential transaction is marked as
funded. The message cache is checked for any previously-verified transactions
that have an input that is also an input of the potential transaction. If such
a transaction exists, the potential transaction is marked as invalid. Then, the
potential transaction is placed in the message cache. If any transactions that
either have an output that is used by the potential transaction or an input that
is also an input of the potential transaction are found in permanent storage,
a previously-verified transaction message is constructed and broadcast to the
network. If all the inputs in the potential transaction are marked as funded, the
potential transaction is marked as funded.
When a previously-verified transaction message is received, the contained block
hash is checked to verify that the block in which the previously-verified transaction was contained was accepted into the blockchain. The hash of the contained

transaction is calculated, and the contained partial Merkle tree is checked. If


these checks succeed, the transaction is broadcast to the rest of the network and
placed in the message cache, which is then checked for any potential transactions
that have an input that is an output of the previously-verified transaction. If
such a transaction is found, that input of the potential transaction is marked as
funded. If all the inputs of the potential transaction are marked as funded, the
potential transaction is marked as funded. The message cache is also checked for
potential transactions that have an input that is also an input of the previouslyverified transaction. If such a transaction is found, it is marked as invalid.
When a potential block message is received, the hash and proof-of-work of the
contained block are checked in the same way they are in current cryptocurrency
clients, and the transactions contained within the block are checked for conflicts.
The included previously-verified transaction messages are checked, and are then
used along with the previously-verified transactions stored in the cache to verify
the transactions in the block. If all the transactions are verified, the potential
block message is broadcast to the rest of the network and added to the message
cache. After a period of time, the duration of which is based on the estimated
number of computers in the network and average network latency, has passed,
the message cache is checked for any previously-verified transaction messages
which show that an input to a transaction in the block has already been used
by a previous transaction. If such a message is found, the block is marked as
invalid. If none are found, the block header is added to the permanent storage.
The transactions within the block may be added to the permanent storage as
well. Whether or not a block is stored is decided randomly, with a probability
of storage based on the rate at which new blocks are generated, the free storage
capacity of the computer, and the block size.

2.2

Computer Model Description

Two computer models of the network were created using Java. The first model
was used to determine the probability that a properly-functioning computer
would connect only to malicious computers, thereby putting itself at risk of
not receiving some messages. An initial set of simulated computers were created, and a fraction of those computers were marked as malicious. Next, new
simulated computers were created, a fraction of which were also marked as
malicious. For each new computer, a random set of computers already in the
network were selected, to which connections from the new computer would be
simulated. The number of properly-functioning computers that only connected
to malicious computers was recorded. A formula for approximating this number
was produced.
The second model was used to determine the amount of time required for a
message to reach some fraction of computers in the network. An initial set of
simulated computers was created, and each simulated computer was connected
with a number of other randomly-selected computers. The broadcast of a mes4

sage was simulated, and the number of computers that had received the message
at a given time was recorded. A formula for approximating this number was
produced.

3
3.1

Results
Security

We found that the fraction of computers that were malicious or only connected
to malicious computers had a positive relationship with the initial fraction of
malicious computers in the network and the fraction of malicious computers
added to the network. We observed an inverse relationship with the proportion
of added computers to the number of initial computers and with the number
of connections that each computer made. This number can be approximated
AA
, where F is the fraction of malicious computers
by the formula F = MI I+M
I+A
and computers that only connect to malicious computers, I is the initial number of computers in the network, A is the number of computers added to the
network, MI is the fraction of initial computers that are malicious, and MA is
the fraction of additional computers that are malicious. The probability that a
randomly-selected computer is not malicious itself, but only connects to malicious computers, can be approximated by the formula P = F C, where C is the
number of connections made by each computer added to the network.

3.2

Latency

We found that the fraction of computers in a network that a message reached


1
after a period of time could be approximated by the curve F (t) =
,
k( t x0 )
1+e

where F (t) is the fraction of computers in a network that a message has reached,
t is the amount of time since the message was sent, L is the average connection
1)lnC
lnN ln2
latency, k = ln(N
, N is the number of computers in the
lnN ln2 , x0 =
lnC
network, and C is the average number of connections per computer. We found
that this approximation is most accurate when t is close to zero and when F (t)
is close to one.

Discussion

A distributed storage system for blockchain-based cryptocurrencies has been


described. A computer model was created and the results of that model were
used to construct a mathematical approximation. We found that our protocol
was secure enough and had low enough latency to support a cryptocurrency at
large scale.

4.1

Sybil Attack Prevention

The least resource-intensive attack that would allow a malicious party to gain
control of a significant part of the network is a Sybil attack, in which a single
computer claims to be multiple independent computers. This sort of attack can
be prevented by requiring each computer to periodically submit a completed
proof-of-work to each computer connected to it. This proof-of-work may be
different than the one used in the cryptocurrency. Such a requirement would
increase the resources required to conduct a Sybil attack.

4.2

Implementation

Our protocol was originally designed to be separate from the underlying cryptocurrency protocol. Implementing our protocol in this way would allow existing
cryptocurrency clients to continue being used. However, our protocol could also
be integrated into the existing cryptocurrency protocol. Although this would
require an upgrade of cryptocurrency client software, it may improve the integration of our protocol with existing cryptocurrency networks.

References
[1] S. Nakamoto, Bitcoin: A Peer-to-Peer Electronic Cash System,
Bitcoin - Open source P2P money, 31-Oct-2008. [Online]. Available:
https://bitcoin.org/bitcoin.pdf. [Accessed: 27-Sep-2016].
[2] W.
Dai,
b-money,
1998.
[Online].
http://www.weidai.com/bmoney.txt. [Accessed: 27-Sep-2016].

Available:

[3] M. Jakobsson and A. Juels, Proofs of Work and Bread Pudding Protocols,
Secure Information Networks, pp. 258272, 1999.

También podría gustarte