Analyzing the Bitcoin Blockchain

Bitcoin is the first modern crypto-currency, and still the most important by many aspects. One interesting property of Bitcoin and most blockchain-based cryptocurrencies is that all transactions ever made are, by definition, stored in the blockchain in a way that anyone can access them.
In this page, we describe the content of the blockchain to provide an intuitive understanding of what is going on in Bitcoin.
Since there are many misconceptions about how Bitcoin works and its nature, it is first important to stress one point: trading activities are not present in the blockchain: trading BTC against another currency is made platforms (websites) called exchanges, which handle internally these trade between their customer accounts, as simple scriptures in their databases. Thus exchanging BTC agains Dollars or Tether on such a platform is not really a Bitcoin transaction. In reality, the bitcoins you are buying or selling do not belong to you in Bitcoin terms, they really belong to the wallet/account of the exchange platform, and the fact that they belong to you is only described in the platform database.
What we study here are therefore only real Bitcoin transactions, written in the blockchain, up to January 2021.

BITUNAM presentation

BITUNAM is a research project funded by ANR, French National Research Agency. It stands for Bitcoin User Network Analysis and Mining, and its objective is to better understand the nature of interactions between actors of the Bitcoin Blockchain.
Who are we
Remy Cazabet

Transactions statistics

Number of transactions

A first property we can look at is the evolution of the number of Bitcoin transactions present in the Blockchain by month. The total number of transactions at the end of the dataset is 609,437,067.

Inputs/Outputs

Bitcoin transactions have between zero (mining) to several inputs and 1 or several outputs (more on that later). This is the evolution of th average number of inputs and outputs by transaction.

BTC sent by month

Let's now observe the evolution of the total of bitcoin sent by month (sum of the values in output).

Bitcoin Exchange rate

Bitcoin Price

The value in Dollars of the Bitcoin varies greatly along time. Here is the average Bitcoin value for each month. We can clearly observe some bubbles. It is often interesting to observe correlations between those prices and other properties.

USD sent by month

The price of Bitcoin varies greatly along time. To get a better sense of how much value is exchanged, we can plot the total value exchanged in USD, at the time of the transaction. We can observe that it is much more correlated to the change in bitcoin prices rather than change in bitcoin exchanged.

Mining

Number of Mining

The number of mining transactions. It stays mostly constant along time because the mining task is controled automatically by the Bitcoin protocol. Since each successful mining operation appends a block to the block-chain, i.e., validates a set of transaction request, the objective is to have such validations every about 10 minutes.

Individual mining reward: BTC

When a miner mines a block, it receives a reward, which is composed of 1) Newly minted coins (blue), and 2) Fees paid by customer who send bitcoins(red). The reward is controled by the Bitcoin protocol and we can clearly see effect of successive halvings, programed to progressively reduce the amount of newly created bitcoins. Fees are regulated as a market, not by the Bitcoin Protocol: if more customers want to make transactions, fees tend to increase, since each block can only contains a fixed amount of transactions. In 2021, the reward is fixed at 6.25btc, but we observe an average mining gain of 7btc, i.e., fees become increasingly important in the mining economy.

Total mining reward: USD

A well-know problem of Bitcoin is its huge waste of electric power, often compared with medium-sized country. This electic power is the consequence of miners competing with ever-more powerful hardware to have chances to win mining rewards. Again, this is a market: If all miners combined receive 1 Billion USD over this month, then the sum of what they collectively spend in electricity+hardware investment has to be around 50%/90% of this value, to remain profitable. If Bitcoin price were to remain stable, Bitcoin energy waste would naturally decrease until the end of Bitcoin emission.

Fees

Fees by transaction - USD

Each transaction must pay transaction fees to miners. The everage transaction fee en dollars varies greatly over time. Beware, this value might be misleading, since fees vary according to some transaction properties (its weight in bits, which depends on various factors).

% of transactions paid as fee

Another way to understand fee is to compute the average % of transactions paid as fee. For comparison, the average credit card fee is 2% of each transaction (hidden to you). We compute two variants of the percentage of each transaction value that is paid as fee: 1) Red: the average of the fee fraction per transaction, 2)Blue: The fraction of all amounts spent sent as fees for each month. Fees are mostly independent from values sent, so low amount transactions pay expensive fees (about 2%), while large transactions pay negligeable fees (less tahn 0.01%).

From addresses to actors

Discovering actors/wallets

To go deeper in the analysis of Bitcoin transaction, we need to think in term of actors. In Bitcoin, transaction are made from bitcoin addresses to bitcoin addresses. But these addresses do not correspond to indidivual actors: anyone can create as many addresses as wanted, instantaneously, at no cost. As a consequence, to guarantee some level of anonymity, actors avoid reusing addresses, and instead create new ones for each transaction. If 10 persons whant to send me some coins, I'll provide them 10 different addresses. Thus, in the blockchain, it is not possible to link those different addresses as belonging to the same person, the same actor. But this anonymity has some limits: when I want to spend those coins, I will typically plug several of them in input of the same transaction to do some payments: for instance, I received 10 transactions of 1 BTC, but I want to make a payment of 4 BTC: I will create one transaction with 4 of the payments outputs I received as inputs. At this point, it is possible to make the link between them, and to conclude that those 4 addresses belong to the same actors. The process is iterative: if addresses A and B appear as input to the same transaction, and later B and C, then I can know that A,B and C all belong to the same actors.
For several reasons, this approach is quite effective, at least to identify major Actors such as companies and exchange. For simplicity, we can consdier that this approach works well for actors that consider it more important to optimize transaction costs than to hide their transactions. That is for instnace the case of most public wallet applications. See this article for details.
Note that I prefer the term of actors rather than wallet, because in some cases actors can change their wallet or use multiple ones, or the term "wallet" might be misleading for company/services that use complex, custom transaction management.

Actors

Addresses by actors

The distribution of the number of adresses by actors follows a line on a log-log plot, which is typical of a power law distribution: There is no "typical scale": Most actors (tens of millions) have less than 3 adresses, while a few actors have tens of millions of adresses. Note that the number of adresses by must be understood as a bottom-line: we know that an identify actors has at least those adresses, but they could have much more thatn we could not identify.

Transactions inputs/ouputs

Distribution of input addresses

We observe that most transactions have a single address in input. It means that these transactions do not allow to create clusters of Addresses Note however that those singleton addresses might be reused, and thus might be re-identified somewhere. It is for instance common for companies to use a "peeling" strategy, i.e., make many successive paiments from one address, sending the change to the same address.

Distribution of output addresses

As expected, most transactions have 2 output addresses, probably corresponding to the payment and the "change address". This information confirms the importance of change address identification for tracking bitcoin users.