Forecasting Bitcoin Price Based on Blockchain Information Using Long-Short Term Method

Since its founding in 2008, Bitcoin (financial code: BTC) has emerged as a digital currency in market cap and continues to attract investors and policymakers' attention. In recent years, BTC has high price volatility, a substantial increase in 2016, followed by a significant decline in 2018. Unlike stock markets, BTC  is open for 24x7 dan has no closing period. It means everyone can trade it for any time. However, this flexibility carries investment risk. This research attempts to forecast BTC's price by considering the blockchain's information to minimize the risk. We employ Long-Short Term Memory (LSTM), the artificial Recurrent Neural Network (RNN) architecture. Its model can avoid long-term problems. The data  used is BTC's price and blockchain information data from August 4, 2018, to January 21, 2020. The model with 20 neurons and 500 epochs has the smallest MSE value. Then a prediction has an accuracy rate of 91.07%.


INTRODUCTION
Technology is presented as something new as it pushes for increasing change. In reality, information technology (IT) processes data, collects information, stores collected material, accumulates knowledge, and made communication easier, and plays an important role in many aspects of daily operations of business world nowadays. In response to technological changes, criminals and society are find the new ways to develop. Therefore, technology is introduced to many financial mechanisms like cryptocurrency (Bakar & Rosbi, 2017).
Bitcoin is one of cryptocurrency that founded in 2008. BTC has emerged as a digital currency in market cap and continues to attract investors and policymakers' attention. In recent years, BTC has high price volatility, a substantial increase in 2016, followed by a significant decline in 2018 (Al-Yahyaee, 2019). First bitcoin transaction has occurred in January 2009. More two years later, various reports estimate the circulation of Bitcoin can be more than 6.5 million with approximately 10.000 users. Unlike stock markets, Bitcoin and the others cryptocurrency can be traded in any time cause it has no closing period. However, this flexibility can carries risk. Forecast BTC's price by considering blockchain information can minimize the risk if its done with predict the Bitcoin price accurately.
The methods that can be used is Long-Short Term Memory (LSTM) which is a development form from Recurrent Neural Network (RNN). The common LSTM unit consist of cells, input gate, output gate and forget gate. Cells remember value during abritary time interval and the gates regulates the flow of information into an out of cell. The LSTM model filters information through a gate structure to maintain and update the state of the memory cells. Each memory cells has three sigmoid layers and one tanh layer (Qiu, Wang, & Zhou, 2020).
The LSTM model used in financial data is able to produce a highr coefficient determination with smallest mean square error. The LSTM models used include the LSTM model with wavelet denoising and Gated Recurrent Unit (GRU) neural network model (Qiu, Wang, & Zhou, 2020) (Fischer & Krauss, 2018) (Aldi, Jondri, & Aditsania, 2018). LSTM implementation on electricity load and hydrology (rainfall) data also has high accuracy (Kratzert, Klotz, Brenner, Schulz, & Herrnegger, 2018) (Zheng, Xu, Zhang, & Li, 2017). LSTM can be used for univariate or multivariate time series data. In way to forecast the bitcoin price (univariate) whick has high fluctuation, LSTM is able to compete ARIMA accuracy model (McNally, Roche, & Caton, 2018). However, in one case if an LSTM combined with an AR model, is able to compete the performance of an LSTM on its own (Wu, Lu, Ma, & Lu, 2018).
Blockchain information like difficulty level, average number of blocks, transaction per block, etc. can be used to forecast the bitcoin price. The bitcoin price is affected by macro financial indicators such as the stock index, gold price, etc. Because bitcoin is valued in currency (USD, GBP, JYP, etc.) then the exchange rate of currency give an affect to ups and down of the bitcoin prices (Jang & Lee, 2018) (Sriwiji & Primandari, 2019).

Data Source
The type of data used in this study is secondary data. The data is obtained from several website like www.blockchain.com to get bitcoin blockchain information and www.investing.com to get historical data of bitcoin prices (in USD exchange rates).

Reseacrh Variables
The variables used for this research consisted of 26 variables. The operational definition of research variables is an explanation of each variable used in the study of the indicators that make it up. It can be seen in

Data Analysis Methods
The steps that will be carried out in this study are as follows: 1. Preprocessing data by looking the existence of missing value 2. Perform descriptive analysis of daily Bitcoin data 3. Conducting LSTM Multivariate analysis with the following steps: • Calculating the sigmoid and tanh values using the equation (Ma, 2015): where x is input data and e is an constanta.
• Normalizing research data by converting actual data into values with range intervals [0,1] using min-max scaling (Aldi, Jondri, & Aditsania, 2018): Where x' is a result of normalized data, x is the data to be normalized, is minimum value of the data, and is maximum value of the data. • Split the data into training data and testing data.
• Determine neuron and epoch number.
• Forecasting data • Perform denormalization data using formulas: is a result of denormalized data, is the data to be denormalization, is minimum value of the data, and is maximum value of the data. 4. Comparing the MSE, RMSE, and MAPE value from each experiment using formulas (Budiman, 2016) : dengan : actual value; ̂: predict value; n; amount of time predicted Below is a flowchat of the analysis.

Descriptive Analysis
Descriptive analysis of daily Bitcoin data was carried out to find out the common description of Bitcoin on April 4th, 2018 to January 4th, 2020.  Figure 2 it can be seen that Bitcoin price movement tend to increased or its trend is increasing. On June, 20th 2019, bitcoin reached the highest price i.e US$ 13.063,80. Thi price is still low if compared to bitcoin price in 2017 cause it manage to reach $19.533 per coin. The lowest bitcoin price was around US$ 3.300 in early Desember 2018 a decrease of about 83% from the highest price (Farras, 2018).

Multivariate Long-Short Term Memory Analysis
The research data is split into training data and testing data with a ratio of 80% for training data or as much as 420 and 20% for testing data or as much as 107. Then, the network is formed with 10 input variables and 1 output layer with the number of neurons in the hidden layer that will be used for experiments are 10, 20, 30, 40, and 50. Number of epoch that will be used are100, 500, and 1000. To find out the exact number of neurons and epoch it can be seen by the smallest loss value where loss value here means MSE value.
Parameters optimized using Adam's optimization. This opimization using bias correction technique. Theres no rules to determine number of neuron and epoch so the number obtained through simulation to get most optimal number in forecasting the time series data. The following table is the results with several numbers of neurons and epochs. Bitcoin Price network that formed was too general.it means the network ability to recognized the pattern is almost none. Meanwhile, too much epoch can effect to network experiencing an overfit condition or the network is too specific to training data.

Forecasting Data
Model that obtained from training process will be tested using testing data. To make sure the result of the model is good or not can be seen from the plot in Figure 3.

Figure 3. Comparioson Plot between Actual dan Prediction Data
In Figure 3 we can see the pattern of predicted data follow the pattern of actual data. It means models that formed give a good forecasting (prediction) result. Actual data represented as blue line and prediction data represented as orange line.
Error value is measured by Mean Absoulute Percentage Error (MAPE). To forecast the testing data, the LSTM model has MAPE 8.93%. It means, the model accuracy is 91.07%.

CONCLUSION
The movement of Bitcoin price tend to increased or its trend is increasing. The highest bitcoin price is reached On June, 20th 2019 i.e US$ 13.063,80. Thi price is still low if compared to bitcoin price in 2017 cause it manage to reach $19.533 per coin. The lowest bitcoin price was around US$ 3.300 in early Desember 2018 a decrease of about 83% from the highest price.
In data preprocessing, dependent variable (bitcoin price) and independent variable (blockchain information) is normalized by mix-max scaling. Then, the data split into training data and testing data with a ratio 80% for training data and 20% for testing data. The optimal number of neuron and epoch was tuned doing simulation by looking the smallest MSE value. In this reseacrh we used 20 neurons and 500 epochs. Error value of LSTM model perfomance for testing data obtained 8.93% of MAPE. It means, the accuracy of the model is 91.07%.