Хураангуй:
The application of machine learning for stock prediction is attracting a lot of attention in
recent years. A large amount of research has been conducted in this area and multiple
existing results have shown that machine learning methods could be successfully used
toward stock predicting using stocks’ historical data. Most of these existing approaches
have focused on short-term prediction using stocks’ historical price and technical
indicators. In this thesis, we prepared 22 years’ worth of stock quarterly financial data
and investigated three machine learning algorithms: Feed-forward Neural Network
(FNN), Random Forest (RF) and Adaptive Neural Fuzzy Inference System (ANFIS) for
stock prediction based on fundamental analysis. In addition, we applied RF-based
feature selection and bootstrap aggregation in order to improve model performance and
aggregate predictions from different models. Our results show that RF model achieves
the best prediction results, and feature selection is able to improve test performance of
FNN and ANFIS. Moreover, the aggregated model outperforms all baseline models as
well as the benchmark DJIA index by an acceptable margin for the test period. Our
findings demonstrate that machine learning models could be used to aid fundamental
analysts with decision making regarding to stock investment. In recent years, a variety
of research fields, including finance, have begun to place great emphasis on machine
learning techniques because they exhibit broad abilities to simulate more complicated
problems. In contrast to the traditional linear regression scheme that is usually used to
describe the relationship between the stock forward return and company characteristics,
the field of finance has experienced the rapid development of tree-based algorithms and
neural network paradigms when illustrating complex stock dynamics. These nonlinear
methods have proved to be effective in predicting stock prices and selecting stocks that
can outperform the general market. This article implements and evaluates the
robustness of the random forest (RF) model in the context of the stock selection
strategy. The model is trained for stocks in the Chinese stock market, and two types of
feature spaces, fundamental/technical feature space and pure momentum feature
space, are adopted to forecast the price trend in the long run and the short run,
respectively. It is evidenced that both feature paradigms have led to remarkable excess
returns during the past five out-of-sample period years, with the Sharpe ratios
calculated to be 2.75 and 5 for the portfolio net value of the multi-factor space strategy
and momentum space strategy, respectively. Although the excess return has weakened
3
in recent years with respect to the multi-factor strategy, our findings point to a less
efficient market that is far from equilibrium