###### 编程技术网

 用户名 Email 自动登录 找回密码 密码 立即注册

# 通过逐步实施的股票市场预测机器学习

shikhar mishra Python 2022-1-10 23:42 163人围观

This article was published as a part of the Data Science Blogathon

In summary, Machine Learning Algorithms are widely utilized by many organizations in Stock market prediction. This article will walk through a simple implementation of analyzing and forecasting the stock prices of a Popular Worldwide Online Retail Store in Python using various Machine Learning Algorithms.

Let us see the data on which we will be working before we begin implementing the software to anticipate stock market values. In this section, we will examine the stock price of Microsoft Corporation (MSFT) as reported by the National Association of Securities Dealers Automated Quotations (NASDAQ). The stock price data will be supplied as a Comma Separated File (.csv), that may be opened and analyzed in Excel or a Spreadsheet.

MSFT的股票列在纳斯达克上，其价值更新了股票市场的每个工作日。应该指出的是，市场不允许周六和周日交易，因此两个日期之间存在差距。每日日期都表示股票的股票，股票的最高和最低价值，以及当天结束时的截止值。

```#Importing the Libraries
import pandas as PD
import NumPy as np
%matplotlib inline
import matplotlib. pyplot as plt
import matplotlib
from sklearn. Preprocessing import MinMaxScaler
from Keras. layers import LSTM, Dense, Dropout
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib. dates as mandates
from sklearn. Preprocessing import MinMaxScaler
from sklearn import linear_model
from Keras. Models import Sequential
from Keras. Layers import Dense
import Keras. Backend as K
from Keras. Callbacks import EarlyStopping
from Keras. Layers import LSTM
from Keras. utils.vis_utils import plot_model

```

Using the Pandas Data Reader library, we will upload the stock data from the local system as a Comma Separated Value (.csv) file and save it to a pandas DataFrame. Finally, we will examine the data.

```#Get the Dataset

```#Print the shape of Dataframe  and Check for Null Values
print(“Dataframe Shape: “, df. shape)
print(“Null Value Present: “, df.IsNull().values.any())
Output:
>> Dataframe Shape: (7334, 6)
>>Null Value Present: False```
 Date Open High Low Close Adj Close Volume 1990-01-02 0.605903 0.616319 0.598090 0.616319 0.447268 53033600 1990-01-03 0.621528 0.626736 0.614583 0.619792 0.449788 113772800 1990-01-04 0.619792 0.638889 0.616319 0.638021 0.463017 125740800 1990-01-05 0.635417 0.638889 0.621528 0.622396 0.451678 69564800 1990-01-08 0.621528 0.631944 0.614583 0.631944 0.458607 58982400

```#Plot the True Adj Close Value

• Open
• High
• Low
• Volume
```#Set Target Variable
#Selecting the Features
features = [‘Open’, ‘High’, ‘Low’, ‘Volume’]```

```#Scaling
scaler = MinMaxScaler()
feature_transform = scaler.fit_transform(df[features])
feature_transform= pd.DataFrame(columns=features, data=feature_transform, index=df.index)
 Date Open High Low Volume 1990-01-02 0.000129 0.000105 0.000129 0.064837 1990-01-03 0.000265 0.000195 0.000273 0.144673 1990-01-04 0.000249 0.000300 0.000288 0.160404 1990-01-05 0.000386 0.000300 0.000334 0.086566 1990-01-08 0.000265 0.000240 0.000273 0.072656

SCI-kit-rement库的TimeSeriessplit类将用于此。我们将分割数设置为10，表示将使用10％的数据作为测试集，90％的数据将用于培训LSTM模型。利用该时间序列分割的优点是以规则的时间间隔检查分割时间序列数据样本。

```#Splitting to Training set and Test set
timesplit= TimeSeriesSplit(n_splits=10)
for train_index, test_index in timesplit.split(feature_transform):
X_train, X_test = feature_transform[:len(train_index)], feature_transform[len(train_index): (len(train_index)+len(test_index))]
y_train, y_test = output_var[:len(train_index)].values.ravel(), output_var[len(train_index): (len(train_index)+len(test_index))].values.ravel()```

LSTM的数据处理

```#Process the data for LSTM
trainX =np.array(X_train)
testX =np.array(X_test)
X_train = trainX.reshape(X_train.shape[0], 1, X_train.shape[1])
X_test = testX.reshape(X_test.shape[0], 1, X_test.shape[1])```

```#Building the LSTM Model
lstm = Sequential()
plot_model(lstm, show_shapes=True, show_layer_names=True)```

```#Model Training
history=lstm.fit(X_train, y_train, epochs=100, batch_size=8, verbose=1, shuffle=False)
Eросh  1/100
834/834  [==============================]  –  3s  2ms/steр  –  lоss:  67.1211
Eросh  2/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  70.4911
Eросh  3/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  48.8155
Eросh  4/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  21.5447
Eросh  5/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  6.1709
Eросh  6/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  1.8726
Eросh  7/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  0.9380
Eросh  8/100
834/834  [==============================]  –  2s  2ms/steр  –  lоss:  0.6566
Eросh  9/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  0.5369
Eросh  10/100
834/834  [==============================]  –  2s  2ms/steр  –  lоss:  0.4761
.
.
.
.
Eросh  95/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  0.4542
Eросh  96/100
834/834  [==============================]  –  2s  2ms/steр  –  lоss:  0.4553
Eросh  97/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  0.4565
Eросh  98/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  0.4576
Eросh  99/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  0.4588
Eросh  100/100
834/834  [==============================]  –  1s  2ms/steр  –  lоss:  0.4599```

LSTM预测

```#LSTM Prediction
y_pred= lstm.predict(X_test)```

```#Predicted vs True Adj Close Value – LSTM
plt.plot(y_test, label=’True Value’)
plt.plot(y_pred, label=’LSTM Value’)
plt.title(“Prediction by LSTM”)
plt.xlabel(‘Time Scale’)
plt.ylabel(‘Scaled USD’)
plt.legend()
plt.show()```

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

^