Just wanted to comment/ask on the linear regression lesson where you calculate the stcok prices using linear regression. In the forecasting and prediction section (https://pythonprogramming.net/forecasting-predicting-machine-learning-tutorial/ ) you make a prediction for five days into the future using the last five days data. And than draw the forecasted data on the graph.
So what happens seems to be you have data as (day1, day2 ..... dayn-1, dayn) in df, when you drop nan values you are left with (day1, day2, ... dayn-5) in df and Xlately is (dayn-4, dayn-3..dayn) than using the predicition model you are trying to predict (dayn+1, dayn+2,...dayn+5) into forecast - - when you get the date last date with df.iloc[-1].name you get dayn-5 and from there you insert the forecasted values in to (dayn-4, dayn-3..dayn) and draw your graph with this index and assume that you are seeing five days into the future in your graph.
I think you somehow need to store "Adj. Close" values in a sepate list before dropping off nan's and then add the forecasted values to that list to get a correct presentation.
Am I right? Below you can find my alternative code. But it seems there is still something not correct...
import quandl, math import numpy as np import pandas as pd from sklearn import preprocessing, cross_validation, svm from sklearn.linear_model import LinearRegression import matplotlib.pyplot as plt from matplotlib import style import datetime
X = np.array(df.drop(['label'], 1)) X = preprocessing.scale(X) X_lately = X[-forecast_out:] X = X[:-forecast_out] # new line y_spare = df['Adj. Close'] #new line end df.dropna(inplace=True)
for i in forecast_set: next_date = datetime.datetime.fromtimestamp(next_unix) next_unix += 86400 df.loc[next_date] = [np.nan for _ in range(len(df.columns)-1)]+[i] """ # new line last_date = y_spare.index[-1] forecast_date = last_date + datetime.timedelta(days=1) forecast_index = [] for value in forecast_set: if forecast_date.weekday()<5: #checking if the date is a weekdate forecast_index.append(forecast_date) else : # correcting it as a week day forecast_date += datetime.timedelta(days=(8 - forecast_date.isoweekday())) forecast_index.append(forecast_date) forecast_date += datetime.timedelta(days=1) forecast_series = pd.Series(forecast_set, index=forecast_index) #new line end
#df['Adj. Close'].plot() #df['Forecast'].plot() y_spare.plot() #new line forecast_series.plot() #new line plt.legend(loc=4) plt.xlabel('Date') plt.ylabel('Price') plt.show()
You must be logged in to post. Please login or register an account.