ADF vs KPSS test for stationarity and ACF vs PACF
The series is stationary according to both ADF and KPSS test for stationarity. However, both ACF and PACF plot still show significant lags. Why is that?
do you know?
how many words do you know
See also questions close to this topic

Do I inverse transform my predictions and test dataset before measuring a model's performance?
I've created a toy example of timeseries forecasting with a series [1, 2, 3, ..., 999, 1000]. I split the series into training (2/3) and testing (1/3) sets, and transformed the training set with scikit's MinMaxScaler.
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator from tensorflow.keras import Sequential from tensorflow.keras.layers import Dense, Dropout from sklearn.preprocessing import MinMaxScaler from sklearn.metrics import mean_squared_error import numpy as np import pandas as pd # define dataset series = pd.DataFrame([i+1 for i in range(1000)]) train = series[:int(len(series)*0.67)] test = series[int(len(series)*0.67):] # Scale scaler = MinMaxScaler() trainNorm = scaler.fit_transform(train) testNorm = scaler.transform(test) # TimeSeriesGenerator takes a funny shape and I don't know why trainNorm = np.array(trainNorm).reshape(len(trainNorm)) testNorm = np.array(testNorm).reshape(len(testNorm))
I use TimeSeriesGenerator to convert the training set into a lagged training set according to the number of time steps I desire. I also construct a simple neural network.
# Number of steps to "look back" for forecasting n_input = 5 # define generator generator = TimeseriesGenerator(trainNorm, trainNorm, length=n_input, batch_size=795) # define model model = Sequential() model.add(Dense(64, activation='relu', input_dim=(n_input))) model.add(Dropout(0.2)) model.add(Dense(64)) model.add(Dropout(0.2)) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') model.summary() # fit model model.fit_generator(generator, steps_per_epoch=1, epochs=200, verbose=2)
I then created a list of predictions to compare to the test set. I also use the training set to perform walkforward validation.
"""This section creates a list of predictions and performs walkforward validation.""" preds = [] history = [x for x in trainNorm] # step over each timestep in the test set for i in range(len(testNorm)): # Forecast for predictions x_input = np.array(history[n_input:]).reshape((1, n_input)) y_pred = model.predict(x_input, verbose=0) # store forecast in list of predictions # and add actual observation to history for the next loop preds.append(y_pred[0]) history.append(testNorm[i]) # Reverse normalization to original values for scoring preds = scaler.inverse_transform(preds) history = np.array(history).reshape(1, 1) history = scaler.inverse_transform(history) test = np.array(test) # estimate prediction error mse = mean_squared_error(test, preds) predError = np.sqrt(mse) print(f"Mean Square Error: {round(mse, 2)}") print(f"Root Mean Square Error: {round(predError, 2)}")
I think my model trains properly and my scoring seems accurate, but I'm not sure. My questions concern the latter part of my code. I'm not sure when to introduce an inverse transformation for scoring my model, or whether I even do so in the first place.
Can I score my model without the inverse transformation? If I do need an inverse transformation, do I do it before the code for scoring and after the code for the walkforward validation loop? Did I code the inverse transformation and reshape my data properly? I'd just like to know whether I'm on the right track with how to do things with my toy model.

How can you graph multiple overlapping 18 month periods with daily data?
I am doing an exploratory data analysis for data that is collected at the daily level over many years. The relevant time period is about 18  20 months from the same date each year. What I would like to do is visually inspect these 18 month periods one on top of the other. I can do this as below by adding data for each geom_point() call. I would like to avoid calling that one time for each period
min ex:
library(tidyverse) minex < data.frame(dts = seq((mdy('01/01/2010')), mdy('11/10/2013'), by = 'days')) minex$day < as.numeric(minex$dts  min(minex$dts)) minex$MMDD < paste0(month(minex$dts), "", day(minex$dts)) minex$v1 < 20 + minex$day^0.4 cos(2*pi*minex$day/365) + rnorm(nrow(minex), 0, 0.3) ggplot(filter(minex, dts %in% seq((mdy('11/10/2013')  (365 + 180)), mdy('11/10/2013'), by = 'days')), aes(day, v1)) + geom_point() + geom_point(data = filter(minex, dts %in% seq((mdy('11/10/2012')  (365 + 180)), mdy('11/10/2012'), by = 'days')), aes(day+365, v1), color = 'red')