This article was published as a part of the Data Science Blogathon.
Any data associated with the time that is dependent on time-related matters can be termed as time-series data. In such type of data, we can see trends, nonstationarity, and seasonality based on a daily, weekly, yearly basis. This type of data is adversely affected when any of these above parameters are hampered. Now, being a data scientist who is involved in the analysis of time series it is his/her duty to analyze the data and then make valuable predictions out of it so that the same can be used as a benchmark model to make future predictions or forecastings.
There are many models present for the predictive analysis of time series like Machine learning ARIMA (Auto-Regressive Integrated Moving Average model), Auto-Regressive model, Exponential Smoothing, LSTM (Long Short Term Memory), etc. These models require the data to be fed and with certain tweaking and fine-tuning they help us to make predictions. But, what if there is a third party library that could perform all the fine-tuning part within and you just need to feed the model and wait for the magic to happen?
The answer to this question is the Facebook Prophet library. This was launched by Facebook as an API for carrying out the forecasting related things for time series data. The library is so powerful that it has the capability of handling stationarity within the data and also seasonality related components. By stationarity, we mean that there should be constant mean, variance, and covariance in the data if we divide the data into segments with respect to time and seasonality means the same type of trend the data is following if segregated based on time intervals.
Let’s say that the sales of ice cream are high during summers and low during winters and this trend is being followed over time irrespective of the year. Then this is termed as seasonal data.
For time-series data that will be used as a predictive analysis model, there should be no seasonality and stationarity should be maintained over time intervals. When it comes to using ARIMA, AR, and other models of the same kind then there is always a problem related to the eradication of any kind of seasonality and nonstationarity but, with the help of Prophet, this problem has been finished.
This library offers n number of parameters to play around with and tune our model with higher efficiency for eg. specifying holidays, daily seasonality, Fourier transformations, etc.
So, without wasting much time let’s take a look at this library by implementing the same with the help of Python.
To install this library make sure that Python/Anaconda is already there in your system along with pip installation. Also, you should have the knowledge of creating a new environment in Anaconda to download the libraries through pip or conda. So, let’s see the installation:
1. To install Fbprophet one must first install Pystan which is a library that helps in running Fbprophet with ease. To install Pystan just open you Command Prompt or Anaconda Prompt and then type:
pip install pystan
Wait for the installation to finish.
2. Once Pystan is successfully downloaded, the next step is to install Fbprophet either by pip or conda. Under the same Command Prompt just type:
pip install fbprophet
conda install -c conda-forge fbprophet
3. Once, the installation finishes and throws no error then you have successfully installed the packages and are ready for the implementation.
Implementation With Python
To use Fbprophet one must have a few libraries already installed within the system like Pandas, Matplotlib, Numpy, Warnings (exceptional), Jupyter Notebook, or Lab.
The steps involved in carrying out predictive analysis with the Fbprophet library are:
a. Importing the necessary libraries
b. Importing the data with the help of the Pandas library.
c. Data Preprocessing i.e., taking only two columns which is the date column and the target column, and ignoring the others. Also, converting the date column to Date Time format and then renaming both the columns to “ds” for date and “y” for the target. You can also use feature scaling like normalization or standardization for fast execution of code and better predictions.
d. Fitting/Training the whole model under the Prophet library.
e. Creating new data with the help of the Prophet and then predicting the output on this new data.
f. Plotting the forecast as obtained.
The steps mentioned above have been depicted under pictorial representations to make you familiar with the coding stuff.
Dataset – https://raw.githubusercontent.com/Sagu12/FBPROPHET-TIME-SERIES-FORECASTING/main/milk.csv
Importing the data:
Checking the datatype:
Plotting the data as a line plot to see seasonality and stationarity:
Below we can see clearly that the data is nonstationary as there is an increasing trend in the same and also it is seasonal because of the constant fluctuations at time intervals. Therefore we need to make it stationary and nonseasonal. Fbprophet has the capability of handling this type of data and therefore we need not worry regarding the preprocessing part.
Renaming the columns as desired by Prophet. The Fbprophet library assumes a univariate analysis with respect to the time variable and therefore we need not specify other columns in it. So, now let’s rename the columns to ds and y as desired by the library.
Changing the data type of the date column to Date time:
Importing Fbprophet for time series forecasting and training the model. Hereunder, we can see that the mechanism is the same as of any machine learning algorithm that is, fitting the model and then predicting the output. The only difference that is there is, we need to provide our whole dataset in the model training part and should not split the same into train and test.
Creating the future dataset with the help of the Prophet so that we make predictions on unseen data:
Hereunder the predictions table, we are only concerned with ds, yhat_lower, yhat_upper, and yhat because these are the variables that will give us the predicted results with respect to the date specified.
yhat means the predicted output based on the input fed to the model, yhat_lower, and upper means the upper and lower value that can go based on the predicted output that is, the fluctuations that can happen.
Getting the desired columns:
Plotting the output:
Hereunder the plot we can see the predictions made by the Prophet library. The dotted lines represent the actual data points that we specified in the training part. The lines represent the predictions made. Also, we can see the predictions made on the unseen data that we created with only lines at the extreme right-hand side. For verification purposes, you can match the data frame timestamps with the graph.
Checking the trends in the data:
Below we can see the trends with respect to year and months in the year. The first graph represents an increasing trend as we progress in the years and the latter shows a fluctuating trend in the monthly milk sales. For some months it is low while for some it is high.
So, this is how one can use the Fbprophet library to easily predict future time series data without wasting much time on tuning the model. There is also a provision to perform cross-validation with the help of the Prophet library which helps in increasing the accuracy of predictions.
Also, one can see the performance metric on their data as well, like MSE, RMSE, MAE, etc, by just calling them from Scikit learn library and then implementing the same. Go, explore this wonderful library by yourself, and create wonders.