What Is Time Series Analysis and how does it benefit a data analyst? We look at a number of models may be employed to help describe time series.
In data analysis, a time series is a collection of data points organized in time. According to some definitions, the data points in a time series should be equally spaced, although this is not always the case. The varying definitions for a time series can be illustrated with three examples:
- A dataset compares the performance of athletes to their height. Since neither the performance of athletes nor their height relates to time, this is not a time series by any definition.
- A dataset compares the weather in a town to readings taken at random times of the year. Since time is one of the variables, but the intervals between the data points are not evenly spaced, this may or may not be a time series depending on the definition chosen.
- A dataset compares the weather in a town to reading taken at the same time every month. Since time is one of the variables, and the intervals between the data points are evenly spaced (one month apart), this is a time series by all definitions.
What Is Time Series Analysis?
Time series analysis is the process of analyzing a time series. It is chiefly concerned with identifying three different aspects of the time series, which can be used to better clean, understand, and forecast the data. To do so, it may use a range of models which can process the time series.
Factors in Time Series Analysis
When analyzing a time series, this form of data analysis involves identifying at least three insightful aspects of the data. These factors are autocorrelation, seasonality, and stationarity.
Autocorrelation
In a time series, autocorrelation is the tendency of data observations and patterns to repeat themselves. If these observations and patterns repeat themselves at regular individuals, the result may also be known as seasonality.
Seasonality
As touched on above, seasonality is when observations and patterns repeat themselves at regular intervals. The best example of seasonality would be a graph of temperatures across multiple years. During the summer, temperatures are high; during the winter, temperatures are low.
Stationarity
Stationarity is a measure of how little a time series' mean and variance changes over time. For example, if the temperatures measured across a period of ten years are of similar magnitude and variance — after accounting for the seasonality of the dataset — then the time series would be said to have high stationarity.
As a more illustrative example of stationarity, consider the effect of global warming on temperatures measured every month. Although the dataset may continue to show signs of autocorrelation and seasonality, the stationarity of the dataset would decrease due to a higher mean temperature and greater variance in temperatures (due to lower and higher extremes).
Benefits of Time Series Analysis
Time series analysis has various benefits for the data analyst. From cleaning data to understanding it — and helping to forecast future data points — this is all achieved through the application of various time series models, which we'll touch on later.
Cleaning data
The first benefit of time series analysis is that it can help to clean data. This makes it possible to find the true "signal" in a data set, by filtering out the noise. This can mean removing outliers, or applying various averages so as to gain an overall perspective of the meaning of the data.
Of course, cleaning data is a prominent part of almost any kind of data analysis. The true benefit of time series analysis is that it is accomplished with little extra effort.
Understanding data
Another benefit of time series analysis is that it can help an analyst to better understand a data set. This is because of the models used in time series analysis help to interpret the true meaning of the data, as touched on previously.
Forecasting data
Last but not least, a major benefit of time series analysis is that it can be the basis to forecast data. This is because time series analysis — by its very nature — uncovers patterns in data, which can then be used to predict future data points.
For example, autocorrelation patterns and seasonality measures can be used to predict when a certain data point can be expected. Further, stationarity measures can be used to estimate what the value of that data point will be.
Really, it's the forecasting aspect of time series analysis that makes it so popular in business applications. Analyzing and understanding past data is all good and well, but it's being able to predict the future that helps to make optimal business decisions.
Models for Time Series Analysis
There are a number of models that can be used to describe and predict data points in a time series. In this section, we'll look at two of the most basic models: moving averages and exponential smoothing.
Moving averages
A moving average model suggests that an upcoming data point will be equal to the average of past data points. This rudimentary model is powerful in smoothing out data sets so as to observe their overall trend, with little regard for outlying data points. However, it may smooth out the seasonality of some time series.
Exponential smoothing
Exponential smoothing is another model where upcoming data points are predicted based on an exponentially decreasing average of past data points. It's said to be preferable to a moving average model in time series where there is no clear trend or pattern.
Final Thoughts
Time series analysis is an advanced area of data analysis that focuses on processing, describing, and forecasting time series, which are time-ordered datasets. There are numerous factors to consider when interpreting a time series, such as autocorrelation patterns, seasonality, and stationarity. As a result, a number of models may be employed to help describe time series, including moving averages and exponential smoothing models. More advanced time series analysis models, which have not been discussed in this article, can be used to predict time series behavior with greater accuracy.