Answer: E
Option A is incorrect. From the Amazon SageMaker developer guide titled Linear Learner Algorithm “Linear models are supervised learning algorithms used for solving either classification or regression problems.” But you are trying to solve one-dimensional time series problem so that you can extrapolate the baseball playtime series into the future.
Option B is incorrect. From the Amazon SageMaker developer guide titled Neural Topic Model (NTM) Algorithm “Amazon SageMaker NTM is an unsupervised learning algorithm used to organize a corpus of documents into topics that contain word groupings based on their statistical distribution.” So this algorithm is used for natural language processing, not time series problems.
Option C is incorrect. The k-means algorithm is a clustering algorithm. From the Amazon SageMaker developer guide titled K-Means Algorithm “K-means is an unsupervised learning algorithm. It attempts to find discrete groupings within data, where members of a group are as similar as possible to one another and as different as possible from members of other groups.” You are trying to solve one-dimensional time series problems to extrapolate playtime series into the future, not a data clustering problem.
Option D is incorrect. From the Amazon SageMaker developer guide titled Random Cut Forest (RCF) Algorithm “Amazon SageMaker Random Cut Forest (RCF) is an unsupervised algorithm for detecting anomalous data points within a data set.” But you are trying to solve a one-dimensional time series problem to extrapolate baseball playtime series into the future.
Option E is correct. From the Amazon SageMaker developer guide titled DeepAR Forecasting Algorithm “... you have many similar time series across a set of cross-sectional units. For example, you might have time series groupings for demand for different products, server loads, and requests for webpages. For this type of application, you can benefit from training a single model jointly over all of the time series. DeepAR takes this approach. When your dataset contains hundreds of related time series, DeepAR outperforms the standard ARIMA and ETS methods. You can also use the trained model to generate forecasts for new time series that are similar to the ones it has been trained on.”Also, from the same developer guide, “The training input for the DeepAR algorithm is one or, preferably, more target time series that the same process or similar processes have generated. Based on this input dataset, the algorithm trains a model that learns an approximation of this process/processes and uses it to predict how the target time series evolves.” So the DeepAR algorithm is used for one-dimensional time series problems for complex analysis like baseball play prediction.
Option F is incorrect. The XGBoost algorithm is a gradient boosting algorithm. From the Amazon SageMaker developer guide titled XGBoost Algorithm, “gradient boosting is a supervised learning algorithm that attempts to accurately predict a target variable by combining an ensemble of estimates from a set of simpler, weaker models.” You are not trying to predict a target value; you are trying to solve a one-dimensional time series problem.
Reference:
Please see the Amazon SageMaker developer guide titled Use Amazon SageMaker Built-in Algorithms, the AWS Machine Learning Blog titled Now Available in Amazon SageMaker: DeepAR algorithm for more accurate time series forecasting, and the AWS StatCast AI page titled See how AI on AWS gives baseball fans new insights into the game.