Answer: C
Option A is incorrect. From the Amazon SageMaker developer guide titled Linear Learner Algorithm, “Linear models are supervised learning algorithms used for solving either classification or regression problems.” But you are trying to solve a data clustering problem so that you can find the ten best clustered sightings to determine where to place your shark detection sensors.
Option B is incorrect. From the Amazon SageMaker developer guide titled Neural Topic Model (NTM) Algorithm, “Amazon SageMaker NTM is an unsupervised learning algorithm that is used to organize a corpus of documents into topics that contain word groupings based on their statistical distribution.” So this algorithm is used for natural language processing, not data clustering.
Option C is correct. The k-means algorithm is a clustering algorithm. From the Amazon SageMaker developer guide titled K-Means Algorithm, “K-means is an unsupervised learning algorithm. It attempts to find discrete groupings within data, where members of a group are as similar as possible to one another and as different as possible from members of other groups.” By setting the k hyperparameter to 10, this algorithm will allow you to find the 10 best groupings of shark sightings worldwide.
Option D is incorrect. From the Amazon SageMaker developer guide titled Random Cut Forest (RCF) Algorithm, “Amazon SageMaker Random Cut Forest (RCF) is an unsupervised algorithm for detecting anomalous data points within a data set.” But you are trying to solve a data clustering problem so you can find the ten best clustered sightings to determine where to place your shark detection sensors.
Option E is incorrect. From the Amazon SageMaker developer guide titled Semantic Segmentation Algorithm, “The Amazon SageMaker semantic segmentation algorithm provides a fine-grained, pixel-level approach to developing computer vision applications.” So the Semantic Segmentation algorithm is used for computer vision applications, but you are trying to solve a data clustering problem.
Option F is incorrect. The XGBoost algorithm is a gradient boosting algorithm. From the Amazon SageMaker developer guide titled XGBoost Algorithm, “gradient boosting is a supervised learning algorithm that attempts to accurately predict a target variable by combining an ensemble of estimates from a set of simpler, weaker models.” You are not trying to predict a target value; you are trying to find discrete groupings in your dataset.
Reference:
Please see the Amazon SageMaker developer guide titled Use Amazon SageMaker Built-in Algorithms.