AWS Certified Machine Learning Specialty Exam Questions

Amazon

AWS Certified Machine Learning Specialty

64 / 258

Question 64:

You work as a machine learning specialist for the National Oceanic and Atmospheric Administration (NOAA Research). NOAA has developed a great white shark detection program to help warn shore populations when the sharks are in the area of a populated beach. You have the assignment to use your machine learning expertise to decide where to place 10 high-tech shark detection sensors on the oceanic floor as part of a pilot to determine if the NOAA invests broadly in these very expensive sensors. You have great white sightings data from around the globe gathered over the past several years to use your model training and test data. The model dataset contains several useful features, such as the longitude and latitude of each sighting.
You have decided to use an unsupervised learning algorithm that attempts to find discrete groupings within the data. Specifically, you want to find similarities in the longitude and latitude and find groupings of these. You need to produce 10 longitude and latitude pairs to determine where to place the sensors.
Which algorithm can you use in SageMaker that best suits this task?

Answer options:

A.Linear Learner
B.Neural Topic Model
C.K-Means
D.Random Cut Forest
E.Semantic Segmentation
F.XGBoost

Answer correct:

Answer: C Option A is incorrect. From the Amazon SageMaker developer guide titled Linear Learner Algorithm, “Linear models are supervised learning algorithms used for solving either classification or regression problems.” But you are trying to solve a data clustering problem so that you can find the ten best clustered sightings to determine where to place your shark detection sensors. Option B is incorrect. From the Amazon SageMaker developer guide titled Neural Topic Model (NTM) Algorithm, “Amazon SageMaker NTM is an unsupervised learning algorithm that is used to organize a corpus of documents into topics that contain word groupings based on their statistical distribution.” So this algorithm is used for natural language processing, not data clustering. Option C is correct. The k-means algorithm is a clustering algorithm. From the Amazon SageMaker developer guide titled K-Means Algorithm, “K-means is an unsupervised learning algorithm. It attempts to find discrete groupings within data, where members of a group are as similar as possible to one another and as different as possible from members of other groups.” By setting the k hyperparameter to 10, this algorithm will allow you to find the 10 best groupings of shark sightings worldwide. Option D is incorrect. From the Amazon SageMaker developer guide titled Random Cut Forest (RCF) Algorithm, “Amazon SageMaker Random Cut Forest (RCF) is an unsupervised algorithm for detecting anomalous data points within a data set.” But you are trying to solve a data clustering problem so you can find the ten best clustered sightings to determine where to place your shark detection sensors. Option E is incorrect. From the Amazon SageMaker developer guide titled Semantic Segmentation Algorithm, “The Amazon SageMaker semantic segmentation algorithm provides a fine-grained, pixel-level approach to developing computer vision applications.” So the Semantic Segmentation algorithm is used for computer vision applications, but you are trying to solve a data clustering problem. Option F is incorrect. The XGBoost algorithm is a gradient boosting algorithm. From the Amazon SageMaker developer guide titled XGBoost Algorithm, “gradient boosting is a supervised learning algorithm that attempts to accurately predict a target variable by combining an ensemble of estimates from a set of simpler, weaker models.” You are not trying to predict a target value; you are trying to find discrete groupings in your dataset. Reference: Please see the Amazon SageMaker developer guide titled Use Amazon SageMaker Built-in Algorithms.

Add to favourites

ExamQuestions.com

Register

Login

Amazon

AWS Certified Machine Learning Specialty

64 / 258