ExamQuestions.com

Register
Login
AWS Certified Machine Learning Specialty Exam Questions

Amazon

AWS Certified Machine Learning Specialty

82 / 258

Question 82:

You have just landed a position as a machine learning specialist at a large financial services firm. Your new team is working on a fraud detection model using the SageMaker built-in linear learner algorithm. You are gathering the data required for your machine learning model. The dataset you intend to produce will contain well over 5,000 objects that need to be labeled. Your team wants to control the costs of cleaning your data. Therefore, the team has decided to use SageMaker Ground Truth active learning to automate your data labeling.
The Ground Truth automated labeling job initially follows this set of steps:
Selects a random sample of data
sends the sample data to human workers
uses the human-labeled data as validation data
runs a SageMaker batch transform using the validation set, which generates a quality metric used to estimate the potential quality of auto-labeling the rest of the unlabeled data
runs a SageMaker batch transform on the unlabeled data
data, where the expected quality of automatically labeling the data is above the requested level of accuracy, is labeled
After performing the above steps, what does Ground Truth do next to complete the labeling of ALL of your data?

Answer options:

A.Selects a new sample of unlabeled data and sends it to human workers; it uses the existing labeled data to verify the new human-labeled data; repeats this later set of steps until all the data in the dataset is labeled.
B.Selects a new sample of unlabeled data and sends it to human workers; it uses the existing labeled data and the new human-labeled data to train a new model; repeats this later set of steps until all the data in the dataset is labeled.
C.Selects a new sample of the most hard to identify unlabeled data and sends it to human workers; it uses the existing labeled data to verify the new human-labeled data; repeats this later set of steps until all the data in the dataset is labeled.
D.Selects a new sample of the most hard to identify unlabeled data and sends it to human workers; it uses the existing labeled data and the new human-labeled data to train a new model; repeats this later set of steps until all the data in the dataset is labeled.