ExamQuestions.com

Register
Login
AWS Certified Machine Learning Specialty Exam Questions

Amazon

AWS Certified Machine Learning Specialty

235 / 258

Question 235:

You are a machine learning specialist at a company that is exploring conversational user interface application development. As an experiment, your team is building a natural language processing application. Your application needs to process the transcribed conversation data from your conversational user interface. For training, you are starting with a dataset comprising 5 million sentences. You plan to run a model based on the Word2Vec algorithm to generate embeddings of the sentences. This will allow your team to make different types of predictions.
Based on this example sentence: “My funy LARGE MEME went over the audiences head.”
Which operations should your team perform to sanitize and prepare the data in a repeatable manner? (CHOOSE THREE)

Answer options:

A.Correct the spelling of "funy" to "funny" and “audiences” to “audience’s.”
B.Perform normalization by making the sentence lowercase.
C.Using an English stopword dictionary, remove all stop words.
D.Use One-hot encoding on the sentence.
E.Use part-of-speech tagging to keep the action verbs and the nouns only.
F.Perform tokenization of the sentence, creating a word vector.