Correct Answers: C and E
Option A is incorrect. The Scikit-learn Normalizer normalizes values to a unit norm. You need to transform categorical values into numerical representations, and you need to replace missing values.
Option B is incorrect. The Scikit-learn Standardizer standardizes values to a unit norm. You need to transform categorical values into numerical representations, and you need to replace missing values.
Option C is correct. The SimpleImputer completes or estimates missing values. This is one of the two sanitation tasks you need to perform.
Option D is incorrect. The Scikit-learn Binarizer sets feature values to 0 or 1 according to a threshold. You need to transform categorical values into numerical values that can represent many different categories, and you need to replace missing values.
Option E is correct. The OneHotEncoder encodes categorical features into a one-hot numeric array with each entry in the array representing a category. There are as many entries in the array as there are categories in the feature. The ‘one’ in a given array element represents a categorical value numerically.
References:
Please see the AWS Machine Learning blog titled Preprocess input data before making predictions using Amazon SageMaker inference pipelines and Scikit-learn (https://aws.amazon.com/blogs/machine-learning/preprocess-input-data-before-making-predictions-using-amazon-sagemaker-inference-pipelines-and-scikit-learn/),
The Amazon SageMaker Examples titled Inference Pipeline with Scikit-learn and Linear Learner (https://github.com/aws/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_inference_pipeline/Inference%20Pipeline%20with%20Scikit-learn%20and%20Linear%20Learner.ipynb),
Amazon SageMaker developer guide titled Use Scikit-learn with Amazon SageMaker (https://docs.aws.amazon.com/sagemaker/latest/dg/sklearn.html),
Scikit-learn API page titled sklearn.preprocessing.OneHotEncoder (https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html),
Scikit-learn API page titled sklearn.impute.SimpleImputer (https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html),
Scikit-learn API page titled API Reference (https://scikit-learn.org/stable/modules/classes.html)