Question 240:
You work as a machine learning specialist for an online retailer that is expanding into fresh produce as one of its new product categories. You and your machine learning team have been tasked with creating a model to classify each of your new fresh produce products. Examples of features in your data source include weight, price, country of origin, food group (fruit, vegetable, etc.), and other numeric and categorical features. You plan on using either k-nearest neighbors (KNN) or support vector machines (SVM) to classify your fresh produce products. Which data cleansing technique should you use on your data so that your features with potentially large values, such as weight, don’t take on exaggerated importance in the model when compared to features with potentially smaller values, such as price per unit?
Answer options:
A.Scale your data using scikit-learn MinMaxScaler B.Normalize your data using scikit-learn normalize C.Bin your data using scikit-learn KBinsDiscretizer with the uniform strategy D.Quantile bin your data using scikit-learn KBinsDiscretizer with the quantile strategy