Data Engineering on Microsoft Azure (DP-203) Exam Questions

Microsoft

Data Engineering on Microsoft Azure (DP-203)

235 / 240

Question 235:

Henry is a Data Engineer of Whizlabs Inc working on Databricks Spark streaming. He’s using PySpark for the development of dataframes. He needs to perform the data aggregations &amp; count of distinct data frame operations in the dataframe. 
Which of the following is the correct code snippet in this scenario?

Answer options:

A.countDistinctDF = nonNullDF.select(“emp_id”, “emp_name”) 
.groupBy(“emp_id).agg(countDistinct(“emp_name”).alias(“distinct_emp_name”)
display(countDistinctDF)
B.countDistinctDF = nonNullDF.select(“emp_id”, “emp_name”) 
.groupBy(“emp_id).aggregate(countDistinct(“emp_name”).alias(“distinct_emp_name”)
display(countDistinctDF)
C.countDistinctDF = nonNullDF.select(“emp_id”, 
“emp_name”).agg(countDistinct(“emp_name”).alias(“distinct_emp_name”). .groupBy(“emp_id)
display(countDistinct)
D.countDistinctDF = nonNullDF.select(“emp_id”, “emp_name”) 
.groupBy(“emp_id).aggregate().(countDistinct(“emp_name”).alias(“distinct_emp_name”)
display(countDistinctDF)

Add to favourites

ExamQuestions.com

Register

Login

Microsoft

Data Engineering on Microsoft Azure (DP-203)

235 / 240