AWS Certified Big Data Specialty (Expired on July 1, 2020) Exam Questions

Amazon

AWS Certified Big Data Specialty (Expired on July 1, 2020)

205 / 370

Question 205:

Allianz Financial Services (AFS) is a banking group offering end-to-end banking and financial solutions in South East Asia through its consumer banking, business banking, Islamic banking, investment finance and stock broking businesses as well as unit trust and asset administration, having served the financial community over the past five decades.
AFS uses Redshift on AWS to fulfill the data warehousing needs and uses S3 as the staging area to host files. AFS uses other services like DynamoDB, Aurora, and Amazon RDS on remote hosts to fulfill other needs. The data modeling team is working on designing the tables on Redshift and want to adapt best practices for querying. Please advice. select 4 options.

Answer options:

A.Choose the Best SORT key
B.Choose the Best Distribution Style
C.Specify compression encodings when table is created
D.Use Automatic Compression
E.Define primary key and foreign key constraints between tables wherever appropriate, even though they are only informational
F.Use CHAR/VARCHAR for Date Columns

Answer correct:

Answer : A,B, D,E Option A is correct -Amazon Redshift stores your data on disk in sorted order according to the sort key. The Amazon Redshift query optimizer uses sort order when it determines optimal query plans. If recent data is queried most frequently, specify the timestamp column as the leading column for the sort key. Queries are more efficient because they can skip entire blocks that fall outside the time range. If you do frequent range filtering or equality filtering on one column, specify that column as the sort key. Amazon Redshift can skip reading entire blocks of data for that column. It can do so because it tracks the minimum and maximum column values stored on each block and can skip blocks that don`t apply to the predicate range. If you frequently join a table, specify the join column as both the sort key and the distribution key. Doing this enables the query optimizer to choose a sort merge join instead of a slower hash join. Because the data is already sorted on the join key, the query optimizer can bypass the sort phase of the sort merge join. https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-sort- key.html Option B is correct - the query optimizer redistributes the rows to the compute nodes as needed to perform any joins and aggregations. The goal in selecting a table distribution style is to minimize the impact of the redistribution step by locating the data where it needs to be before the query is executed. Distribute the fact table and one dimension table on their common columns. Choose the largest dimension based on the size of the filtered dataset. Choose a column with high cardinality in the filtered result set. Change some dimension tables to use ALL distribution. https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-best-dist-key.html Option C is incorrect -Let COPY Choose Compression Encodings. Automatic compression balances overall performance when choosing compression encodings. Range- restricted scans might perform poorly if sort key columns are compressed much more highly than other columns in the same query. As a result, automatic compression chooses a less efficient compression encoding to keep the sort key columns balanced with other columns https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-use-auto-compression.html Option D is correct -Let COPY Choose Compression Encodings. Automatic compression balances overall performance when choosing compression encodings. Range- restricted scans might perform poorly if sort key columns are compressed much more highly than other columns in the same query. As a result, automatic compression chooses a less efficient compression encoding to keep the sort key columns balanced with other columns https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-use-auto-compression.html Option E is correct -Define primary key and foreign key constraints between tables wherever appropriate. Even though they are informational only, the query optimizer uses those constraints to generate more efficient query plans. Do not define primary key and foreign key constraints unless your application enforces the constraints. Amazon Redshift does not enforce unique, primary- key, and foreign-key constraints https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-defining-constraints.html Option F is incorrect - Amazon Redshift stores DATE and TIMESTAMP data more efficiently than CHAR or VARCHAR, which results in better query performance. Use the DATE or TIMESTAMP data type, depending on the resolution you need, rather than a character type when storing date/time information https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-timestamp-date-columns.html

Add to favourites

ExamQuestions.com

Register

Login

Amazon

AWS Certified Big Data Specialty (Expired on July 1, 2020)

205 / 370