AWS Certified Big Data Specialty (Expired on July 1, 2020) Exam Questions

Amazon

AWS Certified Big Data Specialty (Expired on July 1, 2020)

226 / 370

Question 226:

FundsLawn, a financial services company provides fully automated funding to small businesses in minutes, leverage on data generated through business activity to understand performance and processing of funding requests. Uses multi-shard kinesis data streams as data integration backbone,KPL to ingest data generated from various business segments, load the data into applications like RedShift, ES, DynamoDB for invoicing and S3 for long term storage using KCL library.
Resulting in heavy inflow of requests from existing and new customers in different business segments post recent successful campaign, FundsLawn observed a need for strategy to address the following issues with the existing platform
TCO for maintaining the streaming platform is too high
Performance of the streaming platform does not meet SLA’s
Understand the performance of ingestion to improve throughput
Please select 3 options

Answer options:

A.Batching, Aggregation
B.Resharding, Shard Split Operation
C.Batching, Collection
D.Resharding, Shard Merge Operation
E.Enhanced Kinesis Data Streams monitoring level Metrics
F.KPL Metrics at SHARD granularity

Answer correct:

Answer : A, C, F Option A is correct - Aggregation helps to improve the per shard throughput. This is also optimizes the overall TCO of the stream. Batching refers to performing a single action on multiple items instead of repeatedly performing the action on each individual item. Aggregation refers to the storage of multiple records in a Kinesis Data Streams record. Aggregation allows customers to increase the number of records sent per API call, which effectively increases producer throughput. Kinesis Data Streams shards support up to 1,000 Kinesis Data Streams records per second, or 1 MB throughput. The Kinesis Data Streams records per second limit binds customers with records smaller than 1 KB. Record aggregation allows customers to combine multiple records into a single Kinesis Data Streams record. This allows customers to improve their per shard throughput. https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html Option B is incorrect -This may not be a viable option because still we are at a phase we are working on the strategy to redesign our sharding mechanism. We need metrics to identify hot and cold shards and proceed with redesigning the sharding mechanism. Besides, the purpose of resharding in Amazon Kinesis Data Streams is to enable your stream to adapt to changes in the rate of data flow. Split shards to increase the capacity (and cost) of your stream. You merge shards to reduce the cost (and capacity) of your stream. https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-strategies.html Option C is correct - Collection reduces the overhead of making many separate HTTP requests for a multi-shard stream. Batching refers to performing a single action on multiple items instead of repeatedly performing the action on each individual item. Collection refers to batching multiple Kinesis Data Streams records and sending them in a single HTTP request with a call to the API operation PutRecords, instead of sending each Kinesis Data Streams record in its own HTTP request. This increases throughput compared to using no collection because it reduces the overhead of making many separate HTTP requests. https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html Option D is incorrect - This may not be a viable option because still we are at a phase we are working on the strategy to redesign our sharding mechanism. We need metrics to identify hot and cold shards and proceed with redesigning the sharding mechanism. Besides, the purpose of resharding in Amazon Kinesis Data Streams is to enable your stream to adapt to changes in the rate of data flow. Split shards to increase the capacity (and cost) of your stream. You merge shards to reduce the cost (and capacity) of your stream. https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-strategies.html Option E is incorrect - Enhanced Kinesis Data Streams monitoring level Metrics provide information of the streams at shards. This does not provide information about data ingestion. Kinesis sends the following shard-level metrics to CloudWatch every minute. These metrics are not enabled by default. There is a charge for enhanced metrics emitted from Kinesis. Shard-level metrics are for specific monitoring tasks, usually related to troubleshooting. https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-cloudwatch.html#kinesis-metrics Option F is correct - The Kinesis Producer Library (KPL) for Amazon Kinesis Data Streams publishes custom Amazon CloudWatch metrics. Specify an application name when launching the KPL, which is then used as part of the namespace when uploading metrics. Configure the KPL to add arbitrary additional dimensions to the metrics. This is useful if you want finer-grained data in your CloudWatch metrics. One of two important factors for a metric includes level and granularity. The levels are NONE, SUMMARY, and DETAILED. While granularity at GLOBAL, STREAM, and SHARD. When SHARD is chosen, metrics are emitted with the stream name and shard ID as dimensions. Metrics for the current KPL instance are available locally in real time; you can query the KPL at any time to get them. The KPL locally computes the sum, average, minimum, maximum, and count of every metric, as in CloudWatch. https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-kpl.html

Add to favourites

ExamQuestions.com

Register

Login

Amazon

AWS Certified Big Data Specialty (Expired on July 1, 2020)

226 / 370