Answer : A, C, F
Option A is correct - Aggregation helps to improve the per shard throughput. This is also optimizes the overall TCO of the stream.
Batching refers to performing a single action on multiple items instead of repeatedly performing the action on each individual item.
Aggregation refers to the storage of multiple records in a Kinesis Data Streams record. Aggregation allows customers to increase the number of records sent per API call, which effectively increases producer throughput.
Kinesis Data Streams shards support up to 1,000 Kinesis Data Streams records per second, or 1 MB throughput. The Kinesis Data Streams records per second limit binds customers with records smaller than 1 KB. Record aggregation allows customers to combine multiple records into a single Kinesis Data Streams record. This allows customers to improve their per shard throughput.
https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html
Option B is incorrect -This may not be a viable option because still we are at a phase we are working on the strategy to redesign our sharding mechanism. We need metrics to identify hot and cold shards and proceed with redesigning the sharding mechanism.
Besides, the purpose of resharding in Amazon Kinesis Data Streams is to enable your stream to adapt to changes in the rate of data flow. Split shards to increase the capacity (and cost) of your stream. You merge shards to reduce the cost (and capacity) of your stream.
https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-strategies.html
Option C is correct - Collection reduces the overhead of making many separate HTTP requests for a multi-shard stream.
Batching refers to performing a single action on multiple items instead of repeatedly performing the action on each individual item.
Collection refers to batching multiple Kinesis Data Streams records and sending them in a single HTTP request with a call to the API operation PutRecords, instead of sending each Kinesis Data Streams record in its own HTTP request.
This increases throughput compared to using no collection because it reduces the overhead of making many separate HTTP requests.
https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html
Option D is incorrect - This may not be a viable option because still we are at a phase we are working on the strategy to redesign our sharding mechanism. We need metrics to identify hot and cold shards and proceed with redesigning the sharding mechanism.
Besides, the purpose of resharding in Amazon Kinesis Data Streams is to enable your stream to adapt to changes in the rate of data flow. Split shards to increase the capacity (and cost) of your stream. You merge shards to reduce the cost (and capacity) of your stream.
https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-strategies.html
Option E is incorrect - Enhanced Kinesis Data Streams monitoring level Metrics provide information of the streams at shards. This does not provide information about data ingestion.
Kinesis sends the following shard-level metrics to CloudWatch every minute. These metrics are not enabled by default. There is a charge for enhanced metrics emitted from Kinesis. Shard-level metrics are for specific monitoring tasks, usually related to troubleshooting.
https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-cloudwatch.html#kinesis-metrics
Option F is correct - The Kinesis Producer Library (KPL) for Amazon Kinesis Data Streams publishes custom Amazon CloudWatch metrics. Specify an application name when launching the KPL, which is then used as part of the namespace when uploading metrics. Configure the KPL to add arbitrary additional dimensions to the metrics. This is useful if you want finer-grained data in your CloudWatch metrics.
One of two important factors for a metric includes level and granularity. The levels are NONE, SUMMARY, and DETAILED. While granularity at GLOBAL, STREAM, and SHARD. When SHARD is chosen, metrics are emitted with the stream name and shard ID as dimensions. Metrics for the current KPL instance are available locally in real time; you can query the KPL at any time to get them. The KPL locally computes the sum, average, minimum, maximum, and count of every metric, as in CloudWatch.
https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-kpl.html