AWS Certified CloudOps Engineer Associate Practice Test (SOA-C03)
Use the form below to configure your AWS Certified CloudOps Engineer Associate Practice Test (SOA-C03). The practice test can be configured to only include certain exam objectives and domains. You can choose between 5-100 questions and set a time limit.

AWS Certified CloudOps Engineer Associate SOA-C03 Information
The AWS Certified CloudOps Engineer – Associate certification validates your ability to deploy, operate, and manage cloud workloads on AWS. It’s designed for professionals who maintain and optimize cloud systems while ensuring they remain reliable, secure, and cost-efficient. This certification focuses on modern cloud operations and engineering practices, emphasizing automation, monitoring, troubleshooting, and compliance across distributed AWS environments. You’ll be expected to understand how to manage and optimize infrastructure using services like CloudWatch, CloudTrail, EC2, Lambda, ECS, EKS, IAM, and VPC.
The exam covers the full lifecycle of cloud operations through five key domains: Monitoring and Performance, Reliability and Business Continuity, Deployment and Automation, Security and Compliance, and Networking and Content Delivery. Candidates are tested on their ability to configure alerting and observability, apply best practices for fault tolerance and high availability, implement infrastructure as code, and enforce security policies across AWS accounts. You’ll also demonstrate proficiency in automating common operational tasks and handling incident response scenarios using AWS tools and services.
Earning this certification shows employers that you have the technical expertise to manage AWS workloads efficiently at scale. It’s ideal for CloudOps Engineers, Cloud Support Engineers, and Systems Administrators who want to prove their ability to keep AWS environments running smoothly in production. By earning this credential, you demonstrate the hands-on skills needed to ensure operational excellence and reliability in today’s fast-moving cloud environments.

Free AWS Certified CloudOps Engineer Associate SOA-C03 Practice Test
- 20 Questions
- Unlimited
- Monitoring, Logging, Analysis, Remediation, and Performance OptimizationReliability and Business ContinuityDeployment, Provisioning, and AutomationSecurity and ComplianceNetworking and Content Delivery
An operations team must migrate 500 TB of data from an on-premises NFS file server to an Amazon S3 bucket in us-east-1 within seven days. The site has a dedicated, mostly idle 10 Gbps AWS Direct Connect link. The team wants the simplest AWS DataSync configuration that can saturate the link during the bulk copy and then be reused for scheduled incremental syncs after the cut-over. What should the team do?
Deploy a single DataSync agent on-premises, create one DataSync task that copies the entire share to the bucket, and leave the task bandwidth setting at "Use available".
Deploy two DataSync agents and assign both agents to the same task so the task can reach 20 Gbps.
Deploy two DataSync agents and create two separate tasks, each copying half of the directory tree to different prefixes in the bucket.
Use the AWS CLI with aws s3 sync for the full copy, and reserve DataSync only for incremental syncs.
Answer Description
A single DataSync task can use up to 10 Gbps, which is approximately 1.25 GB/s or about 108 TB per day in ideal conditions. Even allowing for protocol overhead, a single agent running at roughly 70 TB per day can move 500 TB in seven days (7 × 70 ≈ 490 TB). Therefore, deploying one on-premises agent and configuring the task to use all available bandwidth is sufficient to meet the timeline while keeping the setup simple. DataSync automatically tracks metadata and can be re-run on a schedule for incremental copies. Adding extra agents or splitting the dataset adds complexity without bypassing the 10 Gbps per-task limit, and S3 Transfer Acceleration is not used by DataSync.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is AWS DataSync used for?
What is the bandwidth limit per task in AWS DataSync?
How does AWS Direct Connect benefit data transfers?
What is AWS DataSync and how does it work?
How does AWS Direct Connect improve data transfer for services like DataSync?
What are the benefits of using DataSync for both bulk and incremental transfers?
A DevOps engineer installed the unified CloudWatch agent on dozens of Amazon EC2 instances in two Regions. The agent publishes the mem_used_percent metric in the CWAgent namespace with the InstanceId dimension. The engineer must receive a single SNS notification whenever any instance's memory usage is above 80% for three consecutive 1-minute periods while minimizing management effort and CloudWatch costs. Which approach satisfies these requirements?
Create an EventBridge rule that matches every PutMetricData API call from the CWAgent namespace, route the events to an SNS topic, and use SNS message filtering to detect values above 80%.
Create a standard CloudWatch alarm for each EC2 instance that monitors mem_used_percent, then create a composite alarm that aggregates these alarms and publishes a notification to SNS.
Stream the mem_used_percent metrics to CloudWatch Logs, configure a metric filter that counts occurrences above 80% in a 1-minute window, and create a CloudWatch alarm on that filter to send an SNS notification.
Create a CloudWatch alarm on the Metrics Insights query
SELECT MAX(mem_used_percent) FROM "CWAgent"
with a 60-second period, threshold 80, and three evaluation periods, and configure the alarm to publish to Amazon SNS.
Answer Description
A Metrics Insights query can aggregate the memory-utilization metrics from every instance into one time series, so the engineer needs only one alarm. Using SELECT MAX(mem_used_percent) FROM "CWAgent"
with a 60-second period returns the highest utilization observed in each minute. Configuring the alarm with a threshold of 80 and three evaluation periods generates an SNS notification only when any instance breaches the limit for three consecutive minutes. This single alarm is cheaper and simpler than creating individual alarms, building composite alarms, or forwarding every PutMetricData call through EventBridge or log filters.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is the CloudWatch unified agent?
How does a Metrics Insights query work in CloudWatch?
What is the purpose of evaluation periods in a CloudWatch alarm?
What is Metrics Insights in CloudWatch?
What is the advantage of using a single CloudWatch alarm with aggregated metrics?
How does CloudWatch handle evaluation periods in alarms?
An operations engineer is troubleshooting a Java application running on an EC2 instance in a private subnet that suddenly fails to connect to an Amazon RDS for MySQL database in the same VPC. The instance is attached to security group sg-app, whose only outbound rules allow TCP ports 80 and 443 to 0.0.0.0/0. The database is attached to sg-db, whose inbound rules allow TCP 3306 from sg-app. Network ACLs and route tables already permit all traffic between the subnets. Which change will most effectively restore connectivity while adhering to the principle of least privilege?
Add an outbound rule to sg-app that allows TCP 3306 with sg-db as the destination.
Associate both the EC2 instance and the database with the default security group.
Add an inbound rule to sg-app that allows TCP 3306 from sg-db.
Broaden sg-db's inbound rule to allow TCP 3306 from 0.0.0.0/0.
Answer Description
The initial client connection originates from the EC2 instance, so sg-app must allow egress on the destination port. Because sg-app's outbound rules currently restrict traffic to ports 80 and 443, the SYN packet for MySQL (TCP 3306) is dropped before it ever reaches sg-db. Adding an outbound rule that permits TCP 3306 with sg-db (or its CIDR) as the destination allows the connection to be established, while still limiting other outbound traffic. Opening sg-db to 0.0.0.0/0 or disabling both groups would grant unnecessary access, and adding an inbound rule to sg-app is irrelevant because the instance is the initiator and stateful rules already cover return traffic.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is the principle of least privilege in the context of security groups?
How do security groups differ from network ACLs in AWS?
What are the implications of using 0.0.0.0/0 in a security group rule?
Why does sg-app need an outbound rule for TCP 3306?
What does 'adhering to the principle of least privilege' mean in this context?
Why don’t we need to modify sg-db's inbound rules for this scenario?
An Ops team will launch a new VPC (10.0.0.0/16) spanning two Availability Zones. Each AZ will host one public and one private subnet. Resources in private subnets must initiate outbound internet connections even if one AZ becomes unavailable, and networking costs should be kept as low as AWS best practices allow. Which subnet and NAT configuration meets these requirements?
Create one NAT gateway in a public subnet of Availability Zone A and associate both private subnet route tables with this gateway.
Launch a single NAT instance in one public subnet and update both private subnet route tables to forward 0.0.0.0/0 traffic to that instance.
Deploy a NAT gateway in each public subnet and configure each private subnet's route table to use the NAT gateway located in the same Availability Zone.
Provision two NAT gateways in a dedicated services subnet located in Availability Zone A and point all private subnets to those gateways for internet access.
Answer Description
Placing a NAT gateway in each Availability Zone and ensuring that the private subnet in that AZ routes traffic to the local gateway provides resiliency against an AZ outage; if one zone fails, the other zone's NAT gateway continues to operate. This design also avoids the cross-AZ data processing charges that occur when a subnet routes through a NAT gateway in another AZ. A single NAT gateway or NAT instance represents a single point of failure, and locating both gateways in one AZ undermines availability while still incurring cross-AZ traffic costs.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why does each AZ need its own NAT gateway?
What are cross-AZ data processing charges?
What is a NAT gateway and how does it differ from a NAT instance?
What is a NAT gateway and its purpose in AWS?
Why is it important to deploy a NAT gateway in each Availability Zone?
What are cross-AZ data processing charges and why should they be avoided?
A company runs an application on Amazon EC2 Linux instances that are launched by an Auto Scaling group. Operations must collect Apache access logs and memory utilization from every instance, send the data to Amazon CloudWatch, and ensure that any update to the collection settings is applied automatically to new and running instances without storing credentials on the servers. Which solution meets these requirements with the LEAST operational overhead?
Bake the CloudWatch Logs agent and a cron-based script that runs the aws cloudwatch put-metric-data CLI command into the AMI, passing long-lived access keys to the instances with user data. Rebuild the AMI whenever the configuration changes.
Turn on AWS CloudTrail management and data events for the account, enable CloudTrail Insights, and create a CloudWatch Logs subscription filter to capture Apache logs and memory metrics.
Enable detailed monitoring on the Auto Scaling group and write a shell script that copies Apache logs to an S3 bucket every five minutes; configure an S3 event to import the logs into CloudWatch Logs.
Store a CloudWatch agent JSON configuration in Systems Manager Parameter Store. Attach an IAM instance profile that includes AmazonSSMManagedInstanceCore and CloudWatchAgentServerPolicy in the launch template. Use Systems Manager Run Command with AWS-ConfigureAWSPackage to install the CloudWatch agent and AmazonCloudWatch-ManageAgent to start it, so the agent automatically downloads the configuration and publishes Apache logs and memory metrics.
Answer Description
The unified CloudWatch agent can collect both system-level metrics such as memory utilization and application logs like Apache access logs. By storing the agent's JSON configuration centrally in AWS Systems Manager Parameter Store, any change to the configuration can be fetched when the agent starts or is refreshed, so new and existing Auto Scaling instances stay consistent without rebuilding AMIs. Installing the package with the AWS-ConfigureAWSPackage Run Command document and then starting or reloading it with the AmazonCloudWatch-ManageAgent document lets you push the latest configuration fleet-wide. An instance profile that includes AmazonSSMManagedInstanceCore and CloudWatchAgentServerPolicy supplies the necessary permissions, so no static credentials are stored on the servers.
The older CloudWatch Logs agent publishes only logs and would still require custom scripts for memory metrics and a manual update process. Copying logs to Amazon S3 and importing them into CloudWatch Logs does not solve in-guest memory monitoring and adds scripting overhead. Enabling CloudTrail and CloudTrail Insights records management and data-plane API events but cannot collect operating-system metrics or web-server log files. Therefore, the Systems Manager-based deployment of the CloudWatch agent is the lowest-maintenance approach.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is the role of AWS Systems Manager Parameter Store in this solution?
What permissions are provided by AmazonSSMManagedInstanceCore and CloudWatchAgentServerPolicy?
Why is the unified CloudWatch agent preferred over the older CloudWatch Logs agent?
What is the role of Systems Manager Parameter Store in this solution?
Why is IAM instance profile important for this solution?
What advantages does the unified CloudWatch agent have over the older CloudWatch Logs agent?
An operations team runs a Lambda function named ValidateTag
in AWS account 111111111111. A custom AWS Config rule that resides in account 222222222222 must invoke this function. Security policy states that all permissions must be managed from account 111111111111 and that no IAM roles may be created or modified in account 222222222222. Which approach meets these requirements while following the principle of least privilege?
Share the
ValidateTag
function with account 222222222222 by using AWS Resource Access Manager; the share automatically grants invoke permissions to AWS Config.Add a permission to the Config rule that allows it to assume a role in account 111111111111 which has
lambda:*
permissions on the function.Create an IAM role in account 222222222222 that trusts account 111111111111, attach the
lambda:InvokeFunction
permission to it, and reference the role ARN in the Config rule.From account 111111111111 run
aws lambda add-permission --function-name ValidateTag --statement-id AllowConfigCrossAccount --action lambda:InvokeFunction --principal config.amazonaws.com --source-account 222222222222
, which adds a resource-based policy to the function.
Answer Description
Because AWS Config is a service, the safest way to let a Config rule in another account invoke the function is to add a resource-based policy statement to the Lambda function in the owning account. Using the lambda:add-permission
API (or the console) adds a statement that names config.amazonaws.com
as the principal, grants only the lambda:InvokeFunction
action, and scopes the permission to the caller's account with a SourceAccount
or SourceArn
condition. No new roles are needed in the calling account. The other options either rely on creating or changing roles in account 222222222222, use unsupported sharing mechanisms, or grant the wrong action, so they violate the stated constraints or least-privilege practice.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is a resource-based policy in AWS Lambda?
What does the `lambda:add-permission` API command do?
What is the principle of least privilege in AWS?
What is a resource-based policy in AWS?
How does the `lambda:add-permission` API work?
What is the purpose of the `SourceAccount` condition in the policy?
A DevOps engineer updates a networking CloudFormation stack that currently exports its VPC ID as DevVpcId. The revised template exports a different VPC ID but retains the same export name. On update CloudFormation fails with "Export DevVpcId cannot be updated as it is in use by stack AppStack." AppStack must stay running and unchanged. Which action enables deployment of the new VPC without triggering the export error?
Add the CAPABILITY_NAMED_IAM flag to the update command so CloudFormation can overwrite the existing export.
Enable termination protection on AppStack before updating the networking stack to suppress the export conflict.
Rename the new VPC ID output to a unique export name (for example DevVpcIdV2) and then update the networking stack.
Grant the deployment role cloudformation:UpdateExport permission and retry the stack update.
Answer Description
CloudFormation does not permit changing or deleting an exported value that is being imported by another stack. Because AppStack is still importing the output named DevVpcId, the networking stack cannot update that existing export. The correct remediation is to keep the original export intact and create a new export with a unique name (for example DevVpcIdV2) that contains the new VPC ID. This allows the networking stack update to succeed without disrupting AppStack. The other options do not address the export-name constraint: adding CAPABILITY_NAMED_IAM only affects IAM resources, modifying stack permissions will not bypass the limitation, and enabling rollback protection or termination protection has no impact on export name conflicts.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why does CloudFormation not allow changing exported values?
What is the purpose of CAPABILITY_NAMED_IAM in CloudFormation?
How can an export conflict be safely resolved in CloudFormation?
Why can't CloudFormation update or delete an exported value in use?
What does creating a unique export name like DevVpcIdV2 achieve?
What is the purpose of the CAPABILITY_NAMED_IAM flag, and why doesn't it work here?
Your company manages infrastructure for multiple AWS accounts using Terraform. You must build a CI/CD pipeline that: validates plans on every commit, stores Terraform state centrally with locking to prevent simultaneous writes, and avoids long-lived credentials in the pipeline environment. Which approach meets these requirements while following AWS and Terraform best practices?
Store the state file in a CodeCommit repository and enable repository versioning; store each account's access keys in Secrets Manager and inject them into the build environment.
Wrap Terraform modules in CloudFormation StackSets and use CloudFormation as the remote backend; pass cross-account role ARNs to CodePipeline through environment variables.
Configure an encrypted, versioned S3 bucket with a DynamoDB table for state locking; have CodeBuild assume an environment-specific IAM role via STS and run Terraform with the S3 backend.
Use the local backend on the CodeBuild container and rely on CodePipeline artifact versioning; create a single IAM user with AdministratorAccess and embed its access keys in the buildspec file.
Answer Description
Storing the Terraform state in an S3 bucket that has server-side encryption and versioning, while using a DynamoDB table for state locking, satisfies the requirement for a central, collision-free state store. In the pipeline, CodeBuild can assume an account-specific IAM role through AWS STS, so no permanent access keys are exposed. Terraform is initialized with the S3 backend and automatically uses the temporary credentials provided by the assumed role. The other options either lack state locking, rely on insecure long-lived credentials, or misuse services (for example, CodeCommit and CloudFormation are not supported remote backends for Terraform state).
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why is an S3 bucket with DynamoDB used for managing Terraform state?
How does AWS STS help avoid long-lived credentials in pipelines?
Why are the other options for Terraform state management incorrect?
What is state locking in Terraform and why is it important?
How does CodeBuild assume an IAM role using AWS STS?
What is the Terraform S3 backend and how does it work?
Your company has three AWS accounts (A, B, and C) that belong to the same organization. Operations wants a single CloudWatch dashboard in account A (us-east-1) that shows EC2 CPUUtilization metrics from accounts B and C in both us-east-1 and eu-west-1. They need the simplest solution that avoids copying data between Regions or running additional agents. Which set of steps will meet these requirements according to AWS best practices?
Enable cross-Region replication for CloudWatch in accounts B and C so that their eu-west-1 metrics are copied to us-east-1, then create a single-Region dashboard in account A.
Install the CloudWatch agent on all instances in accounts B and C with a configuration that publishes the metrics directly into the log group of account A.
Share each Region's metrics from accounts B and C to account A by using AWS Resource Access Manager, then add the shared metrics to a dashboard in account A.
In accounts B and C, create an IAM role that trusts account A and grants CloudWatch read-only access; from account A, build the dashboard widgets using metric identifiers that include the source account IDs and Regions.
Answer Description
CloudWatch dashboards can display metrics that reside in another Region or another account without any data replication. Each source account must allow the dashboard account to read its metrics. The documented way is to create an IAM role (any name is allowed) in each source account that trusts the monitoring account, attach a read-only CloudWatch policy, and then reference the remote Region and account ID when you build the widget (for example, account-id:region, namespace, metric-name, dimensions). No resource shares, data exports, or agents are required. The console and PutDashboard API automatically assume the role when rendering the widget, so the dashboard refreshes seamlessly across accounts and Regions. Options suggesting AWS RAM, data replication, or running agents introduce unnecessary complexity and are not required for cross-account, cross-Region dashboards.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why is creating an IAM role necessary for sharing metrics between AWS accounts?
How does CloudWatch support cross-account and cross-region metric dashboards without data replication?
What is the significance of the metric identifiers when building CloudWatch dashboard widgets across accounts?
What is an IAM role in AWS?
How do you reference metrics across accounts and Regions in CloudWatch?
What is a CloudWatch dashboard and how is it used?
An auto-scaling script sometimes goes out of control and issues a flood of RunInstances API requests, quickly exhausting the AWS account's service quotas. You need an AWS-native mechanism that detects the abnormal surge in RunInstances call rate and immediately invokes a Lambda function that disables the script's IAM role. Which solution provides the required automation with the least ongoing operational overhead?
Enable AWS Config and write a custom rule that counts RunInstances API calls; have the rule invoke the Lambda function when the count exceeds the allowed limit.
Enable CloudTrail Insights for management events and create an EventBridge rule that matches "AWS Insight via CloudTrail" events where insightType is ApiCallRateInsight; set the rule's target to the Lambda function that disables the IAM role.
Turn on VPC Flow Logs and use CloudWatch Contributor Insights to detect traffic spikes; create an EventBridge rule that triggers the Lambda function when flow-log entries exceed a threshold.
Send CloudTrail logs to CloudWatch Logs, build a metric filter to count RunInstances calls per minute, add a CloudWatch alarm on the metric, and configure the alarm to invoke the Lambda function through SNS.
Answer Description
CloudTrail Insights analyzes write-management events such as RunInstances. When the RunInstances call rate deviates from the baseline, CloudTrail generates an Insight event whose detail-type is "AWS Insight via CloudTrail" with insightType set to ApiCallRateInsight. EventBridge can match this event and invoke a Lambda target. Enabling CloudTrail Insights plus a single EventBridge rule needs no manual thresholds or extra infrastructure.
Streaming CloudTrail logs to CloudWatch Logs with a metric filter would work but requires hand-tuned thresholds and ongoing maintenance. VPC Flow Logs and Contributor Insights monitor network traffic, not API frequencies. AWS Config evaluates resource configurations on a schedule and cannot track API call rates in near real time. Therefore, CloudTrail Insights with an EventBridge rule is the most operationally efficient choice.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is CloudTrail Insights and how does it work?
How does EventBridge integrate with CloudTrail Insights?
Why are alternatives like CloudWatch metric filters less efficient for detecting API anomalies?
What is CloudTrail Insights and how does it detect API anomalies?
What does EventBridge do in this solution?
Why are the other solutions less efficient for this use case?
A company hosts an internal REST API on Amazon EC2 instances in a "service VPC" that resides in Account A. Several developer teams in other AWS accounts need to consume this API from private subnets in their own VPCs. Security policy states that traffic must stay on the AWS network, the service VPC must not accept any inbound connections over VPC peering, and each consumer VPC must be able to use its own CIDR range without overlap constraints. Which approach satisfies the requirements with the least operational effort?
Expose the API through an internet-facing Application Load Balancer and require each consumer subnet to use a NAT gateway for outbound calls.
Attach all VPCs to an AWS Transit Gateway and advertise the service VPC subnet routes to the consumer VPCs through Transit Gateway route tables.
Establish VPC peering connections between the service VPC and every consumer VPC, then update route tables to point traffic to the peering links.
Place the API behind a Network Load Balancer, create a VPC endpoint service, and let each consumer VPC connect through an interface VPC endpoint (AWS PrivateLink).
Answer Description
Publishing the API through AWS PrivateLink keeps all traffic on the AWS backbone and removes the need to manage complex routing rules or overlapping CIDRs. The service owner places the API behind a Network Load Balancer and creates a VPC endpoint service. Consumer accounts create interface VPC endpoints in their private subnets; these appear as elastic network interfaces, so no inbound traffic reaches the service VPC directly. VPC peering and Transit Gateway attachments fail the inbound-restriction or CIDR-overlap requirements, and a NAT gateway plus public ALB forces traffic across the public internet and adds needless cost.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is AWS PrivateLink and how does it help in this scenario?
What are the benefits of using a Network Load Balancer with AWS PrivateLink?
Why are VPC peering and Transit Gateway not suitable for this use case?
What is AWS PrivateLink and how does it work?
Why is a Network Load Balancer required for AWS PrivateLink?
How does AWS PrivateLink address overlapping CIDR issues and enforce inbound restrictions?
An operations engineer installed the CloudWatch agent on several Amazon Linux 2 EC2 instances by using the Systems Manager document AWS-ConfigureAWSPackage. A custom JSON file (shown below) was deployed to each instance and the agent was restarted.
{
"agent": {"metrics_collection_interval": 60},
"metrics": {
"append_dimensions": {"InstanceId": "${aws:InstanceId}"},
"aggregation_dimensions": [["InstanceId"]]
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/opt/app/server.log",
"log_group_name": "app-logs",
"log_stream_name": "{instance_id}"
}
]
}
}
}
}
Application logs are now visible in CloudWatch Logs, but no memory or disk space metrics appear in CloudWatch Metrics. What is the simplest way to collect these missing metrics on every instance?
Insert mem and disk sections under metrics_collected in the agent JSON file, then restart the CloudWatch agent on each instance.
Edit the AWS-ConfigureAWSPackage document to run the agent in collectd compatibility mode.
Turn on detailed monitoring for the instances in the EC2 console.
Attach the managed policy CloudWatchAgentAdminPolicy to the instance profile role.
Answer Description
The CloudWatch agent publishes only the metrics that are explicitly listed in the metrics_collected section of its JSON configuration. The current file defines no collectors, so the agent sends no memory or disk data even though it is running. Adding the appropriate collectors (for example, a mem block for memory and a disk block for file-system usage) and then restarting or reloading the agent causes the agent to gather and publish those metrics. Enabling EC2 detailed monitoring affects only the built-in instance metrics (CPU, network, etc.) and cannot add memory or disk metrics. Changing the instance role's permissions or modifying the Systems Manager document does not cause the agent to start collecting additional metrics when they are not specified in the configuration.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is the role of the CloudWatch agent in collecting metrics?
How does the JSON configuration file in CloudWatch agent affect monitoring?
Why doesn’t enabling detailed monitoring in EC2 add memory or disk metrics?
What is the purpose of the metrics_collected section in the CloudWatch agent JSON file?
How do you add memory and disk metrics to the CloudWatch agent configuration file?
What is the difference between CloudWatch agent metrics and detailed monitoring in the EC2 console?
An organization stores sensitive logs in the prod-private-logs S3 bucket in its production AWS account. To run periodic queries, an analytics account currently accesses the bucket through a bucket policy that grants s3:GetObject to an IAM role in that account. Security policy now mandates that every cross-account access path uses an external ID. What is the most secure way to comply without breaking the analytics workflow?
Attach a service control policy (SCP) to the analytics account that denies s3:GetObject unless the request includes the required external ID header.
Add a Condition element with sts:ExternalId to the existing S3 bucket policy so that the analytics role must present the correct external ID when calling GetObject.
Enable S3 Object Lock in compliance mode for the bucket and require callers to specify the external ID through object version IDs when fetching objects.
Create an IAM role in the production account that trusts the analytics account, includes a Condition requiring a specific sts:ExternalId value, attaches a policy allowing s3:GetObject on the bucket, and remove the direct bucket policy statement. Have the analytics workflow assume this role before accessing S3.
Answer Description
An external ID can only be evaluated by AWS STS when an external principal tries to assume a role. S3 bucket policies do not recognize the sts:ExternalId condition key, so the requirement cannot be enforced there. The secure pattern is to replace direct bucket access with a production-side IAM role whose trust policy allows the analytics account to call sts:AssumeRole only when the expected external ID is supplied. The role then carries a permissions policy that grants the minimum s3:GetObject access to the bucket. The analytics workflow assumes the role and uses the temporary credentials to read the objects, satisfying both the security mandate and the functional need. Adding sts:ExternalId to the bucket policy, creating an SCP, or enabling Object Lock would not enforce the requirement or would block legitimate access.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is sts:ExternalId and why is it required in cross-account access scenarios?
Why can't sts:ExternalId be used in S3 bucket policies?
What is the difference between a permissions policy and a trust policy in IAM roles?
What is an external ID in AWS?
Why can't sts:ExternalId be used in an S3 bucket policy?
How does sts:AssumeRole work in the context of cross-account access?
An RDS for PostgreSQL running on a db.t3.medium instance shows sustained high DB load. Performance Insights issues a proactive recommendation stating that the CPU wait dimension is saturated. Which modification best follows the recommendation to increase performance efficiency?
Scale the instance to a larger class such as db.m6g.large.
Turn on automatic minor version upgrades to apply the latest patch.
Enable storage autoscaling and double the gp2 volume size.
Create a read replica in another Availability Zone for analytic traffic.
Answer Description
When CPU is identified as the dominant wait dimension, Performance Insights proactive recommendations advise adding compute capacity. Scaling the DB instance class to a larger size (for example, moving from a burstable db.t3.medium to a compute-optimized or general-purpose db.m6g.large) directly increases available vCPUs and memory, reducing CPU saturation. Enlarging storage, creating replicas, or applying patches can help other bottlenecks but do not address immediate CPU exhaustion flagged by the recommendation.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is CPU wait dimension in RDS Performance Insights?
How do instance classes like db.t3.medium or db.m6g.large differ?
Why does scaling the instance improve performance for CPU saturation?
What is the difference between burstable and general-purpose DB instance classes in AWS RDS?
What does the 'CPU wait dimension is saturated' mean in Performance Insights?
What is the role of Storage Autoscaling in AWS RDS, and why doesn't it solve CPU saturation?
An operations team manages an Amazon ECS cluster that uses the EC2 launch type. They need to collect host-level CPU, memory, disk, and network metrics and forward all container application logs to Amazon CloudWatch Logs. The solution must start automatically on every new container instance without requiring changes to existing application task definitions. Which approach meets these requirements with the least operational effort?
Add the CloudWatch agent as a sidecar container to every existing and future application task definition so it starts alongside each application task.
Edit the configuration file of the Amazon ECS agent on every container instance so that it emits host metrics and container logs directly to CloudWatch.
Create a task definition that runs the unified CloudWatch agent (with Fluent Bit) and deploy it as an ECS service that uses the DAEMON scheduling strategy. Store the agent configuration in Parameter Store and grant the task IAM permissions to write to CloudWatch.
Use EC2 user data to install and start the CloudWatch agent on each container instance when it boots.
Answer Description
Running the unified CloudWatch agent as its own task with the DAEMON scheduling strategy ensures that one copy of the agent (and optional Fluent Bit container) is launched on every container instance in the cluster. The agent configuration can be stored in AWS Systems Manager Parameter Store, and the task's IAM role needs permissions such as CloudWatchAgentServerPolicy. Because the agent is deployed independently of the application task definitions, no modification of those tasks is required, and every new EC2-based container instance automatically starts collecting and publishing the required metrics and logs.
Embedding the agent as a sidecar in each application task would meet the functional requirement but would require editing every task definition and updating them whenever new services are added. Installing the agent once with user data or by manually updating the ECS agent does not guarantee that future instances receive the agent without additional steps and does not automatically collect container logs. Therefore, deploying the CloudWatch agent as a DAEMON task is the most operationally efficient solution.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is the DAEMON scheduling strategy in ECS?
How does the unified CloudWatch agent work with Fluent Bit?
Why is storing the CloudWatch agent configuration in Parameter Store beneficial?
What is the DAEMON scheduling strategy in Amazon ECS?
What is the difference between Fluent Bit and the CloudWatch agent?
How does AWS Systems Manager Parameter Store support this solution?
A company has associated a Route 53 Resolver DNS Firewall rule group with several production VPCs to block known malware domains. An auditor requires proof that the blocking rules are enforced and insists that the DNS log records be retained for at least 5 years at the lowest possible cost. Which solution meets these requirements with the least operational overhead?
Enable AWS CloudTrail Lake and periodically join CloudTrail management events with VPC Flow Logs to infer blocked DNS requests.
Turn on Route 53 Resolver query logging to CloudWatch Logs and create a subscription filter that forwards the logs to an S3 bucket.
Enable Route 53 Resolver query logging for the production VPCs and write the logs directly to an Amazon S3 bucket that has a lifecycle policy to transition objects to the S3 Glacier Flexible Retrieval storage class after 30 days.
Configure Amazon GuardDuty DNS Malware Protection and export its findings to AWS Security Hub for long-term retention.
Answer Description
Route 53 Resolver query logging can stream logs directly to an Amazon S3 bucket. Each entry includes the query_name plus DNS Firewall fields such as firewall_rule_group_id and firewall_rule_action, demonstrating that the rule evaluated the request. Applying an S3 lifecycle policy moves older objects to S3 Glacier Flexible Retrieval, providing inexpensive five-year retention with no additional infrastructure. Sending logs first to CloudWatch or relying on CloudTrail, VPC Flow Logs, or GuardDuty adds cost or lacks per-query evidence.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Route 53 Resolver DNS Firewall?
How does an S3 lifecycle policy work?
What data does Route 53 Resolver query logging capture?
What is Route 53 Resolver DNS Firewall?
What is an S3 lifecycle policy and how does it help with cost optimization?
Why is Route 53 Resolver query logging sent directly to S3 preferred over CloudWatch Logs for this scenario?
An e-commerce application runs on EC2 instances in two Availability Zones, fronted by an Application Load Balancer (ALB). Some checkout requests take 3 to 4 minutes to complete, and users intermittently receive 504 Gateway Timeout responses. CloudWatch shows the targets are healthy and no Auto Scaling scale-in events occurred. Which change will most effectively prevent these timeouts without redesigning the application?
Enable connection draining by setting the target group deregistration delay to 300 seconds.
Increase the ALB idle timeout to a value higher than the longest expected request processing time.
Replace the ALB with a Network Load Balancer to remove all timeout limits.
Enable cross-zone load balancing on the ALB.
Answer Description
The ALB returns a 504 Gateway Timeout when it closes the connection because no response is received from the target before the load balancer's idle timeout elapses. The default idle timeout for an ALB is 60 seconds, which is shorter than the 3- to 4-minute checkout operations. Increasing the idle timeout to exceed the longest expected request duration allows the load balancer to keep the connection open until the target responds, eliminating the observed 504 errors.
Changing the deregistration delay only affects in-flight requests during scale-in, not long-running requests on healthy targets. Enabling cross-zone load balancing distributes traffic across zones and does not influence connection timeouts. A Network Load Balancer also has a connection idle timeout (with a default of 350 seconds for TCP listeners), so replacing the ALB with an NLB would not remove timeout limits. Simply adjusting the ALB's idle timeout is the most direct and cost-effective fix.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is the ALB idle timeout, and why does it matter?
How does the deregistration delay impact in-flight requests?
Why is replacing the ALB with an NLB not effective for timeout issues?
What is the ALB idle timeout, and why does it matter?
How does connection draining and deregistration delay work in target groups?
What is the difference between an ALB and an NLB in handling timeouts?
A company runs a production MySQL database on a single-AZ Amazon RDS instance. Backups are generated once each night by an AWS Backup plan. After a recent incident, 18 hours of data were lost. The operations team must achieve a maximum RPO of 5 minutes while minimizing cost and operational effort. Which solution meets these requirements?
Create a read replica in the same Region and promote it to primary after an incident to recover the latest data.
Configure AWS Backup to create manual DB snapshots every 5 minutes and delete snapshots older than one day.
Convert the DB instance to a Multi-AZ deployment so a failover to the standby can be triggered during an outage.
Enable automated backups on the DB instance with a 7-day retention period and use point-in-time restore for recovery.
Answer Description
Enabling automated backups activates continuous transaction-log capture for the RDS instance. During recovery, Amazon RDS first restores the most recent daily snapshot and then applies the logs, allowing a point-in-time restore to any second within the retention window except the last five minutes. This limits potential data loss to under 5 minutes and requires no additional administration or infrastructure.
Manual snapshots taken every 5 minutes could meet the RPO but would create thousands of snapshots each day, increasing cost and management overhead.
A Multi-AZ deployment provides a synchronous standby and a zero-RPO failover, but it roughly doubles database cost and is unnecessary for a 5-minute RPO target.
A read replica is asynchronously updated; it can lag beyond five minutes and still involves extra cost and manual promotion steps, so it does not guarantee the stated RPO.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is an RPO in AWS Backup?
How do automated backups with point-in-time restore work?
What is the difference between Multi-AZ deployments and automated backups for disaster recovery?
What is an RPO (Recovery Point Objective)?
How does enabling automated backups on Amazon RDS achieve point-in-time recovery?
What is the difference between Multi-AZ and automated backups for data recovery in RDS?
A company runs a production Amazon RDS for PostgreSQL db.r5.large instance with 2 vCPUs. After enabling Performance Insights, the operations team notices that query latency rises when the database load exceeds the number of vCPUs. They need an automated Systems Manager runbook to execute whenever this situation persists for 5 minutes, while keeping operational overhead low. Which solution meets the requirement?
Configure a CloudWatch alarm on the instance's CPUUtilization metric with an 80% threshold for 5 minutes and target the Systems Manager runbook.
Create a CloudWatch alarm in the AWS/RDS namespace for the DBLoad metric (statistic: Average, period: 60 seconds, evaluation periods = 5, threshold = 2) and set the alarm action to run the Systems Manager Automation document.
Enable Enhanced Monitoring at 1-second granularity and deploy a Lambda function that polls CPU metrics every minute; if CPUUtilization > 80% for 5 checks, invoke the runbook.
Create an RDS event subscription for source type 'db-instance' and event category 'failure'; subscribe an SNS topic that triggers the Systems Manager runbook.
Answer Description
Performance Insights automatically publishes the DBLoad (average active sessions) metric to the AWS/RDS namespace in CloudWatch. A common best practice is to compare DBLoad to the vCPU count; sustained values above the vCPU count indicate CPU saturation. Creating a CloudWatch alarm on the DBLoad metric with a period of 60 seconds, five evaluation periods, and a threshold of 2 (the vCPU count) directly monitors the required condition. CloudWatch alarms can invoke Systems Manager runbooks through alarm actions, so no custom polling or additional services are needed.
The CPUUtilization metric does not measure active sessions and can miss database-specific contention like I/O waits. Building a custom Lambda poller with Enhanced Monitoring adds unnecessary complexity and operational overhead. RDS event subscriptions do not emit events for high DB load; they are for state changes like failures or reboots, so they cannot trigger the runbook for this scenario.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Performance Insights in Amazon RDS?
What is the DBLoad metric used in CloudWatch?
How do CloudWatch alarms and Systems Manager Automation documents work together?
What is the DBLoad metric in Amazon RDS Performance Insights?
How can Systems Manager Automation documents (runbooks) integrate with CloudWatch alarms?
Why is the DBLoad metric preferred over CPUUtilization for monitoring database performance?
A company has multiple production AWS accounts. For every account, critical CloudWatch alarms already publish state-change events to the account's default event bus. Operations engineers sign in only to the management account and must see a pop-up notification in the AWS Management Console whenever any of those alarms enters the ALARM state. Using AWS User Notifications, what is the MOST efficient way to meet this requirement?
Create a cross-account Amazon EventBridge rule in each production account that forwards CloudWatch AlarmStateChange events to the management account event bus, then configure an AWS User Notifications rule in the management account that targets the AWS Console channel.
Share the CloudWatch alarms through a cross-account dashboard and rely on the dashboard icons to indicate alarm state when engineers open it.
Create an AWS Systems Manager Incident Manager response plan that watches the alarms across accounts and selects console notifications as the engagement channel.
Subscribe each alarm's SNS topic to an AWS Chatbot Slack channel and enable Slack as the preferred channel in AWS User Notifications.
Answer Description
CloudWatch automatically sends an AlarmStateChange event to the local EventBridge default event bus. A cross-account EventBridge rule can forward that event to an event bus in the management account. AWS User Notifications consumes EventBridge events; creating a notification rule that matches the forwarded alarm event and chooses the AWS Console channel causes a banner/pop-up to appear in the Notification Center of anyone signed in to that account. The other options do not surface the alert in the console: AWS Chatbot sends to chat platforms, shared dashboards only show status when manually opened, and Incident Manager engages contacts by phone, SMS, or email but not via the console notification center.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is EventBridge, and how does it enable cross-account communication?
What are AWS User Notifications, and how do they work with EventBridge?
Why are cross-account dashboards or Incident Manager not efficient for console notifications?
What is Amazon EventBridge and how does it relate to CloudWatch alarms?
How does cross-account EventBridge rule forwarding work?
What are AWS User Notifications and how can they trigger console pop-ups?
Neat!
Looks like that's it! You can go back and review your answers or click the button below to grade your test.