Monitoring, Optimization, and Security (GCP PDE) Flashcards
GCP Professional Data Engineer Flashcards

| Front | Back |
| How can you ensure sensitive data is not exposed in logs | Use log exclusions and redact sensitive data in Cloud Logging |
| How can you improve Spark job performance in Dataproc | Tune executor memory and use dynamic allocation |
| How can you optimize query performance in BigQuery | Use partitioned and clustered tables |
| How can you reduce costs in a data processing environment by optimizing storage usage | Use lower-cost storage tiers like Coldline or Archive for infrequently accessed data |
| How does Cloud Armor help secure data workflows | Protects against DDoS attacks and enforces security policies at the edge |
| How does Cloud Logging help with security | Logs access and actions for auditing purposes |
| What does the BigQuery reservation model help optimize | Cost efficiency for workloads with predictable query patterns |
| What GCP feature allows you to manage access and permissions for resources | IAM (Identity and Access Management) |
| What GCP feature can help you set spending limits and avoid unexpected costs | Budget alerts and quotas |
| What GCP feature enables automatic adjustment of processing resources to match workload demands | Autoscaling |
| What GCP practice helps reduce costs with data egress | Store data closer to the region where it will be processed or consumed |
| What GCP service allows analysis of logs for troubleshooting and auditing purposes | Cloud Logging |
| What GCP service can inspect | classify, and redact sensitive data in your workflows, Data Loss Prevention API (DLP API) |
| What GCP tool helps to visualize system performance and bottlenecks in real time | Cloud Monitoring dashboards |
| What is a cost-saving technique for managing idle resources | Use preemptible VMs or automate resource shutoff during low usage |
| What is the best practice for setting up alerts for anomalies in workflows | Configure alerting policies in Cloud Monitoring |
| What is the main advantage of using Regional buckets over Multi-Regional buckets | Lower cost and latency for region-specific workloads |
| What is the purpose of a Service Account in GCP | Provide applications or VM instances with identities for accessing resources securely |
| What practice should you follow to ensure secure data transmission in GCP | Use encryption in transit with TLS |
| What service enables centralized log export and analysis across multiple projects | Log Sinks with Logging |
| What tool automatically identifies anomalous patterns in metric data in GCP | Cloud Monitoring's Anomaly Detection feature |
| What tool in GCP is used for monitoring resource metrics and creating dashboards | Cloud Monitoring |
| Why is enabling Audit Logs important for cloud resources | Tracks who did what, when, and where for security and compliance |
| Why should you audit IAM role assignments regularly | To ensure the principle of least privilege is maintained |
About the Flashcards
Flashcards for the GCP Professional Data Engineer exam help you review essential GCP terminology, concepts, and key ideas tested on the exam. The deck covers monitoring and observability with Cloud Monitoring (dashboards, anomaly detection), centralized logging and export with Cloud Logging and log sinks, alerting policy configuration, and real-time performance visualization.
It also reinforces security, privacy, and access control - IAM, service accounts, audit logs, Cloud Armor, and DLP - alongside cost and performance strategies such as storage tiering, regional buckets, BigQuery optimization (partitioning, reservations), autoscaling, Dataproc tuning, preemptible VMs, and budget alerts to manage spend.
Topics covered in this flashcard deck:
- Monitoring and logging
- Alerting and dashboards
- Security and IAM
- Cost optimization techniques
- Storage tiers and regions
- Compute and autoscaling