Bash, the Crucial Exams Chat Bot
AI Bot
Data Storage and Databases (GCP PDE) Flashcards
GCP Professional Data Engineer Flashcards
| Front | Back |
| Archive Storage ideal use case | Best for long-term storage of data rarely accessed |
| BigQuery clustering | Organizes data using a field to optimize performance for specific query patterns |
| BigQuery data sharing | Uses datasets and authorized views to share data securely across projects |
| BigQuery export formats | Supports exporting data in formats like CSV, JSON, and Avro to Cloud Storage |
| BigQuery federated queries | Enables querying external data sources like Google Sheets or Cloud Storage files |
| BigQuery flat-rate pricing | Provides predictable costs by purchasing a dedicated amount of query processing capacity |
| BigQuery integration with Looker Studio | Enables interactive visualization and reporting for datasets |
| BigQuery optimization | Use partitioned or clustered tables for improved query performance |
| BigQuery partitioning | Helps optimize query costs by organizing data based on a specific column like date |
| BigQuery pricing model | Charges are based on storage and the amount of data processed during queries |
| BigQuery reserved slots | Offers guaranteed computing resources to improve query performance |
| BigQuery scalability | Supports petabytes of data with serverless querying |
| BigQuery slot-based pricing | Determines query performance based on the number of slots purchased |
| BigQuery use case | Best for analyzing large-scale datasets using SQL-like queries |
| Bigtable backup options | Supports creating backups for tables in specific instances for point-in-time recovery |
| Bigtable consistency model | Provides only eventual consistency for data writes and reads |
| Bigtable data model | Wide-column database optimized for sparse data |
| Bigtable indexing | Uses row keys for primary indexing with no built-in secondary indexes |
| Bigtable locality | Stores data physically adjacent based on row keys for faster access |
| Bigtable query limitations | Requires row-key optimization as it does not natively support complex joins or aggregations |
| Bigtable region distribution | Can be deployed across multiple zones for disaster tolerance |
| Bigtable replication | Used for availability and disaster recovery purposes |
| Bigtable row design | Design row keys to optimize data access patterns |
| Bigtable scaling | Automatically adjusts to handle increased throughput or storage without downtime |
| Bigtable use case | Best for low-latency operations on large-scale time-series data |
| Cloud Datastore vs Bigtable | Datastore is better for transactional consistency, while Bigtable is better for analytics and high throughput |
| Cloud Firestore use case | Best for mobile and web applications requiring offline support and real-time synchronization |
| Cloud Firestore vs Datastore | Firestore provides advanced querying and offline support, while Datastore offers simpler APIs |
| Cloud Spanner use case | Best for horizontally scalable relational databases with strong consistency requirements |
| Cloud SQL backups | Supports automated and on-demand backups for disaster recovery |
| Cloud SQL replication types | Supports both asynchronous and synchronous replication for high-availability scenarios |
| Cloud SQL use case | Best for relational databases requiring compatibility with MySQL, PostgreSQL, or SQL Server |
| Cloud Storage data lifecycle management | Policies automatically delete or transition objects between tiers based on age |
| Cloud Storage IAM | Used for granular control over who can access and manage data |
| Cloud Storage scalability | Automatically scales to handle large amounts of unstructured data |
| Cloud Storage signed URLs | Provides temporary access to specific objects using a time-limited URL |
| Cloud Storage tiers | Standard, Nearline, Coldline, and Archive |
| Cloud Storage use case | Best for storing unstructured data like images, videos, and backups |
| Cloud Storage vs Persistent Disk | Cloud Storage is object storage, while Persistent Disk is block storage attached to VMs |
| Coldline Storage ideal use case | Best for data accessed less than once a year |
| Datastore access control | Uses IAM policies to define permissions at project, entity group, or key levels |
| Datastore indexing | Automatically indexes properties for queries but allows custom index configuration |
| Datastore queries | Support strong or eventual consistency depending on query type |
| Datastore relationship modeling | Supports nested entities and ancestor query patterns |
| Datastore transactions | Allow atomicity for multiple operations across multiple entities |
| Datastore use case | Best for scalable NoSQL applications and transactional workloads |
| Nearline Storage ideal use case | Best for data accessed less than once a month but more than once a year |
| Streaming data to BigQuery use case | Best for real-time analytics and dashboards |
Front
Cloud Storage data lifecycle management
Click the card to flip
Back
Policies automatically delete or transition objects between tiers based on age
Front
Archive Storage ideal use case
Back
Best for long-term storage of data rarely accessed
Front
Bigtable backup options
Back
Supports creating backups for tables in specific instances for point-in-time recovery
Front
Cloud SQL replication types
Back
Supports both asynchronous and synchronous replication for high-availability scenarios
Front
Cloud Firestore use case
Back
Best for mobile and web applications requiring offline support and real-time synchronization
Front
Cloud Storage scalability
Back
Automatically scales to handle large amounts of unstructured data
Front
Cloud Spanner use case
Back
Best for horizontally scalable relational databases with strong consistency requirements
Front
Cloud Storage signed URLs
Back
Provides temporary access to specific objects using a time-limited URL
Front
Bigtable scaling
Back
Automatically adjusts to handle increased throughput or storage without downtime
Front
BigQuery partitioning
Back
Helps optimize query costs by organizing data based on a specific column like date
Front
Datastore use case
Back
Best for scalable NoSQL applications and transactional workloads
Front
Cloud Firestore vs Datastore
Back
Firestore provides advanced querying and offline support, while Datastore offers simpler APIs
Front
BigQuery slot-based pricing
Back
Determines query performance based on the number of slots purchased
Front
Bigtable indexing
Back
Uses row keys for primary indexing with no built-in secondary indexes
Front
Datastore transactions
Back
Allow atomicity for multiple operations across multiple entities
Front
BigQuery reserved slots
Back
Offers guaranteed computing resources to improve query performance
Front
Bigtable row design
Back
Design row keys to optimize data access patterns
Front
BigQuery export formats
Back
Supports exporting data in formats like CSV, JSON, and Avro to Cloud Storage
Front
Bigtable query limitations
Back
Requires row-key optimization as it does not natively support complex joins or aggregations
Front
Cloud Storage vs Persistent Disk
Back
Cloud Storage is object storage, while Persistent Disk is block storage attached to VMs
Front
Datastore queries
Back
Support strong or eventual consistency depending on query type
Front
Bigtable consistency model
Back
Provides only eventual consistency for data writes and reads
Front
Cloud SQL backups
Back
Supports automated and on-demand backups for disaster recovery
Front
Cloud Storage use case
Back
Best for storing unstructured data like images, videos, and backups
Front
BigQuery flat-rate pricing
Back
Provides predictable costs by purchasing a dedicated amount of query processing capacity
Front
Coldline Storage ideal use case
Back
Best for data accessed less than once a year
Front
Bigtable replication
Back
Used for availability and disaster recovery purposes
Front
BigQuery pricing model
Back
Charges are based on storage and the amount of data processed during queries
Front
BigQuery clustering
Back
Organizes data using a field to optimize performance for specific query patterns
Front
BigQuery use case
Back
Best for analyzing large-scale datasets using SQL-like queries
Front
Streaming data to BigQuery use case
Back
Best for real-time analytics and dashboards
Front
Datastore access control
Back
Uses IAM policies to define permissions at project, entity group, or key levels
Front
Cloud Storage tiers
Back
Standard, Nearline, Coldline, and Archive
Front
Bigtable data model
Back
Wide-column database optimized for sparse data
Front
BigQuery optimization
Back
Use partitioned or clustered tables for improved query performance
Front
Datastore relationship modeling
Back
Supports nested entities and ancestor query patterns
Front
Bigtable use case
Back
Best for low-latency operations on large-scale time-series data
Front
BigQuery integration with Looker Studio
Back
Enables interactive visualization and reporting for datasets
Front
BigQuery data sharing
Back
Uses datasets and authorized views to share data securely across projects
Front
Bigtable region distribution
Back
Can be deployed across multiple zones for disaster tolerance
Front
Cloud Datastore vs Bigtable
Back
Datastore is better for transactional consistency, while Bigtable is better for analytics and high throughput
Front
BigQuery scalability
Back
Supports petabytes of data with serverless querying
Front
Datastore indexing
Back
Automatically indexes properties for queries but allows custom index configuration
Front
Cloud SQL use case
Back
Best for relational databases requiring compatibility with MySQL, PostgreSQL, or SQL Server
Front
BigQuery federated queries
Back
Enables querying external data sources like Google Sheets or Cloud Storage files
Front
Bigtable locality
Back
Stores data physically adjacent based on row keys for faster access
Front
Nearline Storage ideal use case
Back
Best for data accessed less than once a month but more than once a year
Front
Cloud Storage IAM
Back
Used for granular control over who can access and manage data
1/48
This deck focuses on GCP storage solutions, including Cloud Storage, Bigtable, Datastore, BigQuery, and how to choose the right storage option for a given use case.