Bash, the Crucial Exams Chat Bot
AI Bot

Data Storage and Databases (GCP PDE)  Flashcards

GCP Professional Data Engineer Flashcards

FrontBack
Archive Storage ideal use caseBest for long-term storage of data rarely accessed
BigQuery clusteringOrganizes data using a field to optimize performance for specific query patterns
BigQuery data sharingUses datasets and authorized views to share data securely across projects
BigQuery export formatsSupports exporting data in formats like CSV, JSON, and Avro to Cloud Storage
BigQuery federated queriesEnables querying external data sources like Google Sheets or Cloud Storage files
BigQuery flat-rate pricingProvides predictable costs by purchasing a dedicated amount of query processing capacity
BigQuery integration with Looker StudioEnables interactive visualization and reporting for datasets
BigQuery optimizationUse partitioned or clustered tables for improved query performance
BigQuery partitioningHelps optimize query costs by organizing data based on a specific column like date
BigQuery pricing modelCharges are based on storage and the amount of data processed during queries
BigQuery reserved slotsOffers guaranteed computing resources to improve query performance
BigQuery scalabilitySupports petabytes of data with serverless querying
BigQuery slot-based pricingDetermines query performance based on the number of slots purchased
BigQuery use caseBest for analyzing large-scale datasets using SQL-like queries
Bigtable backup optionsSupports creating backups for tables in specific instances for point-in-time recovery
Bigtable consistency modelProvides only eventual consistency for data writes and reads
Bigtable data modelWide-column database optimized for sparse data
Bigtable indexingUses row keys for primary indexing with no built-in secondary indexes
Bigtable localityStores data physically adjacent based on row keys for faster access
Bigtable query limitationsRequires row-key optimization as it does not natively support complex joins or aggregations
Bigtable region distributionCan be deployed across multiple zones for disaster tolerance
Bigtable replicationUsed for availability and disaster recovery purposes
Bigtable row designDesign row keys to optimize data access patterns
Bigtable scalingAutomatically adjusts to handle increased throughput or storage without downtime
Bigtable use caseBest for low-latency operations on large-scale time-series data
Cloud Datastore vs BigtableDatastore is better for transactional consistency, while Bigtable is better for analytics and high throughput
Cloud Firestore use caseBest for mobile and web applications requiring offline support and real-time synchronization
Cloud Firestore vs DatastoreFirestore provides advanced querying and offline support, while Datastore offers simpler APIs
Cloud Spanner use caseBest for horizontally scalable relational databases with strong consistency requirements
Cloud SQL backupsSupports automated and on-demand backups for disaster recovery
Cloud SQL replication typesSupports both asynchronous and synchronous replication for high-availability scenarios
Cloud SQL use caseBest for relational databases requiring compatibility with MySQL, PostgreSQL, or SQL Server
Cloud Storage data lifecycle managementPolicies automatically delete or transition objects between tiers based on age
Cloud Storage IAMUsed for granular control over who can access and manage data
Cloud Storage scalabilityAutomatically scales to handle large amounts of unstructured data
Cloud Storage signed URLsProvides temporary access to specific objects using a time-limited URL
Cloud Storage tiersStandard, Nearline, Coldline, and Archive
Cloud Storage use caseBest for storing unstructured data like images, videos, and backups
Cloud Storage vs Persistent DiskCloud Storage is object storage, while Persistent Disk is block storage attached to VMs
Coldline Storage ideal use caseBest for data accessed less than once a year
Datastore access controlUses IAM policies to define permissions at project, entity group, or key levels
Datastore indexingAutomatically indexes properties for queries but allows custom index configuration
Datastore queriesSupport strong or eventual consistency depending on query type
Datastore relationship modelingSupports nested entities and ancestor query patterns
Datastore transactionsAllow atomicity for multiple operations across multiple entities
Datastore use caseBest for scalable NoSQL applications and transactional workloads
Nearline Storage ideal use caseBest for data accessed less than once a month but more than once a year
Streaming data to BigQuery use caseBest for real-time analytics and dashboards
This deck focuses on GCP storage solutions, including Cloud Storage, Bigtable, Datastore, BigQuery, and how to choose the right storage option for a given use case.
Share on...
Follow us on...