A payment-processing company ingests transaction records from multiple branches into a BigQuery table. Each record contains the cardholder's full name and the 16-digit primary account number (PAN). Compliance requires the following before data can be queried by data scientists in the analytics project:
Names must be pseudonymized in a way that lets datasets from different branches still be joined on the same customer.
PANs must be rendered non-reversible, but analysts need the last four digits for charge-back investigations. You are designing a Dataflow pipeline that calls Cloud Data Loss Prevention (DLP) for in-stream de-identification. Which approach best meets both requirements while minimizing the risk of re-identification?
Encrypt the entire table with Cloud KMS at rest and allow analysts to decrypt on read; rely on Data Catalog column tags to warn users about personal data.
Apply a CryptoDeterministicConfig transform to the name field using a shared Cloud KMS key, and apply a CharacterMaskConfig that masks the first 12 digits of the PAN, leaving the last 4 digits visible.
Store the raw table in a restricted project and grant analysts a BigQuery view that excludes the name and PAN columns; do not perform any in-pipeline transformation.
Apply a CryptoReplaceFfxFpeConfig transform to the name field and to the PAN field using the same Cloud KMS key so that both values remain reversible for auditors.
Deterministic cryptographic transformation with a centrally-managed Cloud KMS key converts identical names to the same surrogate value across all branches, so joins are still possible and the process can be reversed only by a team that controls the key. Character masking that replaces the first 12 digits of the PAN with a fixed symbol irreversibly removes sensitive data yet keeps the last four digits available to analysts. The other proposals either (1) use format-preserving encryption that is still reversible by anyone with a key, (2) encrypt data in bulk without selectively exposing the last four digits, or (3) rely only on access controls without actually de-identifying the data, and therefore do not satisfy the stated compliance goals.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Cloud Data Loss Prevention (DLP) and how does it help in de-identification?
Open an interactive chat with Bash
How does CryptoDeterministicConfig enable pseudonymization across datasets?
Open an interactive chat with Bash
Why use CharacterMaskConfig for PANs instead of encrypting the entire field?
Open an interactive chat with Bash
What is CryptoDeterministicConfig in GCP Dataflow?
Open an interactive chat with Bash
How does CharacterMaskConfig work in Cloud DLP?
Open an interactive chat with Bash
Why use Cloud KMS for managing encryption keys in data pipelines?
Open an interactive chat with Bash
GCP Professional Data Engineer
Designing data processing systems
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .