Your organization is preparing to send a subset of e-commerce transactions to an external marketing analytics vendor. Payment Card Industry Data Security Standard (PCI-DSS) rules forbid disclosing customers' 16-digit primary account numbers (PANs). At the same time, the data science team must be able to match the vendor's clustering results back to the correct customers after the analysis is returned. Which data-obfuscation technique best satisfies both requirements?
Add calibrated Laplace noise to the PAN values to achieve differential privacy.
Encrypt each PAN using an AES-FF1 format-preserving cipher and share the ciphertext.
Generate a salted SHA-256 hash of each PAN before export.
Replace each PAN with a randomly generated surrogate stored in a secure token vault.
Replacing each PAN with a vault-managed surrogate value (tokenization) strips the export of any data that PCI-DSS defines as card-holder data, yet preserves a reversible mapping inside the organization. When the results come back, the team can detokenize securely and re-identify the customers. A salted hash is intentionally one-way, so re-identification is impossible. Differential-privacy noise breaks the one-to-one correspondence entirely. Format-preserving encryption keeps the PAN recoverable, but the ciphertext itself is still classified as sensitive card data; either the key must be shared with the vendor or the vendor remains in PCI scope-neither meets the stated goal.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is tokenization, and how does it work in securing sensitive data?
Open an interactive chat with Bash
How does tokenization compare to encryption in terms of compliance with PCI-DSS?
Open an interactive chat with Bash
Why is a salted hash not suitable in this situation for maintaining data re-identification?