Your data science team must release a patient readmission data set to external researchers. The file currently contains direct identifiers (patient name, Social Security number) and quasi-identifiers (full date of birth, 5-digit ZIP code). Compliance requires that (a) the released data no longer be considered protected health information (PHI) under the HIPAA Privacy Rule, so no patient authorization is needed, and (b) age and regional patterns remain analytically useful. Which preprocessing approach best meets both requirements?
Mask all but the last four digits of each Social Security number and hash patient names with SHA-256 while leaving other fields unchanged.
Replace each name and Social Security number with a random UUID but keep the full date of birth and 5-digit ZIP code intact.
Encrypt the entire data set with AES-256 and provide researchers with the decryption key after they sign a data-use agreement.
Remove the 18 identifiers listed in 45 CFR §164.514(b)(2), convert dates of birth to age in years, and truncate ZIP codes to their first three digits when the corresponding area exceeds 20,000 residents.
Under the HIPAA Safe Harbor de-identification method in 45 CFR §164.514(b)(2), removing 18 specific identifiers-and, for geographic and temporal data, reducing them to coarser values-renders the data "not individually identifiable." Once those identifiers are removed and ZIP codes are truncated to the first three digits (when the combined area has >20,000 residents) and dates are generalized (for example, converting date of birth to age in years), the resulting data set is no longer PHI and may be disclosed without individual authorization. Pseudonymization that leaves full dates of birth and 5-digit ZIP codes is reversible and therefore still PHI; encrypting the file or partially masking identifiers preserves the underlying direct identifiers, so the data remain PHI unless every user lacks the key. Only the Safe Harbor-compliant transformation both removes the regulatory burden and preserves useful aggregated age and location information.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why is truncating ZIP codes to the first three digits necessary under HIPAA?
Open an interactive chat with Bash
What is the significance of converting dates of birth to age in years?
Open an interactive chat with Bash
What are the '18 identifiers' referenced in HIPAA Safe Harbor?