A multinational manufacturer's analytics team is training a demand-forecasting model. The raw training file contains line-item component costs that are labeled Company Proprietary - Internal Use Only. External data-science contractors who maintain the firm's cloud development repository will need access to whatever the team uploads. According to widely adopted proprietary-data handling practices, which mitigation should the team apply before placing the dataset in the vendor-managed repository?
Keep the raw dataset intact but change its classification label to public domain to avoid contractual conflicts.
Attach a Creative Commons BY-SA license header to the original CSV and share the unmodified data with the vendor.
Publish only aggregated or derived cost features (for example, cost indices by product family) so individual proprietary prices cannot be reconstructed.
Rely solely on TLS encryption during file upload while providing the vendor full access to the raw dataset once stored.
Proprietary data represents trade-secret or otherwise competitively sensitive information. Industry classification policies typically restrict such data to employees (and third parties) who have signed nondisclosure agreements and require that any external sharing expose no details that could be reverse-engineered to reveal the protected information. The safest approach is therefore to transform the dataset so that only aggregated or otherwise derived features-values that preserve modeling utility but cannot be traced back to exact prices-are shared. This implements the data-minimization principle recommended by NIST and is a common form of data masking/obfuscation. Transport-layer encryption alone protects data in transit, but the vendor's staff would still see every proprietary value once the file is stored. Merely adding an open-source license or re-classifying the file as public domain conflicts with the original proprietary label and does nothing to mitigate exposure, leaving the organization vulnerable to loss of trade-secret protection and contractual breach.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is proprietary data and why is it sensitive?
Open an interactive chat with Bash
What is data masking or obfuscation and how does it work?
Open an interactive chat with Bash
What is the role of the data-minimization principle recommended by NIST?