A financial services corporation is developing a fraud detection model. The model requires training on a large dataset containing highly sensitive, personally identifiable information (PII) that, due to strict regulatory compliance, must not leave the corporation's on-premises data center. The model training process itself is computationally demanding, requiring elastic access to a powerful GPU cluster that the company finds more cost-effective to use from a public cloud provider. Given these constraints, which of the following deployment strategies is the most appropriate for training this model?
An edge deployment strategy where small, containerized versions of the model are trained directly on local servers within branch offices to reduce data movement.
Implement a hybrid strategy where data is preprocessed and anonymized on-premises, and the resulting data is then securely pushed to the public cloud for model training on scalable GPU instances.
An on-premises deployment where the company procures, installs, and maintains a dedicated GPU cluster within its own data center to perform the model training.
A full cloud deployment where all data is encrypted and moved to a secure cloud environment to leverage the provider's end-to-end managed machine learning platform.
The correct answer describes a hybrid deployment strategy. This approach is optimal because it balances the need for data security and regulatory compliance with the requirement for scalable, cost-effective computational resources. Sensitive data (PII) remains within the secure on-premises environment for preprocessing, such as anonymization or feature extraction. The resulting less-sensitive, transformed data is then securely transferred to the public cloud to leverage its powerful and elastic GPU resources for the computationally intensive model training phase. An entirely on-premises solution would be expensive and inefficient, as it would require a large capital investment in GPU hardware that may be underutilized. A full cloud solution is not viable as it would violate the strict data residency and compliance requirements for the sensitive PII. An edge deployment is irrelevant for this large-scale training scenario; it is a strategy for inference on decentralized devices, not for centralized model training.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What does preprocessing and anonymization of data involve?
Open an interactive chat with Bash
What makes GPU clusters critical for model training?
Open an interactive chat with Bash
How does a hybrid strategy ensure regulatory compliance?