Deployment and Operationalization Flashcards

AWS Machine Learning Engineer Associate MLA-C01 Flashcards

Front	Back
Approach for handling imbalanced traffic across endpoints	Use traffic shifting with weighted routing for dynamic load distribution
Benefits of using Amazon CloudFront in ML deployments	Speeds up content delivery and reduces latency with a global CDN
Benefits of using AWS autoscaling for SageMaker endpoints	Automatically adjusts resources based on traffic demands
Best practices for AWS Lambda function design for inference	Optimize memory and runtime for lower costs and higher throughput
Challenges of scaling ML models horizontally	Increased costs and possible inconsistencies in system state between replicas
Common challenge in deploying ML models and its solution	Model drift; implement regular re-training and monitoring
Concept of model monitoring in production	Ensuring performance and detecting drift in model predictions
Difference between batch and real-time inference	Batch processes data in chunks, while real-time has low-latency one-by-one predictions
Difference between online and offline model retraining	Online retraining updates a model incrementally with new data, while offline retraining requires retraining from scratch
Difference between synchronous and asynchronous inference in SageMaker	Synchronous processes requests one at a time, while asynchronous queues requests for later processing
Difference between warm starts and cold starts in AWS Lambda	Warm starts reuse the existing container, while cold starts initiate a new one, causing a delay
Explain the purpose of model explainability in deployment	Understand and validate model predictions to ensure transparency and fairness
How Amazon SageMaker aids in model deployment	Provides a managed service for hosting endpoints and scaling models
How AWS Inferentia accelerates ML inference workloads	Uses dedicated chips optimized for high-performance ML tasks at lower costs
How AWS Step Functions assist in complex ML workflows	Orchestrates multiple services and tasks into a serverless workflow
How Data Wrangler simplifies data pre-processing in SageMaker	Offers a visual interface for cleaning, transforming, and analyzing data
How to connect Amazon S3 with SageMaker for model deployment	Use S3 for retrieving and storing model artifacts
How to enable monitoring for a SageMaker endpoint	Set up CloudWatch alarms and logs for performance metrics
How to implement feature logging during inference	Store input features along with predictions for monitoring and debugging
How to version machine learning models in S3	Use distinct prefixes or labels to track different versions of artifacts
Impact of cold starts on real-time serving latency	Initial container start-up delays inference, affecting response time for users
Importance of scaling ML models in production	Handling increasing load and maintaining low latency
Key advantage of deploying machine learning models on AWS	Scalability and cost efficiency
Key logging tools for monitoring ML models on AWS	AWS CloudWatch and Amazon S3 logs
Methodology for A/B testing ML models on SageMaker	Deploy multiple endpoints and route traffic proportionally to assess performance
Purpose of AWS Lambda in operationalizing ML models	Automating inference tasks with serverless compute
Purpose of containerization in ML model deployment	Ensures consistency across environments and simplifies scalability
Purpose of SageMaker Multi-Model endpoints	Hosts multiple models on a single endpoint to optimize costs and resource use
Purpose of using a custom Docker container in SageMaker	Allows packaging specific dependencies and configurations required by the model
Role of Amazon S3 in model deployment	Storing model artifacts and data for inference
Role of Elastic Load Balancing in ML model deployment	Distributes incoming traffic across multiple instances to ensure availability and reliability
Security measures for deploying ML models on AWS	Configure IAM roles, encrypt data, and network safety measures
Steps to create a SageMaker endpoint	Model deployment, endpoint creation, and configuration setup
Strategies for reducing latency in real-time model inference	Optimize code and infrastructure, use GPUs for large computations
Use case for batch inference in a deployment scenario	Processing large datasets periodically for predictions
What is endpoint lifecycle management in SageMaker	Managing creation, scaling, and deletion of endpoints
What is the role of autoscaling policies in AWS application load balancers	Adjusts target response latency or request count to manage changing traffic loads
What is the role of AWS API Gateway in ML operationalization	Acts as an interface to invoke ML models hosted on AWS
Why is canary deployment used in ML operations	Gradually routes traffic to a new model version to test changes and minimize risks

AWS Machine Learning Engineer Associate MLA-C01

This deck explores methodologies for deploying, monitoring, and scaling machine learning models with AWS services such as S3, Lambda, and SageMaker endpoints.

Share on...