Bash, the Crucial Exams Chat Bot
AI Bot

Deployment and Operationalization  Flashcards

AWS Machine Learning Engineer Associate MLA-C01 Flashcards

FrontBack
Approach for handling imbalanced traffic across endpointsUse traffic shifting with weighted routing for dynamic load distribution
Benefits of using Amazon CloudFront in ML deploymentsSpeeds up content delivery and reduces latency with a global CDN
Benefits of using AWS autoscaling for SageMaker endpointsAutomatically adjusts resources based on traffic demands
Best practices for AWS Lambda function design for inferenceOptimize memory and runtime for lower costs and higher throughput
Challenges of scaling ML models horizontallyIncreased costs and possible inconsistencies in system state between replicas
Common challenge in deploying ML models and its solutionModel drift; implement regular re-training and monitoring
Concept of model monitoring in productionEnsuring performance and detecting drift in model predictions
Difference between batch and real-time inferenceBatch processes data in chunks, while real-time has low-latency one-by-one predictions
Difference between online and offline model retrainingOnline retraining updates a model incrementally with new data, while offline retraining requires retraining from scratch
Difference between synchronous and asynchronous inference in SageMakerSynchronous processes requests one at a time, while asynchronous queues requests for later processing
Difference between warm starts and cold starts in AWS LambdaWarm starts reuse the existing container, while cold starts initiate a new one, causing a delay
Explain the purpose of model explainability in deploymentUnderstand and validate model predictions to ensure transparency and fairness
How Amazon SageMaker aids in model deploymentProvides a managed service for hosting endpoints and scaling models
How AWS Inferentia accelerates ML inference workloadsUses dedicated chips optimized for high-performance ML tasks at lower costs
How AWS Step Functions assist in complex ML workflowsOrchestrates multiple services and tasks into a serverless workflow
How Data Wrangler simplifies data pre-processing in SageMakerOffers a visual interface for cleaning, transforming, and analyzing data
How to connect Amazon S3 with SageMaker for model deploymentUse S3 for retrieving and storing model artifacts
How to enable monitoring for a SageMaker endpointSet up CloudWatch alarms and logs for performance metrics
How to implement feature logging during inferenceStore input features along with predictions for monitoring and debugging
How to version machine learning models in S3Use distinct prefixes or labels to track different versions of artifacts
Impact of cold starts on real-time serving latencyInitial container start-up delays inference, affecting response time for users
Importance of scaling ML models in productionHandling increasing load and maintaining low latency
Key advantage of deploying machine learning models on AWSScalability and cost efficiency
Key logging tools for monitoring ML models on AWSAWS CloudWatch and Amazon S3 logs
Methodology for A/B testing ML models on SageMakerDeploy multiple endpoints and route traffic proportionally to assess performance
Purpose of AWS Lambda in operationalizing ML modelsAutomating inference tasks with serverless compute
Purpose of containerization in ML model deploymentEnsures consistency across environments and simplifies scalability
Purpose of SageMaker Multi-Model endpointsHosts multiple models on a single endpoint to optimize costs and resource use
Purpose of using a custom Docker container in SageMakerAllows packaging specific dependencies and configurations required by the model
Role of Amazon S3 in model deploymentStoring model artifacts and data for inference
Role of Elastic Load Balancing in ML model deploymentDistributes incoming traffic across multiple instances to ensure availability and reliability
Security measures for deploying ML models on AWSConfigure IAM roles, encrypt data, and network safety measures
Steps to create a SageMaker endpointModel deployment, endpoint creation, and configuration setup
Strategies for reducing latency in real-time model inferenceOptimize code and infrastructure, use GPUs for large computations
Use case for batch inference in a deployment scenarioProcessing large datasets periodically for predictions
What is endpoint lifecycle management in SageMakerManaging creation, scaling, and deletion of endpoints
What is the role of autoscaling policies in AWS application load balancersAdjusts target response latency or request count to manage changing traffic loads
What is the role of AWS API Gateway in ML operationalizationActs as an interface to invoke ML models hosted on AWS
Why is canary deployment used in ML operationsGradually routes traffic to a new model version to test changes and minimize risks
This deck explores methodologies for deploying, monitoring, and scaling machine learning models with AWS services such as S3, Lambda, and SageMaker endpoints.
Share on...
Follow us on...