AWS ML - Part 4
Domain 4: Machine Learning Implementation and Operations
4.1 Build machine learning solutions for performance, availability, scalability, resiliency, and fault
tolerance.
AWS environment logging and monitoring
o CloudTrail and CloudWatch
o Build error monitoring
Multiple regions, Multiple AZs
AMI/golden image
Docker containers
Auto Scaling groups
Rightsizing
o Instances
o Provisioned IOPS
o Volumes
Load balancing
AWS best practices
4.2 Recommend and implement the appropriate machine learning services and features for a given
problem.
ML on AWS (application services)
o Poly
o Lex
o Transcribe
AWS service limits
Build your own model vs. SageMaker built-in algorithms
Infrastructure: (spot, instance types), cost considerations
o Using spot instances to train deep learning models using AWS Batch
4.3 Apply basic AWS security practices to machine learning solutions.
IAM
S3 bucket policies
Security groups
VPC
Encryption/anonymization
4.4 Deploy and operationalize machine learning solutions.
Exposing endpoints and interacting with them
ML model versioning
A/B testing
Retrain pipelines
ML debugging/troubleshooting
o Detect and mitigate drop in performance
o Monitor performance of the model