# AWS ML - Part 3

## Domain 3: Modeling

3.1 Frame business problems as machine learning problems.

Determine when to use/when not to use ML

Know the difference between supervised and unsupervised learning

Selecting from among classification, regression, forecasting, clustering, recommendation, etc.

3.2 Select the appropriate model(s) for a given machine learning problem.

Xgboost, logistic regression, K-means, linear regression, decision trees, random forests, RNN,

CNN, Ensemble, Transfer learning

Express intuition behind models

3.3 Train machine learning models.

Train validation test split, cross-validation

Optimizer, gradient descent, loss functions, local minima, convergence, batches, probability,

etc.

Compute choice (GPU vs. CPU, distributed vs. non-distributed, platform [Spark vs. non-Spark])

Model updates and retraining

o Batch vs. real-time/online

3.4 Perform hyperparameter optimization.

Regularization

o Drop out

o L1/L2

Cross validation

Model initialization

Neural network architecture (layers/nodes), learning rate, activation functions

Tree-based models (# of trees, # of levels)

Linear models (learning rate)

3.5 Evaluate machine learning models.

Avoid overfitting/underfitting (detect and handle bias and variance)

Metrics (AUC-ROC, accuracy, precision, recall, RMSE, F1 score)

Confusion matrix

Offline and online model evaluation, A/B testing

Compare models using metrics (time to train a model, quality of model, engineering costs)

Cross validation