ML Quiz — KiranVision

1. What is supervised learning?

Learning without any data Learning from labeled input-output pairs Learning only from unlabeled clusters Learning through robot hardware only

2. Which metric is most appropriate for imbalanced classification?

Raw accuracy alone Dataset file size F1-score or ROC-AUC Number of features only

3. Linear regression minimizes which loss (by default)?

Mean Squared Error Cross-entropy only Hinge loss Log-likelihood of clusters

4. What causes overfitting?

Too little training data always helps Model memorizes noise; high variance Using a validation set Regularization always hurts performance

5. Random forests reduce variance by:

Using a single deep tree Removing all randomness Averaging many decorrelated trees Increasing learning rate only

6. k-fold cross-validation is used to:

Estimate model performance more reliably Replace the test set permanently Increase training set labels artificially Eliminate need for any data

7. Logistic regression outputs:

Unbounded real numbers only Probabilities via sigmoid Cluster centroids Image pixels directly

8. Gradient boosting builds models by:

Training all trees in parallel on identical data Using only linear regression Sequentially fitting residuals/errors Ignoring the loss function

9. StandardScaler is important before SVM because:

Features on different scales skew distance calculations SVM cannot run on numeric data It adds more training labels It removes the need for kernels

10. The bias-variance tradeoff means:

More data always increases both equally Simplifying models can reduce variance but increase bias Test set should be used for tuning Neural nets have no variance

Machine Learning Knowledge Quiz