Unit 3: Evaluating model

 Unit 3: Evaluating model

 What is Model Evaluation?
Model evaluation means checking how well a machine learning model performs using different evaluation metrics.

✔ Why is it important?
Helps us know if the model is performing well.
Works like a report card for the AI model.
Gives feedback → so we can improve the model.
Helps us select the best model

Why Do We Need Model Evaluation?
Model evaluation
Tells the strengths and weaknesses of a model
Shows how well a model will work on future / unseen data
Helps build reliable and trustworthy AI systems
Is a necessary step before using the model in real life
Just like a school report card helps students improve,
model evaluation helps AI models improve.

Train–Test Split (Evaluation Technique)
✔ What is train-test split?
It is a method to check a model’s performance by dividing the dataset into:
Training set → Used to teach the model
Testing set → Used to check the model

✔ Why is train-test split needed?
To check how the model performs on new data
To avoid overfitting (when a model memorizes the training data and fails on new data)
To estimate future performance
To build a model that predicts correctly for unseen cases

✔ Key Point:
You should not test the model on the same data used for training.


 5. Accuracy and Error
✔ Bob & Billy Example (Simple Understanding)
Entry fee = ₹500
Bob brings ₹300 → error = 500 – 300 = 200
Billy brings ₹550 → error = 550 – 500 = 50
Billy is more accurate because he is closer to the correct amount.

6. What is Accuracy?
Accuracy tells us how many predictions the model got correct.
Correct Predictions/ Total Predictions
✔ Higher accuracy = better model performance.

7. What is Error?
Error is the difference between the predicted value and actual value.
It shows how wrong the model’s prediction is.
✔ Goal: Minimize error
✔ Example:
If the model says “no disease” but the person actually has a disease → this is an error.
Evaluation helps choose the best model and avoid overfitting.

EVALUATION METRICS FOR CLASSIFICATION 

What is Classification?
Classification means sorting or grouping items into different categories.
Simple Example:
You are in a supermarket with two trolleys:
One trolley → fruits & vegetables
Another trolley → grocery items
You are classifying items into:
Fruits/Vegetables
Grocery

Definition
Classification is a machine learning task where the model predicts a class label based on input data.
Examples of Classification
Predicting whether an item is a vegetable or grocery
Email: spam or not spam
Predicting whether a patient has disease = Yes/No
Credit card fraud detection (Fraud / Not Fraud)

Classification Metrics
These are methods to measure how good a classification model is.
Common metrics are:
Confusion Matrix
Accuracy
Precision
Recall
F1 Score

Confusion Matrix
A confusion matrix shows:
Actual values on the Y-axis
Predicted values on the X-axis
It contains four important terms:
Predicted Yes Predicted No
Actual Yes TP FN
Actual No FP TN


Meaning of the Four Terms
1. True Positive (TP)
Model predicted Yes → actually Yes
Example: You predicted a person has disease, and they really have it.

2. True Negative (TN)
Model predicted No → actually No
Example: You predicted a person does not have disease, and they are healthy.

3. False Positive (FP)
Model predicted Yes → actually No
Wrong prediction (Type-I error)
Example: You predicted a person has disease, but they are healthy.

4. False Negative (FN)
Model predicted No → actually Yes
Very dangerous (Type-II error)
Example: You predicted a patient is healthy, but they actually have the disease.


Accuracy
Accuracy tells us how many predictions were correct.



✔ When is accuracy useful?

When dataset is balanced (equal number of Yes/No examples)
⚠ When NOT to use accuracy?
When dataset is unbalanced

Example in textbook:
900 Yes, 100 No
Model predicts everything “Yes” → still 90% accuracy (but it's useless)

Precision
Precision tells out of all predicted YES, how many were correctly YES.


✔ When to use Precision?
When False Positives (FP) must be minimized.
Example use case: Satellite Launch
Predicting bad weather as good weather is dangerous
FP must be avoided → use Precision

6. Recall (Sensitivity / True Positive Rate)
Recall tells out of all actual YES, how many were predicted correctly.


✔ When to use Recall?
When False Negatives (FN) must be minimized.
Example use case: COVID-19 diagnosis
Predicting a sick person as healthy (FN) is dangerous
Recall must be high → use Recall

F1 Score
F1 Score combines both Precision and Recall into one value.


✔ When to use F1 Score?
Dataset is unbalanced
Can't decide whether FP or FN is more important


 Classification Metric Selection – Summary Table
Scenario Most Important Metric Why?
Satellite launch (Bad weather predicted as good = dangerous) Precision Avoid FP
COVID-19 detection (Sick person predicted as healthy = risky) Recall Avoid FN
Fraud detection Recall Missing a fraud (FN) is costly
Balanced dataset Accuracy All errors equally important
Unbalanced dataset & unsure F1 Score Balanced measure

Ethical Concerns in Model Evaluation
While evaluating a model, we must consider:
Fairness
Ensure the model is not biased toward any group.
Privacy
Do not expose personal or sensitive data.
Transparency
The decision-making logic should be explainable.
Avoid Harm
Wrong predictions should not result in physical, emotional, or financial harm.
Accountability
Developers must take responsibility for model errors.

Comments

Popular posts from this blog

XII UNIT 3 HOW CAN MACHINES SEE?

XII UNIT 2 Data Science Methodology: An Analytic Approach to Capstone Project

UNIT -3 MCQ PART 1 HOW CAN MACHINES SEE?