Videos

Topic Access
Train, Test, & Validation Data Sets (7 min) https://www.youtube.com/watch?v=Zi-0rlM4RDs
Confusion Matrix (7 min) https://www.youtube.com/watch?v=Kdsp6soqA7o
   

ROC and AUC, Clearly Explained (16 min)
This is good  🙂

https://www.youtube.com/watch?v=4jRBRDbJemM
Evaluating Classifiers: Gains and Lift Charts ( 14 min) https://www.youtube.com/watch?v=1dYOcDaDJLY

 

Articles

Medium Topic Access
Excercise Understanding Confusion Matrix in R https://www.datacamp.com/community/tutorials/confusion-matrix-calculation-r
Exercise with Answer Model Evaluation https://www.r-exercises.com/2016/12/02/model-evaluation-exercise-1/
Exercise with Answer Model Evaluation https://www.r-exercises.com/2016/12/22/model-evaluation-2/
Tutorial Lift Charts
Updated by IBM

https://www.ibm.com/docs/en/spss-statistics/28.0.0?topic=customers-cumulative-gains-lift-charts

Tutorial Generate ROC Curve Charts for Print and Interactive Use https://cran.r-project.org/web/packages/plotROC/vignettes/examples.html

 

From a confusion matrix

condition positive (P)
the number of real positive cases 
condition negative (N)
the number of real negative cases

true positive (TP)
True positive Prediction=A test result that correctly indicates the presence of a condition or characteristic
true negative (TN)
True negative Prediction=A test result that correctly indicates the absence of a condition or characteristic
false positive (FP)
False positive Prediction=A test result which wrongly indicates that a particular condition or attribute is present
false negative (FN)
False negative Prediction=A test result which wrongly indicates that a particular condition or attribute is absent

sensitivity

sensitivityrecallhit rate, or true positive rate (TPR)

{\displaystyle \mathrm {TPR} ={\frac {\mathrm {Predicted TP} }{\mathrm {known P} }}={\frac {\mathrm {TP} }{\mathrm {P} }}={\frac {\mathrm {TP} }{\mathrm {TP} +\mathrm {FN} }}=1-\mathrm {FNR} }

specificity

specificityselectivity or true negative rate (TNR)

precision

precision or positive predictive value (PPV)
{\displaystyle \mathrm {PPV} ={\frac {\mathrm {PredictedTP} }{\mathrm {PredictedP} }}={\frac {\mathrm {TP} }{\mathrm {TP} +\mathrm {FP} }}=1-\mathrm {FDR} }
negative predictive value (NPV)
miss rate or false negative rate (FNR)
fall-out or false positive rate (FPR)
false discovery rate (FDR)
false omission rate (FOR)
Positive likelihood ratio (LR+)
Negative likelihood ratio (LR-)
prevalence threshold (PT)
threat score (TS) or critical success index (CSI)

Prevalence

Accuracy

accuracy (ACC)
balanced accuracy (BA)
F1 score
is the harmonic mean of precision and sensitivity

phi coefficient (φ or rφ) or Matthews correlation coefficient (MCC)

Fowlkes–Mallows index (FM)

informedness or bookmaker informedness (BM)

markedness (MK) or deltaP (Δp)

Diagnostic odds ratio (DOR)

Sources: Fawcett (2006),[2] Piryonesi and El-Diraby (2020),[3] Powers (2011),[4] Ting (2011),[5] CAWCR,[6] D. Chicco & G. Jurman (2020, 2021),[7][8] Tharwat (2018).[9]

==================================================================================

#To create a confusion matrix with complete analysis of measures

library(caret)

KNNconfusionMatrix<-caret::confusionMatrix(Predictionsvector,actualvector)
KNNconfusionMatrix

#Easiest and best visualization gain and lift in R:

#install.packages(‘CustomerScoringMetrics’)
library(CustomerScoringMetrics)
CustomerScoringMetrics::cumGainsChart(Predictionsvector,actualvector)
CustomerScoringMetrics::liftChart(Predictionsvector,actualvector)

# ROC gain lift With more elaboration in R

##################################################
#  Create a prediction object from predictions and actuals of any test data
# Then by Various performance analysis you will have ROC , cumulative gain and lift charts
#################################################
#ROCRpredictionObjfromAnyModel<-ROCR::prediction(as.numeric(PredictionsfromAnyTestset),as.numeric(ActualsfromAnyTestset))
#################################################
ROCRpredictionObjfromKNN<-ROCR::prediction(as.numeric(amirKNNextraxtedPredictionsvector),as.numeric(actualvectoradmitancefromtestdata))

# ROC Chart:
ROCKNN <- ROCR::performance(ROCRpredictionObjfromKNN,measure=”tpr”,x.measure=”fpr”)
plot(ROCKNN, col=”orange”, lwd=2, main=”ROC curve ROCRpredictionObjKNN”)

# Gains Chart:
gainKNN <- ROCR::performance(ROCRpredictionObjfromKNN,measure=”tpr”,x.measure=”rpp”)
plot(gainKNN, col=”orange”, lwd=2, main=”gain curve KNN”)

# Lift Chart:
liftchartKNN<-ROCR::performance(ROCRpredictionObjfromKNN,”lift”,”rpp”)
plot(liftchartKNN, main=”Lift curve KNN”, colorize=T)

#AUC
library(ROCit)
roc_empirical <- ROCit::rocit(score = as.numeric(amirKNNextraxtedPredictionsvector), class = as.numeric(actualvectoradmitancefromtestdata), negref = 1)
# to get the area under the curve = AUC
summary(roc_empirical)
plot(roc_empirical)

 

 

=============================================

http://mlwiki.org/index.php/Cumulative_Gain_Chart

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4608333/

What is a gain chart?

https://idswater.com/2020/09/19/what-is-a-gain-chart/

http://www2.cs.uregina.ca/~dbd/cs831/notes/lift_chart/lift_chart.html

AUC:  https://www.r-bloggers.com/2016/11/calculating-auc-the-area-under-a-roc-curve/

https://www.rdocumentation.org/packages/caret/versions/5.07-001/topics/predict.train

https://www.datanovia.com/en/lessons/determining-the-optimal-number-of-clusters-3-must-know-methods/

https://cran.r-project.org/web/packages/ROCit/index.html