Kaggle- Fraudulent or Genuine

April 9, 2017

26 views
21 downloads

Algorithms

Report Abuse
Worked upon the kaggle credit card fraud detection dataset (highly imbalanced dataset) made use of oversampling.
While working on the dataset I balanced the data through oversampling using the python script as the data was highly imbalanced in nature. I used the two Class decision forest algorithm. As the class imbalance ratio is high , I recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification.