Celebrities in Frontal-Profile Wild (CPFW) dataset contains images of 500 subjects (with 10 frontal images and 4 profile images for each subject). 5000 frontal images were pre-processed using D-lib to crop and align the faces. 37 images had a failure to detect (FTD) case. The final gallery had the following distribution of # the number of images /subject.
# of Subjects # of Images
6 8
25 9
469 10
total images= 4963
total subjects= 500
You should assume the system is symmetric. You may use the programming language/tool of your choice (R, Python, Matlab, etc) in the analysis of the data. Please indicate which tool/language is being used, and include a text file of code with your submission.
Files needed for this assignment (see Module 2):
Answer the following Question:
Genuine and Impostor Score Distributions:
a. Extract genuine and impostor scores from the similarity matrix.
b. Generate and plot the score distribution histograms for genuine and impostor scores on the same graph.
c. Additionally, plot the relative score distribution for genuine and impostor scores.
D-prime Calculation: Compute the d-prime (d’) value to assess the separation between genuine and impostor score distributions.
Receiver Operating Characteristic (ROC) Curve:
a. Calculate the True Positive Rate (TPR) and False Positive Rate (FPR) for varying thresholds. (a minimum of 10 thresholds.)
b. Plot the ROC curve and compute the Area Under the Curve (AUC).
Cumulative Match Characteristic (CMC) Curve: Generate a CMC curve to evaluate the rank-based identification performance of the biometric system.
False Match Rate (FMR) and False Non-Match Rate (FNMR) Curves:
a. Plot the FMR and FNMR curves on the same graph relative to the threshold.
b. Identify and mark the operating threshold that minimizes the difference between FMR and FNMR.