機器學習演算法的性能指標：precision, recall, accuracy, sensitivity 與 specificity @ Murphy 的書房

Precision, recall, and accuracy

目前聽到 precision / recall 時，還沒能夠很直覺地理解它的意義。

因此整理了一下定義及例子，設法加強直覺性的理解。

定義:

\( \text{Precision} = \frac{\text{# of of true positives}}{\text{# of predicted positives}} = \frac{\text{# of true positives}}{\text{# of true positives + # of false positives}} \)

\( \text{Recall} = \frac{\text{# of true positives}}{\text{# of actual positives}} = \frac{\text{# of true positives}}{\text{# of true positives + # of false negatives}} \)

\( \text{Accuracy} = \frac{\text{# of true positives + # of true negatives}}{\text{# of actual positives + # of actual negatives}} \)

意義:

(假設預測一個案例是 positive 或 negative)

Precision: 預測是 positive 的案例中，真的是 positive 的比例。(對於 positive 預測正確的比例，預測的準確度)

Recall: 在 positive 案例中，正確預測是 positive 的比例。(對於 positive 案例回想正確的比例)

Accuracy: 在所有案例中，對於 positive 及 negative 預測正確的比例。

以癌症預測演算法為例:

Precision 是衡量在所有演算法預測為有癌症的人之中，多少比例的人實際上有癌症。

Recall 是衡量在所有實際上有癌症的人之中，多少比例的人被演算法預測為有癌症。

Accuracy 是衡量在所有檢測的人之中，多少比例的人被演算法預測正確。

\( \text{Precision} = \frac{7}{7+5} = 0.58 \)

\( \text{Recall} = \frac{7}{7+8} = 0.47 \)

\( \text{Accuracy} = \frac{7+80}{7+8+5+80} = 0.87 \)

只看 accuracy 容易有盲點，以上述的例子，因為在所有檢測的人之中，實際上有癌症的人比例很低，演算法只要盡量往沒有癌症的方向猜，就可以達到比較高的 accuracy。

在上面這個極端的例子:

\( \text{Precision} = \frac{1}{1+1} = 0.5 \)

\( \text{Recall} = \frac{1}{1+14} = 0.07 \)

\( \text{Accuracy} = \frac{1+84}{1+14+1+84} = 0.85 \)

Sensitivity and specificity

\( \text{Sensitivity} = \text{Recall} = \text{True Positive Rate} = \frac{\text{# of true positives}}{\text{# of actual positives}}\)

\( \text{Specificity} = \text{True Negative Rate} = \frac{\text{# of true negatives}}{\text{# of actual negatives}}\)
意義:

(假設預測一個案例是 positive 或 negative)

Sensitivity: 在 positive 案例中，正確預測是 positive 的比例。

Specificity: 在 negative 案例中，正確預測是 negative 的比例。

以癌症預測演算法為例:

Sensitivity 是衡量在所有實際上有癌症的人之中，多少比例的人被演算法預測為有癌症。

Specificity 是衡量在所有實際上沒有癌症的人之中，多少比例的人被演算法預測為沒有癌症。

\( \text{Sensitivity} = \frac{7}{7+8} = 0.47 \)

\( \text{Specificity} = \frac{80}{5+80} = 0.94 \)

參考資料

[WiKi] Precision and recall

[WiKi] Sensitivity and specificity

Murphy 的書房

機器學習演算法的性能指標：precision, recall, accuracy, sensitivity 與 specificity

Precision, recall, and accuracy

Sensitivity and specificity

參考資料

文章分類

搜尋此網誌

關於「Murphy的書房」