We now know that logistic regression models provide us with the probability that the outcome variable is true, that is to say, y = 1. However, in real-world use cases, we need to make decisions, not just deliver probabilities. Often, we make binary predictions, such as Yes/No, Good/Bad, and Go/Stop. A threshold value (t) allows us to make these decisions based on probabilities as follows:
- If P(y=1) >= t, then we predict y = 1
- If P(y=1) < t, then we predict y = 0
The challenge now is how to choose a suitable value of t. In fact, what does suitable mean in this context?
In real-world use cases, some types of error are better than others. Imagine that you were a doctor and were testing a large group of patients for a particular ...