Let’s start with the Basics.
Logistic Regression is a Classification technique. It is used where the response variable is categorical. The idea of logistical regression is to find conditional probability, of output, is y=1 given that its input is X.
Logistical regression can be classified as:
> Binomial: Response variable or target variable can have only two possible types like 0 or 1, true or false, pass or fail, win or loss.
> Multinomial Logistic Regression: Response variable/Target variable can have 3 or more possible types which are not ordered ( types that do have quantitative significance ).
Q Why are we using logistic regression, not linear regression?
> As probability is linear regression could be less than 0 or more than 1, logistic regression was introduced.
> In Logistic Regression, the outcome has only a limited number of possible values whereas linear regression can have an infinite number of possible values.
Logistical regression is represented as: P(y=1 | X) where y is output and X is input.
Logistical regression uses Logit Functions or log-odds function, for calculation of conditional probability.
Logit or Log-odds function is represented as:
where, the left-hand side is called the logit or log-odds function, and p(x)/(1-p(x)) is called odds. Odds is ratio of probability of success to probability of failure.
Let p(X)=y
Then logistic regression equation: y = e^(b0 + b1*x) / (1 + e^(b0 + b1*x))
This is known as Sigmoid function and it gives an S-shaped curve. Value of probability here is ranging from 0<p<1.
Example where Logistic expression can be used:
> Let there be two response variable Pass or Fail, find the probability that student will pass depends on numbers of hours he/she studied.
> Let X indicate CGPA of students and Y indicate admit=1 or no admit=0. To find the probability of being admitted given that CGPA is X.
> To determine presence or absence of certain diseases like cancer based on symptoms and other medical data.
Leave a Reply