In this blog post, I will show that the decision function of Naive Bayes is the same as logistic regression for observation from exponential family.
1. Naive Bayes with Gaussian
The basic assumption of Naive Bayes is . The posterior is
Usually in books, is a multinomial distribution. However, there is nothing preventing us from using other distributions, as long as variables of different dimensions are independent. For example, for continuous variable, we can use Gaussian
The posterior is
If for all , then all are canceled. The posterior becomes
We can see Naive Bayes is a linear model, whose decision function is the same as logistic regression, given that the variance is the same for all classes.
2. Naive Bayes with Exponential Family
This is the sub-family of the general exponential family introduced in previous blog post. It is also widely used, such as in generalized linear model. This family of distribution has some special properties. First, variables of all dimensions are independent, which meets the requirement of Naive Bayes. Second, the sufficient statistics is only the first order momentum
The maximum likelihood estimation of is simply
The posterior again is
From these results, we can see that, for Naive Bayes, the decision boundary is always linear, if the conditional distribution of observations are of exponential family (1).