Do Maximum Entropy Principle and Vapnick’s Support Vector Machines have the Same Mathematical Derivations?
Fadi Chakik*, Fadi Dornaika**,***, and Ahmad Shahin*
*LaMA Research group, Lebanese University, Tripoli, Lebanon
**IKERBASQUE, Basque Foundation for Science, Bilbao, Spain
***Department of Computer Science and Artificial Intelligence
University of the Basque Country, San Sebastian, Spain
Abstract
This paper investigates the mathematical derivation of the Maximum Entropy Principle (MEP) and of Support Vector Machines (SVM) for data-driven classification tasks. The main contribution is the derivation of the conditions for which the solution based on the Maximum Entropy Principle has similar form to that given by SVM for binary classification tasks. These conditions are imposed on the form of the “observables” used by the MEP. It is well known that complex expressions for the observables will make the use of MEP very difficult, even impossible. The established link shows a very interesting property by which the binary classification based on the Maximum Entropy Principle can be solved using equivalent SVM-based optimizations regardless of the complexity of the used observables. Moreover, we give a geometrical interpretation of the formulation given by the MEP.
Keywords: Maximum entropy principle, Margin Decision Rules, Support Vector Machines, binary classification, Bayesian inference, learning from examples