CONF Grandvalet_NIPS_2008/IDIAP Support Vector Machines with a Reject Option Grandvalet, Yves Rakotomamonjy, Alain Keshet, Joseph Canu, Stéphane EXTERNAL https://publications.idiap.ch/attachments/papers/2009/Grandvalet_NIPS_2008.pdf PUBLIC https://publications.idiap.ch/index.php/publications/showcite/Grandvalet_Idiap-RR-01-2009 Related documents Proceedings of the 22nd Annual Conference on Neural Information Processing Systems 2008 We consider the problem of binary classification where the classifier may abstain instead of classifying each observation. The Bayes decision rule for this setup, known as Chow’s rule, is defined by two thresholds on posterior probabilities. From simple desiderata, namely the consistency and the sparsity of the classifier, we derive the double hinge loss function that focuses on estimating conditional probabilities only in the vicinity of the threshold points of the optimal decision rule. We show that, for suitable kernel machines, our approach is universally consistent. We cast the problem of minimizing the double hinge loss as a quadratic program akin to the standard SVM optimization problem and propose an active set method to solve it efficiently. We finally provide preliminary experimental results illustrating the interest of our constructive approach to devising loss functions. REPORT Grandvalet_Idiap-RR-01-2009/IDIAP Support Vector Machines with a Reject Option Grandvalet, Yves Keshet, Joseph Rakotomamonjy, Alain Canu, Stéphane EXTERNAL https://publications.idiap.ch/attachments/reports/2008/Grandvalet_Idiap-RR-01-2009.pdf PUBLIC Idiap-RR-01-2009 2009 Idiap January 2009 We consider the problem of binary classification where the classfier may abstain instead of classifying each observation. The Bayes decision rule for this setup, known as Chow’s rule, is defined by two thresholds on posterior probabilities. From simple desiderata, namely the consistency and the sparsity of the classifier, we derive the double hinge loss function that focuses on estimating conditional probabilities only in the vicinity of the threshold points of the optimal decision rule. We show that, for suitable kernel machines, our approach is universally consistent. We cast the problem of minimizing the double hinge loss as a quadratic program akin to the standard SVM optimization problem and propose an active set method to solve it efficiently. We finally provide preliminary experimental results We consider the problem of binary classification where the classifier may abstain instead of classifying each observation. The Bayes decision rule for this setup, known as Chow’s rule, is defined by two thresholds on posterior probabilities. From simple desiderata, namely the consistency and the sparsity of the classifier, we derive the double hinge loss function that focuses on estimating conditional probabilities only in the vicinity of the threshold points of the optimal decision rule. We show that, for suitable kernel machines, our approach is universally consistent. We cast the problem of minimizing the double hinge loss as a quadratic program akin to the standard SVM optimization problem and propose an active set method to solve it efficiently. We finally provide preliminary experimental results illustrating the interest of our constructive approach to devising loss functions.