CONF Srinivas_INTERNATIONALCONFERENCEONMACHINELEARNING_2018/IDIAP Knowledge Transfer with Jacobian Matching Srinivas, Suraj Fleuret, Francois http://publications.idiap.ch/index.php/publications/showcite/Srinivas_Idiap-RR-04-2018 Related documents Proceedings of the International Conference on Machine Learning 2018 http://proceedings.mlr.press/v80/srinivas18a.html URL Classical distillation methods transfer representations from a “teacher” neural network to a “student” network by matching their output activations. Recent methods also match the Jacobians, or the gradient of output activations with the input. However, this involves making some ad hoc decisions, in particular, the choice of the loss function. In this paper, we first establish an equivalence between Jacobian matching and distillation with input noise, from which we derive appropriate loss functions for Jacobian matching. We then rely on this analysis to apply Jacobian matching to transfer learning by establishing equivalence of a recent transfer learning procedure to distillation. We then show experimentally on standard image datasets that Jacobian-based penalties improve distillation, robustness to noisy inputs, and transfer learning. REPORT Srinivas_Idiap-RR-04-2018/IDIAP Knowledge Transfer with Jacobian Matching Srinivas, Suraj Fleuret, Francois EXTERNAL http://publications.idiap.ch/attachments/reports/2018/Srinivas_Idiap-RR-04-2018.pdf PUBLIC Idiap-RR-04-2018 2018 Idiap March 2018 https://arxiv.org/abs/1803.00443 URL

</datafield>

<subfield code="a">Srinivas_INTERNATIONALCONFERENCEONMACHINELEARNING_2018/IDIAP</subfield>

</datafield>

<subfield code="a">Knowledge Transfer with Jacobian Matching</subfield>

</datafield>

<subfield code="a">Srinivas, Suraj</subfield>

</datafield>

<subfield code="a">Fleuret, Francois</subfield>

</datafield>

<subfield code="u">http://publications.idiap.ch/index.php/publications/showcite/Srinivas_Idiap-RR-04-2018</subfield>

<subfield code="z">Related documents</subfield>

</datafield>

<subfield code="a">Proceedings of the International Conference on Machine Learning</subfield>

</datafield>

</datafield>

<subfield code="u">http://proceedings.mlr.press/v80/srinivas18a.html</subfield>

</datafield>

<subfield code="a">Classical distillation methods transfer representations from a “teacher” neural network to a “student” network by matching their output activations. Recent methods also match the Jacobians, or the gradient of output activations with the input. However, this involves making some ad hoc decisions, in particular, the choice of the loss function. In this paper, we first establish an equivalence between Jacobian matching and distillation with input noise, from which we derive appropriate loss functions for Jacobian matching. We then rely on this analysis to apply Jacobian matching to transfer learning by establishing equivalence of a recent transfer learning procedure to distillation. We then show experimentally on standard image datasets that Jacobian-based penalties improve distillation, robustness to noisy inputs, and transfer learning.</subfield>

</datafield>

</record>

<subfield code="a">REPORT</subfield>

</datafield>

<subfield code="a">Srinivas_Idiap-RR-04-2018/IDIAP</subfield>

</datafield>

<subfield code="a">Knowledge Transfer with Jacobian Matching</subfield>

</datafield>

<subfield code="a">Srinivas, Suraj</subfield>

</datafield>

<subfield code="a">Fleuret, Francois</subfield>

</datafield>

<subfield code="i">EXTERNAL</subfield>

<subfield code="u">http://publications.idiap.ch/attachments/reports/2018/Srinivas_Idiap-RR-04-2018.pdf</subfield>

<subfield code="x">PUBLIC</subfield>

</datafield>

<subfield code="a">Idiap-RR-04-2018</subfield>

</datafield>

<subfield code="b">Idiap</subfield>

</datafield>

<subfield code="d">March 2018</subfield>

</datafield>

<subfield code="u">https://arxiv.org/abs/1803.00443</subfield>

</datafield>

</record>

</collection>