Triplet-Center Loss for Multi-View 3D Object Retrieval

Loss function has two terms. L_softmax optimizing a regular supervised classification problem. L_tc is a novel triple-center loss for enforcing a better embedding.
The triple loss promotes an embedding space where the distances between A,P is smaller than that of A,N by at least a margin m
softmax + triple center loss boost classification performance
T-SNE visualization for the embedding space using different loss functions. Softmax + triplet-center loss function (e) creates compact clusters that are distance from one another.
Quantitative retrieval examples
  • I have first hand experience working on similar retrieval problem. This github repository provide tensor-flow implementation for the naive center loss. It should be easy to modify it into TCL
  • In this paper, one assumption made for both the center loss and the proposed TCL is that all classes follow a Gaussian distribution — have a single modal. This assumption is weak in complex problems where same class objects can belong to multi-modal distribution. In such case, the current formulation might hurts.
  • For solving 3D multi-view object recognition, the softmax classification is justified. Yet, this is not always the case. If softmax loss is not allowed because for instance the main focus is object retrieval not recognition, the suggested TCL still suffer the degenerate solution where all points are embedded into the same point. This point is explicitly mentioned in the paper. I want to highlight it again because It is also mentioned that TCL is “very robust” to its hyper-parameter \lambda. I find this a bit misleading. If the hyper-parameter is big, the TCL will suppress the softmax loss and this will lead to a degenerate solution.
  • I enjoyed reading the paper. The subject is well presented.



