Triplet-Center Loss for Multi-View 3D Object Retrieval

Loss function has two terms. L_softmax optimizing a regular supervised classification problem. L_tc is a novel triple-center loss for enforcing a better embedding.
The triple loss promotes an embedding space where the distances between A,P is smaller than that of A,N by at least a margin m
softmax + triple center loss boost classification performance
T-SNE visualization for the embedding space using different loss functions. Softmax + triplet-center loss function (e) creates compact clusters that are distance from one another.
Quantitative retrieval examples
  • I have first hand experience working on similar retrieval problem. This github repository provide tensor-flow implementation for the naive center loss. It should be easy to modify it into TCL
  • In this paper, one assumption made for both the center loss and the proposed TCL is that all classes follow a Gaussian distribution — have a single modal. This assumption is weak in complex problems where same class objects can belong to multi-modal distribution. In such case, the current formulation might hurts.
  • For solving 3D multi-view object recognition, the softmax classification is justified. Yet, this is not always the case. If softmax loss is not allowed because for instance the main focus is object retrieval not recognition, the suggested TCL still suffer the degenerate solution where all points are embedded into the same point. This point is explicitly mentioned in the paper. I want to highlight it again because It is also mentioned that TCL is “very robust” to its hyper-parameter \lambda. I find this a bit misleading. If the hyper-parameter is big, the TCL will suppress the softmax loss and this will lead to a degenerate solution.
  • I enjoyed reading the paper. The subject is well presented.

--

--

--

I write reviews on computer vision papers. Writing tips are welcomed.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Why Machine Learning Models Hate “Change”

OIL Bounty: The Benefits of Image Tagging — Kambria

A Gentle Introduction to Linear Regression

Less is More: Learning Highlight Detection from Video Duration

Understanding the Fundamentals of Linear Regression

#7 — Are your Problems Similar to those of the Rest of the Industry?

Be aware of the industry

Deep Learning in 5 minutes Part 2: Recurrent Neural Networks

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ahmed Taha

Ahmed Taha

I write reviews on computer vision papers. Writing tips are welcomed.

More from Medium

Querying Similar Images with TensorFlow

ResNet with Data Augmentation

Building a Convolutional Neural Network Using TensorFlow — Keras

Oil Tank Volume Estimation