Mining on Manifolds: Metric Learning without Labels

Figure 1: A pretrained network embeds images into a manifold (feature space)
Figure 2: Identify hard positive and negative images using the Euclidean and manifold nearest neighbors.
Table 1: Recall@k and NMI on CUB-200–2011. All methods except for Ours and cyclic match [30] use ground-truth labels during training.
Figure 3: Sample CUB-200-2011 anchor images (x^r), positive images from the proposed method (P^+(x^r)) and baseline (NN^e_3(x^r)), and negative images from the proposed method (P^−(x^r)) and baseline (X \ NN^e_3(x^r)). The baseline is Euclidean nearest neighbors and non-neighbors. Positive (negative) ground-truth framed in green (red). Labels are only used for qualitative evaluation and not during training. NN^e_3(x^r) indicates the three euclidean nearest neighbors to the anchor image x^r.
  1. [Strength] The paper is well written and tackles an interesting problem — training a network on unlabeled image collection. The authors released their implementation on Github.
  2. [Strength] The paper proposed a specific approach to identify useful anchor images, i.e., the stationary probability distribution (A). The authors provide an ablation study to evaluate this approach against randomly sampled anchors. The stationary probability distribution (A) approach significantly outperforms random sampling as shown in the next Table.
Impact of choices of anchors and pools of positive and negative examples on Recall@1 on CUB-200–2011 and mAP on Oxford5k. Higher Recall/mAP is better. On CUB, all images are used as anchors, while on Oxford5K anchors are selected either at random or by the stationary probability distribution (A) approach. The positive and negative pools are formed by either the baseline with Euclidean nearest neighbors (NN^e) or the proposed hard-mining selection framework (P+ and P−).

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ahmed Taha

Ahmed Taha

I write reviews on computer vision papers. Writing tips are welcomed.