Deep learning stands on two pillars: GPUs and large datasets. Thus, deep networks suffer when trained from scratch on small datasets. This phenomenon is known as overfitting. This paper [1] proposes knowledge evolution to reduces both overfitting and the burden for data collection. In the paper, knowledge evolution is supported…


This paper [1] leverages two simple ideas to solve an important problem. The paper solves the problem of batch normalization when the batch size b is small, e.g., b=2. …


Every software engineer has used a debugger to debug his code. Yet, a neural network debugger… That’s news! This paper [1] proposes a debugger to debug and visualize attention in convolutional neural networks (CNNs).

Before describing the CNN debugger, I want to highlight few attributes of program debuggers (e.g., gdb)…


Thanks, Kubra, for sharing your thoughts. I never used Deep image prior. It is a cool idea, yet, its training cost is a barrier. As you mentioned, the total time spent on training is optimized; that's correct. Yet, this training time is incurred for every image, i.e., every image is a training sample, there is no inference.

Furthermore, I had a colleague who used deep prior for RGB-D images [1]. The training cost becomes more severe as the number of dimensions increases.

So, Yes, for 2D images the cost might be manageable, but I would proceed with caution for higher dimensions (e.g., videos). Thanks again for sharing your input :)

[1] Depth Completion Using a View-constrained Deep Prior


Metric learning literature assumes binary labels where samples belong to either the same or different classes. While this binary perspective has motivated fundamental ranking losses (e.g., Contrastive and Triplet loss), this binary perspective has reached a stagnant point [2]. Thus, one novel direction for metric learning is continuous (non-binary) similarity…


This paper [1] quantifies the financial and environmental costs (CO2 emissions) of training a deep network. It also draws attention to the inequality between academia and industry in terms of computational resources. The paper uses NLP-architectures to present their case. …


This paper [1] proposes an unsupervised framework for hard training-example mining. The proposed framework has two phases. Given a collection of unlabelled images, the first phase identifies positive and negative image pairs. Then, the second phase leverages these pairs to fine-tune a pretrained network.

Phase #1:

The first phase leverage…


This paper [1] proposes a tool, L2-CAF, to visualize attention in convolutional neural networks. L2-CAF is a generic visualization tool that can do everything CAM [3] and Grad-CAM [2] can do, but the opposite is not true.

Given a pre-trained CNN, an input x generates an output NT(x) — this…


Metric learning learns a feature embedding that quantifies the similarity between objects and enables retrieval. Metric learning losses can be categorized into two classes: pair-based and proxy-based. The next figure highlights the difference between the two classes. Pair-based losses pull similar samples together while pushing different samples apart (data-to-data relations)…


This (G)old paper[1] tackles an interesting question: Why Does Unsupervised Pre-training Help Deep Learning? The authors support their conclusion with a ton of experiments. Yet, the findings contradict a common belief about unsupervised learning. That’s why I have contradicting feeling about this paper. …

Ahmed Taha

I write reviews on computer vision papers. Writing tips are welcomed.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store