IIRC: Incremental Implicitly-Refined Classification

Figure 1: Humans incrementally accumulate knowledge over time. They encounter new entities and discover new information about existing entities. In this process, they associate new labels with entities and refine or update their existing labels, while ensuring the accumulated knowledge is coherent.
Figure 2: IIRC setup showing how the model expands its knowledge and associates and re-associates labels over time. The top right label shows the label model sees during training, and the bottom label (annotated as “Target”) is the one that model should predict during evaluation. The right bottom panel for each task shows the set classes that model is evaluated on and the dashed line shows different tasks.
Figure 3: Average performance on IIRC-CIFAR. Mean and standard deviation are reported.
Figure 4: Confusion matrix after training on tasks from IIRC-CIFAR. The y-axis is the groundtruth while x-axis is the network prediction. The y-axis has old tasks’ classes at the top and new tasks’s classes at the bottom. The lower triangular matrix shows the percentage of older labels predicted for newly introduced classes, while the upper triangular matrix shows the percentage of newer labels predicted for older classes, with the groundtruth being subfigure a. Subfigures b-d denote the confusion matrix from different incremental learning approaches.
  • The paper introduction is easy to read and the problem is well motivated. Given the complexity of IIRC, the authors released the IIRC benchmark as a python package.
  • To study association between concepts (labels), the paper manually inspects the groundtruth confusion matrices against the learned confusion matrix. Yet, I wish the paper proposed a metric to quantify the association between the learned concepts.
  • IIRC casts incremental/continual learning as a multi-label classification problem. The total number of classes is known and fixed; a Sigmoid is applied on class logits, i.e., instead of using Softmax. Automatically, this raises the question of how to add new classes? This is a legitimate question for incremental/continual learning where knowledge adds up. I think IIRC should have learned a feature embedding instead of class logits. With embedding, adding new classes is straightforward — since the network’s output remains the same. Furthermore, it is natural to cluster features (classes) in the feature embedding. This serves as a quantitative metric to measure the association between different concepts.

--

--

--

I write reviews on computer vision papers. Writing tips are welcomed.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Document Classification Using Deep Learning

Text Classification with Transformers

Photo Geolocation with Neural Networks: How to and How not to

Multi Layer Perceptron (MNIST) Pytorch

Market Basket Analysis on Grocery Dataset.

In Machine Learning, what are datasets?

How to Use Forefront’s Free GPT-J Playground

How to Communicate Clearly About Machine Learning.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ahmed Taha

Ahmed Taha

I write reviews on computer vision papers. Writing tips are welcomed.

More from Medium

Pruning for Deep Neural Networks — Techniques to Prune Image and Language Models

Siamese Neural Network (SNN)

Transformer’s Evaluation Details

Review — WSL: Exploring the Limits of Weakly Supervised Pretraining