Open in app

Sign In

Write

Sign In

Ahmed Taha
Ahmed Taha

381 Followers

Home

About

Pinned

L2-CAF: A Neural Network Debugger

Every software engineer has used a debugger to debug his code. Yet, a neural network debugger… That’s news! This paper [1] proposes a debugger to debug and visualize attention in convolutional neural networks (CNNs). Before describing the CNN debugger, I want to highlight few attributes of program debuggers (e.g., gdb)…

Machine Learning

5 min read

L2-CAF: A Neural Network Debugger
L2-CAF: A Neural Network Debugger
Machine Learning

5 min read


4 days ago

Masked Autoencoders Are Scalable Vision Learners

Annotated data is a vital pillar of deep learning. Yet, annotated data is rare in certain applications (e.g., medical and robotics). To reduce the number of annotations, self-supervised learning aims to pre-train deep networks on unannotated data to learn useful representations. Different self-supervised learning approaches propose different objectives to train…

Deep Learning

7 min read

Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Deep Learning

7 min read


Feb 14

Rethinking Attention with Performers — Part II & Final

This article’s objective is to summarize the Performers [1] paper. The article highlights key details and documents some personal comments at the end. A previous article presents a hand-wavy understanding of Performers using a hashing analogy. Vanilla Transformers leverage self-attention layers defined as follows This formula has quadratic space complexity…

Transformers

7 min read

Rethinking Attention with Performers — Part II & Final
Rethinking Attention with Performers — Part II & Final
Transformers

7 min read


Oct 11, 2022

Rethinking Attention with Performers — Part I

This article’s objective is to present a hand-wavy understanding of how Performers [1] work. Transformers dominate the deep-learning literature in 2022. Unfortunately, Transformers suffer quadratic complexity in the self-attention layer. This has hindered transformers for long-input signals, i.e., large sequence L. Large sequences are not critical in NLP applications since…

Machine Learning

7 min read

Rethinking Attention with Performers — Part I
Rethinking Attention with Performers — Part I
Machine Learning

7 min read


May 16, 2022

Understanding the Effective Receptive Field in Deep Convolutional Neural Networks

In deep networks, a receptive field — or field of view — is the region in the input space that affects the features of a particular layer as shown in Fig.1. The receptive field is important for understanding and diagnosing a network’s performance. …

Machine Learning

5 min read

Understanding the Effective Receptive Field in Deep Convolutional Neural Networks
Understanding the Effective Receptive Field in Deep Convolutional Neural Networks
Machine Learning

5 min read


Apr 4, 2022

Understanding Transfer Learning for Medical Imaging

Transfer learning (a.k.a. ImageNet pre-training) is a common practice in deep learning where a pre-trained network is fine-tuned on a new dataset/task. This practice is implicitly justified by feature-reuse where features learned from ImageNet are beneficial to other datasets/tasks. This paper [1] evaluates this justification on medical images datasets. The…

Deep Learning

6 min read

Understanding Transfer Learning for Medical Imaging
Understanding Transfer Learning for Medical Imaging
Deep Learning

6 min read


Jan 24, 2022

Sharpness-Aware Minimization for Efficiently Improving Generalization

For training a deep network, picking the right optimizer has become an important design choice. Standard optimizers (e.g., SGD, Adam, etc.) seek a minimum on the loss curve. This minimum is sought without regard for the curvature, i.e., the 2nd degree derivative of the loss curve. A curvature denotes the…

Optimization

4 min read

Sharpness-Aware Minimization for Efficiently Improving Generalization
Sharpness-Aware Minimization for Efficiently Improving Generalization
Optimization

4 min read


Dec 27, 2021

Feature Embedding Regularizers: SVMax & VICReg

What is more important, a deep network weights or its activations? Obviously, we can derive the network’s activation from its weights. Yet, deep networks are non-linear embedding functions; we want this non-linear embedding only. On top of this embedding, we either slap a linear classifier in a classification network or…

Deep Learning

7 min read

Feature Embedding Regularizers: SVMax & VICReg
Feature Embedding Regularizers: SVMax & VICReg
Deep Learning

7 min read


Nov 3, 2021

IIRC: Incremental Implicitly-Refined Classification

While training a deep network on multiple tasks jointly is easy, training on multiple tasks sequentially is challenging. This challenge is addressed in various literature: Lifelong Learning, Incremental Learning, Continual Learning, and Never-Ending Learning. Within these forms, the common problem is catastrophic forgetting, i.e., the network forgets older tasks. To…

Computer Vision

6 min read

IIRC: Incremental Implicitly-Refined Classification
IIRC: Incremental Implicitly-Refined Classification
Computer Vision

6 min read


Aug 10, 2021

Knowledge Evolution in Neural Networks

Deep learning stands on two pillars: GPUs and large datasets. Thus, deep networks suffer when trained from scratch on small datasets. This phenomenon is known as overfitting. This paper [1] proposes knowledge evolution to reduces both overfitting and the burden for data collection. In the paper, knowledge evolution is supported…

Knowledge Distillation

4 min read

Knowledge Evolution in Neural Networks
Knowledge Evolution in Neural Networks
Knowledge Distillation

4 min read

Ahmed Taha

Ahmed Taha

381 Followers

I write reviews on computer vision papers.

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech