Open in app

Sign In

Write

Sign In

Ahmed Taha
Ahmed Taha

404 Followers

Home

About

Pinned

L2-CAF: A Neural Network Debugger

Every software engineer has used a debugger to debug his code. Yet, a neural network debugger… That’s news! This paper [1] proposes a debugger to debug and visualize attention in convolutional neural networks (CNNs). Before describing the CNN debugger, I want to highlight few attributes of program debuggers (e.g., gdb)…

Machine Learning

5 min read

L2-CAF: A Neural Network Debugger
L2-CAF: A Neural Network Debugger
Machine Learning

5 min read


Jul 3

Big Transfer (BiT): General Visual Representation Learning

Pre-trained representations bring two benefits during fine-tuning: (1) improved sample efficiency, and (2) simplified hyperparameter tuning. Towards this goal, this paper [1] provides a recipe for both pre-training and fine-tuning neural networks for vision tasks. These two steps are entangled and a good engineering recipe is essential to get the…

Deep Learning

5 min read

Big Transfer (BiT): General Visual Representation Learning
Big Transfer (BiT): General Visual Representation Learning
Deep Learning

5 min read


Jun 5

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Standard attention suffers quadratic complexity in terms of the sequence length (number of tokens). To reduce complexity, efficient attention methods have proposed sparse and/or low-rank approximations. These approximations reduce complexity to linear or near-linear with respect to the sequence length. Yet, these methods either lag in performance or achieve no…

Transformers

8 min read

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Transformers

8 min read


May 9

High Resolution Images and Efficient Transformers

ResNet and ViT models achieve competitive performance, but they are not the best. For instance, DenseNets achieve superior performance to ResNets. Yet, DenseNets are less popular than ResNet. So why are ResNets and ViT models so popular in literature? One should not attribute their success to a single factor. Yet…

High Resolution Images

5 min read

High Resolution Images and Efficient Transformers
High Resolution Images and Efficient Transformers
High Resolution Images

5 min read


Mar 27

Masked Autoencoders Are Scalable Vision Learners

Annotated data is a vital pillar of deep learning. Yet, annotated data is rare in certain applications (e.g., medical and robotics). To reduce the number of annotations, self-supervised learning aims to pre-train deep networks on unannotated data to learn useful representations. Different self-supervised learning approaches propose different objectives to train…

Deep Learning

7 min read

Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Deep Learning

7 min read


Feb 14

Rethinking Attention with Performers — Part II & Final

This article’s objective is to summarize the Performers [1] paper. The article highlights key details and documents some personal comments at the end. A previous article presents a hand-wavy understanding of Performers using a hashing analogy. Vanilla Transformers leverage self-attention layers defined as follows

Transformers

7 min read

Rethinking Attention with Performers — Part II & Final
Rethinking Attention with Performers — Part II & Final
Transformers

7 min read


Oct 11, 2022

Rethinking Attention with Performers — Part I

This article’s objective is to present a hand-wavy understanding of how Performers [1] work. Transformers dominate the deep-learning literature in 2022. Unfortunately, Transformers suffer quadratic complexity in the self-attention layer. This has hindered transformers for long-input signals, i.e., large sequence L. Large sequences are not critical in NLP applications since…

Machine Learning

7 min read

Rethinking Attention with Performers — Part I
Rethinking Attention with Performers — Part I
Machine Learning

7 min read


May 16, 2022

Understanding the Effective Receptive Field in Deep Convolutional Neural Networks

In deep networks, a receptive field — or field of view — is the region in the input…

Machine Learning

5 min read

Understanding the Effective Receptive Field in Deep Convolutional Neural Networks
Understanding the Effective Receptive Field in Deep Convolutional Neural Networks
Machine Learning

5 min read


Apr 4, 2022

Understanding Transfer Learning for Medical Imaging

Transfer learning (a.k.a. ImageNet pre-training) is a common practice in deep learning where a pre-trained network is fine-tuned on a new dataset/task. This practice is implicitly justified by feature-reuse where features learned from ImageNet are beneficial to other datasets/tasks. This paper [1] evaluates this justification on medical images datasets. The…

Deep Learning

6 min read

Understanding Transfer Learning for Medical Imaging
Understanding Transfer Learning for Medical Imaging
Deep Learning

6 min read


Jan 24, 2022

Sharpness-Aware Minimization for Efficiently Improving Generalization

For training a deep network, picking the right optimizer has become an important design choice. Standard optimizers (e.g., SGD, Adam, etc.) seek a minimum on the loss curve. This minimum is sought without regard for the curvature, i.e., the 2nd degree derivative of the loss curve. A curvature denotes the…

Optimization

4 min read

Sharpness-Aware Minimization for Efficiently Improving Generalization
Sharpness-Aware Minimization for Efficiently Improving Generalization
Optimization

4 min read

Ahmed Taha

Ahmed Taha

404 Followers

I write reviews on computer vision papers.

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech

Teams