Bilinear CNN Models for Fine-grained Visual Recognition

Fine-grained Visual Recognition (FVGR)

  1. FV-SIFT: Image SIFT features pooled using Fisher Vector descriptor
  2. FC-CNN[M]: M-Net CNN feature extractor followed by Fully Connected layer descriptor
  3. FC-CNN[D]: VGG-Net (Deep) CNN feature extractor followed by Fully Connected layer descriptor
  4. FV-CNN[D]: VGG-Net (Deep) CNN feature extractor followed by Fisher Vector descriptor
  5. B-CNN [D-M]: Bilinear Model with VGG(Deep) and M-Net CNN feature extractors followed by summation pooling
  • The paper is well-organized, mathematically formulation is simple to understand given the required background
  • Multiple interesting concepts are introduced
  • I particularly like the detailed experiments results analysis and the in-depth comparison with previous work even when the benchmark is different — compare methods that use/don’t use image part annotations.

--

--

--

I write reviews on computer vision papers. Writing tips are welcomed.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Analysing Dataset Using KNN

Word Embeddings by example

Image Restoration Using Opening Closing Network

Coding Multi-Agent Reinforcement Learning algorithms

Bagging and The Random Forest

Face Recognition Algorithms with 2 Different Methods

Feed Forward Neural Networks

Detailed Literature Review for S.A.R.C.A.S.M.A.N.I.A

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ahmed Taha

Ahmed Taha

I write reviews on computer vision papers. Writing tips are welcomed.

More from Medium

Speaking Code: Vision Transformer

Original U-Net in PyTorch

Image of Semantic Segmentation

Emotional Computer Vision and Machine Consciousness

Review — Motion Masks: Learning Features by Watching Objects Move