Temporal Context Network for Activity Localization in Videos

  • The network consider temporal proposals but no spatial proposals. This probably hinder its ability to detect action in the background.
  • The author said the classification problem is more difficult. The network classification accuracy is less than the detection accuracy. This network process untrimmed video which is probably the reason for such results.
  • Not sure why a context proposal is not used for classification similar to detection? The main idea from this paper is that context matters.

--

--

--

I write reviews on computer vision papers. Writing tips are welcomed.

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ahmed Taha

Ahmed Taha

I write reviews on computer vision papers. Writing tips are welcomed.

More from Medium

Pytorch 101 & More…

Installing CUDA and cuDNN to use GPU to train NN using Tensorflow in Newly installed Ubuntu 20.04

Troubles with saving models with custom layers in Tensorflow