Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment

Weakly supervised action segmentation learns to segment actions in long untrimmed videos. It leverages action transcript only as training labels. During training, the network has access to video features and the groundtruth sequence of actions. The network learns to recognize the label for every frame. For example, if a video has N frames spanning four actions, the network outputs N predictions; one action class per frame as shown in the following figure. Please note that this setting assumes a single action per frame, i.e…

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ahmed Taha

I write reviews on computer vision papers. Writing tips are welcomed.