Thanks for your question.

Dec 14, 2021

The spectrogram mask is a binary 0/1 mask. Like any binary mask, this mask can split a signal (audio or image) into two pieces. In image segmentation, a binary mask splits the image into parts, e.g., background and foreground.

In this paper, the input spectrogram describes the audio coming from multiple sources, e.g., piano and guitar. To split this signal into its parts (piano and guitar), one can use a binary 0/1 mask. By applying this mask to the input spectrogram, we generate two output spectrograms: one for the piano and another for the guitar.

I hope this helps.

Thanks

Written by Ahmed Taha

No responses yet