Motivation Stateoftheart Proposedapproach Experiment:Singingvoiceseparation Discussionandsummary Semi-Supervised Adversarial Audio Source Separation applied to Singing Voice Extraction Daniel Stoller1, Sebastian Ewert2, Simon Dixon1 1CentreforDigitalMusic QueenMaryUniversityLondon 2Spotify MLSP-L8: Deep Learning III ICASSP 19.04.2018 Motivation Stateoftheart Proposedapproach Experiment:Singingvoiceseparation Discussionandsummary Audio source separation Task: Recover sources from mixtures Example: Music instrument separation: Motivation Stateoftheart Proposedapproach Experiment:Singingvoiceseparation Discussionandsummary Current state of the art [5, 3, 1] Training on multitrack datasets Neural network Discriminative, MSE loss Motivation Stateoftheart Proposedapproach Experiment:Singingvoiceseparation Discussionandsummary Current state of the art [5, 3, 1] Training on multitrack datasets (small ⇒ overfitting!) Neural network Discriminative, MSE loss Motivation Stateoftheart Proposedapproach Experiment:Singingvoiceseparation Discussionandsummary Our goal ⇒ How to also learn from unpaired mixtures and sources? Random mixing ignores source correlations [4, 2] Motivation Stateoftheart Proposedapproach Experiment:Singingvoiceseparation Discussionandsummary Theoreticalframework Intuition Magnitude Unlabeled mixtures spectrogram Separator Accompaniment estimates Mixture Magnitude network database spectrogram Vocal estimates Magnitude spectrogram Motivation Stateoftheart Proposedapproach Experiment:Singingvoiceseparation Discussionandsummary Theoreticalframework Intuition Unlabeled accompaniment Accompaniment Magnitude database spectrogram Magnitude Unlabeled mixtures spectrogram Separator Accompaniment estimates Mixture Magnitude network database spectrogram Vocal estimates Magnitude spectrogram Singing voice Magnitude database spectrogram Unlabeled vocals Motivation Stateoftheart Proposedapproach Experiment:Singingvoiceseparation Discussionandsummary Theoreticalframework Intuition Necessary condition for optimal separator Loss: Minimise divergence between source outputs: L = (cid:80)K D[outqk||pk] u k=1 φ s Motivation Stateoftheart Proposedapproach Experiment:Singingvoiceseparation Discussionandsummary Theoreticalframework Derivation of unsupervised loss For optimal separator: q (sk|m) = p(sk|m) φ Necessary condition for optimal separator Loss: Minimise divergence between source outputs: L = (cid:80)K D[outqk||pk] u k=1 φ s Motivation Stateoftheart Proposedapproach Experiment:Singingvoiceseparation Discussionandsummary Theoreticalframework Derivation of unsupervised loss For optimal separator: q (sk|m) = p(sk|m) φ E q (sk|m) = E p(sk|m) m∼pdata φ m∼pdata Overall separator output = Source distribution
Description: