Remixing musical audio on the web using source separation

Gerard Roma, Andrew J R Simpson, Emad M Grais, Mark D Plumbley

Research in audio source separation has progressed a long way, producing systems that are able to approximate the component signals of sound mixtures. In recent years, many efforts have focused on learning time-frequency masks that can be used to filter a monophonic signal in the frequency domain. Using current web audio technologies, time-frequency masking can be implemented in a web browser in real time. This allows applying source separation techniques to arbitrary audio streams, such as internet radios, depending on cross-domain security configurations. While producing good quality separated audio from monophonic music mixtures is still challenging, current methods can be applied to remixing scenarios, where part of the signal is emphasized or de-emphasized. This paper describes a system for remixing musical audio on the web by applying time-frequency masks estimated using deep neural networks. Our example prototype , implemented in client-side Javascript, provides reasonable quality results for small modifications.

            
@inproceedings{2016_93,
  abstract = {Research in audio source separation has progressed a long way, producing systems that are able to approximate the component signals of sound mixtures. In recent years, many efforts have focused on learning time-frequency masks that can be used to filter a monophonic signal in the frequency domain. Using current web audio technologies, time-frequency masking can be implemented in a web browser in real time. This allows applying source separation techniques to arbitrary audio streams, such as internet radios, depending on cross-domain security configurations. While producing good quality separated audio from monophonic music mixtures is still challenging, current methods can be applied to remixing scenarios, where part of the signal is emphasized or de-emphasized. This paper describes a system for remixing musical audio on the web by applying time-frequency masks estimated using deep neural networks. Our example prototype , implemented in client-side Javascript, provides reasonable quality results for small modifications.},
  address = {Atlanta, GA, USA},
  author = {Roma, Gerard and Simpson, Andrew J R and Grais, Emad M and Plumbley, Mark D},
  booktitle = {Proceedings of the International Web Audio Conference},
  editor = {Freeman, Jason and Lerch, Alexander and Paradis, Matthew},
  month = {April},
  pages = {},
  publisher = {Georgia Tech},
  series = {WAC '16},
  title = {Remixing musical audio on the web using source separation},
  year = {2016},
  ISSN = {2663-5844}
}

Download PDF