Computational Intelligence for End-to-End Audio Processing


Special Issue of the IEEE TRANSACTIONS on Emerging Topics in Computational Intelligence

Please, download the Call for Papers here.

The paper submission deadline has been postponed till June 05, 2017.



  • Stefano Squartini - Università Politecnica delle Marche, Italy
  • Björn Schuller - Imperial College London, U.K., and University of Passau, Germany
  • Aurelio Uncini - University La Sapienza, Italy
  • Chuan-Kang Ting - National Chung Cheng University, Taiwan



Computational Audio Processing techniques have been largely addressed by scientists and technicians in diverse application areas, like entertainment, human-machine interfaces, security, forensics, and health. Developed services in these fields are characterized by a progressive increase of complexity, interactivity and intelligence, and the employment of Computational Intelligence techniques allowed to achieve a remarkable degree of automation with excellent performance.

The typical methodology adopted in these tasks consists in extracting and manipulating useful information from the audio stream to pilot the execution of target services. Such an approach is applied to different kinds of audio signals, from music to speech, from sound to acoustic data, and for each of them we can easily identify specific research topics, some of which have already reached a high maturity level.

In the last few years, a new emerging computational intelligence paradigm has become popular among scientists working in the field and across all a large variety of research areas. It is named end-to-end learning and consists in omitting any hand-crafted intermediary algorithms in the solution of a given problem and directly learning all needed information from the sampled dataset. This means that features used as input of the parametric system to train (like a Neural Network) are not selected by humans, but they are determined by the system itself during the learning process.

Due to its flexibility and versatility, such an approach encountered a great interest in the Computational Audio Processing field, for all types of signals mentioned above. For instance, deep neural architectures are often adopted in these contexts and fed with raw audio data in the time or frequency domains, whereas the supervised, weakly-supervised or unsupervised training algorithms involved in the process are responsible to find a suitable data representation across the different abstraction layers to solve the task under study, i.e. classification, recognition, detection.

On the other side, an increasing interest has been registered by the scientific community in the development of end-to-end solutions to synthetize raw audio streams, like speech or music. Generative Adversarial Networks and WaveNets are the most recent and performing examples for this kind of problems.

It is indeed of great interest for the scientific community to understand how and to what extent novel Computational Intelligence techniques based on the emerging end-to-end learning paradigm can be efficiently employed in Digital Audio, in the light of all aforementioned aspects. Therefore, in line with the mission of the IEEE CIS Task Force in Computational Audio Processing and moving from the success of the Special Sessions recently organized at IJCNN (International Joint Conference on Neural Networks – from 2014 to 2016) and MLSP (Machine Learning for Signal Processing 2016) conferences, the organizers of this Special Issue of the IEEE Transactions on Emerging Topics in Computational Intelligence want to bring the focus on the most recent advancements in the Computational Intelligence field and on their applicability to Digital Audio problems from the end-to-end learning perspective.



Workshop topics include, but are not limited to:

  • End-to-End Learning for Digital Audio Applications
  • Audio Feature Representation Learning
  • Computational Audio Analysis from Raw Data
  • Unsupervised Feature Extraction from Audio Signals
  • Bags of Audio Words in Audio Pattern Recognition
  • End-to-End Cross-domain Audio Analysis
  • Data-learnt Audio Feature Representations and Higher Level Audio Features
  • Automatic Feature Analysis for Sound Event Classification and Recognition
  • Transfer, Weakly Supervised, and Reinforcement Learning for Audio
  • End-to-End Neural Architectures for Music Information Retrieval
  • End-to-End Computational methods for Music/Speech Synthesis
  • Generative modelling techniques for Raw Audio
  • End-to-End Speech Recognition and Dialogue Systems




Electronic submissions for the IEEE Transactions on Emerging Topics in Computational Intelligence can be found here.
During the submission process, please choose Article Type as SI: CAP.


  • Deadline for manuscript submission: June 05, 2017

  • First Notification of Acceptance: September 05, 2017

  • Final manuscripts due: November 15, 2017

  • Publication of Special Issue: December 2017