SONICOM is a research project to revolutionise the way we interact socially within AR/VR environments and applications. Over five years, the international consortium of researchers and tech experts will leverage methods from Artificial Intelligence (AI) to design a new generation of immersive audio technologies and techniques.


BEARS (Both EARS) was started because children and young people with bilateral cochlear implants said that they can have problems communicating in noisy places. This can make socialising difficult. BEARS is an interactive virtual reality (VR) package of games that can be played on different devices. We hope they will help people hear and communicate.

Doctoral Work

‘Did The Speaker Change?’: Temporal Tracking For Overlapping Speaker Segmentation In Multi-speaker Scenarios
Statement of the General Topic Area
Audio diarization is the process of annotating an input audio channel with information that attributes (possibly overlapping) temporal regions of signal energy to their specific sources.
Diarization identifies for example whether speech or non-speech is present, and the type of non-speech such as silence, noise or music. It might also identify the gender of the speaker, the style and structure of the content and when a change in speaker occurs.
With the continually decreasing cost of and increasing access to processing power, storage capacity and network bandwidth allowing for the amassing of large volumes of audio, including broadcasts, voice mails, meetings and other “spoken documents,” there is a growing need to apply automatic Human Language Technologies to allow efficient and effective searching, indexing and accessing of these information sources. In addition to the fundamental technology of speech recognition, to extract the words being spoken, other technologies are needed to extract meta-data that provides context and information beyond the words. Audio diarization, or the marking and categorising of audio sources within a spoken document, is one such technology.
Diarization can therefore help in making material available that would be otherwise too time consuming to access, and provide new uses for information previously discarded.
This research will seek to build on existing research and improve upon diarization performance initially in noisy and clipped multiple speaker environments.

Current Sponsors

Past Sponsors

Past Collaborations