upHear Voice Quality Enhancement technology adapted to smart display device
Sberbank, the largest bank in Russia and Central and Eastern Europe, has launched SberPortal, a multifunctional smart display that supports a variety of multimedia content as well as video calls. The device is equipped with a first-of-its-kind family of virtual assistants named Salute, and understands touch, gesture, and of course voice commands. To guarantee that SberPortal can always operate with the best possible voice quality when receiving user commands or making calls, the device features Fraunhofer’s upHear Voice Quality Enhancement (VQE) technology.
The development team at Sberbank and the upHear VQE team at Fraunhofer IIS jointly designed an array geometry with six microphones, tailored it to the needs of the device and adapted the upHear VQE algorithms accordingly. The flexible technology optimizes the microphone signals collected by the array in voice assistant and Voice over IP (VoIP) mode alike, providing a clean speech signal to Salute, and enabling far-field voice calls in the best possible voice quality.
In VoIP mode, the full-duplex VoIP functionalities of Fraunhofer upHear VQE ensure that people in a voice call can talk to each other in optimum audio quality. This is achieved by canceling out acoustic echoes and removing reverberation and noise while ensuring that the perceived loudness remains at the same level—even when the user is moving closer to or further away from the smart display.
In voice assistant mode, upHear VQE enables Salute assistants to accurately hear voice commands issued from anywhere in the room. The Fraunhofer technology removes interfering sounds for far-field operation and cancels out acoustic echoes caused by SberPortal’s own loudspeaker signal during playback to enable barge-in. As a result, no matter where in a room the commands are given, and even while the smart speaker is playing music, the keyword spotter and speech recognizer receive a clean audio signal.
About Fraunhofer upHear Voice Quality Enhancement
Fraunhofer upHear VQE processes microphone signals, thus enabling far-field full-duplex conversations in the full perceptible audio bandwidth for communication devices. It also allows far-field voice commands and barge-in during audio playback for smart assistant devices, always with outstanding audio quality. This is achieved by combining advanced multichannel acoustic echo cancellation, source localization, noise reduction, dereverberation, automatic gain control and beamforming methods. The fully integrated technology is suitable for numerous applications, including natural language understanding in mobile and smart assistant devices, as well as conferencing solutions. upHear VQE’s flexibility allows for its application with a wide range of microphone array geometries built into mobile phones and smart assistant devices, such as smart speakers, soundbars, cameras and TVs. It can also be configured to meet the computational resources’ requirements. upHear VQE is optimized for mono and stereo as well as surround and even immersive audio devices.
Header image © SberDevices