By nature we are given two sound pressure sensors or, simply, ears. The head provides spatial separation between ears introducing sound delays about 600 microseconds between the ears depending on a direction of arrival. The head also creates acoustic masking or shadowing effect so that at high frequencies sound in one ear may become louder than in the other ear. In addition, we have pinna that modifies the frequency response of our ears depending on the sound direction. Our body, shoulders and other body parts also reflect sounds in a specific manner providing our brain with additional clues regarding the location of the sound. Even if the sound is coming directly from the front, we also receive its reflections coming from walls or other objects that are not equal in the two ears.
So we are used to stereo (two channel) sounds slightly different in the left and right ears. Even if there is no real spatial or directional information, mono sound recorded by one microphone (or even two closely spaced microphones) does not seem natural. Everyone who tried to talk in an anechoic chamber (a special room without sound reflections used for acoustic research) could notice that. To make mono sound more pleasant and “live”, we want to convert it to (at least) stereo.
Is it possible? No, but...
Of course, true restoration of spatial sound field information from a signal microphone recording is not possible. However, we can introduce artificial difference between the left and right signal channels that in many cases will create a feeling that is very similar to listening a true stereo recording.
Many mobile phone or tablets today are capable of recording high quality, often HD video even in low light conditions. This makes possible recording concerts, family and other events with wide, distributed sound. Unfortunately, most of such recordings today are made with a single built-in microphone producing mono sound thus spoiling the overall video quality. Adding some artificial spaciousness to the sound can make this video clips to sound more natural. Try these links:
Jet Lag Jones Freak-Child Sample:
Rammstein - Bück Dich - Live Berlin – short:
Old video and audio recordings
Sometimes we have old music or video recordings that are mono. Can we improve them? Try the following links:
Ray Charles - Hit The Road Jack (LIVE):
Louis Armstrong - Hello Dolly Live:
Speech in voice and video communication
During conference voice or video calls we often communicate with several people. Sometimes these people also happen to say something simultaneously. Since today’s voice communication networks are mono, all voices are combined into one channel and thus reproduced as mono sound. Artificial stereo helps our ears and brain hearing it more natural while simultaneously improving intelligibility. Listen for the following recordings while trying to understand what the people are saying.
Use control buttons on the interactive screenshots to play/stop/navigate.
Original mono mix of man and woman voices:
Pseudo-stereo (processed mono) mix of man and woman voices:
There is not much we can do. There may be time of arrival difference between our ears and amplitude difference due to the head masking effect at high frequencies (above 2 kilohertz). In case we don’t have any prior information, all we can do is:
Divide the original mono sound on frequency regions (bands)
Introduce different signal delays between the corresponding bands in right and left channels
Introduce amplitude difference between the bands corresponding to the delays. Generally, due to the head masking effect, we can expect sounds coming earlier to be stronger
If this is so simple, why it is not popular?
There are many simple recipes for cooking; still making food really tasty is not that easy. Our ears and brain analyzing sounds are very developed. Trying to “fool” these sophisticated mechanisms making mono signal to sound as stereo can create unnatural sound effects that may be “interesting” at the beginning, but become annoying after some time. So the schemes will differ in:
Frequency regions (bands) that are used
Delays introduced between the left and right channels in each region
Amplitude differences (if any) that are introduced
If the regions are constant or change with time or signal itself
Implementation and/or computational complexity
Besides, people are different so, eventually, with any technique (or food) – you may like it or not. A good technology must give you a simple way to reduce the artificial effect or, ultimately, to transform it back to the original mono without any distortion.
Obviously, we cannot disclose many details about our approach. Here are some details
We use bands of about 200Hz width
Delays are not equal for different frequencies. While it seems to be contradicting with nature (delay between the two ears depends only on the sound direction), equal delays will introduce uncertainty in phase and it sounds worse in artificial stereo.
Head shadowing effect is taken into account at high frequencies.
The regions are constant
Depends on the DSP/MCU core and roughly proportional to sampling frequency. Can be estimated as 15 MIPS at 32 KHz.
It is possible to control the effect by one parameter (from none to maximum). With any value, summing up artificial stereo channels produces the exact original mono sound.
What professionals are saying on importance of binaural hearing:
Hearing with both ears, as nature intended, is called binaural hearing. It allows us to hear sounds accurately and more naturally. Especially in noisy situations, it gives us a sense of balance and direction.
For humans, two-eared hearing is just as necessary to survive in our noisy societies; to hear in traffic, in crowds and to understand speech. For many, hearing properly with both ears may mean the difference between just hearing and hearing and understanding.
Please contact us for more information, audio examples and real time demo availabilities.