📜 ⬆️ ⬇️

How to count the sound from a pack of chips, or what is a "visual microphone"

"Visual microphone" is a technique that allows you to restore the audio from silent video. Today we will tell not only about it, but also other methods and technologies that allow you to remotely read and restore music or speech.


Photo m01229 CC

Technology predecessors


One way to record sound at a distance is with lasers. The so-called laser microphones are used to sense vibrations caused by sound waves. For example, you can “capture” the sound in this way from the surface of the window glass, if people are talking in the room or music is playing. The interferometer captures the "movement" of the surface by changing the optical path length of the reflected beam. After that, these deviations are converted into a sound signal using special algorithms.

The network has audio recordings that show that “laser microphones” allow you to restore sound with fairly good quality. However, this approach has its disadvantage associated with the complexity of the installation of the device.

You can also “record sound at a distance” using low-intensity microwave radiation , which is used in communications. Similar technologies were used at NASA for capturing and recognizing weak radio signals in space.

The horn antenna sends microwaves with a frequency of 30-100 GHz into the room through the wall of the building. If people speak or play music indoors, sound waves can be read by microvibrations of light objects and materials - they acquire amplitude modulation in a “captured” form. This information is then used to restore the sound acting on the object. Moreover, this object can be any clothing, so this method allows you to "intercept" even the sound of the heartbeat.

Visual microphone - the decision of scientists from MIT


Scientists from MIT have proposed another way to read sound from a distance. They proved that it is possible to restore the sound based on the video. To do this, you need to record a video object using a camera for high-speed shooting and analyze the microscopic vibrations caused by the propagation of sound waves.

Based on the video, a controlled image pyramid is built , which is a set of filters that “split” each video frame into complex subranges corresponding to different points on the object under study.

Scientists have developed a special algorithm (and put it in open access), which calculates the intensity of sound vibrations at each of the selected points. Local signals are averaged, and on their basis, one common signal is generated, which determines how sound waves act on an object. This signal passes through the Butterworth high pass filter with a cutoff threshold of 20–100 Hz. After that, it becomes possible to restore the audio recording.

According to the study leader Abe Davis, a visual microphone allows you to get an audio recording of less good quality compared to active techniques (for example, using lasers), but it has its advantages. Their system does not require additional equipment and any detectors - only a high-speed video camera is needed. At the same time, the surface from which sound will be “read out” does not have to be mirrored or smooth, as laser microphones often require.

Abe's team tried to count the sound from a paper bag, a pack of chips and aluminum foil. They are light, because the sound vibrations on them were most noticeable, and the resulting signal is less noisy. Among the test objects was also a home plant and a brick, which, according to scientists, “showed” itself better than they expected.

The team made a video in which it showed how these or other objects “sound”:


Scientists note that they plan to continue work in this direction and explore the possibility of playing audio from any video recordings, and not just those specially prepared with the help of a high-speed camera.

Technology development


Other scientists are trying to improve the technology proposed by the MIT team. For example, last year, Iranian researchers presented an algorithm that speeds up sound extraction from “high-speed video” and improves its quality.

Different areas of the object are affected differently. The intensity of vibration depends on the material from which the object is made, its shape, frequency of the acting sound and the distance to the source. For example, when shooting video at a frequency of 20 kHz, sound waves travel about 17 mm between two frames. Therefore, objects that are farther from the sound source react with a delay.

All these factors cause different areas of the object to vibrate with different strengths. Therefore, when analyzing images from a camera, scientists take into account only those zones that make the greatest contribution to the formation of the resulting signal - the least “noisy” blocks. In this case, the frequencies forming them have different phase shifts in order to exclude attenuating interference.

Iranian researchers note that because of this they managed to improve the quality of the reproduced sound, as well as speed up image processing, compared to the original MIT algorithm. They say that their system is able to process the image and restore the sound in real time.

The potential of visual microphones


In general, the technology is still experimental and there is no talk of a full commercial implementation. But she is already predicted of a potential application in the area of ​​law and order - the police will be able to get more information from surveillance cameras.

There are other options: similar systems will allow analyzing how sound behaves in recording studios and concert halls in order to determine their acoustic properties. Another application is to use the system in the space industry to study sounds in space. By the way, residents of Hacker News have already suggested that in the future, "visual microphones" will allow once and for all solve the mystery of the landing on the moon.



More about sound in our “World Hi-Fi”:


Our new materials on GT:



Source: https://habr.com/ru/post/410627/