Russian-speaking developers always have something to tell: to share some of their unique experiences and opinions. But in the format of a video blog, due to the high complexity of the recording, only a few now do it.
Under the cut talked about his difficult way to record and edit video using free software, Ruby scripting and improvised tools.
I began by studying the theory of recording video blogs on English-language YouTube videos. And from the Russian-language materials - this course turned out to be quite useful (in particular, the module about the video blog and the first video about building a frame from the module about the report). I also briefly familiarized myself with the popular features of proprietary video editors in order to more consciously approach the choice of a free editor.
He did not risk investing in the light: there is not enough time to study it and search for the best option, and a superficial study of cheap options indicates a potential rake such as flickering and poor color rendering. With daylight, I did not have great difficulties, it suffices just for short videos.
Existing free video editing tools contain a number of known problems: from bad UI solutions and freezes that turn editing to infinity, to memory leaks, crashes and unexpected artifacts that only appear after the final rendering.
There are a lot of problems and it took time to select a video editor and study its bugs, just to learn how to cope with basic things. Eventually he stopped at Pitivi , simply because he spent so much time on searching and experimenting.
A supported installation method for Pitivi requires Flatpak. For a while I avoided it because I do not have systemd and PulseAudio in the system.
It turns out systemd is not required for a long time. But PulseAudio - it was necessary to install and configure it was easier to modify Flatpak . But it would be better to put PulseAudio, it's just a bit tedious and unclear whether to expect from him problems with recording sound on existing hardware or not.
Install Pitivi, remove PulseAudio configs, run:
$ sudo flatpak remote-add --if-not-exists flathub https://flathub.org/repo/flathub.flatpakrepo $ sudo flatpak install flathub org.pitivi.Pitivi $ sudo find {/var/lib,~/.local/share}/flatpak/runtime -type f -name '*pulseaudio*.conf' -delete $ flatpak run --device=alsa --branch=stable --arch=x86_64 --command=pitivi org.pitivi.Pitivi
No sound. Let's try to run something simpler, for example aplay
:
$ sudo find /var/lib/flatpak/app/org.pitivi.Pitivi/x86_64 -type d -path '*/files/bin' -exec cp `which aplay` {} \; $ flatpak run --device=alsa --branch=stable --arch=x86_64 --command=aplay org.pitivi.Pitivi /dev/urandom ALSA lib dlmisc.c:162:(snd_dlsym_verify) unable to verify version for symbol _snd_pcm_empty_open ALSA lib dlmisc.c:283:(snd1_dlobj_cache_get) symbol _snd_pcm_empty_open is not defined inside [builtin] aplay: main:828: audio open error: No such device or address
Probably the alsa-lib
included in Flatpak was compiled with --with-versioned
. A quick fix is to replace libasound.so
system one:
$ sudo find /var/lib/flatpak -type f -name libasound.so.2.0.0 -exec cp /usr/lib64/libasound.so.2.0.0 {} \; $ find ~/.local/share/flatpak -type f -name libasound.so.2.0.0 -exec cp /usr/lib64/libasound.so.2.0.0 {} \; # если устанавливали от прав пользователя что-то
For me this was not enough:
$ flatpak run --device=alsa --branch=stable --arch=x86_64 --command=aplay org.pitivi.Pitivi /dev/urandom ALSA lib /var/tmp/portage/media-libs/alsa-lib-1.1.6-r1/work/alsa-lib-1.1.6/src/pcm/pcm_direct.c:1943:(snd1_pcm_direct_parse_open_conf) The field ipc_gid must be a valid group (create group audio) aplay: main:828: audio open error: Invalid argument
We also need the ALSA config:
$ sudo find /var/lib/flatpak -type d -name etc -exec cp /etc/asound.conf {} \; $ find ~/.local/share/flatpak -type d -name etc -exec cp /etc/asound.conf {} \; # если устанавливали от прав пользователя что-то $ flatpak run --device=alsa --branch=stable --arch=x86_64 --command=aplay org.pitivi.Pitivi /dev/urandom
Finally, you can use Pitivi.
I do some animation effects for text using a screencast of full-screen pages that are laid out using reveal.js and animate.css. In reveal.js for some slides I add a transition sound:
<section style="font-size: 5em"> <audio data-autoplay src="/path/to/sound.wav"></audio> #1 </section>
It turned out to be important to record a screencast with 60 FPS, if the text is very large. I do the screencast like this:
#!/bin/sh SOUND_INPUT=shared_input_loopback CHANNELS=2 SOUND_RATE=48000 FRAMERATE=60 DRAW_MOUSE=0 VIDEO_SIZE=$(xdpyinfo | awk '/dimensions:/ { print $2; exit }') OUTPUT="${HOME}/video/screen/$(date --rfc-3339=seconds).mp4" ffmpeg \ -thread_queue_size 512 \ -video_size "${VIDEO_SIZE}" \ -framerate "${FRAMERATE}" \ -f x11grab \ -draw_mouse "${DRAW_MOUSE}" \ -i :0.0+0,0 \ -thread_queue_size 512 \ -f alsa \ -ac "${CHANNELS}" \ -i "${SOUND_INPUT}" \ -ar "${SOUND_RATE}" \ -vcodec libx264 -preset ultrafast -crf 18 \ -acodec alac \ -f ipod \ "${OUTPUT}"
In my case, shared_input_loopback
is a device from the asound.conf config .
Also, this add- on ffmpeg
for transitions between clips has proven to be useful.
The handset had a Meizu MX4 phone, on which I decided to use the front camera and record using Open Camera. It took some time to train yourself to look at the camera and control its position in space, without making typical mistakes, like cutting off the head. At the same time, speak quite clearly, loudly, gesticulating and generating at least some kind of facial expression. But that was only the beginning.
What prompted me to do automatic video cutting, and even at the recording stage?
Here are the requirements for the future bike I set myself:
Why so much control over the devices during the recording phase? Why not just start recording for several hours in a row, and then edit it? There are many reasons:
It is easier to record in small fragments and make a primitive automation of all that is possible. As the main language for the development of the bike chose Ruby. Actually now I would probably choose Python, but at that time I was just learning Ruby, and I’m running languages that are new to me in such weird experiments.
Information on the network on this topic is not very much. I remembered late about Stanford and Adobe research (which is not scary, I still need a less sophisticated solution).
The cutting takes place in 2 stages: at the recording stage - rough, at the rendering stage - more accurate, with the ability to manually correct too much trimmed fragments. Rough implemented using VAD from WebRTC. More accurate - using Google Speech (if more specifically - using a modification of the autosub project, to generate subtitles for video). I am sure that there will be more successful solutions, it just turned out to be the best of what we managed to do quickly.
If you want to develop something similar using ffmpeg
- stick to the principle of not trying to do too much in one ffmpeg
call. Take intermediate files and monitor each step, so you don’t have to search for strange unguided bugs, such as improper cuts or unused effects.
I run the resulting disgrace something like this:
$ bin/vlog-recorder \ --project /path/to/project \ --debug true \ --sound-settings ' --device=usb_card --format=dat' # аргументы к arecord r - (RE)START recording s - STOP and SAVE current clip S - STOP and SAVE current clip, don't use auto trimming d - STOP and DELETE current clip p - PLAY last saved clip f - FOCUS camera on center h - show HELP q / Ctrl+C - QUIT [ stopped ] [ battery: 100% / 36°C ]
The arguments to the arecord
I need to explicitly specify the device in order to avoid periodic glitches, which are most likely due to the ALSA-vskogo dsnoop plugin. You can also open the log to control the process of downloading files from your phone: tail -f /path/to/project/log.txt
.
Quickly srenerit in one video for preview, you can like this:
$ bin/vlog-render \ --project /path/to/project \ --language ru \ --video-filters 'hqdn3d,hflip,curves=psfile=/path/to/curves.acv,vignette' \ --speed 1.3 \ --fps 60 \ --preview true
The argument --video-filters
are filters passed to ffmpeg
. The video will automatically open in the mpv
player.
You can also swap or throw out the remaining unnecessary duplicates by editing the appeared file / /path/to/project/render.conf
, which can be detected through the recognized voice. The idea, by the way, is not new . You can also speed up individual fragments and edit unsuccessful video cuts, if there are any. Next time, vlog-render
re-read render.conf
and apply the changes.
To prepare fragments for a video editor, you need to specify --preview false
. In addition to the fragments that will lie in the output
, it still output.mp4
them into one file output.mp4
, because initially I was not sure:
I mainly use the first option. The second was useful in one video with a bad light: there I used only a piece of output.mp4
. For the second option, the vlog-play-segments
script can also be useful: with it, you can quickly see all the pauses between clips in descending order of duration. This will help more accurately render.conf
and save time later editing this long piece of video in Pitivi.
The resulting small clips can be downloaded at once to the timeline in Pitivi: select all the imported clips and drag them using drag-n-drop.
I didn’t want to search for a suitable phone stand, and my hands were already itching to write down anything. We take the piece of cardboard that came to hand and cut out the phone mount to fit our needs:
The stand is mounted on the laptop display to minimize the distance between the script and the camera.
Acceptable sound is very critical . At hand was the microphone Boya BY-M1. Although it is advertised as an omnidirectional microphone, a good sound in practice is obtained only when you use it as unidirectional.
The microphone stand is even simpler to do: take the bottle from the pomegranate juice that came to hand, roll an adhesive tape and assemble this designer together:
You can also put a towel under this design to suppress part of the vibrations from the table and at the same time adjust the height.
In my case, this is ASUS Xonar U3. It turned out, however, that it is not compatible with such a microphone: the microphone has a CTIA plug designed for telephones. The problem was solved by an adapter to TRS plugs for a microphone and headphones. And it was not easy to find it: manufacturers of such adapters rarely write details. In my case, some Cablexpert CCA-418W helped.
Another problem with this card is in DC offset in the right channel when recording. That does not interfere, because I'm still in mono. And for software that does not allow to set up mono, a redirection of a good channel to a bad one was done using ALSA.
Also this card is afraid of overheating. You need to keep it away from the cooler, otherwise it will slow down and record sound in jerks.
I edit the sound in the headphones (in my case it is the Pioneer SE-M390), at a volume higher than the one on which I usually listen to music. Algorithm like this:
ffmpeg -async 1 -i input.mp4 output.flac
ffmpeg -i sound.flac -an -i video.mp4 -c copy output.mkv
Taking the opportunity to relax the rules, I invite you to visit the resulting YouTube channel , where I share insights on effective learning programming and related disciplines.
Good luck in developing programs and creating video blogs!
Update : translated this article for your English-language blog.
Source: https://habr.com/ru/post/437918/