📜 ⬆️ ⬇️

What can you learn when developing an audio player for different browsers?

This story began about 1.5 years ago. It is associated with playing music in various browsers and the platforms on which they run. The path is full of “pain and suffering” the realization that the seemingly easy task may not be so easy, and the “minor” details that you don’t attach importance to at the very beginning can affect everything.

Minor details for the most curious :)
1. Uploading data about each next track from the network.
2. For each element of audio: new Audio () or <audio> you need user permission - a custom action on the page.

Prehistory


Probably, anyone who has ever written an audio player for browsers in his life has encountered the problem of cross-browser and cross-platform.

Here I am, while working on a new MVP, I encountered various features with regard to audio playback in browsers.

It all started with the fact that it was necessary to make a smooth mixing (crossfade) of two tracks during playback - this is the first feature. Our team wanted to make a change of tracks on the radio. And the second feature - each subsequent track is requested from the network.



Research


Then, almost all of our projects used the library Sound Manager 2.

Almost immediately you realize that playing two audio files on mobile devices at the same time does not work the same everywhere!

In Chrome (~ 62 version) for PC, tracks are played as expected. On mobile devices (also in Chrome), playback of tracks worked, but only with the active screen. When the screen was locked, the next track for the current player was not played. As for iOS / macOS, the playback worked in the same way. More information can be found here - section “Single audio stream”.

So began the trip over three seas searching for information bit by bit about the features of the browsers with audio.

Well, I try the solution with Web Audio without using any libraries. Yes, this technology is designed for other purposes: synthesis, sound processing, for games, etc., rather than just playing tracks. But for the sake of experiment, it was necessary to try, since it allows you to compose sounds from different sources on one sound output - speakers / headphones / phone speaker / etc. There are guys who are purposefully engaged in exploring the possibilities of audio playback on mobile devices using the Web Audio API.

After implementation, certain nuances emerged.

First, you need to wait for the full load of the entire track. With a slow connection to the Internet, there will be noticeable pauses due to the fact that the second track may not have time to load by the time the first track ends. Full download can be avoided if you use a bunch of HTML5 Audio tags that act as sound sources for Web Audio, but in this case it becomes impossible to play two sounds at the same time again.

Secondly, if you download a track over the network with fragments and decode them programmatically, this increases the load on the CPU. For a PC it was acceptable, but for mobile devices it is critical.

Thirdly, there were problems with decoding. If fragments of mp3 / ogg / wav files came to the client, then these pieces were quietly decoded and reproduced. But if chunks of an mp4 file that served as a container for HE-AAC came to the browser, then they could not be decoded. To some extent, this also applies to the Opera browser, in which the playback of MP3 files is unstable from version to version — it reproduces, it gives an error that this format is not supported.

Fourthly, the name of the track on the locked screen on a plate with a native audio player (on the iPad) was not displayed / did not change, including when switching between tracks. Perhaps due to the fact that the iPad with version 9 of iOS was used for the tests - there was no other at that time.

As a result, at this stage, Web Audio had to be abandoned. Still, the crossfade is not for browsers, standard music in good quality weighs quite a lot.

If we refuse the crossfade, we will implement a simple fade in and fade out, at the beginning and at the end of the music track, respectively.

The code on the previous step was slightly modified and tested. As a result of tests, various nuances surfaced (shown in the table). All this using the library Sound Manager 2.



We add logging of all events to determine the moment of transition between tracks and understand at what point they stop playing.



Tab activation
In Safari 9+, the sound doesn’t always appear when you activate a tab.

From this, it can be assumed that JS execution in the background is throttling or the execution flow stops completely (events and timers ). However, later it will become clear that this was partly the correct conclusion. Below will be considered another 1 nuance associated with playing tracks and understanding why the sound does not appear.

Remark
To work with a progressbar, for example, by drawing it for a track, it is good to use requestAnimationFrame instead of setInterval / setTimeout. You can avoid the cumulative effect when you deactivate (background tab) and then activate the tab and temporarily suspend it, associated with performing all the calculations and redrawing the progress state.

At the same time, the question arose: what about autoplay tracks on PCs and mobile devices?
Autoplay is understood to mean the automatic start of playing a track without any user action when the page loads.
As for Safari in relation to automatic playback when the page is loaded, this is not possible; you need user interaction with the page, just like on mobile devices . This applies to both video content and audio content .

And so, at that time was the following:

  1. it is impossible (not desirable) to reproduce two or more sounds simultaneously;
  2. for the pseudo “auto-play” track, user permission is needed — the first interaction, later it was called “Sell a finger to the device”;
  3. in the background (background tab / lock screen) JS (it all depends on the browser):
    either freezes completely;
    either exposed to throttling;
    either works the same as with the active tab;
  4. You can automatically start playback without sound, but it is not clear why (for audio content)?
  5. somewhere far away a thought begins to loom, but how to make JS in the background continue to be executed?

Other libraries implemented the functions of the player with the assumption that there might be a solution for this task. Despite the fact that many issues were reviewed on GitHub with a description of the problems when playing tracks in various browsers, it was still hoped that you’ll get to the point: why it doesn’t work and how to do it to work. As it turned out, no ...

A few code examples with a video demonstration of libraries:

  1. Sound Manager 2 - github pages , github repository , video: macOS Safari 12 ; iOS Safari 10 with unlocked screen
  2. Howler
    Howler v2.0.9 - github pages , github repository , video: macOS Safari 12 , iOS Safari 10
    Howler v2.0.15 - github pages , github repository , video: macOS Safari 12
    Howler v2.1.1 - github pages , github repository , video: macOS Safari 12 , iOS Safari 10

For macOS, video recording is made without sound, so you need to look at the volume indicator - the speaker image, on the tab.

More video examples are available in the repository.

In the interactive example for Howler v2.1.1 - sometimes you can hear several sounds at the same time, this is due to the addition of a pool of unlocked audio elements by the user (in future versions of the library this should be fixed).
What is the reason for the inoperability of these libraries?

Above, I wrote: “In the background (background tab), JS either freezes completely or is subjected to throttling . So another moment comes up: the libraries in the code use the creation of new audio objects through new Audio (). If they are created dynamically, i.e. an already existing audio object is not used, and the user does not interact with the site, the tab is inactive or the screen is locked, some browsers may find that the audio from this audio element should not be played until the tab is active again or the user does or action.

An example of a test on github pages and in the repository on github using new Audio (). Video: macOS Safari 12 ; iOS Safari 10 with unlocked screen.

It seems that some kind of universal tool does not exist and it is necessary to look for some other compromise solution.

Then sit down with the guys from the team to discuss, and what is really important in the audio player? For it would be possible to continue the experiments indefinitely, but we need to move forward.

First, important points were identified that prevented achieving the desired result:

  1. Safari on macOS does not play tracks with an inactive tab;
  2. there is no possibility to listen to music in the background (with the screen locked) on smartphones running on iOS and Android, I would like to avoid aggressive redirecting users to the mobile application (later), as previous experience shows that quite a large part of users do not want to install a mobile application ;
  3. The player does not work correctly with a dynamic playlist, i.e. when it is not known in advance what the next track will be.

Further, it was possible to formulate goals that were necessary to achieve:

  1. ensure that the player works in the background - in various browsers and on various platforms;
  2. allow the user to choose what to use: listen to music on the site or in a mobile application;
  3. provide the ability to use the player (or approach) in various future projects.

Began a new phase of finding solutions to the problem. At this stage, various libraries were not used, all studies were conducted using HTML5 Audio. The bottom line is that a variant was found using dedicated workers . iOS did not allow this solution to win again - playback in the background does not work, but it turned out to work in Android (Chrome, Opera, Safari).

HTML5 Audio + Dedicated Workers test sample on github pages and in a github repository .

During the initialization of the Worker, data about the current track is requested. Worker is also engaged in sending a signal to receive a progress state — how long the track plays — from the main stream and, based on this data, decides when to request data about the next track from the network.



Also at that time the following example was tested ( github pages , repository on github ), when HTML5 audio tag is embedded in the DOM (video: macOS Safari 12 , iOS Safari 10 ) and it just replaces SRC when switching between tracks. Today on macOS in 12 Safari this example works. Unfortunately, now there is no way to test the performance of this example on macOS in Safari 10 and 11 versions, but at that time this example did not work during the tests ( autoplay policies , autoplay restrictions ).

To summarize, for iOS and macOS, Safari does not consider a new instance of an audio element as an activated user if it was created in the background inside an event, for example, ajax, setTimeout, onended.

Further, with regard to playing tracks in iOS Safari and iOS Chrome, it was found that it was possible to play tracks in the background (with the screen locked) only using HLS . For iOS and macOS platforms, this format is standard and is supported by the operating system. A native implementation is also available for Android Chrome and Edge. And for PCs in Chrome, you can use software handlers, for example, hls.js , Bitmovin Player , etc.

A link to the github repository is available sample code that covers the simplest use case — simply playing the playback stream generated on the server without rewinding, switching to the next track, etc. Examples are presented using the audio tag, the video tag, the hls.js library, and the player from Bitmovin. Node.js required for launch.

findings


The first point, unfortunately, because of the whole variety of browsers, there is no any universal solution that would allow listening to music in browsers equally well everywhere. Everywhere there are limitations, and as practice shows, it is possible to live with them comfortably.

The second point, sometimes it is worth checking border cases as quickly as possible, for example, a native implementation. Find some minimally acceptable set of requirements and quickly test its performance, and not take as a basis any library. This will give more insight into how these libraries are built inside and why certain functions work or do not work. Otherwise, you can run away quite far in the project and then understand that something is going wrong. And it may be that abandoning the library will be quite expensive. You will need to rewrite a significant portion of the code.

The third point, be sure to pay attention to the audience of your service - from which browsers and operating systems come your users. It is quite easy to track through various metrics and error monitoring systems. Such an approach will make it possible to understand which platforms and browsers are important to maintain, and on which support you can save energy.

And finally


I announce a small contest related to playing music on iOS using HLS technology.

Description can be seen on the link to github .

Source: https://habr.com/ru/post/438952/