Post-processing of data: frontent or backend. Pros and cons

Question

The network transfers data, which are polygons (for simplicity, we can assume that these are float arrays). In this case, the question arises, where it is better to form the final form of this data in the form of json : on the server and transfer to json or at the front. Important note. At the front we need exactly json .

It seems that it is better to do this at the front, so as not to drag strings across the network, but to transfer float arrays. But I'm not sure about JS performance. How effective is it to arrange everything in rows?

Participants who are going to participate in the competition, please note, the competition is announced so that people share their experience in solving such problems. Competition Description:

** During the conversation with community members, it turned out that this topic is relevant and may be interesting. In this regard, I would like to receive as an answer the most detailed answer with a description of the details of the problem that you solved, as well as the decision made. **

I, in particular, mean that we are talking about High Load, i.e. about systems that provide trouble-free performance for at least Q requests and answer no more than for a specified time interval T

I am interested in solutions T == 1 sek. Q> = 50. Better, more than 100

we use protobuf to transfer between the server and the client, and geojson is already configured in the browser
Publish typical values (numbers!) Q (as well as the size of the request-response) and T for the server and clients (medium and extreme).
A question from the series “where it’s better to change the state of the page — on the client or request a new layout from the server” is a typical double-edged sword

Accepted Answer · 2019-02-16T11:09:06

The question is, in fact, very complex. There are pros and cons of both solutions.

If you do everything on a server, then all clients will receive homogeneous data and there will be no additional processing. Consequently, the chance that the client nakosyachat less.
Excess load on the server. Sometimes transfer additional processing to the client is quite a valid decision to reduce the load.

As an example, at the current place of work we use maps. Need point clustering. And the problem with where to do it (on the server or client) is very serious. There are millions of users in the database, each with their own attributes (coordinates, city, country, gender, etc.). It is necessary to cluster the whole thing taking into account the attributes.

Earlier on the server it was hard-nosed that more than a certain number of users is not issued.

There was a question about clustering. Option two:

The server gives a bunch of users. And client applications cluster somehow (iOS, Android, etc. implement it on the client).
Cluster on server. Clients only zoom.

The server, in principle, scales well. We decided to do there. We have backing on the node. We use supercluster , which pulls data from the database and clusters everything. In realtime it copes with this case. In order to not sagging anything, use Worker Threads .

From our project I can not pour the screenshots, but it looks something like this:

PS we have all API chasing in json. If you have not too frequent requests, then it is better to use json, overhead can be neglected in size.

It would be cool if someone told about specific decisions that were made in certain situations
Because it is ambiguous and there are a huge number of variations on the topic that are important to discuss
And the problem with where to do it (on the server or client) is very serious.
I asked this question exactly for the same reason that you indicated.
ru.meta.stackoverflow.com/a/7950/15479 In general, people are active every day)

sanmai 9,627 12 silver marks 37 bronze marks · Answer 2 · 2019-02-15T09:56:34

It is better to always give the front ready, consumer data. All the rest of the logic with the data to do on the server. The front you must be "stupid." Received data and displayed.

This is better because the average user today sees the site not from the desktop, but from the phone. Often this is not a top-end iPhone, but some kind of weak and cheap android. On it everything is slow. This means that the less you do unnecessary work on the client, the faster the site works for users, and the more satisfied your site visitors are. This is especially important if we are not talking about the websites of the home page level, but about the present Highload: the loss of one percent of users due to the fact that your website is not responsive enough can have very tangible financial consequences.

yes, yes, tell it to the developers of google earth web for example, or cesium
As the question was raised, so I answered it) It all depends on the project, the task)
Clustering on Yandex maps, at a minimum, is performed at the front.

Answer 3 · 2019-02-16T12:19:13

Here the question is not fully raised ...

I believe that in this case, decisions on architecture need to be taken based on the amount of transmitted (updated in time) data

Because it is not clear that you can immediately download everything, put it into Indexeddb and then nightmare it, which is a very simple solution.

Or you need to constantly send geo requests to the server, such as show me what has now fallen into such and such a bbox, since features terabytes or they change constantly.

And even it is not clear, do you solve the optimization problem, or just while you draw on the map?

I mainly deal with computer graphics and geo-information systems.

The projects are focused on realtime mapping on the globe like a rocket flew, where the detachable parts fell, what measurement tools did they measure, what trajectory they measured, and stuff like that.

The raw data is a wagon and a small trolley, but the fact that it arrives directly to me can be said a grain of sand. Our backend is a complex zoological park of various software and hardware junk scattered around the planet, amqp broker is the data transfer in the system, and of course, besides my data, it also processes a lot of other consumers and message providers. With such a distribution of load, I myself restore the protobuff packages to geojson and similar things on the fly, because different users need information anyway in different forms and they don’t want to drive extra bytes to the network / satellite, still need to take into account that all this is also duplicated by some curly logic whose details are unknown to me =)

Add: it seems that in your particular case, taking into account the criteria you specified, you cannot rely on the speed of the client device, respectively, then it is better not to perform work on it.

As for clustering and geodata - did you look towards postgis? This is based on steroids, especially for geospatial data and queries.

So I have a contest and announced that the participants told about the problems that they faced.
Read my wishes to the answer in the contest description: During the conversation with community members, it turned out that this topic is relevant and may be interesting.
In this regard, I would like to receive as an answer the most detailed answer with a description of the details of the problem that you solved, as well as the decision made.
we have guys who made the framework, go-service generated by the proto-file and structures to it

Answer 4 · 2019-02-16T18:15:26

The successful practice of developing Unix says that for communication components of the system are desirable (wherever it is not impossible for one reason or another) to use text (with the ability to read and edit by humans) data presentation formats.

I think in your case you need to drive on the json network (i.e., human-readable text ).

avp

37.4k 3 gold marks 35 silver marks 90 bronze marks

Probably, you were engaged in highly specialized tasks. And never dealt with 1000 rps. What to say about 10,000 rpc and all that implies. json here you will greatly complicate life! - hedgehogues
I would not say that the answer is worth minus, but it clearly affects the problem from only one, not very wide perspective. - hedgehogues
@hedgehogues, as you know, optimization should be dealt with after appropriate measurements, which find the bottleneck in the problem. In general, when developing it is highly recommended to first create a workable prototype , and then successively identify problem areas and improve it. A sort of iterative process. (by the way, what exactly do you call rps ( rpc )?) (and, regarding the speed of data transfer, is there really more data flow in your task than we say when displaying HD video?) so that the system as a whole can be assessed) - avp
rps - request per seconds. In loaded systems, the obvious immediately arises: why use json when it is obviously a bottleneck both in terms of transmission speed, and possibly in terms of the amount of memory used. Plus, consider time for serialization and deserialization. Of course, there are now very nasty dumpers, but the protobuf will most likely be faster. He has his drawbacks, of course. But in the current question, it seems to me that they are of little interest. rps - requests per seconds . RPC is another remote procedure call (just a protobuf story). In my task, the system should hold 1000 rps. - hedgehogues
In this case, keep in mind that the data is stored in the in-memory storage, on a separate node (in particular, in Redis). I have never worked with HD video and do not know the loads there. - hedgehogues

|

Post-processing of data: frontent or backend. Pros and cons

4 answers 4

More articles: