There was a question in terms of architecture: what should the parser be - run from the server or from the browser (it will be written in JS).

PS The question arose because of the surprise of colleagues about the fact that the parser I wrote is launched from the browser, while I myself am a backend programmer.

Closed due to the fact that it is necessary to reformulate the question so that it was possible to give an objectively correct answer by the participants Dmitriy Simushev , Grundy , Alexander Petrov , HamSter , Kromster 26 Oct '16 at 5:10 .

The question gives rise to endless debates and discussions based not on knowledge, but on opinions. To get an answer, rephrase your question so that it can be given an unambiguously correct answer, or delete the question altogether. If the question can be reformulated according to the rules set out in the certificate , edit it .

  • 3
    parser what? who should run it? who actually launches it? where should he give the results? where does he give the results? - Grundy
  • Probably, it all depends on what tasks he should perform - Anton Shchyrov
  • Take data from a third-party resource from the DOM-tree, for example. - Timur Musharapov
  • one
    If the operating time of such a parser is longer than the user can wait, then he should definitely work on the server and not in the browser. - ilyaplot
  • one
    And why such a meager set of options? The parser can also be run from the console. or using cron. Or even sending an SMS to the number (I wrote it once) - Darth

1 answer 1

The perfect parser can be written in any language.

Imagine two situations:

1. Instant one-time data acquisition from a third-party resource
Suppose a user has inserted a link to a picture in a message. It is logical on the client side to download this picture and display it in the same message, and transfer base64 to the server, for example.

2. Pumping out a spherical horse in vacuum
Suppose you need to collect an unlimited amount of information for one task. In this case, I would create a queue in RabbitMQ and put a list of tasks there. The list of tasks can be formed on the client side. Next, the queue can parse several scripts that will download and parse data. It does not matter what language they are written. Yes, let it be even the client js of the user of your personal blog. The main thing is that it would be effective. Avoid situations where the user can close the browser after 98 hours of parsing a second before successful completion :)

  • By the way, the option of parsing through the client’s js visitor can replace proxy lists. <s> evil laughter </ s> - ilyaplot
  • Apparently, then it makes sense to immediately transfer the necessary values ​​for parsing to the server and implement a parser on it, and get the values ​​of the calculations at the output, right? This will be the best options for parsing a large DOM tree, where the incoming value is the same - the section name? - Timur Musharapov
  • If obtaining this DOM tree does not take much time, then it is quite likely. - ilyaplot
  • and if a lot, and the data still need to be returned? What to do in this case? - Timur Musharapov 2:58 pm
  • If data acquisition + parsing> 20 seconds, I would not do it in the browser. - ilyaplot