For example, I wrote one web application in php, which is located on my server. That is, there is one code. But there are a lot of customers.

Suppose 10 people connect to my server right away. And how does this single code process the requests of all people at once? He does it at the same time.

I heard, it is connected with threads and processes, some kind of instances. Generally. explain it, please.

And also explain how the php application architecture differs from the node.js application architecture. I heard the difference in some flows, that is, in their number.

    1 answer 1

    There is HTTP. It works according to the " client sent request @ server responded " scheme.



    In the case of PHP, there is usually a web server. It is he who usually deals with multithreading. It accepts the request and passes it to the PHP interpreter.


    If the web server uses CGI, then upon receiving a request, it starts a new PHP interpreter process, sends request parameters to it, sends the data returned by the interpreter back to the client, and waits until the process finishes. Parallel processing can occur due to the fact that the web server controls several PHP interpreter processes.

    Running every time a new process consumes RAM, of course. And time. I recall cases when too many parallel clients led to too much RAM consumption and poorly predictable consequences.


    If the web server uses FastCGI, then in advance, at startup, it raises a set of PHP interpreter processes in a special mode. In the set, each interpreter can be busy or free .

    The web server, after receiving the request, takes from its set the free interpreter process (making it busy ), informs it (using sockets) the request parameters, and returns the issued response to the user, and then releases the interpreter.

    Compared to CGI, processes are no longer terminated / run on every request, which is already a good thing. And there are a finite number of processes. If each of them is limited by memory, they will consume a predictable amount of resources.

    But there is another problem that remains relevant for CGI: the application code usually performs some preparatory work directly before processing the request: it loads libraries, settings, creates system objects. This is quite an impressive part of the code, which for different queries leads to the same results. But these results are not saved, but are thrown to nowhere after each query. Not cool.



    On this note, we leave the world of PHP, skipping exotic ways to force PHP to serve requests, and move on to more interesting topics. About what you asked there ... NodeJS! Okay.

    JavaScript is historically quite strongly tied to the event loop . By itself, it is not multi-threaded, JS programs are usually divided into small portions, between which there is still nothing to do and you can let someone else work. Therefore, very often the code on NodeJS "communicates with the outside world" asynchronously : it does not just say "do X", but "do X, and when you get the result, report in Y, and I will go to rest . "

    There are a lot of actions that are useful to perform asynchronously in the world of websites: waiting for data from the client, waiting for results from the database, waiting for writing to a file, and so on.

    Web server in Node.js own built. And since it is controlled directly from the interpreter, with control over what to do before and after, you can prepare objects once and in advance to execute queries without constructing a new one every time.

    Inside Node.js, a "cycle of events" revolves around the "event queue" (simplified). Event loop: { takes the next event from the queue , performs the action associated with it } and so on, and the action can add new entries to the event queue.

    When requests are received in turn, the event queue does not swell more than 1 record and the order of execution of the code is generally similar to the normal one:

    1. request arrives (in the queue!)
    2. the event loop detects the incoming request (from the queue!)
      • called the handler f for the request
    3. handler f asked to execute the query in the database (in the queue!)
      • and pass the result to handler g
    4. query in database is running, waiting, the event queue is empty
      • but there are handlers, so the process is not yet complete
    5. The event cycle received a notification that the request was completed (from the queue!)
      • called the g handler for the query results
    6. handler g based on the results of the request, composed the page and sent it to

    Now the question is: how does Node.js execute multiple requests in parallel?

    The answer is: it only seems so . Just because of the asynchrony, the request code is divided into separate small pieces, between which waiting is allowed. If there are few requests, then this is really an expectation, and if there are many, then during the wait you can do something useful .

    The above describes a typical request for a simple web application, where there are only two asynchronous actions: receiving a request from a web server and a query to the database. In practice, they are usually more.

    While the database executes the request for the first client, a request from the second client may come. If the event about the arrival of the second request comes before the results from the database for the first , then Node.js:

    • will process the second request first (2.II)
    • will send the second query to the database (3.II)
    • receive data from the database to the first request (5.I), generate and record the answer to the first
      • (if the queries in the database are more or less the same and there are no new queries)
    • receive data from the database to the second query (5.II), generate and record the answer to the second

    And it turns out that only one action is performed at a time, but at the same time one process can serve several clients at a time . And allow long requests of individual customers. For example, not via HTTP but, say, Websockets.

    In CGI and FastCGI, in theory, servicing a Websocket connection takes a whole process and does not allow it to do anything else. N processes, we create N connections and bang-bang - there is no one to respond to requests. Node.js web server and other event web servers do not suffer from this.


    Of course, this may not be enough. With multi-core processors, even Node.js processes can make sense to lift not one by one, but immediately sets. But a single point will have to scatter requests from customers. Load balancer. nginx, HAProxy, Varnish or something else.


    The web server built into the language may be less intelligent: for example, there is the Ruby language, it is customary to use the Rack interface (in which the request handler is one hefty function from a heap of parts). This interface is implemented by Unicorn web server. After receiving the request, it calls the handler function and returns the value returned by it.

    Of course, each Unicorn process can handle only one request. And if he sticks out (clients communicate directly with him), a problem may arise if the client is slow : Unicorn will wait patiently for the client to accept the answer, no matter how long it takes, while the client seems alive.

    N processes, N slow clients and voila - there is no one to serve requests.

    Therefore, Unicorn is always placed "outside" behind the load balancer, because the balancer has its own buffer in which Unicorn can quickly throw out the answer and proceed to the next request, while the balancer with its effective I / O (asynchronous event) hammers the answers even into slow clients, freeing from the ungrateful work of Unicorn.



    ... so, how many kilobytes I have already written here ... Almost 7! Well ... I think you understand that this is a rather extensive topic.

    • Thank you so much for not being lazy and well explained - Muller
    • @Muller In addition to the answer: there are more sockets for PHP, and with the advent of full support for sockets in JS (in all modern browsers), this has already ceased to be an exotic way of exchanging client-server data. And, accordingly, when working with sockets, the problem disappears. The код приложения обычно перед непосредственно обработкой запроса выполняет некоторую подготовительную работу on PHP код приложения обычно перед непосредственно обработкой запроса выполняет некоторую подготовительную работу - Goncharov Alexander
    • @GoncharovAleksandr and how did you solve the problem with the fact that the active connection occupies the whole process of the interpreter? (in a sense, I really ask, without any subtext) - D-side
    • @ D-side, did not quite understand. For example, the example here is habrahabr.ru/post/209864 - all connections are processed in an infinite loop. If you are talking about parallelism, there is pcntl_fork, if you mean that an infinite loop in PHP is not good, you can kill the socket periodically (terminate the script), and a new socket is automatically created, and the clients reconnect. - Goncharov Alexander
    • @GoncharovAlexander read. There syst. call select + separate process for socket connections. Reasonable move, but inclined to believe that this is also exotic. - D-side