📜 ⬆️ ⬇️

The history of wandering documentation Haproxy, or what you should pay attention to when it is configured

Hello again!

Last time we talked about choosing a tool in Ostrovok.ru for solving the problem of proxying a large number of requests to external services, without putting anyone at the same time. The article ended with the choice of Haproxy . Today I will share the nuances that I had to face when using this solution.



Haproxy configuration


The first difficulty was that the maxconn option of Haproxy is different depending on the context:


Out of habit, I set up only the first option ( performance tuning ). This is what the documentation says about this option:
Sets the number of concurrent connections to <number>. It
is equivalent to the command-line argument "-n". Proxies will stop accepting
connections when this limit is reached.

It would seem - what you need. However, when I stumbled upon the fact that new connections to the proxy did not pass immediately, I began to read the documentation more carefully, and there I already found the second parameter ( bind options ):
Limits of concurrent connections. Extraneous
connections will remain in the system backlog until a connection is
released. If unspecified, maxconn.

So, frontends maxconn go, then look for frontends maxconn :
Fix the maximum number of concurrent connections on a frontend
...
By default, this value is set to 2000.

Great, what you need. Add to the configuration:

 global daemon maxconn 524288 ... defaults mode http maxconn 524288 

The next gag was that Haproxy is single threaded. I am very used to the model in Nginx, so this nuance has always depressed me. But you should not despair - Willy ( Willy Tarreau - the developer of Haproxy ) understood what he was doing, so he added the option - nbproc .

However, right in the documentation says:
USING MULTIPLE PROCESSES
IS HARDER TO DEBUG AND IS REALLY DISCOURAGED.
This option can really cause headaches if you need to:


Nevertheless, the gods granted us multi-core processors, so I would like to use them to the maximum. In my case, there were four cores in two physical cores. For Haproxy, I selected the first core, and it looked like this:

  nbproc 4 cpu-map 1 0 cpu-map 2 1 cpu-map 3 2 cpu-map 4 3 

Using the cpu-map, we link the Haproxy processes to a specific kernel. The OS scheduler no longer needs to think about where to schedule the work of Haproxy, thereby keeping the content switch in the cold, and the cpu cache warm.

There are many buffers, but not in our case.



Now consider the lower level stuff.


tune.rcvbuf.client / tune.rcvbuf.server , tune.sndbuf.client / tune.sndbuf.server - the documentation says the following:
The size of the available memory is set.

But for me, the obvious is better than the implicit, so I screwed up the values ​​of these options to be confident in the future.

And one more parameter, not related to buffers, but rather important - tune.maxaccept .
Sets the maximum number of consecutive connections
row before switching to other work. In single process mode, higher numbers
give better performance at high connection rates. However in multi-process
modes, keeping a bit of fairness between processes
increase performance.

In our case, quite a lot of requests to the proxy are generated, so I raised this value to accept more requests at once. However, as stated in the documentation, it is worth testing that in multi-threaded mode the load is as evenly distributed as possible between processes.

All parameters together:

  tune.bufsize 16384 tune.http.cookielen 63 tune.http.maxhdr 101 tune.maxaccept 256 tune.rcvbuf.client 33554432 tune.rcvbuf.server 33554432 tune.sndbuf.client 33554432 tune.sndbuf.server 33554432 

What does not happen much is timeouts. What would we do without them?



Kulstori about HTTP client in Go
Go has a regular HTTP client that has the ability to keep a pool of connections to servers. So there was one interesting story in which the above described timeout and connection pool in the HTTP client took part. Once a developer complained that he periodically has 408 errors from a proxy. We looked into the client code and saw the following logic there:

  • trying to take a free established connection from the pool;
  • if it does not work, we launch the installation of a new connection in Gorutin;
  • check the pool again;
  • if there is a free in the pool, we take it, and add a new one to the pool, if not, use a new one.

Already understand what the salt?

If the client has established a new connection, but has not used it, then after five seconds the server closes it, and the matter ends. The client, on the other hand, catches it only when he already takes a connection from the pool and tries to use it. It is worth keeping this in mind.


All timeouts together:

 defaults mode http maxconn 524288 timeout connect 5s timeout client 10s timeout server 120s timeout client-fin 1s timeout server-fin 1s timeout http-request 10s timeout http-keep-alive 50s 

Logging Why so hard?


As I wrote earlier, most of the time I use Nginx in my solutions, therefore I am spoiled by its syntax and ease of modifying the log formats. I especially liked the killer feature - to format the logs in the form of json, then to parsit them with any standard library.

What do we have in Haproxy? This feature is also there, only you can write only in syslog, and the configuration syntax is a bit more wrapped up.
Immediately give an example of the configuration with comments:

 # выносим все, что касается ошибок или событий, в отдельный лог (по аналогии с # error.log в nginx) log 127.0.0.1:2514 len 8192 local1 notice emerg # здесь у нас что-то вроде access.log log 127.0.0.1:2514 len 8192 local7 info 

Special pain deliver such moments:

As a result, in order to get something interesting, you have to have such a footcloth:

 log-format '{"status":"%ST","bytes_read":"%B","bytes_uploaded":"%U","hostname":"%H","method":"%HM","request_uri":"%HU","handshake_time":"%Th","request_idle_time":"%Ti","request_time":"%TR","response_time":"%Tr","timestamp":"%Ts","client_ip":"%ci","client_port":"%cp","frontend_port":"%fp","http_request":"%r","ssl_ciphers":"%sslc","ssl_version":"%sslv","date_time":"%t","http_host":"%[capture.req.hdr(0)]","http_referer":"%[capture.req.hdr(1)]","http_user_agent":"%[capture.req.hdr(2)]"}' 

Well, it would seem, little things, but pleasant


Above, I described the format of the log, but not so simple. To deposit some items in it, such as:


you must first capture this data from the request ( capture ) and place it in the array of captured values.

Here is an example:

 capture request header Host len 32 capture request header Referer len 128 capture request header User-Agent len 128 

As a result, we can now access the elements we need in the following way:
%[capture.req.hdr(N)] , where N is the sequence number of the capture group definition.
In the above example, the Host header is numbered 0, and the User-Agent is numbered 2.

Haproxy has a feature: it resolves the DNS addresses of the backends at startup and, if it cannot crack one of the addresses, it drops the death of the brave.

In our case, this is not very convenient, since there are many backends, we do not manage them, and it is better to get 503 from Haproxy than the entire proxy server refuses to start because of one supplier. The following option helps us in this: init-addr .

A line taken straight from the documentation allows us to go through all the available methods for resolving an address and, in the case of a file, simply put it off until later and go further:

 default-server init-addr last,libc,none 

And finally - my favorite: the choice of the backend.
The syntax of the Haproxy backend selection configuration is familiar to everyone:

 use_backend <backend1_name> if <condition1> use_backend <backend2_name> if <condition2> default-backend <backend3> 

But the right word is somehow not very. I have already described all the backends in an automated way (see previous article ), it would be possible to generate use_backend here use_backend , a bad thing is not tricky, but I did not want to. As a result, there was another way:

  capture request header Host len 32 capture request header Referer len 128 capture request header User-Agent len 128 # выставляем переменную host_present если запрос пришел с заголовком Host acl host_present hdr(host) -m len gt 0 # вырезаем из заголовка префикс, который идентичен имени бэкенда use_backend %[req.hdr(host),lower,field(1,'.')] if host_present # а если с заголовками не срослось, то отдаем ошибку default_backend default backend default mode http server no_server 127.0.0.1:65535 

Thus, we have standardized the names of backends and URLs that can be accessed by them.

Now compile from the above examples into one file:

Full configuration version
  global daemon maxconn 524288 nbproc 4 cpu-map 1 0 cpu-map 2 1 cpu-map 3 2 cpu-map 4 3 tune.bufsize 16384 tune.comp.maxlevel 1 tune.http.cookielen 63 tune.http.maxhdr 101 tune.maxaccept 256 tune.rcvbuf.client 33554432 tune.rcvbuf.server 33554432 tune.sndbuf.client 33554432 tune.sndbuf.server 33554432 stats socket /run/haproxy.sock mode 600 level admin log /dev/stdout local0 debug defaults mode http maxconn 524288 timeout connect 5s timeout client 10s timeout server 120s timeout client-fin 1s timeout server-fin 1s timeout http-request 10s timeout http-keep-alive 50s default-server init-addr last,libc,none log 127.0.0.1:2514 len 8192 local1 notice emerg log 127.0.0.1:2514 len 8192 local7 info log-format '{"status":"%ST","bytes_read":"%B","bytes_uploaded":"%U","hostname":"%H","method":"%HM","request_uri":"%HU","handshake_time":"%Th","request_idle_time":"%Ti","request_time":"%TR","response_time":"%Tr","timestamp":"%Ts","client_ip":"%ci","client_port":"%cp","frontend_port":"%fp","http_request":"%r","ssl_ciphers":"%sslc","ssl_version":"%sslv","date_time":"%t","http_host":"%[capture.req.hdr(0)]","http_referer":"%[capture.req.hdr(1)]","http_user_agent":"%[capture.req.hdr(2)]"}' frontend http bind *:80 http-request del-header X-Forwarded-For http-request del-header X-Forwarded-Port http-request del-header X-Forwarded-Proto capture request header Host len 32 capture request header Referer len 128 capture request header User-Agent len 128 acl host_present hdr(host) -m len gt 0 use_backend %[req.hdr(host),lower,field(1,'.')] if host_present default_backend default backend default mode http server no_server 127.0.0.1:65535 resolvers dns hold valid 1s timeout retry 100ms nameserver dns1 127.0.0.1:53 


Thanks to those who read to the end. However, this is not all. Next time, we will consider lower-level things concerning the optimization of the system itself, in which Haproxy works, so that it and our operating system are comfortable together, and there is enough hardware for everyone.

See you!

Source: https://habr.com/ru/post/438966/