Tell me how to implement connection timeouts when using epoll? It is required to detect inactive connections and close them.

    2 answers 2

    So after all in man epoll_wait timeout in milliseconds is clearly registered.

    So use it. If epoll_wait () returns 0, then the timeout has expired. In this case, look through your epoll_event vector, by which you do the survey and make a decision.

    Apparently in your case, with each connection you need to associate a data structure, something like

    struct connect { int sock_fd; time_t last; // ??? ... }; 

    and in the struct epoll_event, the array of which you pass to epoll_wait (), use the .data.ptr field (enter the address of the corresponding struct connect).

    Accordingly, for each recv () change in the desired block last. And when timeout compares the current time with each last.

    Something like this, briefly.

    • And somewhere on 10k sockets it will stop returning zero. But the idea is in principle understandable. I, too, was inclined towards a similar solution, but there was an idea to check all the connections only once every 10 seconds, which would not create an undesirable load, but I was shown a different solution. Brought him in his reply. Unfortunately, I can not yet decide which solution is better. - mikelsv
    • @mikelsv, well, you, in any case, need to recalculate the timeout transmitted during the next call to epoll_wait (after all, usually a lot of returns will not be with zero). So, if the next recalculated timeout has become zero, then it is time to view the events vector. - avp
    • Now no, epoll_wait () is called with endless waiting. Now I will most likely call him at intervals of 10 seconds. While there is no task to make an exact timeout. - mikelsv
    • one
      @mikelsv, if all the timeouts are the same for you, then (you probably also considered this solution) you can do everything in O (1). We use self-organizing doubly linked list connections. New add to the end of the list. Updatable (we get from the epoll_wait immediately the address of the list item) move to the end of the list with recalculated timeout time. Thus the list will always be ordered by timeout. When searching for "expired" connections, it is enough to check the items in the head of the list (and immediately delete them) while the timeout time is less than the current one. IMHO is beautiful. - avp 6:42 pm
    • I also thought about such a beautiful solution, but unfortunately the concept of the server is such that the timeouts may be different. Of course, you can put them on the list with a timeout per second and transfer them to the end of the list, but this will be an extra load, although it’s also an option. Before solving this problem, you need to solve another one: ru.stackoverflow.com/questions/429478 , how to tell an epulu to pass a socket to a handler. - mikelsv

    A similar problem is quite interestingly solved here: https://github.com/zeromq/libzmq/blob/master/src/epoll.cpp#L152 and https://github.com/zeromq/libzmq/blob/master/src/poller_base .cpp # L78 .

    In short, std :: multimap is taken there, and when a timer needs to be set, a response time and a pointer to the class to be reported are written to it. And before each call to epoll_wait (), it checks if the timer has expired. Quite an interesting and quick solution.

    • It all depends on the intensity (frequency) of epoll_wait calls. / Suppose you work with a large number of active connections, which means that you often call epoll_wait and each time you have to watch (almost always to no avail), and for whom the timeout has expired. Actually it is very fast. But updating each timer will cause its deletion and subsequent insertion into rb-tree (i.e., time of order log (N)). / In my answer, the update is fast, viewing (by timeout, let's say in practice once a second) time is of the order of N. / While everything (I am leaving until Monday) - avp
    • In principle, yes, it's easier to run through the array of elements. I was also confused by the work with the dynamic array, and I would also have to rewrite it so that it did not pull malloc () on every addition of the timer. - mikelsv