As far as I understand, data is transmitted via TCP as a continuous stream, until the connection is terminated. If you look at the structure of the TCP segment, there is even no information about the length of the data (as opposed to UDP, for example). Thus, if we read something from the TCP socket to the buffer, the reading will continue until the connection is closed or the buffer is not full.

However, if you look at the real code, it is not so - recv () on the server reads exactly as many bytes as sent from the client using send ()

How does recv () understand that all data is received and control needs to be returned to the calling code?

Full and minimal example on bare sockets:

Server:

#include <iostream> #include <sys/socket.h> #include <unistd.h> #include <netinet/in.h> #include <memory> #include <arpa/inet.h> const int BUFFER_SIZE = 1024; const int PORT = 12345; int main() { //create server socket int socketFd = ::socket(AF_INET, SOCK_STREAM, 0); if (socketFd < 0) { return -1; } int opt_val = 1; setsockopt(socketFd, SOL_SOCKET, SO_REUSEADDR, &opt_val, sizeof opt_val); //bind to address sockaddr_in socketAddress; socketAddress.sin_family = AF_INET; socketAddress.sin_port = htons(PORT); socketAddress.sin_addr.s_addr = htons(INADDR_ANY); int rc = ::bind(socketFd, reinterpret_cast<sockaddr*>(&socketAddress), sizeof(socketAddress)); if (rc < 0) { return -2; } //listen rc = ::listen(socketFd, SOMAXCONN); if (rc < 0) { return -3; } //accept new connection sockaddr_in socketAdress; unsigned int sizeOfSocketAdress = sizeof(socketAdress); int clientSocket = ::accept(socketFd, (struct sockaddr *)&socketAdress, &sizeOfSocketAdress); if (clientSocket < 0) { return -4; } //receive char buffer[BUFFER_SIZE]; int receivedBytes = ::recv(clientSocket, buffer, BUFFER_SIZE, MSG_NOSIGNAL); std::cout << "Received " << receivedBytes << " bytes : " << buffer << std::endl; // Прочитано 5 байт "hello", хотя буфер не заполнен и соединение не прервано return 0; } 

Customer:

 #include <iostream> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> int main() { struct sockaddr_in sa; int res; int socketFd; socketFd = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); if (socketFd == -1) { perror("cannot create socket"); exit(EXIT_FAILURE); } memset(&sa, 0, sizeof sa); sa.sin_family = AF_INET; sa.sin_port = htons(12345); res = inet_pton(AF_INET, "127.0.0.1", &sa.sin_addr); if (connect(socketFd, (struct sockaddr *)&sa, sizeof sa) == -1) { perror("connect failed"); close(socketFd); exit(EXIT_FAILURE); } auto buf = "hello"; auto len = 5; int sentBytes = ::send(socketFd, buf, len, 0); std::cout << "sent " << sentBytes << "bytes: " << buf << std::endl; std::string tmp; std::getline(std::cin, tmp); //приостановка выполнения, соединение все еще не закрыто return EXIT_SUCCESS; } 
  • Your recv reads exactly the right amount of data for one reason only - the data runs a little and they are on localhost. Make two different cars, preferably in different networks, and everything will become more cheerful at once - KoVadim
  • @KoVadim and what will happen in different networks? - goldstar_labs
  • recv will read how much, how much he did (of course, not more than the requested). And the situation is quite realistic when, on the one hand, data is sent in large chunks, and on the other, one byte is read first. And then how much you ask. - KoVadim
  • @KoVadim, a) this is yes, sorry, I thought about something else) that is why they read as a rule in cycles - goldstar_labs

3 answers 3

No, recv reads the number of bytes not larger than the specified buffer size, absolutely not worrying about receiving all the data. At best, a packet with a PSH flag comes from the client, hinting that it makes sense to give the data to the reader now.

  • Well, my delusion. But the answer to the question "How does recv () understand that all data is received, and the control needs to be returned to the calling code?" I would like to know. - NikBond 5:56 pm
  • @NikBond Does not understand in any way. - VTT
  • OK, i.e. call send () on the client after sending a segment with data sends a segment in the PSH flag, and recv () on the server finishes reading after receiving this segment? Confused by the wording "at best" and "hinting" If a segment with PSH does not come, will the reading continue until the buffer is full? - NikBond
  • @NikBond Data can accumulate in the buffer even if PSH comes repeatedly. - VTT
  • those. is recv () unpredictable? He can return the buffer at the moment when the message sent from the client is not fully received, when two sent messages from the client are written to the buffer, can it wait until blue in the buffer until the buffer is full, even if all the sent messages came? Accordingly, the use of bare TCP without a more high-level protocol that controls the length of the message is meaningless? - NikBond

How does recv () understand that all data is received and control needs to be returned to the calling code?

None This is an application concern. From the point of view of the application, the TCP channel is a regular file. How does the application know that a certain portion of the data is fully accepted? There are only two ways:

  1. There is a certain character - a separator of records. For most text files, this character is '\ n'
  2. At the beginning of the data portion, the block length is written. How exactly this is done depends on the specific application.

Both of these clauses are 100% applicable to TCP compounds. For example, in the HTTP protocol the variant of clause 1 is used: each message BEGINS with a special header line and ENDED with an empty string. Data is transferred in a slightly different way, but the idea is the same.

    TCP behavior is unpredictable :) there is no 100% guarantee that the 'user data packet' is delivered partially or completely. It is up to you in your program. It makes sense to sometimes carry out additional checks for data availability, I will show with an example of a function that reads all the data for асинхронного сокета :

     // функция работает как 'умный' TCP.socket.data.flush void tcp_recv_empty(int sock, ssize_t sz) { ssize_t rsz; unsigned char rbuf[65536]; // 1500 более бережно :) # if !defined(MSG_WAITALL) # define MSG_WAITALL 0x40 # endif if (!sz) { if ((ioctl(sock, FIONREAD, &sz) != 0) || (!sz)) { return; } } while (sz > 0) { errno = 0; switch ((rsz = recv(sock, rbuf, sizeof(rbuf), MSG_WAITALL))) { case (ssize_t)-1: { # if defined(EAGAIN) if ((errno == EAGAIN) || (errno == EINTR)) # elif defined(EWOULDBLOCK) if ((errno == EWOULDBLOCK) || (errno == EINTR)) # endif { continue; } return; } case 0: { return; } default: { sz -= rsz; } } } } 

    You should always rely on the return code recv*() and the state obtained for example by such a function:

     int net_socket_iserror(int sock) { int se; socklen_t sl = sizeof(int); # if defined(SO_ERROR) if (getsockopt(sock, SOL_SOCKET, SO_ERROR, &se, &sl) < 0) { return errno; } return se; # else return 0; # endif } 
    • one
      TCP is just intended to guarantee delivery, unlike UDP. Or what do you mean: that the power supply of the network card disappears, the radio wave dissolves in the global space .. etc? - AseN
    • In the same place for asynchronous sockets through select or poll work went, you will cause in this context? After select? - vegorov
    • The keyword is a пакет пользовательских данных , the transmitting party can at any time interrupt the transfer of data for various objective reasons, the TCP protocol does not guarantee the transfer of всех data from the point of view of the program processing it. The author himself must check whether all the data came from or is part of the transmitted data. About select vs poll vs epoll comment is incomprehensible, it is not directly related to the state of the socket after recv and error checking, the level of the selector is somewhat higher, and according to the results of processing, they invoke send / recv / shutdown / close. - NewView
    • tcp protocol guarantees data delivery (whether you will read this data or not, this is your business and the protocol has nothing to do with it). It is also not clear what the asynchrony of your function is if it is put in a block by a call to recv (... MSG_WAITALL)? The question about poll is the following: select / poll functions are used to track changes in the state of sockets. If select / poll returned POLLIN / POLLOUT to you, then it is already possible to read / write from the socket without blocking what to do with your function? And why the state of the socket does not change after recv? - goldstar_labs
    • one
      @goldstar_labs, you miss the very essence of the wording: Ключевое слово пакет пользовательских данных is not a TCP packet :) - NewView