Good day.

After calling fork() we get two processes with absolutely identical address spaces. A long time ago, the BSD system actually allocated the necessary amount of new memory and copied the data of the parent process there during the fork() call. Now there is copy technology when changing. I am interested in this:

  1. What exactly is copied when changing? Suppose we called a function in a child process, so at least we wrote something onto the stack. Will the kernel copy one modified page, the entire stack segment, all application segments, or something else?
  2. How is copying? Is it by-byte? Or the processor can copy for example the whole page for 1 instruction? I've never heard of this ...
  • 1. All pages where the byte will change. 2. I’m talking about processors that copy a page for one instruction I have never heard (A page is now large). - alexlz
  • To simulate such a command (block transfer) to x86, you can use for example REP MOVS information taken from here. The IBM mainframe command system also has a block-to-memory transfer command. - avp
  • Yeah. However - alexlz
  • It seems like MMX instructions are used for copying. While digging in this direction :) - Tim Rudnevsky

2 answers 2

The page is copied. Copying performs the same code as the usual copying of arguments from user space and in fact does not differ from the fast implementation options of memcpy . So for amd64 there are three implementations (called from the page break handler, actually from here ):

  • The basic copy_user_generic_unrolled general algorithm: copying occurs through four 64-bit registers of 64 bytes per loop iteration.

  • Basic based on string instructions, copy_user_generic_string , using rep ; movsq . In fact, the copying of the whole page takes place during the execution of one micro-command - movsq. The algorithm is available on systems providing the rep_good extension (see flags in /proc/cpuinfo ).

  • Fast on the basis of string instructions, copy_user_enhanced_fast_string , is similar to the previous one, but uses rep; movsb rep; movsb on systems with support for the erms extension (see the flags in /proc/cpuinfo ) is a particularly fast copy implementation system.

    The entire page is copied. How exactly - most likely byte-by-byte. You need to look at the source of the kernel.

    • Most likely with machine words: that is, 4 or 8 bytes each. - gammaker