I rummaged all Google, I can not understand. And I re-read Wikipedia and in general read everything. I really do not understand.

What prevents to connect simply .cpp files?

Well, you hooked it up twice, well, let the compiler connect only once and that’s it, since it’s one hell of a thing all in one, as it were, the file sticks together, then the first connection will be higher than the rest and will be visible to them below. Let it watch the time changes and recompile only the changed. Let him automatically generate and attach header files with a description of interfaces to a compiled binary, etc.

In other words, why is the routine work of generating header files not automated and assigned to the developer? After all, the compiler next to the compiled binary can easily generate an interface description file (which it received by scanning the cpp file).

Can you give a situation in which there would be PROBLEMS without using header files? This will be the best explanation.

  • Comments are not intended for extended discussion; conversation moved to chat . - Nick Volynkin

6 answers 6

The problem lies in the field of backward compatibility.

Look, any new programming language - yes, even Pascal, not to mention Java or C # - do not need header files. Without a doubt, C ++ could also do without them. What's the matter?

Fast forward half a century ago, in 1972. Imagine a C compiler.

Suppose we want to write a compiler design. We can not compile the entire program at once, we just do not have enough memory. Computers were then small and slow. We want to compile the program piece by piece, several functions at a time.

We immediately have a problem: how to compile the function f , which refers to another function g ? We need a separate description of other functions. We could, of course, read all the source files, first find out what features we have, and then read them a second time and compile one by one. But it was too difficult and slow, it was necessary to parse the function definitions twice, and throw the result once! This is unacceptable CPU time! Plus, if you keep in mind the definitions of all functions, it may again not be enough memory.

On whom Dennis decided to put the difficult problem of separating the description of the function from its implementation, and connecting only the necessary descriptions when compiling this function? On us, programmers. He decided that we should help the compiler and copy the definitions of the functions into a separate file, and tell the compiler what files with the definitions are needed. (That is, the first compilation step is on us.)

This radically simplified the compiler, but in turn led to problems. What happens if we forget to include the necessary header file? Answer: compilation error. What happens if the meaning of the text in the header file changes depending on a macro? Answer: the compiler is "stupid", and does not try to detect this problem, it shifts the responsibility to us.

At the time of the development of the language it was the right decision. The compiler turned out to be practical, fast, and programmers were not averse to helping the compilation. Well, if anyone made a mistake, he himself was to blame.

Rewind hands of clocks in 1983. Bjarn creates C ++. He decided to take off in the wake of the popularity of the C language, and adopted the C compilation model with separate translation units and related problems directly from C. However, the first versions of C ++ were just a preprocessor of the C language! Therefore, separate compilation problems migrated from C to C ++. Worse, new problems have been added. For example, class templates look like classes, but do not generate an object code by themselves, so they have to go for tricks and bypass the disadvantages of a separate compilation system (for example, including the implementation in the header and linker tricks).

And then backward compatibility came into play. Now, in 2017, there is so much code written in the “with headers” style, and so much code comes from various subtleties associated with this that it is too late to change the paradigm, the train has almost left.

However, there is a project for a modular system in C ++, which should help programmers get rid of a legacy of half a century ago. It has not yet been implemented, and there are difficulties in the design level (for example, a macro has been defined in the header, will it be visible if we move from the header to the modules?) I hope in the future developers of the language will be able to overcome the inverse problem compatibility.

  • one
    Didn't C inherit the headers from the assembler? - Cerbo
  • one
    @Qwertiy: Well, because this is not intended to use, but rather the abuse of the inclusive system. Although yes, in C ++, by tradition, for any definition bug, there was someone who started to use it, and the bug turned into a feature. Example - SFINAE :-P - VladD
  • one
    @VladD, this is about 90% of the use of templates, we can say that it is not intended use :) - Qwertiy
  • 2
    I read the answer as the head of a detective novel. Great answer :) - isnullxbh
  • one
    @isnullxbh: Thank you! :) - VladD

The program is built in three stages: preprocessor, compiler and linker.

The preprocessor, processing the directive #include "file" , will insert the contents of file into the current file.

It turns out that if you include a cpp file in several compiled files, it will be compiled several times. And here the Rule of one definition comes into force.

At the same time, read what a translation unit is .

I will give an example:

ClassA.cpp

 class ClassA { public: void someFunction() {}; } 

ClassB.cpp

 #include "ClassA.cpp" class ClassB { } 

ClassC.cpp

 #include "ClassA.cpp" class ClassC { } 

After you run the preprocessor, your ClassB.cpp and ClassC.cpp will turn into

 class ClassA { public: void someFunction() {}; } class ClassB { } 

and

 class ClassA { public: void someFunction() {}; } class ClassC { } 

And just now the compilation begins. Even if you do not separately compile ClassA.cpp, you still break the rule of one definition (ClassA is defined in two translation units - ClassB.cpp and ClassC.cpp).

  • 1) we indicate to the compiler a set of cpp files 2) in some we plug others with #include "file.cpp" 3) if the file connection is duplicated it doesn’t connect it a second time 4) if we compile separately, then AUTOMATICALLY create a header file containing the Torn-out DESCRIPTION of the INTERFACES from the cpp file with the code. 5) Why didn't the compiler creators do this? why force hands to generate? - Maxmaxmaximus
  • one
    In other words, why is the routine work of generating the header files, not automated and assigned to the developer? After all, the compiler next to the compiled binary can easily generate an interface description file (which it received by scanning the cpp file) - Maxmaxmaximus
  • If I have already compiled a cpp file, then the compiler should watch the modification time and understand that it is no longer necessary to recompile the file. Why is this not done? Can you explain? I want to understand one thing, is it the stupidity of the developers of the language, or the need that I don't understand)? - Maxmaxmaximus
  • 2
    Read again about the translation unit. When a cpp file is compiled, the compiler does not know anything about other cpp files (whether it has already been compiled). Regarding the "routine work on the generation of header files." Usually when writing programs, they first think about the interface (they write .h), and then they only write the implementation (.cpp) - Krepver
  • "When a cpp file is compiled, the compiler knows nothing about other cpp files" knows! he can look into his tmp folder of files and search his binary there by name, if the hash sum matches then you do not need to recompile. - Maxmaxmaximus

It happened historically. The use of automatically generated header files requires a complex build system that can determine the compilation order in the presence of complex dependencies between modules.

And since at the time of the formation of the language of such systems have not yet come up with - then add the generation of the header file in the compilers did not. And until now such a system has not appeared because compilers are not able to generate header files.

Perhaps the modules from the new standard will change everything.


In any case, the header files will remain for those cases when the import of the .cpp file is not applicable in any way. For example, when .cpp-file and no - when connecting a third-party library. Or if there is a cyclic dependency between modules. Or just in a situation when additional requirements are imposed on the header file, which makes its automatic generation unacceptable.

  • one
    By the way, as far as I know, the standard does not require the presence of these header files in the compiler in the form of files ... Ie theoretically possible implementation, which will not have real header files of the standard library. - Harry
  • @Harry is from the series "the standard does not require a compiler, just a program that performs the same function" :) - Pavel Mayorov
  • one
    @Harry anyway, with libraries different from the standard, such tricks will not work anymore. - Pavel Mayorov

It is expected that in C ++ 17 the first standardized version of the modules will appear, that is, support for assemblies without header files. As I understand it, in this version, the modules will be implemented as a compiler extension, that is, this function will be optional in C ++ 17.

However, modules are already implemented in Visual Studio 2015 Update 1 and clang , so you can try them.

    What prevents to connect simply .cpp files?

    The lack of information about the types available from outside.

    The compiler turns source files ( .c and .cpp ) into object files ( .obj , later stitched together by the linker into a single .exe or .dll file) - "semi-finished products" - "black boxes" that import and export symbols .

    A symbol, in turn, is a collection of “the name of a function / variable - its offset relative to the beginning of the object file”. Import is the dependencies of the object file (it doesn’t matter where they are implemented, only a coincidence of names is required); export is what is implemented in this file.

    Thus, the object files do not contain data types and other abstractions that exist solely in the compiler's imagination. Only machine code and data that exists in the form of actually allocated space in RAM, and named tags.

    Unfortunately, the mere knowledge of the name is not enough to form imports (dependencies). The compiler also needs to know the structure of the types involved, to generate correct shifts on the stack, insert additional constructor / destructor calls, perform implicit type conversions, validate the call, etc.

    Since this information is not in the object file (remember, there are only names), it should be described as declarations in the source code file that this external entity wants to use. The description can be performed either in the .cpp file itself, or put it into a separate .hpp file and connect it wherever a type description is required.

    Why the type declaration is not embedded in the object file that implements them? The same data type can be used in an arbitrary number of places, and therefore such embedding violates the rule of one definition . According to this rule, every entity should have only one source (object file) so that the linker does not have uncertainty when comparing imports and exports.

    Well, let the compiler connect only once and that's it, since it’s one hell of everything in one file, as it were, sticks together everything, so the first connection will be higher than the rest and will be visible to them below.

    Building the program in C and C ++ takes place in two stages.

    1. The compiler turns each source file ( .c and .cpp ) into an object file ( .obj ). At the same time, each source file is compiled independently, as if it is one in the universe and no one except it exists anymore. At this stage, the compiler does not see other object files , but it is already obliged to issue ready-made machine code. Such an approach was chosen by language developers to enable compiling parallelization.

    2. The linker takes the specified object files (no matter how or when they were received), links their imports and exports, and combines them into a single executable ( .exe or .dll ) file.

    In other words, why is the routine work of generating the header files, not automated and assigned to the developer? After all, the compiler next to the compiled binary can easily generate an interface description file (which it received by scanning the cpp file).

    Here you have the .cpp file (without including non-standard headers, as you would like):

     #include <iostream> #include <string> Logger::Logger() : _console(std::cout) { } void Logger::log(const std::string& message) { _console << message; } 

    How does the compiler derive the structure of the class Logger from this? Namely: what fields should be in the class and what is their access modifier? After all, the implementation of a single class can be distributed across several files, just as a single file can contain an implementation of part of the methods of several classes.

    You will probably say: “describe the class Logger{...}; here so that .hpp generated from it. ” And what's the difference then, if we have this description and so put it in the header file?

    Can you give a situation in which there would be PROBLEMS without using header files?

    • myclass.cpp

       #include <cstdlib> class Foo { int _value; public: Foo() : _value(rand()) { } // ... ещё много-много методов int bar() const { return _value; } } 
    • main.cpp

       // Копия простыни из foo.cpp #include <iostream> class Foo { int _value; public: Foo(); // ... ещё много-много методов int bar() const; } int main() { Foo a; std::cout << a.bar(); return 0; } 
    • one-more-file.cpp

       // Ещё раз копия class Foo { int _value; public: Foo(); // ... ещё много-много методов int bar() const; } // ... 
    • and so on for each file using Foo .

    Now imagine that every change in the class description will require a massive replacement in a large number of places . Forget to fix - the compiler will keep silent, being misled, the final program will fall.

    • one
      "// Disallow copying and moving by declaring at least one constructor manually." Is it forbids? ideone.com/7RlHhH . In my opinion, in this case only the default constructor is not defined. - bronstein87
    • Sorry, the error came out. You are absolutely right. Corrected. - ߊߚߤߘ

    Well, you connected it twice, well, let the compiler connect only once and all

    And if I do not want one time?

    Suppose I generate a bunch of similar functions at the expense of macros, where some minimal differences in behavior are determined by other macros. T. o. instead of generating code, I change the values ​​of the constants and connect the same cpp-file to another cpp-file. Do you want to deprive me of this opportunity?

    Here is an example: http://ideone.com/TlN4r0 . Although there are unnecessary complications due to the fact that everything has to be pushed into one file (for the sake of ideone), the idea should be clear. And yet, it is assumed that the function is large enough, but the differences are small enough. In this case, placing the entire function in a macro becomes irrational.

     #ifndef MAIN #define MAIN #include <iostream> using namespace std; #define D 1 #include __FILE__ #undef D #define D 2 #include __FILE__ #undef D int main() { cout << add1(10) << ' ' << add2(10) << endl; return 0; } #else #define ADD0(d) add##d #define ADD(d) ADD0(d) int ADD(D)(int x) { return x + D; } #undef ADD0 #undef ADD #endif 
    • Do you want to deprive me of this opportunity? - No, I don’t want to deprive you of this coldness, after the preprocessor passes, the compiler looks to see if the file has changed or not, if it has changed, it puts in a second time. - Maxmaxmaximus
    • @ Maxmaxmaximus, an interesting option. But it seems to me quite expensive - the tree of dependencies and the depth of nesting can be very large - you get unjustified multiple checks of the same files. Probably. - Qwertiy
    • @ Maxmaxmaximus: What kind of “second time paste” is it talking about and what kind of “insert” is there if you propose to eliminate the header files completely? - AnT
    • @Qwertiy will result in unjustified multiple checks of the same files. - No, right now, other languages ​​do this and this is a nanosecond operation. Just lies the map file, which is generated at the first full compilation, which describes all versions of files and their correspondence to each other and all that jazz. And that's all) The compiler will not have to physically go over all the files and scan it, it will simply load the mep file into the RAM and find out all the information from there. Well, that is, such things are easily easily cached, and this is how compilers of other languages ​​work. - Maxmaxmaximus
    • one
      @ Maxmaxmaximus, in the code given in the answer, the same file is connected three times (in fact, once it’s just the file itself and #include __FILE__ twice). It never changes, but all 3 times compilation leads to different results. What does the last change time mean? - Qwertiy