I read enough theory and it seems I understood the essence of CI. It is necessary to optimally configure the automatic build of the application in order to quickly prevent various kinds of errors during frequent integration. What other build for PHP? As far as I know there are no builds. Why run different phploc and other programs that simply return the size of the project. Why run these CI servers? How do they help (in the context of PHP)? Automatically start commands just?
- Comments are not intended for extended discussion; conversation moved to chat . - Nofate ♦
3 answers
Continuous integration is designed to solve many problems at once:
- Exclusion of the human factor from repeatable processes, and, in fact, repeatability and reproducibility of these processes. If a developer can accidentally dump a specific version of an application with its changes on a production or forget a separate branch in the corrected code, then continuous integration does not allow the developer to do it - it sets the rule set and allows the developer to pass its code through these rules in order to give a good display negative feedback due to an error.
- The previous point results in the fact that continuous integration is not able to forget any things. While the developer’s head holds plus or minus for the past two weeks, integration will report a problem as long as the problem exists.
- Testing (with which, as a rule, it begins). While the code is laid out on the production without testing, no one can say whether it works.
- The transformation of the code base into an artifact. The code base is just text files, the artifact is a ready-to-use application. In the case of PHP, the difference between the two states can be minimal and imperceptible, but if, for example, you are making a phar application, the CI server will be directly involved in preparing this phar archive. The developer may collect this archive several times a day, but the final version is assembled by the machine, and it is this that gives guarantees for the archive to work.
- Automate the project release process to one button. The difference with the previous paragraph may also not be very noticeable, but the artifact is just some kind of application build, while release is a specific version of the application with some functionality. Build an artifact and run the tests to ensure that the application works, but such an artifact does not necessarily contain all the functionality that can be laid out, it can be an intermediate version - so if the manager has planned in the next version of the website a new admin and integration with or a social network, then an intermediate artifact may contain one thing, and only an artifact that contains everything planned goes into the release. In addition (! Sic), the release may consist of several artifacts. The release itself can be automatic.
- Full automation of the process from laying out the code in the repository to laying out artifacts in production; in contrast to the previous paragraph, a person is completely excluded from the whole chain, and each successful assembly becomes a release automatically.
- Additional artifacts, such as automatic documentation (the notorious PHPDoc), artifact signatures (so that the final recipient can make sure that the information about the assembly of artifacts corresponds to reality).
- Additional code analysis. Tests are good, but they say only whether the code that has been tested works; they say nothing about how much code has been tested and how good this code is. Therefore, integration may also include additional reports, for example:
- Total code coverage tests (eg 65% of the code is covered)
- Report on the coverage of specific files, lines, logical branches and expressions . Here begins the very pulp, which may look like this - you just physically see which strings and which tests were covered. Moreover, such reports are perfectly integrated with the IDE, which allows you to track such things on your car (leaving CI to monitor the overall trend - the coverage grows or falls)
- Human readable test reports. When you test the code, you formalize the problem before comparing the actual result with the desired one, and the error 'Failed asserting that 42 is true' or 'Failed asserting that server returned 200' will not tell you anything. However, there are special frameworks that allow you to generate an even more tasty meat with human descriptions of tests (what do we test, why did this test appear, what happened during the test, what use-case is tested during this test?), Attach attachments to them, measure test execution speed (if one of them hangs, then it's good to know which one, right? what kind of test-insght). Specifically, now I am describing my beloved allure , but apart from him, there must be a lot of utilities. Take the time to go through the link - imagine how much you simplify life in a complex project, if each failed test will be accompanied by (sic!) Screenshot of the page that failed to perform the expected actions, the list of steps that should have been completed in the test , a specific browser in which there were problems, and, finally, some unfunny joke?
- Static code analysis reports. This is the next thing that gives you a hand if you build too complex loops or write unreadable code. While analyzing the implementation of logic is an AI-complete task (after solving which programmers are not very necessary, because the system itself can write logic), the analysis of correctly spaced indents, repeated code and number of comments is a completely solvable task. To impress you once again, I will leave another link , and a little later I will return to this topic.
- Automatic changelogs and reports on the contribution of the team to the project. This is a very specific and complex topic, which I did not touch.
- Load and other testing, close to real life and not testing code directly (in the netflix, for example, do not miss releases that are worse than the previous ones and deliberately kill servers to test the viability of the system in critical situations). If we are interested in the final product, then why not also automate the verification that it will meet our speed expectations?
- Verify that two sources of a code can be merged before a real merge
- Finally, continuous integration is easily integrated (punked) with various ways to alert developers. Monitoring the build process is a tedious and unnecessary matter, so continuous integration can be configured to be notified only in cases where changes break something; in this case, the silence of the system automatically means that either everything works, or the integration is stupidly dead (this is solved by monitoring, which I will not address here).
The whole set of tasks - to create an artifact, to run tests, to run an analysis - is called a pipeline, and a single pass through the code of the pipeline is a build. Build is the fact of passing the code of permanent integration, the result of which can be the above described artifacts and reports.
How does all this help the project? Imagine that we have a virtually endless project with a huge number of developers who cannot be controlled physically. If you do not have the above described pipeline, you do not have the opportunity to look at the code and assess the status of the current code base, technical debt and the contribution of developers to the project. On small projects, it may not be as easy to shoot in the leg as infinitely, but in the case of a large project, this can no longer be done, because the following situations begin to appear:
- In production the beaten release with bugs left. During the development of new features, the developers touched upon the old code and broke everything. Who is guilty? Who introduced the bug? Who missed this bug?
- Developers say that urgent refactoring is needed. How to evaluate how much it is needed, how to prioritize when refactoring, where are the hottest places?
- The developers say that the project is not covered with tests and threatened with risks of rolling out a non-working release. How to determine if this is true?
- The developers say that our code base is generally deteriorating. Again, they are asking for refactoring, or is everything really sloping?
- Other situations for which my fantasy has run out
Summarizing all the above, CI gives you the opportunity to automate all routine processes, transfer the main testing tasks to the machine, normalize the dialogue between managers / developers / product quality department and analyze the code base literally from the inside. While you do not have this thing, you do not know what state the current product is in, so take all the risks of producing a non-working product, a sudden need to refactor, waste time on backtracking bugs that arise in systems with complex logic. Of course, this is not a panacea, and bugs that were not written tests, still go into production, but without all this stuff you can’t look inside and track the technical (and, possibly, not only) quality of the product.
Regarding your particular example with PHPLoc - it is really useless if you just use it to count funny numbers. But indirectly, it gives you a way to analyze your application:
- What is the amount of code base? If we take a new developer, how much code does he have to familiarize himself with in order to start working ?
- How many global constants do we have? Do macaques work for us?
- What we have with cyclomatic complexity? Do we have the correct logic in the code?
If only you participate in the project, and you absolutely do not use global constants, then you do not need it. But if you work in a startup where every day counts, and often you have to sacrifice quality for speed, then instead of placing todo throughout the project, you can monitor those dirty parts of the code that you had to leave using the same phploc, phpcs, phpmd. The question of how much your current project needs is open, but you have the opportunity to provide this very view of the code from the inside.
To consolidate the impression - another link to the CI of the above Allure. It is developed by the test automation team at Yandex, and there you can see how the build is going on - there are tests running, trends, and automatic quality analysis through sonarqube.
In the comment format does not fall, so I will issue an answer.
phploc
Directories 3 Files 8 Size Lines of Code (LOC) 1858 Non-Comment Lines of Code (NCLOC) 1298 (69.86%)
Automatic report to the project manager that programmers are working, not vanka roll
Comment Lines of Code (CLOC) 560 (30.14%)
Coverage comments. There is a practice that good code should be self-documenting. 30% is dohrena
Logical Lines of Code (LLOC) 289 (15.55%) Classes 260 (89.97%) Average Class Length 37 Average Method Length 9
Dynamics shows the performance of the principle of sole responsibility.
Functions 5 (1.73%) Average Function Length 5 Not in classes or functions 24 (8.30%)
OOP or functional? Deviations signal a departure from the guidelines
Complexity Cyclomatic Complexity / LLOC 0.67 Cyclomatic Complexity / Number of Methods 7.86
I won't say anything here
Dependencies Global Accesses 2 Global Constants 2 (100.00%) Global Variables 0 (0.00%) Super-Global Variables 0 (0.00%)
The more - the worse
Attribute Accesses 48 Non-Static 48 (100.00%) Static 0 (0.00%) Method Calls 96 Non-Static 91 (94.79%) Static 5 (5.21%)
I think that's clear.
Then too lazy to disassemble
Structure Namespaces 4 Interfaces 0 Traits 0 Classes 7 Abstract Classes 0 (0.00%) Concrete Classes 7 (100.00%) Methods 28 Scope Non-Static Methods 28 (100.00%) Static Methods 0 (0.00%) Visibility Public Method 10 (35.71%) Non-Public Methods 18 (64.29%) Functions 1 Named Functions 0 (0.00%) Anonymous Functions 1 (100.00%) Constants 1 Global Constants 1 (100.00%) Class Constants 0 (0.00%)
If for you none of this control is interesting - for God's sake, then forget about it and do not use it.
- There is a practice that good code should be self-documenting. 30% - it's dohrena - if this is the explanatory comments. I often leave a joke in the comments. Metrics are imperfect :) but this is nonsense, on the whole, the considerations are correct. - D-side
- Is CI a question of collecting statistics simply? No, I do not think so. Report generation to project managers is not CI. Continius integration is a quick fix if I see that the non-working code got into the master, what bugs have appeared, what do I get from these metrics? I do not care how many comments there. The only useful thing, as far as I understand, is the test coverage, and if you know that they cover everything - vkovalchuk88
- That they are not that imperfect but generally useless in terms of continuous integration, that I looked at these digits so that the users do not wag the time tracking what or what - vkovalchuk88
There can be several servers that should receive updates, as well, as a rule, tests are used - they run right after the commit, check and evaluate the quality of the code, collect metrics, generate documentation, allow you to immediately identify problems of merging branches, etc.