How to write beautiful and readable code?

Question

Maximum code reduction. There is a desire to reduce the code as much as possible, but there are doubts whether it is not overkill. After all, it would be possible to declare a variable separately and assign an expression to it that is not immediately clear to what it is doing without a variable.
Variable naming. Problems with choosing a name for a variable. If you describe fully what it does, it will be about 16 characters, and if shortened, it may conflict with the understanding of other variables. Context also does not always save.
Separation of actions. Too many things in one part of the code, which can complicate understanding. On the one hand, it would be possible to make it into separate functions, but on the other hand there is no point, as they are used only in that part of the code.
The relevance of comments. Your code is always clear, and supposing how others understand it will not always be objective. So, it is not always obvious that a comment is needed or it will be redundant.
Formatting Start a variable with a large or small letter, and also within a variable, whether you need to start words in large letters or separate them with "_". Whether to make an empty line between if and other blocks, as well as between variables and these blocks, or arrange by actions. Whether to indent within expressions, for example if (sometring) or if (sometring). 6 ...

These items are only for understanding the problem, because many have forgotten, and many have not noticed, I think. On the one hand, it is also very easy to answer these points, but on the other hand I would like to have rules or detailed analysis. So, are there any vowel / unofficial rules that would describe this and more in this direction? Or is there a book dedicated to this topic?

I recommend that you read the "Perfect Code" goo.gl/jiEHE1 and the "Clean Code" for compulsory reading goo.gl/rtF5wD
And the continuation of the "Clean Code", which is called the "Ideal Programmer" goo.gl/yD6VJb
@Shamov, in my opinion, in the "Ideal Programmer" there is not a word about how to write good and beautiful code.
There is about how to behave yourself well and beautifully with a professional tz.
@andreycha Strictly speaking, yes ... there are no specific instructions.
But given the fact that only a good person can write good code, the whole book is about it.

Accepted Answer · 2015-04-14T19:27:34

The issue of style is actually a very serious matter.

Do not forget that the code is written by you not for the compiler. Make it easy for the compiler to understand, but your goal is harder: make it clear to the person .

The program code is the same literary work. You have to bring the thought to the reader, and this reader can be both yourself in six months, and your colleague, who will have to edit the code while you are on vacation.

A good code lives for a long time, which means it will be read, corrected, understood and explained many times. Understanding someone else's code is much more difficult than writing a new one from scratch, so investing in code clarity is important (if, of course, you wish your code a long, happy life).

Code reduction is not necessary . The code should be clear, no more and no less.

If it seems to you that somewhere you need to paint for clarity, do so, even if you have to enter additional variables only to give a name to the intermediate result. If, on the contrary, it seems to you that the code is too much for the simple thing that it does - bring this thing into a separate function and come up with the correct name to explain the meaning of the action. Maintain uniformity and overall pace: if a piece of code launches a space rocket in flight, then a piece of code next to it that reads data from the configuration file looks ridiculous.

Naming variables and functions . Do not feel sorry for the letters! Bytes on the hard drive fell. You do not have to write comments to clarify the meaning of the variable, otherwise the reader should see one (the name of the variable) and keep the other in mind (its meaning). On the other hand, do not bother the reader with unnecessary details. The fact that you are not just acceptableByteCount , but countOfBytesWhichDidNotPassAtLeastTwoFilters is boring. Observe a reasonable balance. If you need a loop variable, name it i or j . Adhere to generally accepted agreements, if necessary, invent your own (but reasonable!), Easily understood by others.

Name the variables correctly. The name should reflect the meaning. If you use the same variable in two different ways, you are doing wrong, divide it into two variables with a unique meaning. For example, it is not necessary to combine the length of the transferred string and the counter of the remaining characters for processing, although the initial value of the counter coincides with the length of the string.

Try to make the text read naturally. For example, the name of the action should be a verb (not vector.Normalization() , but vector.Normalize() or vector.GetNormal() ). The name of the boolean condition should be similar to the condition and most likely start with is , has and the like. (For example: hasChanges() , isPrime() , etc.) For God's sake, use English names, not Russian translit! Believe me, isZarplataComputed() looks awful. The exception is languages with Cyrillic syntax ( 1c ?) Or a generally accepted command style.

Separation of actions . Yes, it makes sense to separate the code into a function only to correctly name this code fragment. Functions are not reusable! Functions are needed for logical code breaking into parts. If you see that your function is responsible for different things, and you cannot think of a short, precise name for it, it means that your function does too much, and it needs to be divided. Often from a super-long function of 500 lines we get a dozen classes. And this is good.

And yes, the reader is much better able to understand the function that does one simple task. If the function does too much, it has more complicated preconditions and postconditions besides more complex code (which means that it is even harder to understand).

For good splitting, design from top to bottom. Example: to cook food is what? Decide what it means to cook a french breakfast. Okay, what about cooking a french breakfast? This is to buy croissants and make coffee. What is brewed coffee? This grind grain in a coffee grinder, fall asleep in the Turk, add water, put on fire, etc. This is naturally formed into the procedures PrepareMeals , PrepareFrenchBreakfast , BuyCroissants , MakeCoffee . I did not have to invent anything.

The relevance of comments . Try to write code so that comments are not needed. Very often, comments are redundant; very often they become obsolete and cease to reflect reality. People often change the logic of the code, but forget to run through the comments.

Don't forget that the code is being executed, not the comments. Therefore, an erroneous, misleading comment (for example: /* здесь не может получиться NULL */ ) is much worse than its absence. If you have a piece of code with a comment explaining what it does, turn this code into a function, and the comment into its name. It is likely that it will not be possible to completely avoid comments, but make sure that the comments describe why you are doing what you are doing, and what exactly you are doing should be clear from the code.

Formatting Bad formatting greatly affects the readability of the code. Develop a style and stick to it. What style you choose, in general, and it does not matter, as long as it is logical and consistent. (For example, if you put a space after a while , you should probably put a space after the if .) Try, however, not to deviate from generally accepted conventions (for example, it is customary to choose method names in Java in lowerCamelCase), otherwise it will be difficult for you to read someone else's code .

If you work in a team, do not break the overall style, even if you personally do not like it. If the team does not have a generally accepted style, offer it! Placing brackets, indents, spaces, maximum length of lines and everything is important so that the reader is not distracted. Inconsistent formatting is confusing and distracting much more than it seems - just like the wrong punctuation makes it difficult to correctly understand the text of a literary work.

And the last. Do not worry about improving the efficiency of the code by reducing its readability. Selecting a separate method is not a problem, modern compilers have learned to inline everything they need. And most likely they are able to combine different variables into one register much better than you. If in some place for low-level optimization you really need to degrade readability, provide this fragment with sufficient comments about what happens in the code, and most importantly, why such a trick is needed.

etki etki 33.2k 2 45 71 · Answer 2 · 2015-04-14T17:33:08

The first thing I would like to say is a very good question.

What you are looking for is called a code writing convention (or code writing standards ), and they are easy to find by searching for "coding style convention" or "coding standards". For C / C ++ there are several conventions, what specific you will use - as a rule, it does not matter, the main thing is that the whole project be in one key. The first references are conventions from Google (C ++) and GNU with Linux Kernel (C) - I have not read them, but they should fully cover the above questions.

As for the comments, this is a rather controversial position, here you have to work out your strategy. Many believe that the code should be read by itself without comment; people like me support this strategy, but they believe that comments should maximize chewing on possible use (in order to be trivially highlighted in the IDE).

Answer 3 · 2015-04-14T19:21:17

I write in PHP, Python and Javascript, but still I allow myself to leave a couple of conjectures (they relate to your 2nd point). This item brought my team a lot of trouble, after the project grew.

We have an internal protocol (JSON over TCP), with which the components of the system communicate with each other, let's call it, let's say ... TMP-protocol.

There are more than a dozen entities that are directly related to TMP - all sorts of interfaces, gateways, callbacks, "promises", waiting for an answer, etc. and so on.

NEVER give names to classes and variables that do not give a rough idea of their purpose, for example :

 class TmpProtocol

Protocol in this case is an extra 8 characters (tmp is already a protocol), the class name doesn’t mean anything at all. Compare with this :

 class TmpClientGateway

It is better? Much more Intuitively, I want to do something like this:

 $a = new TmpClientGateway(); $a->makeRequest(...)

Now about the variables. If a variable has at least some value in the outside world, then it needs to be given a meaningful name, for example, the following code will turn the head and bring you to nausea:

 self.deferred = defer.Deferred()

We promised someone (no matter to whom) a pending result. But what we have promised is not clear to anyone (and you yourself will forget in a year). Where better to write something like this:

 self.serverResponse = defer.Deferred()

You need to give yourself an account of what this particular thing does and call it appropriately. And a jumble of classes serving tmp:

 class tmpProtocol, class Handler, class tmpInterface

turns into something more understandable:

 class tmpServerRequest, class tmpServerRequestFactory, class tmpMethodDecorator

And yet, if a certain class is a successor from some standard Factory, then the variables of this new class in the name should contain Factory, and not, say, Handler

Now about the 3rd item: Do not force the same piece of code to handle the error of the tcp connection and the error of, say, "incompleteness" of the incoming data (even if from above, for the user, these situations look the same).

This will make the code completely opaque. It is much better to spread the functionality across two, even if almost identical, functions and avoid obscure branching from if-else-elsif-elsif.

Especially maddening, when something almost the same happens in the body of elsʻov, but different, by 1 line (in one place - manually closed, socket, and in the other - did not begin to do it).

It will be much clearer if the first situation is handled by the errCorruptedData function (the data came, but the curves, so close the socket!), And the second situation is errNoRouteToHost (the socket did not open anyway).

I apologize for the degenerate examples.

Answer 4 · 2015-04-14T17:51:11

The quality of your C code mainly depends on your practical experience and the degree of enlightenment. In addition, each major project has its own formatting quirks.

Therefore, I will answer the points of the question, using the secret (no longer) convention, which I usually use when I write C code.

Two or more use a for (c). I shorten it only if it can be used to automate the whole process. I never allow Chinese code.
I always use long and clear lower case identifiers separated by _ of which the context and all the additional information is immediately clear. Never cut them. very_long_clear_and_useful_variable_name . I never use Hungarian notation with type information in the identifier.
I create as many of the most appropriate abstractions as needed to solve the problem, not more. I also follow the hard rule of thumb if possible: A non-automatically generated function that has either lines longer than 80 characters, or more than 3 levels of nesting of control flow instructions, or more than 3 nested parentheses in one line, or more than 40 lines, or receiving more than 3 parameters (variadic counts as 1), or in the code of which there are more than 0 explicitly written numbers or string constants with probability bl viscous to 100% no use to anybody in the outside world and will fit only as a case study or for a neat internal use.
Every time you insert a function into the production that does not satisfy the rule, one star in the sky goes out, a crowd of sadistic schoolchildren dissects a hamster alive, and neurons die in the minds of your colleagues reading it.
In general, I do not use comments in the definitions, instead of them I do a forced upload to the debag log. In the ads I insert only the header and a brief description of what this file does.
МАКРОСЫ large letters, everything else is small through _. I use K & R otsupy. DeprecatedCamelCaseIdentifiers do not use DeprecatedCamelCaseIdentifiers . I use the _t suffix to distinguish structures declared from typedef from the rest.

Although of course how many conventions do not apply With the code in general is quite difficult to make easy to read.

In confirmation of this, you can watch the international competition of the most complicated programs in C.

Here, for example, is a typical program from there that simply draws a huge smiley showing the language in the terminal:

 m(f,a,s)char*s; {char c;return f&1?a!=*s++?m(f,a,s):s[11]:f&2?a!=*s++?1+m(f,a,s):1:f&4?a--? putchar(*s),m(f,a,s):a:f&8?*s?m(8,32,(c=m(1,*s++,"Arjan Kenter. \no$../.\""), m(4,m(2,*s++,"POCnWAUvBVxRsoqatKJurgXYyDQbzhLwkNjdMTGeIScHFmpliZEf"),&c),s)): 65:(m(8,34,"rgeQjPruaOnDaPeWrAaPnPrCnOrPaPnPjPrCaPrPnPrPaOrvaPndeOrAnOrPnOrP\ nOaPnPjPaOrPnPrPnPrPtPnPrAaPnBrnnsrnnBaPeOrCnPrOnCaPnOaPnPjPtPnAaPnPrPnPrCaPn\ BrAnxrAnVePrCnBjPrOnvrCnxrAnxrAnsrOnvjPrOnUrOnornnsrnnorOtCnCjPrCtPnCrnnirWtP\ nCjPrCaPnOtPrCnErAnOjPrOnvtPnnrCnNrnnRePjPrPtnrUnnrntPnbtPrAaPnCrnnOrPjPrRtPn\ CaPrWtCnKtPnOtPrBnCjPronCaPrVtPnOtOnAtnrxaPnCjPrqnnaPrtaOrsaPnCtPjPratPnnaPrA\ aPnAaPtPnnaPrvaPnnjPrKtPnWaOrWtOnnaPnWaPrCaPnntOjPrrtOnWanrOtPnCaPnBtCjPrYtOn\ UaOrPnVjPrwtnnxjPrMnBjPrTnUjP"),0);} main(){return m(0,75,"mIWltouQJGsBniKYvTxODAfbUcFzSpMwNCHEgrdLaPkyVRjXeqZh");}

She is quite working. What kind of convention is there?

PS To make it a little easier to understand complex ads like this:

char **(*(*(*x)[100])(int,char*,double ***,void(*)(int**,char[])))[50];

In fact, everything is simple and clear. x is just a pointer to an array of 100 pointers to a function that takes arguments ( int , a pointer to char , a pointer to a pointer to a pointer to a double , a pointer to a function that accepts (a pointer to a pointer to an int , an array of char ) returns void ) that returns a pointer to an array of 50 pointers to char pointers

you can use their English translator

Shamov shamov 2,729 one 7 22 · Answer 5 · 2015-04-14T20:17:50

First of all, I must say that there is no answer to this question. On both its parts ...

The concept of beauty is generally irrelevant to code. The code can not be beautiful. This is not a poem, and not prose. And it should not be beautiful. His task is not to give someone aesthetic pleasure, but to compile. He has a purely instrumental function. Surely, there are people who enjoy various designs of cycles and conditional operators purely aesthetically, but this is rather a perversion ... no need to focus on such people. Although it should be noted that the algorithm that is implemented in the code can be beautiful. But these things must be clearly separated. To call some code beautiful is the same as to call beautiful reinforced concrete from which a beautiful building is built.

As for the readability of the code, then this is better. Although there is no answer to this part of the question either. In the industry, there is a widespread erroneous view that readability is a property of the code itself ... as if the code can be given such a form that makes it readable. In fact, it is not. The code cannot be read by itself ... apart from who reads it. This is a more or less obvious thought. But, unfortunately, few people think about it. In fact, you need to think about the readability of the code not as a property of the code itself, but as a relationship between two or more people, which is somehow embedded in the code that they write and read. If each of these people turns out to write code so that others can read and understand it more or less comfortably, then these people write readable code. But this readability is inextricably linked with these particular people. You can not pass the code to other people, while maintaining its readability at the same level. Other people in the brain will have already formed completely different patterns associated with the code, and therefore the readability of any code that does not fit into them will surely squeeze.

The notorious Coding Style Guides actually approach the solution of the problem from the people. Their task is only to fix a certain set of reasonable rules that people working together agree to adhere to. Not because these rules are objectively good by something, but simply because such a set of rules is most convenient for them as a general one.

In paragraph 4, the essence of the problem is very accurately grasped, although it is about comments. "Your code is always clear." - This phrase needs to be mentally repeated three times whenever there is a desire to make a judgment about your own code. Including its readability. Well, or you can slightly change the phrase: "We always read our code."

In general, apart from any connection with people, the code can be given only such properties as correctness, compilability, efficiency, etc. The property of readability exists in the code in the form of something ephemeral that associates this code with some specific people. And in order to evaluate the code for readability, you need to show it to other people. And not just to other random people, but to those who have a personal interest in reading and understanding it. There is no other way. If the code is interesting only to the author himself, and he can read it comfortably himself, then the code is already 100% readable. Nothing more needs to be done. It is impossible to increase readability above 100%.

Beauty is an aesthetic concept and, of course, subjective, i.e.
associated with the inner world of man, with the model of reality that is in each of our heads.
Someone sees it in natural objects, someone in pictures, and someone in music ... So there is nothing surprising when a certain part of people speaks about the beauty of the code.

Community spirit ♦ one · Answer 6 · 2015-04-14T20:35:22

Each language has its own approach. For the same C and C ++, the answer to some items may be completely different.

Reducing the amount of code is sometimes a plus, because Rummaging through the colossal codebase is often unpleasant, and measuring the number of KLoCs in your projects is a silly thing. But we must not overdo it. We have here, you know, not code-golf =) the code should always be readable, even for the person who sees it for the first time!
Variable naming. This is a matter of taste, but it is important that the variable name provides at least some information about what it is.
IMHO is the most important point, I'll sign here (a little more than asked, but oh well) with specifics for C, because I have some experience with him. Some of this applies to other languages. I warn you in advance, everything further is purely my IMHO.
In any C code, the main thing is balance.
It is important to keep a balance between the desire to make the code more modular and reusable, adding more specialized functions to it, and creating too long, non-optimal and certainly unreadable chains of function calls (not all C compilers are able to inline, not the rubber stack, and the frames are not instantaneous are created).
There must be a balance between what your API takes over and what it puts on the user (the calling API code). For example, there is an unspoken rule - if the API initializes the newly created object, the memory allocation for this object should be left to the user.
The balance should be between the number of functions, globals, function prototypes and macros in one file (and in one translation unit). On the one hand, it’s not very good to tamper either the file or the translation unit, but on the other hand, it’s not reasonable to create a separate file for one function.
And, of course, the balance should be between what is visible to whom and where. If someone tells you that there is no encapsulation in C, do not believe, it was there long before C ++ and its private modifier! =) Чтобы энкапуслировать функцию или global, объявите ее статической ( static ) в отдельном translation-unit-е, тогда к ней можно будет получить доступ только из него же. Чтобы энкапуслировать членов структуры, используйте ее как opaque pointer - это можно сделать, например, объявив ее как incomplete type в коде, вызывающем энкапсулирующее API, а определить ее уже в самом API (хороший пример тут ).
Комментарии - это хорошо. Но нельзя на них слишком сильно полагаться. Если код плохо написан, то тут никакие комментарии не спасут. Опять же, нужен баланс - комментариев не должно быть мало (за исключением очень редких случаев, когда код благодаря красноречивым именам функций/переменных и т.д. и логичного построения алгоритмов читаем сам по себе), но и не должно быть много (как в поговорке: 90% комментариев, 10% кода ~~, но все равно нихрена не понимаю, что этот код делает~~ ).
Стиль - это лично ваше дело. Конечно, у некоторых языков есть идиоматический стиль, которого стараются все придерживаться (например, C# с майкрософтовским стилем, или Java со стилем примеров из Javadoc). Проблема в том, что у C их что собак нерезанных. Кто-то пишет код, как в примерах в книге K&R, кто-то следует формату Linux Kernel - а, кто-то предпочитает формат Столлмана... Вот здесь энное количество примеров кода разных стилей. Вообще, стиль себе каждый программист должен выбрать сам (лично я, например, даже на C пишу на чем-то сродни стилю Javadoc-а, с camelCase-ом и индентацией, похожей на BSD KNF). И, конечно, на каждом себя уважающем предприятии есть свой, иногда уникальный, style-guide, определяющий, КАК должен выглядеть код, так что надо уметь адаптироваться к другим стилям.

Вы пишете есть негласное правило - если API инициализирует только что созданный объект, аллокацию памяти под этот объект надо оставить пользователю -- ссылки на авторитетные (правда, тут вопрос -- для кого) источники можно? Возможно правильней не обязывать а предоставить возможность (т.е. тут речь об разных уровнях API). / Ну, и насколько хороша практика сокрытия полей структур -- тоже спорный вопрос.

Community spirit ♦ one · Answer 7 · 2015-04-14T20:46:43

Буду отвечать общими принципами - они сами подходят к Си++, но примеры буду приводить те, которые приходят в голову, не обязательно на плюсах.

Код должен быть кратким. Но не максимально.
Например, я как-то переписал несколько экранов разметки на несколько строк (html, AngularJS). И добавил комментарий, описывающий, что там вообще происходит.
А вот пример на VB.NET, когда сокращение кода - это жесть:
X *= 10 - 5 значащих символов
X &= 0 - 4 значащих символа и жуткий оверхед при выполнении: сначала мы конвертируем числа X и 0 в строки, потом создаём новую строку выполнив конкатенацию, потом парсим получившуюся строку обратно в число и выполняем присваивание. Место такому коду - только в codegolf-задачках, больше нигде.
В чрезмерно ужатом коде разобраться может быть очень сложно, а уж на Си++ - особенно.
На мой взгляд, имена переменных должны быть краткими. Но при этом осмысленными. И без излишних сокращений. Да, я могу понять, что такое s2e в <a id=s2e>Switch to English</a> , но на мой взгляд, такие сокращения оправданы только для частоиспользуемых вещей, а в остальных случаях их применять не следует. И я не люблю распространённые сокращения tmp и cnt - эти 1 - 2 символа не стоят того.
Всё по ситуации. Если есть осмысленный блок, который хочется вынести в функцию - то да. Если нет, то не надо. Если вынесение вызывает проблемы, то тоже не надо. Как вариант - написать комментарий, что делает конкретный участок кода и поставить обрамляющие этот участок фигурные скобки, чтобы блок был выделен явно:
```
 // Сделать что-то  {  ТутКакойТоДлинныйКод();  } 
```
Чрезмерное выделение функций на каждый чих мне не нравится. Да, некоторые любят, когда код можно читать как текст, но код обычно читают не чтобы полюбоваться, а чтобы что-то в нём изменить. И искать нужное место, когда перед тобой почти текст, а не код мне очень неудобно.
Комментарии должны говорить об идее кода или пояснять какие-то сложные моменты. Пояснять то, что можно понять из кода, обычно не стоит. Единственное исключение, это когда код делает что-то совсем неочевидное. Имеет смысл пояснять какие-то подводные камни и ограничения, а так же высокоуровневые описания. Имеет смысл отмечать комментарием то, что при первом взгляде кажется ошибкой. Например, присваивания в условиях и отсутствие break в switch'е.
Про заглавные буквы комментировать не буду.
Что касается отступов - почти всегда пропускаю строку перед и после блоками.
Предпочитаю не ставить фигурные скобки: https://ru.stackoverflow.com/a/424351/178988 .
Не пишу несколько операторов в одной строке просто так.
Допускаю возможность использования запятой.
Если есть основания выровнять похожие строки столбцами - делаю это. Это единственный случай помещения нескольких операторов в строку.

How to write beautiful and readable code?

7 answers 7

More articles: