Why don't people use formal methods?

On the Software Engineering Stack Exchange, I saw the following question : “What prevents the widespread introduction of formal methods?” The question was closed as biased, and most of the answers were comments like “Too expensive !!!” or “The site is not an airplane !!!” In a sense, this is true, but it explains very little. I wrote this article to give a broader historical picture of formal methods (FM), why they are not actually used, and what we are doing to correct the situation.

Before you begin, you need to formulate some conditions. In fact, there are not many formal methods: just a few tiny groups . This means that different groups use the terms differently. In a broad sense, there are two groups of formal methods: a formal specification studies the writing of exact, unambiguous specifications, and a formal test - methods of proof. This includes both code and abstract systems. Not only do we use different terms for code and systems, we often use different tools to verify them. To confuse everything even more if someone says that he creates a formal specification, this usually means design verification. And if someone says that he does a formal verification, it usually refers to the verification of the code.

For clarity, we divide the verification for code verification (CV) and design verification (DV) and in the same way we divide the specifications for CS and DS. Such terms are not commonly used in the broad FM community. Let's start with CS and CV, then move on to DS and DV.

In addition, partial verification is possible, where only a subset of the specification is verified, or complete verification . This may be the difference between the proofs of the statements “the system never fails and does not accept the wrong password” or “the system never fails and blocks the account if you enter the wrong password three times”. Basically, we will assume that we are doing a full check.

You should also clarify the type of software that we formalize. Most people implicitly distinguish highly reliable programs, such as medical devices and airplanes. People assume that formal methods are widely used for them, and for the rest they are not needed. This is an overly optimistic view: most highly reliable software does not use formal methods. Instead, we will focus on “ordinary” software.

Finally, a disclaimer: I am not a professional historian, and although I tried to carefully check the information, there may be errors in the article. In addition, I specialize in formal specification (DS and DV), so there is more chance of error where I talk about code verification. If you see, write to me, I will correct (and again: I make money on conducting seminars on TLA + and Alloy, so I am very biased about these languages; I try to be as objective as possible, but you know: bias is bias).

Formal programming

BOM specification

Before proving the correctness of the code, you need to get the standard of truth. This means some specification of what the code should do. We must know for sure that the result conforms to the specification. It is not enough to say that the list is “sorted”: we do not know what we are sorting, what criteria we use and even what we mean by “sorting”. Instead, you can say: "The list of integers l sorted in ascending order for any two indices i and j, if i < j , then l[i] <= l[j] ".

Code specifications are divided into three main types:

The first is writing code independent operators. We write our sort function, and in a separate file we write the theorem “this returns sorted lists”. This is the oldest form of the specification, but Isabelle and ACL2 still work (ML was invented specifically to help write such specifications).
The second implements the specification in the code in the form of pre- and postconditions, statements and invariants. A postcondition can be added to the function that “the return value is a sorted list”. Assertion based specifications were originally formalized as Hoare’s logic . They first appeared in the Euclid programming language in the early 1970s (it is unclear who first started using them: Euclid or SPV , but as far as I know, Euclid was previously presented to the public). This style is also called contract programming - the most popular form of verification in the modern industry (here contracts are used as code specifications).
Finally, there is a type system. According to the Curry-Howard correspondence, any mathematical theorem or proof can be encoded as a dependent type. We define the “sorted lists” type and declare the type [Int] -> Sorted [Int] for the function.

On Let's Prove Leftpad you can see how it looks. HOL4 and Isabelle are good examples of the specifications of the “independent theorem”, SPARK and Dafny are of the specifications of the “nested statement”, and Coq and Agda are of the “dependent type”.

If you take a closer look, these three forms of code specification are compared to the three main areas of automatic validation: tests, contracts, and types. This is no coincidence. Correctness is a wide range, and formal verification is one of its extremes. As the severity (and effort) of verification decreases, we get simpler and narrower checks, be it a restriction of the state space under study, the use of weaker types, or a forced check at run time. Then any means of complete specification turns into a means of partial specification, and vice versa: many consider Cleanroom a formal verification technique, where code review goes far beyond human capabilities.

Which specifications are correct?

Verification verifies that the code conforms to the specification. The question arises: how do we know that we have the correct specification? Finding the right specification is one of the biggest problems in formal methods. This is also one of the main objections against them. But skeptics mean here not exactly what the experts on formal methods mean.

When outsiders ask “How do you get the right specifications?”, They usually think of validation , that is, specifications that do not meet the requirements of the client. If you formally prove that your code sorts the list, and the client actually wants Uber for soups (tm), you just spent a lot of time. They say that only fast iterations and short feedback cycles are able to confirm your requirements.

It is true that code verification does not validate it. But there are two problems with this argument. The first is that the stage of applying formal methods is simply transferred, but does not disappear completely. After all these quick iterations, you probably already have an idea of what the client wants. And then start the verification code. Secondly, although we do not know what exactly the client wants, but we can assume what exactly he does not want. For example, that the software accidentally fell. They do not need security holes. Everyone agrees with this: in the end, no one says that you need to skip unit tests during iterations. So at least make sure that your version control system does not delete random user data (note: do not think that I am angry or something like that).

The problem with finding the right specification is more fundamental: we often don’t know what to write there . We think of our requirements in human, not mathematical, terms. If I say: "The program should distinguish trees from birds", then what is at stake? I can explain to a person by showing a bunch of photos of trees and birds, but these are only specific examples, not a description of the idea . In fact, to translate this into a formal specification, you need to formalize human concepts, and this is a serious problem.

Do not misunderstand me. Here you can determine the relevant specifications, and experts do it all the time. But writing the relevant specifications is a skill that needs to be developed, as well as programming skills. That is why many of the recent successes of code verification are explained by a clear mapping of what we want into what we can express. For example, CompCert is a formally verified C compiler. Specification for it: “Avoid compilation errors”.

But all this has nothing to do with verification. When you have a specification, you still need to prove that the code corresponds to it.

Proof of specification

The first ever code verification tool is a Dijkstra-style method “think about why this is true,” which is primarily intended for ALGOL. For example, I can “prove” the correctness of sorting by the method of insertion as follows:

Basic option : if one item is added to the empty list, it will be the only item, so it will be sorted.
If we have a sorted list with k elements and add one element, then we insert the element so that it is after all the smaller numbers and in front of all the big numbers. This means that the list is still sorted.
By induction, sorting by the method of insertion will sort the entire list.

Obviously, in reality, the evidence will look more rigorous, but this is a general idea. Dijkstra and others used this style to prove the correctness of many algorithms, including many of the basics of parallelism. This is also the style with which Knut’s words are related: “Beware of errors in this code; I only proved that it is correct, but did not run it. ” You can easily spoil the mathematical proof so that no one will notice. By some estimates , approximately 20% of the published mathematical evidence contains errors. Peter Guttmann has an excellent essay on evidence of the performance of a ridiculous program, where tons of “verified” code immediately fall if run.

At the same time, we studied the methods of automatic proof of mathematical theorems. The first program to prove the theorem was published in 1967 . In the early 1970s, such programs began to be used to check the code on Pascal, and in the middle of the decade special formal languages appeared. The programmer formulates certain properties of the code, and then creates a testable proof that the code has these properties. The first programs for proving theorems simply helped people to verify evidence, while more complex tools could independently prove parts of the theorem.

Which leads to the next problem.

Evidence is hard to get

The evidence is hard and this is a very unpleasant job. It’s hard to quit programming and go to the circus. Surprisingly, the formal proofs of code are often more rigorous than the proofs that most mathematicians write! Mathematics is a very creative activity, where the answer to the theorem is valid only if you show it. Creativity does not fit well with formalism and computers.

Take the same example with the sorting method of the insert, where we applied induction. Any mathematician will immediately understand what induction is, how it works in general and how it works in this case. But in the program for proving theorems, everything must be strictly formalized. The same with proof by contradiction, proof through contraposition, etc. In addition, it is necessary to formalize all assumptions, even those where most mathematicians do not bother themselves with proofs. For example, that a + (b + c) = (a + b) + c . The program for testing theorems does not know a priori that this is true. You either have to prove it (difficult), or declare it as the truth according to the associative law of addition (dangerous), or buy a library of theorems from someone who has already been able to prove it (expensively). Early programs for proving theorems competed in terms of the number of embedded proof tactics and the associated libraries of theorems. One of the first widespread SPADE programs presented the full arithmetic library as the main advantage.

Next, you need to get the proof itself. You can assign it to the program or write it yourself. Usually, the automatic determination of evidence is not solvable. For extremely narrow cases, such as propositional logic or HM type checking, it is “only” NP-complete. In fact, we ourselves write most of the evidence, and the computer verifies its correctness. This means that you need to know well:

mathematics;
computer science;
the area in which you work: compilers, hardware, etc .;
nuances of your program and specialization;
nuances of the program for proving theorems that you use, which in itself is a whole specialty.

To make matters worse, computer-specific inserts put sticks in the wheels. Remember, I said that it is dangerous to assume the associative law of addition? Some languages do not comply with it. For example, in C ++ INT_MAX. ((-1) + INT_MAX) + 1 INT_MAX. ((-1) + INT_MAX) + 1 is INT_MAX. -1 + (INT_MAX + 1) INT_MAX. -1 + (INT_MAX + 1) , which is undetectable. Assuming associative addition in C ++, then your proof will be wrong, and the code will be broken. You need to either avoid this statement, or prove that there is never an overflow for your particular fragment.

You can say that indefinite addition is a mistake, but you need to use a language with unrelated integers. But most languages have specific functions that interfere with evidence. Take the following code:

 a = true; b = false; f(a); assert a;

It's always like this? Is not a fact. Maybe f will change a . Maybe it will change the parallel flow. Perhaps b assigned an alias a , so changing it will also change a (note: aliases are so obstructive in writing evidence that John C. Reynolds had to create a completely new separation logic to cope with this problem). If something is possible in your language, then you should explicitly prove that this is not happening here. This will help clean code, in another case, it can destroy the evidence, because it forces the use of recursion and functions of a higher order. By the way, the one and the other is the basis for writing good functional programs. What is good for programming is bad for proof! (Note: in this lecture, Edmund Clark listed some difficult-to-verify properties: "floating-point commas, strings, user-defined types, procedures, concurrency, generic templates, storage, libraries ...").

Formal verifiers have a dilemma: the more expressive the language, the harder it is to prove something on it. But the less expressive the language, the more difficult it is to write on it. The first working formal languages were very limited subsets of more expressive languages: ACL2 was a subset of Lisp, Euclid was a subset of Pascal, and so on. Nothing that we have discussed so far really proves real programs, these are only attempts to approach to writing evidence.

The evidence is hard. But they become easier. Researchers in this area add new heuristics, libraries of theorems, pre-tested components, and so on. Technical progress also helps: the faster the computers, the faster the search.

SMT revolution

One of the innovations in the mid-2000s was the inclusion of SMT solvers in programs to prove theorems. In a broad sense, an SMT solver can turn (some) mathematical theorems into constraint observance problems. This turns a creative task into a computational one. You may still need to supply it with intermediate problems (lemmas) as steps in a theorem, but this is better than proving everything yourself. The first SMT solvers appeared around 2004, first as academic projects. A couple of years later, Microsoft Research released Z3, a ready-made SMT solver. The big advantage of the Z3 was that it became much more comfortable to use than other SMTs, which, frankly, said almost nothing. Microsoft Research used it internally to help prove the properties of the Windows kernel, so they were not limited to the minimum UX.

SMT struck the FM community because it suddenly made many simple proofs trivial and allowed to tackle very difficult issues. Thus, people could work in more expressive languages, since now the problems of expressive statements began to be solved. Incredible progress is evident in the IronFleet project: using the best SMT solvers and advanced verification language, Microsoft was able to write 5000 lines of verified Dafny code in just 3.7 person-years! This is an incredibly fast pace: as many as four whole lines per day (note: the previous record belonged to seL4 , the developers of which wrote two lines per day in C.

The evidence is hard.

Why do you need it?

It's time to step back and ask: "What is the point?" We are trying to prove that some program meets a certain specification. Correctness is a range. There are two parts of verification: how objectively your program is “correct” and how carefully you have verified that it is correct. Obviously, the more verified the better, but the test costs time and money. If we have several limitations (performance, time to market, cost, etc.), a complete validation check is not necessarily the best option. Then the question arises, what is the minimum check we need and what it costs. In most cases, you need, for example, 90% or 95% or 99% correctness. Maybe you should spend time improving the interface, rather than checking the remaining 1%?

Then the question: "Is the verification 90/95/99% significantly cheaper than 100%?" The answer is "yes." It is quite comfortable to say that the code base, which we have tested and typed well, is basically correct except for a few fixes in production, and we even write more than four lines of code per day. In fact, the vast majority of failures in distributed systems could have been prevented with a little more complete testing. And this is just an extension of tests, not to mention fuzzing, property-based testing, or model testing. You can get a truly outstanding result with these simple tricks without needing to get full proof.

What if typing and tests do not provide sufficient verification? It's still much easier to go from 90% to 99% than from 99% to 100%. As mentioned earlier, Cleanroom is a developer practice that includes comprehensive documentation, thorough flow analysis and extensive code review. No evidence, no formal verification, not even unit testing. But a properly organized Cleanroom reduces the density of defects to less than 1 bug per 1000 lines of code in production (note: figures from Staveley's research in Toward Zero-Defect Programming > but always be skeptical and check the original sources ). Programming Cleanroom does not slow down the pace of development, and certainly goes faster than 4 lines per day. And Cleanroom itself is just one of the many methods for developing highly reliable software that lies in the middle between normal development and code verification. You do not need complete verification to write good software or even almost perfect. There are cases when it is necessary, but for most industries it is a waste of money.

However, this does not mean that formal methods are generally uneconomical. Many of the above-mentioned highly reliable methods are based on writing code specifications that you do not formally prove. With regard to verification, in the industry with profit use two general methods. First, design verification instead of code, which we will look at later. Secondly, partial verification of the code, which we will consider now.

Partial code verification

For everyday tasks, it is too expensive to do a full scan. How about partial? After all, you can benefit from the proof of some properties of individual code fragments. Instead of proving that my function always correctly sorts numbers, I can at least prove that it does not loop forever and never goes out of range. This is also very useful information. So, even the simplest evidence for C programs is a great way to eliminate a huge part of undefined behavior .

The problem is accessibility . Большинство языков спроектированы или для полной верификации, или вообще её не поддерживают. В первом случае вам не хватает многих хороших функций из более выразительных языков, а во втором случае нужен способ доказать вещи на языке, который враждебен самой концепции. По этой причине большинство исследований по частичной верификации сосредоточено на нескольких высокоприоритетных языках, таких как C и Java. Многие работают с ограниченными подмножествами языков. Например, SPARK является ограниченным подмножеством Ada, поэтому вы можете писать критичный код на SPARK и взаимодействовать с менее критичным кодом Ada. Но большинство этих языков довольно нишевые.

Чаще определённые виды верификации встраивают в основную структуру языков. Используемые в продакшне системы типизации — распространённая разновидность: вы можете не знать, что tail всегда возвращает tail, но вы точно знаете, что его тип [a] -> [a] . Есть ещё такие примеры как Rust с доказанной безопасностью памяти и Pony с доказательством безопасности по исключениям. Они немного отличаются от SPARK и Frama-C тем, что способны выполнять только частичную проверку. И ещё их разрабатывают обычно эксперты по языкам программирования, а не эксперты по формальным методам: между двумя дисциплинами много общего, но они не идентичны. Возможно, именно поэтому такие языки, как Rust и Haskell, действительно подходят для использования на практике.

Спецификация дизайна

До сих пор мы говорили только о верификации кода. Однако у формальных методов есть ещё одна сторона, которая более абстрактна и верифицирует сам дизайн, архитектуру проекта. Это настолько глубокий анализ, что является синонимом формальной спецификации : если кто-то говорит, что выполняет формальную спецификацию, скорее всего, это означает, что он определяет и верифицирует дизайн.

Как мы уже говорили, доказать все строчки кода очень, очень сложно. Но многие проблемы в продакшне возникают не из-за кода, а из-за взаимодействия компонентов системы. Если отмахнуться от деталей реализации, например, принять как данность, что система способна распознавать птиц, то нам будет проще анализировать, как деревья и птицы в качестве модулей высокого уровня вписываются в общий дизайн. В таком масштабе становится намного проще описать вещи, которые вы не смогли бы реализовать, например, среду выполнения, человеческие взаимодействия или беспощадный поток времени . В этом масштабе мы формализуем свои намерения с помощью общей системы, а не строк кода. Это гораздо ближе к человеческому уровню, где проекты и требования могут быть гораздо более неоднозначными, чем на уровне кода.

Для пример, возьмём процедуру с грубой спецификацией «если она вызывается, то делает системный вызов для сохранения данных в хранилище и обрабатывает системные ошибки». Свойства проверить сложно, но вполне понятно, как это сделать. Правильно ли сериализуются данные? Нарушаются ли наши гарантии из-за некорректного ввода? Обрабатываем ли мы все возможные способы сбоя системного вызова? Теперь сравните систему логов высокого уровня со спецификацией «все сообщения записываются в лог» и ответьте на такие вопросы:

Записываются все сообщения или все сообщения, которые поступают в систему ? Сообщения записываются один раз или гарантированно один раз?
Как отправляются сообщения? Это очередь? Канал доставляет их только один раз? Всё ли в порядке с доставкой?
Под записью в лог мы имеем в виду запись навсегда? Можно ли занести сообщение в лог, а потом удалить оттуда? Может ли оно «болтаться» между записанным и незаписанным состояниям, прежде чем будет записано окончательно?
Что делать, если сервер взорвётся в середине записи сообщения? Нужно ли журналирование?
Есть ли важные свойства носителя? Выходит ли факт «носитель потерял данные» за рамки наших требований или нет?
Что насчёт GDPR?
и так далее и тому подобное.

Без формального дизайна сложнее выразить действительно нужные требования для системы. Если вы не можете их выразить, то понятия не имеете, действительно ли дизайн соответствует требованиям или просто выглядит похоже на них, но может привести к непредсказуемым последствиям. Более формально выражая намерения и дизайн, мы можем легко проверить, что на самом деле разрабатываем то, что нужно.

Так же, как мы используем языки программирования для представления кода, мы используем языки спецификаций для представления проектов. Языки спецификаций обычно ориентированы на спецификации дизайна, а не спецификации кода, хотя некоторые языки можно использовать для обоих случаев (примечание: процесс сопоставления спецификаций дизайна со спецификациями кода называется уточнением ).В дальнейшем я буду называть языки спецификаций языками дизайна (DL), чтобы свести к минимуму путаницу (опять же, это не распространённая терминология; большинство людей используют «язык спецификаций», но я хочу четко разграничить спецификации кода и спецификации дизайна).

Вероятно, первым полным DL стал VDM примерно в 1972 году. С тех пор мы видели огромное разнообразие различных языков спецификаций. Пространство DL гораздо более разнообразно и фрагментировано, чем у языков верификации кода (CVL). Грубо говоря, люди изобрели DL как средство достижения цели, а CVL — как саму цель. Поскольку на них сильно влияют конкретные проблемные области, в DL реализованы всевозможные идеи и семантика. Вот очень, очень краткий обзор некоторых первых DL:

Tongue	Область моделирования	Средство
Z	бизнес-требования	реляционная алгебра
Promela	messages	CSP
SDL	телекоммуникации	блок-схемы
Диаграммы состояний Харела	контроллеры	автоматы
Таблицы принятия решений	решения	tables

Поскольку DL обычно создаются для решения конкретных проблем, большинство из них имеют по крайней мере два или три реальных применения. Как правило, результаты очень положительные. Практикующие специалисты говорят, что DL даёт понимание проблем и облегчает поиск решений. Долгое время главным чемпионом был Praxis (теперь Altran), который применял «исправления-по-конструкции» — комбинацию Z-конструкций и кода SPARK — для создания критических систем безопасности. Спецификации экономят время и деньги, потому что вы не обнаружите ошибки дизайна на поздней стадии проекта.

Памела Зейв экспериментировала с Alloy и обнаружила фундаментальный баг в Chord, одной из основных распределённых хэш-таблиц. AWS начала находить 35-шаговые критические ошибки , написав спецификации TLA+. По моему опыту, когда люди пытаются писать спецификации, они очень скоро становятся большими поклонниками этого дела.

Но поклонники формальных методов и люди со стороны совершенно по-разному оценивают ценность написания спецификаций. Для поклонников самым большим преимуществом является то, что сам процесс написания дизайна заставляет вас понять, что вы пишете. Когда вам нужно формально выразить то, что делает ваша система, то множество неявных ошибок внезапно становятся болезненно очевидными. Это совершенно не могут понять посторонние. Если хотите дать кому-то попробовать DL, у человека должен быть способ проверить, что дизайн действительно обладает свойствами, которые он хочет.

К счастью, это также чрезвычайно важно для многих спецификаторов, поэтому верификация дизайна — большая область исследований.

Проверка моделей

Как и в случае с кодом, мы можем проверить дизайн, написав теоремы. К счастью, у нас есть ещё один трюк: можно использовать программу проверки моделей (model checker). Вместо составления доказательства, что дизайн правильный, мы просто брутфорсим пространство возможных состояний и смотрим, существует ли в нём неправильное состояние. Если ничего не найдём, то всё в порядке (примечание: программы проверки модели также используются для верификации кода, как JMBC, но в верификации дизайна намного чаще используется проверка модели, чем верификация кода).

У проверки моделей масса преимуществ. Во-первых, не нужно писать доказательства, что экономит много времени и усилий. Во-вторых, не нужно учитьсяwrite evidence, so the entry barrier is much lower. Third, if the design is broken, checking the model will give an explicit counterexample. It is much, much easier to correct the error, especially if you need 35 steps to play it. Try to find it yourself.

There are a couple of drawbacks. Firstly, these tools are not as powerful. In particular, you may encounter an infinite (unbounded) моделью, у которой бесконечное количество разных состояний. Например, вы определяете обработчик очереди сообщений: он нормально работает со списком из десяти сообщений. Но если нужно поверить его на любом списке… ну, их бесконечное количество, поэтому в модели бесконечное количество состояний. У большинства инструментов для проверки моделей есть различные приёмы для подобных ситуаций, такие как определение классов эквивалентности или симметрии, но каждый случай отличается.

Другой большой недостаток — взрыв в пространстве состояний (state-space explosion). Представьте, что у вас три процесса, каждый из которых состоит из четырёх последовательных шагов, и они могут чередовать шаги любым способом. Если они не влияют на поведение друг друга, то получается (4*3)! / (4!)^3 = 34 650 возможных исполнений (поведений). Если у каждого процесса одно из пяти начальных состояний, то общее число поведений возрастает до 4 300 000. И проверка моделей должна убедиться, что все они ведут себя хорошо. И это при условии, что они не взаимодействуют друг с другом! Если они начнут это делать, пространство состояний станет расти ещё быстрее. Комбинаторный взрыв рассматривается как основная проблема для проверки моделей, и предстоит проделать много работы, чтобы решить эту задачу.

Но в то же время есть ещё один способ справиться со взрывом пространства состояний: бросить на него больше оборудования. Самая большая проблема для проверки модели — «всего лишь» проблема производительности, а мы очень хорошо решаем проблемы производительности. Большинство (но не все) проверок моделей легко распараллеливаются. После оптимизации модели и проверки её с небольшими параметрами можно развернуть кластер AWS и запустить его с большими параметрами.

На практике многие спецификаторы используют проверку моделей, а затем по мере необходимости переходят к программам для доказательства теорем (примечание: имейте в виду, что «много спецификаторов» — это человек десять). Ещё больше специалистов по составлению спецификаций запускают проверку моделей, а когда она достигла предела своих возможностей, переключаются на менее интенсивные формы верификации.

Проблема со спецификациями дизайна

Таким образом, верификация дизайна проще и быстрее, чем верификация кода, и продемонстрировала много впечатляющих успехов. Так почему люди ей не пользуются? Проблема с DV гораздо коварнее. Верификация кода — это техническая проблема, а верификация дизайна — проблема социальная: люди просто не видят смысла.

Во многом это следствие того, что дизайн — это не код . В большинстве DL не существует автоматического способа создания кода, а также способа взять существующий код и проверить его на соответствие дизайну (примечание: генерация кода из спецификаций называется синтезом ; для ориентировки см. работы Нади Поликарповой ; доказательство, что код соответствует спецификации (или спецификации соответствует другой спецификации) называется уточнением : по обоим направлениям идут активные исследования).

Программисты склонны не доверять артефактам программного обеспечения, которые не являются кодом или принудительно не синхронизированы с кодом. По этой же причине часто игнорируются документация, комментарии, диаграммы, вики и сообщения в коммитах.

Похоже, программисты просто не верят, что от спецификаций есть какая-то польза. По моему опыту, они предполагают, что текущих инструментов (псевдокод, диаграммы, TDD) более чем достаточно для правильного проектирования. Я не знаю, насколько распространённым является это мнение и не могу его объяснить ничем, кроме общего консерватизма.

Но точно такие жалобы есть у каждого сообщества методологии, которое я знаю: сторонники TDD жалуются, что программисты не хотят пробовать TDD, Фанаты Haskell жалуются, что программисты не думают о статической типизации и так далее.

Я слышал аргумент, что Agile не приемлет заранее продуманный дизайн, поэтому никто не хочет делать формальную спецификацию. Может быть. Но многие из тех, кого я встречал, отвергают и Agile, и FM. Ещё один аргумент заключается в том, что исторически формальные методы постоянно переоценивались и не выполняли обещанное. Это тоже возможно, но большинство людей даже не слышали о формальных методах, а уж тем более об их истории.

Просто очень трудно заставить людей волноваться о том, чего они ещё не делают, даже если признают преимущества.

Summary

Верификация кода — трудная задача. В это вовлекается всё больше людей, хотя программы для доказательства теорем и SMT-решатели становятся всё более сложными. Но всё-таки в обозримом будущем, наверное, это останется уделом специалистов.

Верификация дизайна намного проще, но для её использования требуется преодолеть культурный барьер. Думаю, ситуацию можно изменить. Двадцать лет назад автоматизированное тестирование и код-ревью были довольно экзотическими и нишевыми темами, но в конечном итоге стали мейнстримом. Опять же, контрактное программирование были нишевым и остаётся таковым.

Надеюсь, эта статья чуть лучше объясняет, почему формальные методы так редко используются. По крайней мере, это лучшее объяснение, чем обычный аргумент «веб — это же не самолёт». Не стесняйтесь кричать, если увидите какие-то очевидные ошибки.

Source: https://habr.com/ru/post/437296/