What we should build a road. Part 1

The daily route of most of us is limited to going from home to work and back. And the most difficult obstacle that can slow down our movement is traffic jams. But in our country there are a huge number of places that can be reached only by special transport.

Such transport is good if you do not need to carry large volumes of cargo or go to such difficult-to-reach areas on a regular basis. Then we should think about the construction of infrastructure, the movement of which is possible and the usual civilian transport.

And it’s good if the whole complexity of designing a future route lies in finding a ruler and a pencil in order to draw a straight line on the map connecting a pair of objects. But, alas, our reality is very different from this. What if the terrain when flying over it resembles a good piece of Swiss cheese?

For more than a year, my colleagues and I from the research lab Digital Design have been working on creating a tool that can build various communications networks automatically. For details, welcome under cat.

How it all began

Usually trips to such remote places are due to a certain financial gain. And although it is often known in advance that the development of these areas will bring a good profit, why not save money on things on which this can be done without much difficulty? What if a more competently designed route will allow us not to build a few extra kilometers of the highway?

But only the presence of roads to get some significant profit is not enough. And basically it is only brought deposits of minerals. Therefore, in addition to the roads, it is necessary to design a network of various pipelines, a network of water lines, a network of transmission lines with correctly placed electrical substations. All of this infrastructure can be described as linear objects.

And when an engineer is faced with the task of designing a communications network in an area with complex engineering and geological conditions, an incredibly complex analytical task falls on him.

How not to get to the spill area of the river? How to minimize the path on permafrost? How to save on the volume of sand pillows when laying the road through the swamp? Where to get sand for this pillow?

This is only a small fraction of the questions that the engineer asks himself during the work, and yet it is still necessary to take into account various GOST and SNiPs. And ok, if you want to connect only two objects, and if there are a couple of dozen objects?

Therefore, the design sphere is in dire need of a tool that can calculate both the cost of linear objects designed by an engineer and the ability to build them automatically.

Data

The first thing we had to face when we started the task was to search for data. What data is required? Where to get them? And if the second question can be solved with the help of the user, then for the understanding of the first question we needed a serious analyst.

And the data for the correct calculation requires a lot of real - from the most banal markings of the terrain, on which lakes and rivers are marked, various swamps, as well as karst areas, which are not recommended to be built on because of the threat of collapse. Therefore, in addition to the areas that can be seen from the satellite, geological research data are also required here.

But not all data is limited to natural objects. There are also a bunch of things that were created by man. For example, in the case of road construction, we can use the roads that were built earlier. But in the construction of pipelines using the existing infrastructure is not always possible. Since the connection of the new pipeline to the existing network may well not pass the hydraulic calculation.

In addition to the re-use of what was previously created by man, one must also take into account various restrictions on the construction of nearby objects. For example, for the construction of transmission lines, one must take into account the distance from human habitats depending on the voltage on the transmission lines themselves.

But it is not entirely correct to consider the calculation of the task only on the plane, since in addition to the flat terrain, there are areas with a large elevation difference, on which construction is also impossible.

Acquiring data on terrain is a much simpler process than data on the marking of areas and existing infrastructure. To simplify development, we use the SRTM (Shuttle Radar Topography Mission) open data that anyone can get.

So, we have the data, we know where we can build, where we cannot build, there are construction costs for different areas of the area. We have a region, it contains a set of objects that we want to connect into a single communication network. In mathematical terms, this will sound like a search for a solution to the Steiner problem.

A bit of math

It is known that each formula reduces the number of readers by half, so we have dramatically reduced this section ...

In order to optimize the route of the projected linear object, it is required to be able to calculate its construction cost. Each linear object can be represented as a polyline. $L = \ {A_i, B_i \} _ {i = 0} ^ n$ consisting of segments $AB_ {i}$ .
The construction cost of each straight segment can be represented as a function value. $C_ {AB_ {i}}$ where each segment is set parametrically:

$$ display $$ {\ displaystyle AB_i \ colon \ left \ {{\ \ {{aligned} & s = s \ left (t \ right) - \ mbox {parameterized curve}, \\ & h = h \ left (s (t ) \ right) - \ mbox {relief height function}, \\ & c = c \ left (s (t) \ right) - \ mbox {specific construction cost function at a point}, \\ end {aligned}} \ right .} $$ display $$Where $t \ in \ left [0, 1 \ right]$ - segment parametrization.

$C_ {AB_i} = \ int \ limits_ {AB_i} c \ left (s (t) \ right) \ sqrt {\ left [g_ {xy} (s (t)) + \ frac {\ partial h} {\ partial s_x} \ cdot \ frac {\ partial h} {\ partial s_y} \ right] \ dot {s} _ {x} \ dot {s} _ {y}} dt$ where $g$ - the metric tensor of the Earth's surface without relief, that is, in simple words, the function of measuring the distance on the Earth's surface.

Due to the fact that our planet has the shape of a geoid, then for measuring distances one has to use special formulas. For this, we use the Haversinus formula. In it, the shape of the planet has the form of a sphere, but this is sufficient for our purposes, since the measurement error is about 0.3% , and the speed of calculating distances in this way remains high.

Approaches to finding the best path

But before moving on to connecting a group of objects, you need to learn how to find the best path between the two. Two ways immediately come to mind:

construct a graph, and then apply the shortest path search methods;
use a solution based on optimization methods.

When implementing the first method, we have a problem associated with the method of constructing a graph to obtain the highest possible accuracy of the solution. It is necessary to find a balance between the number of vertices and edges in the graph, the quality of the solution and the time spent on its calculation.

In the second case, the main problem is local minima. It is easy to choose a case when the solution may not converge or be far from optimal. Since the solution obtained in this way is extremely dependent on the initial approximation. If we have a good initial approximation, then the result will be qualitative.

So, we have two approaches to solving this problem. The first one has rather low quality, but there is no problem of local minima. The second has a problem of local minima, but the high quality of the solution. And if you correctly combine these two approaches, then the shortcomings of each will disappear. We take the solution obtained on the graph as an initial approximation, and then apply optimization methods to it.

In this part of the article we will consider exactly the proposed solution of this problem on the graph.

Tool selection

The action plan is planned, it remains only to “code”. Since at the initial stage of writing the application, we had little knowledge of the subject area, we needed a flexible language in order to rewrite the entire code in case of some rough architectural miscalculations for a couple of ~~cans of beer~~ evenings. Also, optionally, there was a wish to have libraries for all occasions, in order not to resort to the invention of bicycles. Therefore, Python was chosen as the main programming language.

In the application, we used an impressive technological stack, which is not limited to the following:

shapely to work with geometric data, such as polygons and polylines;
geopandas for convenient work with shapely ;
scipy for storing the calculated graph and finding the shortest path on it, as well as finding the minimum spanning trees;
PyTorch to work with the cost function;
pillow to work with images.

Implementation

The module can be divided into several stages:

generation of the cost function;
generation of computational grid;
graph construction on the basis of the grid;
graph weighting;
calculation of shortest paths on a graph.

Since the edges of the graph are segments, we can weigh these edges by calculating the value of a certain integral, which is presented above. But there are a lot of edges in the graph, and, accordingly, the values of values and heights will need to be obtained for a very large number of points, which can take a lot of time.

Initially, the idea was to find the specific cost of construction by determining whether a point belongs to a polygon. But this thought was dismissed because of the high algorithmic complexity of this approach, since there can be many polygons, and each of them has a large number of vertices.

Therefore, to solve this problem, all the polygons that we have we can present in the form of a picture, which in mathematical form is a matrix. Thus, we can get the specific cost of construction at the point in O (1) , and therefore get the cost of building a certain rib with high accuracy.

Now we are ready to build a graph, but there is no uniquely correct way to build it to obtain high-precision solutions. In order to build a graph, it is required to generate a computational grid, namely, a set of points in the considered area, connected in appropriate order by segments that form cell faces. Grids can be divided into two types: uniform and non-uniform.

The uniform grid model describes the coordinates of individual points, so that the distance between the grid nodes is the same. An uneven grid is represented as a ~~chaotic~~ collection of individual points.

Using a uniform grid will not always give a qualitative result for finding the shortest path, so it is necessary to test this approach on an uneven grid. The most versatile is the triangular mesh.

The grid was generated by introducing noise with a certain coefficient into a uniform rectangular grid, and then applying Delaunay triangulation points for this set.

The effectiveness of the constructed grids was tested on various model terrain maps, which took into account the different cost of the regions and the terrain, and then were compared with the solution obtained analytically. By sorting the length of the edge of the graph and the coefficient for introducing noise into a uniform grid, optimal parameters were found for constructing the optimal path between two points.

Main problems

At first everything went well, the algorithm calculated the optimal networks, but one day there was a problem with the quality of the constructed path. The case was quite simple: it was necessary to connect two objects on a fairly long map. The algorithm did not at all want to build a route that was obviously correct.

The problem, as always, was with the computational grid. We continued our experiments with the appearance of grids and concluded that connecting the vertices into a graph, where each vertex is connected to the n- nearest neighbors, is the solution to our problems.

Networking

After we learned to find the optimal paths between two points, the next step was to solve the problem of building an optimal network of linear objects. The algorithms for constructing Steiner trees on graphs encountered in the literature are focused on more general problems, and therefore will be ineffective in our case. Our graph is highly sparse, with a small number of terminal vertices, relative to the total number of vertices in the graph, and our task is also applied. Therefore, for a number of other reasons, it was decided to develop a proprietary algorithm that uses certain heuristics to effectively build an optimal network.

The description of the algorithm for constructing an optimal network on a graph for our case is a topic for a separate publication, here we will not consider it. Since, when calculating a real task, it is necessary to take into account many additional conditions from the construction sphere that are relevant to the end user, which impose certain restrictions on the algorithm used.

Real case

So now we turn to the results of the algorithm. Our task is to connect a given set of objects with a communication network. Comparison of the results of the algorithm took place with a previously built road network. The total length of the existing network is 153.5 km versus 122.6 km from the calculated one. This gives about a 25% reduction in the length of the road network, which in the end will save about 5-15% on the cost of capital construction.

You can take a closer look at the calculation results here .

The description of the applied optimization methods will be in the next part of the article.

Source: https://habr.com/ru/post/437478/