What is better, allocate a lot of memory or a lot of pointers to memory?

Question

There is a task, you need to create a weighted graph in the form of an adjacency matrix. We need an int-array, which will simultaneously mean both the connection and its "price". Well, the whole joke is that (in real life) not all elements will be connected with each other, which means some memory will be idle => it will be smarter to create an array of pointers, and those elements that are not used = NULL, t . memory will not occupy ... But again, I occupy a lot of memory for addresses. What is better? In general, do pointers have "weight"?

Of course it has: 4 bytes on the x86 architect and 8 bytes on the x64 architect.
On the topic: write down for yourself what operations will be carried out with this graph, "approximately" describe the algorithms that you will use for this.
After that, you can "approximately" weigh what speed you need for any particular algorithm.
If the number of algorithms for which the speed will be more important will be more than for the others, you can turn a blind eye to some costs.
No - it means you have to find a compromise, for example: pack / unpack a graph on demand, or in portions
except for packing / unpacking ... I just note the relationship between "memory saving" and "work speed".
traversal of an array of indirect references in the general case is a rather slow operation, since
- And in terms of memory capacity - you can safely calculate how much more efficient it will be to store pointers: there will be no benefit to x86 without additional packaging algorithms.
Make the code as little as possible dependent on the exact memory allocation strategy, implement as it seems more convenient to you, when you come up against the need for improvements, then you will improve.

KoVadim KoVadim 85.7k five 67 128 · Accepted Answer · 2013-03-18T11:10:38

Knut in the first volume has an example of how such matrices can be stored, quite economically and with the ability to quickly index.

A structure is used to store one item.

 struct Node { int row; // это и следующее поле можно и исключить int col; // ниже будет описана замена. Node * next; Node * bottom; Data data; // а это одно (или несколько полей), где собственно хранятся данные. }

Therefore, it turns out that for each element you need two additional pointers (8-16 bytes) and, possibly, another 4-8-16 bytes to speed up the work (and possibly debugging).

Indices are also not required to be stored in int . It may well be that short may be sufficient.

Data is stored as follows. Each node stores a link to the next node in the row and the next node in the column. If there is no node in the given line / column anymore to the right / below, then the link to the topmost / left one is stored, that is, we loop.

If the data is 4 bytes, and the system is 32 bit, then already when filling less than a third, there will already be a gain.

How to work with such a matrix

We get one vector which will store pointers to lists of lines. To find an element of an index, you just need to take the necessary line in the vector, and then just go through the elements until you find the right one. If you need to calculate the amount per line - it is also not difficult - you just need to get a line and sum up the elements. Zero is not, but they do not change the picture.

But if the dimensions of the matrix are large (for example, 100,000 per 100,000), then it makes sense to simply make a list of all existing rows and columns.

Summing it all up, we get this

 struct Node { int row; int col; Node * next; Node * bottom; Data data; // а это одно (или несколько полей), где собственно хранятся данные. } std::list<Node*> rows; std::list<Node*> cols; // дальше алгоритмы словесные. Node * get(int i, int j) { // Пройтись циклом по rows до нужной строки. // Если такой строки нет, значит элемент нулевой // иначе проходим по строке (используя указатель next), ищем нужный элемент. } // Добавление элемента. void add(Node * n) { // Находим нужную строку. Если строки в списке rows нет, нужно вставить. // добавляем этот элемент в эту строку. // если строка есть, то проходим циклом, что бы найти место вставки, // поправляем два указателя // аналогичное проделываем с столбцами. } // удаление аналогично.

Maybe I modified the idea of Knut a little, so I strongly recommend that you take his book and search (I think it’s easy to find titles, the first volume).

Also in the book it is recommended to create your own pool of objects, if it is assumed that the data will be modified frequently.

Once again, this algorithm will be very effective if the number of elements is significantly (hundreds of times) less than the total capacity of the matrix.

IMHO for the specified bottom algorithms in Node is not needed.
If there are few empty lines, you can add a hash to search the line.
Or (if most of the lines are filled) generally use a vector of lines
If some elements have abnormally many links, then you can also hash the list of links (i.e., string) for them.
- Again, the question to the author: what data do you have (how many elements, average number of connections of one element)?
In general, I correctly understood that bottom is a list of (ring) all nodes with the same col ?
Knut, as I recall, does not have lists of rows and columns; he solves the problem of indexation there in another way.
next is a pointer to the next (right) element in the row , and bottom is the next (below) element in the column .

fogbit fogbit 3,366 7 eight · Answer 2 · 2013-03-18T08:21:18

Two standard representations of links in a graph are 1) a matrix, where the element (i, j) indicates whether there is a connection between the i-th and j-th vertices and 2) an array of lists, where the i-th element of the array contains a list of connected vertices. If there is no connection (i, j) in the column, then the i-th list will not contain the element j.

Both approaches have pros and cons. Plus a second approach in saving memory.

> some memory is idle => it will be smarter to create an array of pointers, and those elements that are not used will be = NULL, i.e.
memory will not occupy ... But again, I occupy a lot of memory for addresses.
An answer was given to the question “what’s better” when “some amount of memory is idle”.
If you read more carefully, you can understand that I meant 1 of 2 options.
@Djonny, if you definitely want a choice between a matrix of numbers and a matrix of pointers to numbers, then the answer is obvious: The matrix of numbers.
Moreover, if all the weights fit into 0..255 (or -128..127), then we can take the matrix of bytes.
An even smaller range of weights can be packaged into an even more compact matrix of several bits per element.
And if you express yourself more precisely, you can get answers to precisely those questions that were meant.

renegator renegator 3,848 7 7 · Answer 3 · 2013-03-18T10:17:56

This is called highly sparse matrices, considered, for example, by the Whip. There is a long book Pisanetski strictly on this topic.

cheb cheb 184 one four · Answer 4 · 2013-06-11T20:17:22

Just create a pointer to a pointer;) **

What is better, allocate a lot of memory or a lot of pointers to memory?

4 answers 4

More articles: