Partition problem - splitting a set into subsets with the same sum, a variant of dynamic programming

Question

There is an incoming array of numbers, for example: {3, 2, 1, 6}. It is necessary to find the same sum of its subsets, i.e. in this case, 6 (3 + 2 + 1 = 6).

A detailed description of the solution to the problem is http://www.geeksforgeeks.org/dynamic-programming-set-18-partition-problem/ (all data hereinafter from this link). In the process of solving dynamic programming, we fill in the logic table:

int sum = 0; int i, j; // Caculcate sun of all elements for (i = 0; i < n; i++) sum += arr[i]; if (sum%2 != 0) return false; bool part[sum/2+1][n+1]; // initialize top row as true for (i = 0; i <= n; i++) part[0][i] = true; // initialize leftmost column, except part[0][0], as 0 for (i = 1; i <= sum/2; i++) part[i][0] = false; // Fill the partition table in botton up manner for (i = 1; i <= sum/2; i++) { for (j = 1; j <= n; j++) { part[i][j] = part[i][j-1]; if (i >= arr[j-1]) part[i][j] = part[i][j] || part[i - arr[j-1]][j-1]; } } /* // uncomment this part to print table for (i = 0; i <= sum/2; i++) { for (j = 0; j <= n; j++) printf ("%4d", part[i][j]); printf("\n"); } */ return part[sum/2][n];

I can not understand the logic of filling, namely the line:

 part[i][j] = part[i][j] || part[i - arr[j-1]][j-1];

It is necessary to explain the language for monkeys, why do we take the cell part [i - arr [j-1]] [j-1] with such indices?

9Pasha 9Pasha 845 2 silver marks 12 bronze marks · Accepted Answer · 2017-08-25T22:31:16

The logic is as follows: we first analyze the line

 part[i][j] = part[i][j-1];

- if from the previous column (well, that is, a subset at the j-1 iteration) you can make a number equal to i, then from an array in which there is one more element, then all the more you can. Well, let's say i = 3, a subset at the previous iteration: {2,1}, a subset at the current iteration: {2,1,10}. It is clear that if {2,1} can be made up of 3, then from a larger subset it can also be done.

Now the line:

 part[i][j] = part[i][j] || part[i - arr[j-1]][j-1];

For now, let's forget about the operator or, but look at the right side of the expression.

arr [j-1] is the element that is added to the previous subset to get the current subset. (minus one because the cycle report starts from one, and the numbering of the array starts from zero).

The [j-1] column is a column with a previous subset.

And now what is the magic of the string [i - arr [j-1]]. Well, I'll show you a specific example, I think it will be so clearer. For example, in the role of arr [j-1] we have the number 3, i = 5. And now we need to check whether it is possible to compose the number 5 from the current subset. How can we find out? We have our current number 3, but 3 is not 5. To get 5 we don’t have 2 more. And we’re looking to make 2 from the previous subset, then we can make 5 together with our current triple. in this line, from the amount we need we subtract our current element, and see if it is possible to compile this number from the previous subset.

Well, now the operator or. Well, we actually do not care how we make this number. There are two ways to verify this (well, this is either already tested on the previous subset, or we will try to compile it with our new number) - and it doesn’t matter how. If at least some way out, then put true.

Andrey Ivanov Andrey Ivanov 101 2 bronze marks · Answer 2 · 2017-08-26T03:25:31

It is considered that dynamic programming is when we come up with a certain data structure, most often an array of the required size with the necessary number of dimensions (part in your code), and a cunning algorithm for iterative recurrent filling of this structure. When in the process of filling we reach certain indices - the value of these indices will be the answer to the question of the problem. But often the filling algorithm is not obvious and opaque, therefore such questions arise.

I suggest the following approach: recursion + memoization. In some sources this is also called the term dynamic programming, with which I absolutely agree. For the task (and for most similar) it is very easy to come up with a trivial recursive algorithm, for this task its implementation looks like this:

 bool f(int i, int s) { return (s==0) ? true : (i<0) ? false : f(i-1, s) || f(i-1, sa[i]); }

The function (according to your link is similar) is called with the maximum array index and half the sum of its elements, returns true, if we can recruit a set of array elements to half its sum. The problem is solved, but for large sizes of the array, the growth of the working time will be exponential (as in the same Fibonacci). The solution is to start an array, where to write the once calculated values of the function. Our function is two-argument, it is convenient to make the array two-dimensional. And when we next call our function f, check if the value with such arguments has already been calculated (there is a value in our array), then take it, and not fall further into exponentially recursive calls. Our function is clean, with no side effects, so this approach is correct. And suddenly it turns out that the part array in the above code plays exactly the same role! And eventually it fills with the same values! And there is no need to invent a cunning algorithm for its iterative-recurrent filling, the magic recursion will do everything for us :) Although, if desired, this algorithm is easy to deduce from the known logic of the existing recursive function.

Recursion with memoization (I don’t know why I don’t know, it’s so accepted) and dynamic programming are different methods, they have a partial overlap of the scope, but that doesn’t mean it’s the same thing.
The DP method is almost always more efficient in terms of speed and memory; it does not require a stack.
However, this method is normally applicable only in the case of a simple topological orientation of the graph (which is used to bypass the recursion).
In my comments, I focused on the general features of these methods in order to convey the message to the author of the question.
If you identify the differences - yes, the recursive version uses the stack (although it is not so simple, for example, in Haskell-type languages there is no mutability, and the stack is partially located on the heap, therefore such recursive-memoized algorithms do not lead to stackflow).
On the other hand, there are tasks where it is not necessary to fill in the entire DP-structure - and the memoization during recursion will fill only what is needed.
But I repeat - I specifically singled out the general features of the algorithms so that the vehicle in future similar tasks could itself implement the classical DP from the naive recursive algorithm.

Partition problem - splitting a set into subsets with the same sum, a variant of dynamic programming

2 answers 2

More articles: