There is a task. Find the largest common increasing sequence among two sequences of length n and m. I wrote an algorithm for O (n ^ 2), but I was told that there is a solution for O (n). Waiting for help! If you specify the direction of movement, it will be good too. The algorithm should be implemented in C ++ as a form factor.
5 answers
The more general problem of finding the largest common subsequence (for the case of two sequences) is solved by dynamic programming in O (N * M).
The related problem of finding the largest common substring in two lines (the alphabet is limited) is solved by suffix trees for O (N + M).
I think the truth is somewhere nearby). The idea is to use the condition of increasing the subsequence.
It is also worth looking at the suffix trees.
It seems to me that somewhere here a mistake or inaccuracy of wording has crept in. Even the task of finding the largest increasing subsequence of one given sequence is solved in the general case for O (n log n), and here it is required for O (n + m) to find the largest common such subsequence for a pair of sequences?
Perhaps, meant all the same substrings? If the original sequence increases, the solution has already been given. If not, then you can divide them into ascending sections (the answer cannot cross the border of such areas) and try to achieve the desired asymptotics (I don’t know if it will work out). You can also really apply suffix trees, more precisely, this algorithm. It seems that it is easy to modify it for increasing substrings, going only along increasing paths in a tree.
- The task of finding the largest increasing subsequence of one given sequence is solved in O (n). - Nicolas Chabanovsky ♦
- I am afraid that I don’t know such an algorithm, and I’m even sure that the best estimate for this problem that I have met is O (n log (n)). Could you give a link to the solution for O (n)? - ofitserov
- I have no links. I once solved it. I have a solution. If you ask a question, on the sort of - how to solve the problem of finding the largest increasing subsequence of one given sequence in O (n), I will definitely put out the code. However, another problem is discussed in this question. - Nicolas Chabanovsky ♦
I managed to solve only for O (n ^ 2). I think it will not work much faster. here is the code
int comlen( char *p, char *q) { int maxlen = 0; while ( (*p != '\0' || *q != '\0' ) && *p++ == *q++) ++maxlen; return maxlen; } int isInc( char *p, int len) { int i = 1; int k = 0; for (; *p != '\0' && i < len; ++i, ++k) { if (p[i]<p[k]) return 0; } return 1; } int lcis(char * str1, char * str2, int len) { int maxlen = -1; int thislen = -1; int maxi = 0, maxj = 0; for (int i = 0; i < len; ++i){ for (int j = 0; j < len; ++j) { thislen = comlen( &str1[i], &str2[j]); if (isInc(&str1[i], thislen)){ if (thislen > maxlen) { maxlen = thislen; maxi = i; maxj = j; } } } } return maxlen; }
- 2Perhaps, all the same for O (n ^ 3)? Two nested loops + a comlen call that performs a linear search. Example: 2 lines of characters 'a' of length n. - ofitserov
- A typo. Yes, in the worst case, the time will be of the order of O (n ^ 3), and on the average O (n ^ 2). You can add a loop exit condition, for example, when both lines are the same (compare the current length with the length of the min. Line). - Nicolas Chabanovsky ♦
There is an algorithm 0 (n ~ m). Something between n and m.
The meaning is:
- There are global iterators (or pointers) to:
- start in the first sequence, initially = 0
- start in the second sequence, initially = 0
- sequence length, initially = 0
- current variables
- the beginning in the first sequence, initially = 0 // is always equal to 0, unless we are in a common subsequence
- the beginning in the second sequence, initially = 0 // is always equal to 0, unless we are in a common subsequence
- sequence length, initially = 0
- there are some current pointers and loop condition
for (iterator i1 = listN.begin(), iterator i2=listM.begin(); i1!=listN.end() && i2!=listM.end(); ) { if (мы не в общей подпоследовательности) { if (*il=*i2) { то начинаем новую подпоседовательность, заодно длина текущей подпоследовательности ++; ++i1; ++i2; } else { сдвигаем тот итератор, значение по которому меньше; } } else if (мы в общей подпоследовательности) { if (*il=*i2) { длина текущей подпоследовательности ++; ++i1; ++i2; } else { общая подпоследовательность закончилась, сравниваем ее с найденной глобальной; если текущая больше глобальной, приравниваем глобальную текущей; ставим текущие начала последовательности в NULL, а текущую длину подпоследоваельности в 0; сдвигаем тот итератор, значение по которому меньше; } } }
for (iterator i1 = listN.begin(), iterator i2=listM.begin(); i1!=listN.end() && i2!=listM.end(); ) { if (мы не в общей подпоследовательности) { if (*il=*i2) { то начинаем новую подпоседовательность, заодно длина текущей подпоследовательности ++; ++i1; ++i2; } else { сдвигаем тот итератор, значение по которому меньше; } } else if (мы в общей подпоследовательности) { if (*il=*i2) { длина текущей подпоследовательности ++; ++i1; ++i2; } else { общая подпоследовательность закончилась, сравниваем ее с найденной глобальной; если текущая больше глобальной, приравниваем глобальную текущей; ставим текущие начала последовательности в NULL, а текущую длину подпоследоваельности в 0; сдвигаем тот итератор, значение по которому меньше; } } }
After looping through the global variables, the result will lie: if there was a subsequence, then both iterators are iterated, if the values are equal, or 1 of them, by which the value is greater, until the end of one of the lists is reached.
- oneFirst, it looks like an algorithm for finding a common increasing substring, rather than a subsequence (in a subsequence, the elements were not necessarily adjacent to the original sequence). Secondly, he does not seem to always find the greatest answer. Example: [9, 1, 2, 3] and [1, 2, 3, 9], he will find [9]. It seems that it works in the event that these sequences increase. - ofitserov
- From the code it is obvious that it works only for increasing sequences. You can try to remake, selecting from the global sequences the increasing parts, and then driving through it. But faster than those who asked what was offered, definitely will not. Upd .: ofitserov already wrote almost the same thing) - uramer239
I suggest paying tribute to this article: The longest increasing subsequence in O (N log N)