Good evening friends. On the whole, it is extremely difficult to adequately formulate a question fully reflecting the essence of the problem, therefore I will try to explain what I mean.

In short, I need to modify Text_Diff (I am absolutely not tied specifically to this library, it's just the first thing I found on the network) so that in addition to comparing the two versions, it was possible to display + delete / restore edits by users + based on this form the final document from:

"Original text version" -> "Approved edits" -> "Final text version".

For example:

Original version

This is the “fish” text often used in print and web design. Lorem Ipsum is a standard “fish” for Latin texts from the beginning of the 16th century.


User Edit 1 Approved

This is the “fish” text often used in print and web design. Lorem Ipsum is a standard “fish” for Latin texts from the beginning of the 16th century. Addition from the user 1


User Edit 2 Approved

It is not only the text “fish” but also the crustacean text , often used in print and web design. Lorem Ipsum is a standard “fish” for Latin texts from the beginning of the 16th century. Addition from the user 2


User rejected revision 3

This is a fabulous nonsense text- "fish", often used in print and web design. Lorem Ipsum is a standard “fish” for Latin texts from the beginning of the 16th century. Fuck from user 3


Moderator version

This enchanting nonsense is not only text- "fish", but also crustacean text , often used in print and web design. Lorem Ipsum is a standard “fish” for Latin texts from the beginning of the 16th century. Addition from user 1 Addition from user 2 Crap from user 3


Final version

It is not only the text “fish” but also the crustacean text , often used in print and web design. Lorem Ipsum is a standard “fish” for Latin texts from the beginning of the 16th century. Addition from user 1 Addition from user 2

It is important to note that all edits are not added one after another, but relative to the original text version.

I have problems with architecture building as such, since I have not invented anything more sensible than dozens of text version comparisons for each user.

While my non-usable sketch looks like this:

  1. We create a table where we store the text of each user edit (we extract the text of each user’s edit by comparing the original version of the text and the user’s version with the edits, writing the difference in the table)
  2. We create a table with a general text of edits from all users for subsequent comparison with the text from each user and highlighting their edits (here is a dead end, since it is not at all clear to me how to work with the order of the text, because each new revision will be made relative to the original version of the document, but not one after the other)
  3. If two separate users touch the same text, we place them one after the other in the document (also a space, hypothetically a cross-comparison can be made with the previous general version with edits from other users, the version with edits from the current user and the original version for highlighting repeated edits, but it sounds so-so)

As a result of text_diff calls, there will be an incredible amount (and even with comparing everything and the whole problem of chronology is preserved) , which is miserable both in architecture and in performance. Therefore, I would like to ask you for help with finding the right direction of thought or, better yet, with reference to the ready-made library xD.

Thanks to everyone who read to the end ...

Ps and yes, I know that I would do well to give a course on algorithms, study types of sorts, etc. And in the future I will definitely do it. And yes, I also know that for good I should know all this, but I'm sorry ...

    1 answer 1

    Just the text that was originally!
    Not just the text that was originally!
    Just a terrible text that was originally!
    Just a text that was not originally!

    We compare strings by words, not source ones with source ones without register, for example, starting from the first:

    1) Не -> Просто 1.1) Не совпало, записали в массив слово "Не" $arr[1] - ключ 1 т.к слово первое 2) просто -> Просто ... 

    second:

     1) Просто -> Просто 2) ужасный -> текст 2.1) Не совпало, записали в массив слово "Не" $arr[2] - ключ 2 т.к слово второе ... 

    third:

     1) Просто -> Просто 2) текст -> текст 3) который -> который 4) был -> был 5) не -> изначально 5.1) Не совпало, записали в массив слово "не" $ar[5] - ключ 5 т.к слово пятое ... 

    Now we unite, take $ arr [m] and append the words “knowing the numbers” of the words in [m] to the initial line.

    For our case, it looks like this:

    Sort array by [m]

    $ arr [1] - Substitute [1] word number one

    Not

    $ arr [2] - Substitute [2] word number one from the main message, then our word number two, which has changed

    Not Just awful

    $ arr [5] - the words numbered 2,3,4 from the main message, then our word numbered five, and since nothing else we finish the remaining words from the default message.

    Not Just a terrible text that was not originally!

    Also, if the keys [m] match, combine the values ​​with a space.

    If the words in the comparison did not match, you take the word from the modified message new, and compare again with the original same word (which was in the previous comparison), and if again did not match, combine with the same key!

    Postscript: about deleting words something similar can be thought out and, of course, add a register-based functionality to this description!