There is a hierarchical data structure that is periodically updated based on uploads from an external source. Unloading can be complete or on one specific node.
Each node has a boolean attribute 'hidden', which comes in unloading. Also, the site has some content that may be empty. Also, each node has a Boolean attribute 'empty', which needs to be recalculated at the time of synchronization.
The empty attribute must be set to true if the node meets all the conditions:
- Node has empty content.
- Node has no non-empty and non-hidden descendants.
In the unloading, the descendants are guaranteed to follow their parents, so earlier the definition of empty was carried out as follows:
The program runs sequentially through the upload, each node adds to the database or updates, if it is already there. Looks in unloading, whether the content is empty. It makes a selection in its database of the immediate descendants of this node, which are not empty or hidden. Based on this, sets the value to empty.
After that, the program consistently goes up the chain of parents of this node and does the same procedure for them. This is necessary because the information about the descendants will be in the not yet worked out part of the upload, and if their list has been updated or their values are empty and hidden, then the empty value of the parent may be invalid and it should be recalculated.
I know that the algorithm does not look optimal, but I did not write it.
Now the task has come: to make the node so that it can be like an alias (link) to another node. This is necessary to implement a parallel hierarchy.
Accordingly, in the calculation of empty it is necessary to take into account that the node must be non-empty if it refers to a non-empty node. The problem is that the alias may be present in the unloading both earlier and later than the node to which it refers.
How is it best to implement such a structure taking into account the requirements described?
UPD: PosgreSQL is used as a database via sqlalchemy. Accordingly, all synchronization work is performed by a Python script.
The node table has fields: id, parent_id, link_id, hidden, empty, and several other fields that are not involved in this task.
parent_id and link_id are id from the same table.
There is also a table with content, from which for each node a number of elements are taken.