Help me figure out how to build a binary tree in Python 3 in a bracket record?

The structure of the binary tree is as follows:

< Π±Π΄ > ::= < пусто > | (< Π±Π΄ > < Π±ΡƒΠΊΠ²Π° >< Π±Π΄ >) < пусто > ::= 

And here is an example corresponding to this structure (B(C))A(D) . In this case, A is the root of the tree.

At the moment, it turned out to make a procedure that extracts all the letters from the bracket entry (of course, it looks pretty crooked), but it’s impossible to build a tree.

The code itself:

 class Node: def __init__(self, data=None, left=None, right=None): self.left = left self.right = right self.data = data class Tree: def __init__(self): self.root = None def myAdd(self, node, data, key): if key == None: node = Node(data) if key == 'left': node.left = Node(data) if key == 'right': node.right = Node(data) def addToTree(self, expression): self._addToTree(expression, self.root) # ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠ° строки с ΠΈΠ·Π²Π»Π΅Ρ‡Π΅Π½ΠΈΠ΅ΠΌ Π±ΡƒΠΊΠ² def _addToTree(self, expression, node, key=None): if expression: if len(expression) == 1: self.myAdd(node, expression[0], key) else: brackets = 0 index_sym = 0 expression = list(''.join(expression)) for inx, sym in enumerate(expression): if sym == '(': brackets += 1 if sym == ')': brackets -= 1 if (sym != '(') and (sym != ')'): index_sym = inx if (brackets == 0) and (sym != ')') and (sym != '('): index_sym = inx break self._addToTree(expression[index_sym], node ) self._addToTree(expression[1:index_sym], node, 'left') self._addToTree(expression[index_sym+2:-1], node, 'right') def printTree(self): if (self.root != None): self._printTree(self.root) def _printTree(self, node): if (node != None): self._printTree(node.left) print( str(node.data), end= ' ' ) self._printTree(node.right) tree = Tree() tree.addToTree('((c)b(d))a(((g)f(h))e)') tree.printTree() 

    2 answers 2

    To solve the problem, it is best to implement the process of upstream parsing. Imagine an empty stack and your string (let's call its input buffer):

     stack = [] input = '(B(C))A(D)' 

    We will read the characters one by one from the input buffer to the stack and analyze the top of the stack for "interestingness"

    Step 1. Nothing interesting.

     stack = ['('] input = 'B(C))A(D)' 

    Step 2. Nothing interesting

     stack = ['(', 'B'] input = '(C))A(D)' 

    Step 3. nothing interesting

     stack = ['(', 'B', '('] input = 'C))A(D)' 

    Step 4. Still nothing interesting

     stack = ['(', 'B', '(', 'C'] input = '))A(D)' 

    Step 5. Already something! Three characters at the top of the stack '(', 'C', ')' can be minimized to the top of the tree with the value 'C' Before convolution:

     stack = ['(', 'B', '(', 'C', ')'] input = ')A(D)' 

    After:

     stack = ['(', 'B', Node('C')] 

    Step 6. Making another transfer:

     stack = ['(', 'B', Node('C'), ')'] input = 'A(D)' 

    And turn off:

     stack = [Node('B', None, Node('C'))] 

    Step 7. Transferring:

     stack = [Node('B', None, Node('C')), 'A'] input = '(D)' 

    Step 8,9,10. We transfer:

     stack = [Node('B', None, Node('C')), 'A', '(', 'D', ')'] input = '' 

    Step 11. We turn off:

     stack = [Node('B', None, Node('C')), 'A', Node('D')] 

    Step 12. We turn off:

     stack = [Node('A', Node('B', None, Node('C')), Node('D')] 

    Step 13. The input buffer is empty. On the stack is the finished syntax tree :)

    What it would look like in the code:

     class Node: def __init__(data, left, right): self.data = data; self.left = left; self.right = right; LEFT_BRACKET = 0 RIGHT_BRACKET = 1 DATA = 2 NODE = 3 class Token: def __init__(symbol): self.value = symbol if symbol == '(': self.type = LEFT_BRACKET elif symbol == ')': self.type = RIGHT_BRACKET elif symbol.isalpha(): self.type = DATA else: self.type = NODE def try_reduce(stack): """ Набор ΠΏΡ€Π°Π²ΠΈΠ» для свёртки. МоТно Ρ‚ΠΎΠΆΠ΅ ΠΎΠΏΠΈΡΠ°Ρ‚ΡŒ ΠΊΠ°ΠΆΠ΄ΠΎΠ΅ ΠΏΡ€Π°Π²ΠΈΠ»ΠΎ классом. Бколько Ρ‚ΠΎΠΊΠ΅Π½ΠΎΠ² Π½ΡƒΠΆΠ½ΠΎ ΠΏΠΎΡΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ, с ΠΊΠ°ΠΊΠΈΠΌΠΈ Ρ‚ΠΎΠΊΠ΅Π½Π°ΠΌΠΈ ΡΡ€Π°Π²Π½ΠΈΡ‚ΡŒ ΠΈ Π½Π° ΠΊΠ°ΠΊΠΎΠΉ Ρ‚ΠΎΠΊΠ΅Π½ Π·Π°ΠΌΠ΅Π½ΠΈΡ‚ΡŒ. """ for rule in rules: result = rule.check(stack) if result: return; stack = [] inp_string = '(B(C))A(D)' while (True): if try_reduce(stack): continue if not len(a): return; stack.append(Token(a[0])) a = a[1:] 

    I will leave the implementation of the Rule class to be thought out. I apologize for any inaccuracies - I’m not really involved in the parsing

    • Thank you for such a detailed implementation with the stack, but I would like to do it without using it, but with recursion (if possible). Found a question with the implementation of the tree in Python , but there is a binary search tree. And somehow, when self.root is passed to the self.root method, this variable looks like a link, although there are no links in Python. Do not tell me why this is happening? - Simple User
    • one
      There are no links, but the values ​​are passed by reference in Python: a = []; b = a; b.append(1) a = []; b = a; b.append(1) a = []; b = a; b.append(1) look at what will be equal to a - Mikhail Lelyakin
    • That's it. We still need to tighten the python) Now it’s clear why this could be used. But why when I call the myAdd method, the data is overwritten and not saved as new objects? For example, when displaying a tree, the left part, the root and the right were displayed - Simple User

    According to the previous algorithm (in pascal) it turned out to do the following:

     class Node: def __init__(self, left=None, right=None, data=None): self.left = left self.right = right self.data = data def __str__(self): return str(self.data) class Tree: def __init__(self, expression): self.root = None self.add(expression) def add(self, expression): token_list = list(''.join(expression)) self.root = self._addToTree(token_list) def _addToTree(self, token_list): if token_list: brackets = 0 index_sym = 0 for inx, sym in enumerate(token_list): if sym == '(': brackets += 1 if sym == ')': brackets -= 1 if (sym != '(') and (sym != ')'): index_sym = inx if (brackets == 0) and (sym != ')') and (sym != '('): index_sym = inx break return Node(self._addToTree(token_list[1:index_sym]), self._addToTree(token_list[index_sym+2:-1]), token_list[index_sym]) def InOrder(self): self._InOrder(self.root) def _InOrder(self, node): if (node != None): self._InOrder(node.left) print( str(node.data), end= ' ' ) self._InOrder(node.right) def PreOrder(self): self._PreOrder(self.root) def _PreOrder(self, node): if (node != None): print( str(node.data), end= ' ' ) self._PreOrder(node.left) self._PreOrder(node.right) def PostOrder(self): self._PostOrder(self.root) def _PostOrder(self, node): if (node != None): self._PostOrder(node.left) self._PostOrder(node.right) print( str(node.data), end= ' ' ) tree = Tree('((c)b(d))a(((g)f(h))e)') print('ΠŸΡ€ΡΠΌΠΎΠΉ ΠΎΠ±Ρ…ΠΎΠ΄') tree.PreOrder() print() print('ΠšΠΎΠ½Ρ†Π΅Π²ΠΎΠΉ ΠΎΠ±Ρ…ΠΎΠ΄') tree.PostOrder() print() print('ΠžΠ±Ρ€Π°Ρ‚Π½Ρ‹ΠΉ ΠΎΠ±Ρ…ΠΎΠ΄') tree.InOrder()