When developing his "lightweight" template engine, al-I CMF MODx, I faced the problem of nesting of structures.
What we feed (simplified for perception design):
/* html-ΠΊΠΎΠ΄ */ [[SNIPPET_1 :if &is=`var` &then=` /* html-ΠΊΠΎΠ΄ */ [[$CHUNK_1 :if &is=`var` &then=`[[SNIPPET_2 :filter &name=`var` ]]` ]]` &else=`[[$CHUNK_2:upper]]` :filter &name=`var` ]] /* html-ΠΊΠΎΠ΄ */ [[~LINK_1:abs]]
:if, :filter, :upper...
are filters (modifiers) , and &is, &name β¦
are filter variables .
Filter variables ( &is=`var`
), as you might guess, should contain anything: from a simple string to the html-code of a template seasoned with variables (snippets, chunks, etc.)
The problem is how to close [[SNIPPET_1]]
in this case, if there are other template variables in it. It is worth noting that [[SNIPPET_1]]
has two filters applied to it :if
and :filter
. This also needs to be considered.
It would be wonderful to parse this construction as it is (that is, to take into account the line feed - the convenience of perception)
Actually, the regexp pattern, which is used in the project:
preg_replace_callback( '/\[{2}([\$\*\@\%\~]?|\+{1,2})([\w-\.]+)\s*((?:\:[\w]+\s*(?:\s*\&[\w]*\=`(?:.[^\n]*)`)*\s*)*)\s*\]{2}/iu', function ($call) { }, $subject )
Selects separately the name of the template variable (
SNIPPET_1, CHUNK_1, SNIPPET_2 β¦
), its type (""
- snippet,"$"
- chunk,"~"
- link ...) and filters with their contents (:if&is=`var`&then=`[[$CHUNK_1]]` :filter&name=`var`
).In this case,
[^\n]
is a stub, i.e. the contents of the filter variable is written in one line without transitions to the next, to determine the end of the filter variable, namely:&then=`[[$CHUNK_1:if&is=`var`&then=`[[SNIPPET_2:filter&name=`var`]]`]]`
Agree, not very readable turns out.
Next, the filter construction is parsed into an array. The name of the filter (
if, filterβ¦
) and the variables of each filter are determined. Regexp pattern:preg_match_all('/\:([\w]*)((?:\s*\&[\w]*\=`(?:.[^\n]*|)`\s*)*)/iu', $call[3], $found);
And finally, the cyclical mileage for each of the filters and the function (corresponds to the name of the filter). For example, hereβs the filter function
:if
:preg_match_all('/\&([\w]*)\=\`((?:.[^\&]*)?(?(?=:).*?\`\]{2}(?:.[^&]*)?|(?:.[^\&\:])?))\`/iu', $subject, $found);
Collisions in the current template engine functionality:
Again, the contents of the filter variable are written in one line without transitions to the next;
Errors are not noticed, only with two-dimensional nesting. It is treated by creating an additional (new) chunk with placing the necessary construction in it.
Summarizing: Dear Regular Expression Gurus, share your experience on how to close a structure if there are similar constructions in it.
UPDATE:
@ReinRaus Thank you for the answer. In spite of the fact that the direction where to dig me was suggested by @VladD ( http://php.net/manual/ru/regexp.reference.recursive.php ), you painted possible reefs associated with this design.
You are right, there is a problem, because inside attribute values ββthere is a
`
However, if you replace in the template of this kind single quotes for something that looks more like a restriction, for example,
&is={{β¦}}
, then everything is great. Here is an example:'/\[{2}([\$\*\@\%\~]?|\+{1,2})([\w-\.]+)((?:\s*\:[\w]+\s*(?:\s*\&[\w]*\=\s*\{{2}\s*(?:[^\{\}]++|(?R))*\}{2})*)*)(?:[^\[\]]++|(?R))*\]{2}/iu'
The name of the template variable (
[[ΠΈΠΌΡ]]
), its type ([[$...]]
- chunk ...), as well as the list of filters with their contents are highlighted. (:ifβ¦
:filterβ¦
), and so on for each template variable.It was not possible to select the regexp pattern to replace the single quotes
`β¦`
with{{β¦}}
taking into account\s
, therefore you will have to edit the templates with pens. Of course, the symbol`
overwhelmed much preferable. If you have a solution, I will be glad to read.The second problem is the second pattern (inside the callback function), which parses directly the filters (for each template variable (snippet, chunk) there can be several of them).
:if &is={{var}} &then={{ /* html-ΠΊΠΎΠ΄ */ [[$CHUNK_1 :if &is={{var}} &then={{[[SNIPPET_2 :filter &name={{var}} ]]}} ]]}} &else={{[[$CHUNK_2:upper]]}} :filter &name={{var}}
The problem lies in the allocation of a single filter, regardless of the presence of nested similar structures.
Given the above pattern, the filters are stored in
$call[3]
. You can go to the trick and replace all the constructions{{β¦}}
with their contents with something else.'/\{{2}(?:[^\{\}]++|(?R))*\}{2}/iu'
Next, parse safely with the exception of
[^\:]
. After all, the design of filters will get a simpler look.:if &is={{var_1}} &then={{var_2}} :filter &name={{var_3}}
Is it possible to do without a replacement?
++
. - VladD 7:09:if
? Your code will have to take this into account. - VladD