I greet you, SW. community!

Recently, I had the task to implement some parser (although this term is not very suitable) of textual information, which may contain not only markup (for example, Markdown is used here), but also certain commands.

With markup, everything is clear, for each markup block we write something like that (I don’t already remember where I found this code, but it works amazingly well):

new RegexFormatter(@"\[url=((.|\n)*?)(?:\s*)\]((.|\n)*?)\[/url(?:\s*)\]", "<a href=\"$1\" target=\"_blank\">$3</a>")); 

Then we loop through all the patterns for the whole text and perform the conversion (if you tell me a more optimal way, I will be grateful too).

But what to do if the text can contain commands that should perform interaction with the database, or any other logic? Suppose I am writing a self-hosted site console . How to find them in the text, how to determine which command in the text executes which command in the programming language?

Of course, I can write a kind of bicycle in the forehead, but I would like to understand how to do it correctly. Who can prompts, or at least throws the literature?

Thanks in advance!

UDP: clarified the question a bit.

  • And what about the regexp? Decoration? A parser is a state machine, and you should read about them. - karmadro4
  • As for commands: The easiest way is to create a meta tag like [eval command]. I don’t know how good the [c # eval] [1] implementation is, try to see it. In a pinch, you can always write something yourself with the help of [Reflection] [2]. That is, you will not need to parse individual commands and translate it into code, you can execute any command from the template. [1]: codeproject.com/Articles/13335/C-Eval-Function [2]: nnm.ru/blogs/Catone/reflection_v_net_s_primerami_na_c - ReinRaus
  • meta tag is a good idea, thanks, and eval is evil in any of its manifestations - Specter
  • eval - the current that looked like the same dictionary of commands, only compiled into a convenient form - Gorets
  • one

1 answer 1

It is not clear what kind of a "complex text" is, is it Chinese? where can there be links, selections, etc.?

  1. if this is a page page - I would take a ready-made library, there are a lot of them, or even there are ready-made frames in the language that read the markup.
  2. if this program is some kind of special translator, then, if done in a simple way:
    1. would write a dictionary of commands, parser text on a word, and compare it with a dictionary, if not doing a bit more complicated and perhaps more correct, then
    2. it would be necessary to do everything with a finite automaton and then from each state to make the necessary transition And of course, there is still an option with regulars, but as always, I am against it =)
  • > It is not clear in your answer there are also numbered lists, there could also be links, and so on, but I need to consider the possibility that there will be more commands that will be executed on the server, a search for matches in the dictionary is the most obvious option, and I thought first of all - Specter