Please help. There is a list of lines. You need to go through the list and if there is a line that starts with 'ORC', form a list with this line and other subsequent lines until the moment when the line that starts with 'ORC' is encountered. And then it itself. Incoming list:

lst = ['MSH|^~\\&|CENTRUM|', 'ORC|RE|100003883', 'OBR|1|100003883', 'OBX|1|NM|', 'OBX|1|NM|', 'ORC|RE|100003883-11469', 'OBR|2|100003883', 'OBX|1|NM|', 'OBX|1|NM|', 'OBX|1|NM|', 'ORC|RE|100003883', 'OBR|3|100003883', 'OBX|1|', 'ORC|RE|100003883', 'OBR|4|100003883', 'OBX|1|NM|277933'] 

What should be the output:

 result = [['ORC|RE|100003883', 'OBR|1|100003883', 'OBX|1|NM|', 'OBX|1|NM|'], ['ORC|RE|100003883-11469', 'OBR|2|100003883', 'OBX|1|NM|', 'OBX|1|NM|', 'OBX|1|NM|'], ['ORC|RE|100003883', 'OBR|3|100003883', 'OBX|1|'], ['ORC|RE|100003883', 'OBR|4|100003883', 'OBX|1|NM|277933']] 

I guess that you need to move in the direction of itertools.groupby but there are not enough brains. Thank.

    2 answers 2

    try this:

     idx = [i for i, x in enumerate(lst) if x.startswith('ORC')] res = [lst[idx[i-1]:idx[i]] for i in range(1, len(idx))] + [lst[idx[-1]:]] 

     In [92]: res Out[92]: [['ORC|RE|100003883', 'OBR|1|100003883', 'OBX|1|NM|', 'OBX|1|NM|'], ['ORC|RE|100003883-11469', 'OBR|2|100003883', 'OBX|1|NM|', 'OBX|1|NM|', 'OBX|1|NM|'], ['ORC|RE|100003883', 'OBR|3|100003883', 'OBX|1|'], ['ORC|RE|100003883', 'OBR|4|100003883', 'OBX|1|NM|277933']] 

    in one run:

     res = [] tmp = [] for x in lst: if x.startswith('ORC'): if tmp: res.append(tmp) tmp = [x] elif tmp: tmp.append(x) res.append(tmp) 

     In [122]: res Out[122]: [['ORC|RE|100003883', 'OBR|1|100003883', 'OBX|1|NM|', 'OBX|1|NM|'], ['ORC|RE|100003883-11469', 'OBR|2|100003883', 'OBX|1|NM|', 'OBX|1|NM|', 'OBX|1|NM|'], ['ORC|RE|100003883', 'OBR|3|100003883', 'OBX|1|'], ['ORC|RE|100003883', 'OBR|4|100003883', 'OBX|1|NM|277933']] 
    • Thank you very much. It helped a lot - Serhii Yaroshevkyi

    To disassemble the fence into sections (one pillar in each section):

     '---|***|+++|...' -> ['|***', '|+++', '|...'] 

    you can find the indexes of the posts, and then break the sequence, bypassing the obtained indices in pairs (start: end):

     >>> fence = '---|***|+++|...' >>> post_indexes = [i for i, x in enumerate(fence) if x == '|'] >>> [fence[s:e] for s, e in zip(post_indexes, post_indexes[1:] + [len(fence)])] ['|***', '|+++', '|...'] 

    In your case, the role of the pillars is performed by lines beginning with ORC ( x.startswith('ORC') instead of x == '|' ):

    The task is similar to fence.split(post) :

     >>> '---|***|+++|...'.split('|') ['---', '***', '+++', '...']