Suppose there is a string

hhhrrrraaavvvvvvvaaaa 

How to get a list out of it?

 ['hhh', 'rrrr', 'aaa', 'vvvvvvv','aaaa'] 

    3 answers 3

     from itertools import groupby items = [''.join(v) for k, v in groupby('hhhrrrraaavvvvvvv')] 
    • Your version with the list generator is cleaner, blunted something with its cycle :) - gil9red
    • one
      @ gil9red a variety of answers is good :) - Sergey Gornostaev

    Option with groupby :

     from itertools import groupby text = 'hhhrrrraaavvvvvvv' items = [] for _, sub in groupby(text): items.append(''.join(sub)) print(items) # ['hhh', 'rrrr', 'aaa', 'vvvvvvv'] 

    Regular option:

     import re text = 'hhhrrrraaavvvvvvv' items = [m.group() for m in re.finditer(r'(.)\1+', text)] print(items) # ['hhh', 'rrrr', 'aaa', 'vvvvvvv'] 
    • Thank you very much, I also thought that you can use groupby, but I don’t have enough experience to understand how this method works) I will figure it out - prplmad

    An example without a groupby , we define the indexes of the change of letters, we put "@" and then split over the set:

     print(''.join([t[x]+"@" if not t[x] == t[x+1] else t[x] for x in range(len(t)-1)] + [t[-1]]).split("@")) # ['hhh', 'rrrr', 'aaa', 'vvvvvvv', 'aaaa'] # t = 'hhhrrrraaavvvvvvvaaaab' # ['hhh', 'rrrr', 'aaa', 'vvvvvvv', 'aaaa', 'b'] # t = 'hhhrrrraaavvvvvvvaaaabb' # ['hhh', 'rrrr', 'aaa', 'vvvvvvv', 'aaaa', 'bb'] 

    Got rid of re-passing through the sequence at the cost of separately adding the last character

    • Is it possible to close the same algorithm, but divided into components? :) - gil9red 1:06
    • I have simplified the expression there - now "cleaner" - Eugene Dennis