Here is a solution using regulars
import re a = """ <div class="qwerty hel_lo tuy-iy">content</div> <div class="qwerty hel_lo tuy-iy">content</div> <div class="qwerty hel_lo tuy-iy">content</div> """ a = a.replace("\n", "") b = re.findall(r"class\s*?=\s*?\"(.*?)\"", a) print(b)