From XML select the desired data with priority

Question

There is a simplified version of xml:

<items> <item> <eissn>12-12</eissn> </item> <item> <eissn>11-13</eissn> <issn>11-12</issn> </item> <item> </item> </items>

I need to extract all issn and eissn, but if neither one nor the other is present, this line should not exist at all. And if there is both issn and eissn, then the priority should be issn. Expected option:

 #1105: Статья в журнале #963: 12-12 ***** #1105: Статья в журнале #963: 11-12 ***** #1105: Статья в журнале *****

With this feature, I get issn and eissn:

 def get_issn_eissn(item) -> str: return item.find("issn") or item.find("eissn") from bs4 import BeautifulSoup root = BeautifulSoup(text, 'html.parser') for item in root.select('item'): issn_eissn = get_issn_eissn(item) issn_eissn = '#963: ' + str(issn_eissn) issn_eissn = re.sub(r'\<.*?\>', '', issn_eissn) print(issn_eissn)

Please tell me how you can set the priority and if there is nothing, get:

 #1105: Статья в журнале *****

Thanks in advance for your reply.

MaxU MaxU 52.3k 6 18 51 · Accepted Answer · 2018-08-18T11:57:15

Head-on:

 def get_issn(item): x = item.find('issn') or item.find('eissn') return '#1105: Статья в журнале' + (f'\n#963: {x.text}' if x else '') + '\n*****'

 In [17]: for item in root.select('item'): ...: print(get_issn(item)) ...: #1105: Статья в журнале #963: 12-12 ***** #1105: Статья в журнале #963: 11-12 ***** #1105: Статья в журнале *****

one
Thank you very much!!! Everything worked !!! - Ireen1985

From XML select the desired data with priority

1 answer 1

More articles: