How to remove extra words from a line in python

Question

There is a table dump.xlsx

I need, based on the data in the table, to write this data into a new file in .csv format. The csv file for the current dump.xlsx table looks like this.

directory_preview_v2,directory-service,28.0.0, <- обязательно запятая в конце find_cli,find-cli,9.1.0,

You need to do this on python.

if, as a result, you only have the information in this form, then you can take a slice, first, the position of the character "_"
Can you give an example of the source data (4-5 lines) in text form (for example, in the form of CSV) and the expected result?
in dump.xlsx A1 = AWSPROD4CATALOG-PREVIEW-V2STACK_catalog_preview_v2 B1 = registry.dp-dev.jcpcloud2.net / catalog-service: 26.0.0.
The expected result is written to the file text.csv catalog_preview_v2, catalog-service, 26.0.0,
The link to the uploaded file requires confirmation of the email and, as a result, leads to email exposure.
Personally, I wouldn’t like to show you my email address to help you with your question ...;)

Answer 1 · 2018-09-14T11:27:42

To process data in CSV, Excel, SQL (and many other formats), I prefer to use the Pandas module:

 import pandas as pd filename = r'D:\download\dump.xlsx' # parse Excel file into a DataFrame df = pd.read_excel(filename, usecols="B,E") # get rid of `prefix_` in the 'NAME' column df['NAME'] = df['NAME'].str.partition('_', expand=False).str[-1] # parse a string "address/image:number" into two columns ['IMAGE','NUM'] df[['IMAGE','NUM']] = df['IMAGE'].str.replace(r'.*\/', '').str.split(':', expand=True) # add an empty column, which is needed for showing a coma # at the end of each line in the CSV file df['EMPTY'] = "" # save DF as a CSV file df.to_csv(r'd:/temp/result.csv', header=None, index=False)

PS for reading Excel files Pandas by default uses the xlrd module.

Result:

 In [28]: from pathlib import Path In [29]: print(Path(r'd:/temp/result.csv').read_text()) directory_preview_v2,directory-service,28.0.0, find_cli,find-cli,9.1.0,

Could you please give an example, for example, if a table has many columns and rows?
In this case, all the data is almost similar to those that were given above.
@IgorSamarskiy, that's why I asked for an example of input and output data (this is better done by editing the question ).
In response, I proceeded from the fact that the data you need is in the first line.
The remaining lines will be ignored ... If you need to process all the lines, simply remove the parameter nrows=1
@IgorSamarskiy, have you tried running the code from my answer?
No, because I do not quite understand how it works, if I copy it in mine.

How to remove extra words from a line in python

1 answer 1

More articles: