Plotting based on a csv file edited in python

Question

There are many files in csv format, with fractional values, of approximately the following format:

These values represent the amplitudes of the signal oscillations from the board at some point in time, where each line is a step. Graphic display used to be quite successfully implemented by means of Python and its modules, but, being a beginner in it, I came across certain problems. To display the values in time, I added a column to the left incrementing at each step, symbolically counting the time. From above, I added headings for the convenience of displaying columns in the graph as channels.

The result looks like this. But spreading the values of channels in time does not work for me - in the listing you can see that the program does not read the names of the columns. I use Anaconda based on 2.7 and Pyzo. Code by reference, can anyone tell me what to fix in it for correct display? Thank you in advance for any hints or corrections, I’m not very good at it yet.

What I did Approximately what should have been (done by Excel tools)

Do you have a question how to teach Excel to understand your input format?
(It is probably because of the decimal point on the Russian Windows is buggy — just register that the dot is the decimal dot, and the comma is the field separator, not the digits).
(Then pandas and matplotlib can be used to display time series )
You can put somewhere an example of the input file in text (CSV, TSV, etc.) format (so that it can be used in the code) and explain what transformations you did to get temp.csv from minitest.csv (except for adding names columns)?
No, I used Excel to simply understand how everything should look as a result.
The oscilloscope basically shows the same thing, but I can't do screenshots.
By means of python I want to implement a visual, I use both pandas and matplot, but the result is as above.
Transformations - multiplied all values by 0.8 (internal gain of my board), and added a column that counts the steps starting from zero (Time), which I want to take for X when plotting.
The values of the remaining columns for Y. There are many input files, but they differ only in values - the structure always remains the same.
Two different input files: mediafire.com/download/iz88jcan80zfb35/… mediafire.com/download/4h7ota2v9dceuxb/…
This is how the same files appear in my program that I wrote for the oscilloscope.
s018.radikal.ru/i525/1609/b9/e12b15f78d70.png And all the difference between the programs and files - the csv files that the oscilloscope writes have internal column indexing (time and channels respectively) That's why I decided to add titles without fail - column names.
But the program does not read them and does not display them; moreover, they are not separated by commas at the end.
It turns out another data format inside my csv, like a comment, but not a string?

Accepted Answer · 2016-09-16T08:16:22

Here's what I got:

Code:

 import pandas as pd import matplotlib.pyplot as plt def plot_df(df, out_plot_fn=None, mult_factor=0.8, cols=None, x_axis=None, figsize=(12,10)): if x_axis: df = df.set_index(x_axis) if cols: df.columns = cols if mult_factor and mult_factor != 1: df *= mult_factor ax = df.plot.line(figsize=figsize) if out_plot_fn: plt.savefig(out_plot_fn) else: plt.show() fn = r'D:\temp\.data\7.csv.gz' df = pd.read_csv(fn, skipinitialspace=True) plot_df(df, out_plot_fn=r'd:/temp/bad.png') # пропускаем 7-ю колонку (Python/Pandas считают с нуля, поэтому - 6-ю) # иначе по Y-axis будет неправильное масштабирование # и все графики будут выглядеть слишком сглаженными usecols = [0,1,2,3,4,5,7] df = pd.read_csv(fn, usecols=usecols, skipinitialspace=True) plot_df(df, out_plot_fn=r'd:/temp/a.png') # переименуем названия колонок в DF cols = ['ch1','ch2','ch3','ch4','ch5','ch6','ch8'] plot_df(df, cols=cols, out_plot_fn=r'd:/temp/b.png') # только первые 50 строк plot_df(df.head(50), out_plot_fn=r'd:/temp/c.png')

Results:

a.png:

b.png:

c.png:

This is what happens if you do not cut out the 7th ( 8 ) and the last empty ( Unnamed: 8 ) column:

I did not dare to communicate with data frames, but the code is clearly structurally better.
Do not advise any manual on work with data files and their visualization?
The second time already helping me out, for which many thanks.
Only does not give rest: whence the 9th (unnamed) column nevertheless undertakes?
Regarding the visualization of ordinary data (not NumPy / Pandas / SciPy) I will not advise - I was not interested in this, the processing speed is very important for me, so I try to do everything with the help of "vectorized" operations (Pandas / NumPy / etc.).
The column Unnamed: 8 already present in your CSV - I don't know how you got it ...
If you still decide to use Pandas, then the link from @jfs is exactly what you need
The column after the execution of my program was obtained and remained in any csv, an error with some indexes.
@irvis, I seem to have forgotten to indicate that my program is more convenient to use with the source file, and not with those that you have already processed (it seems that UnnamedL 8 appears after pd.read_csv(k).mul(0.8).to_csv(...) ).
If you post an example of the source file, I can show you how to do it. PS pd.read_csv() method is very flexible and powerful - if you do not have “broken” CSV, you can read any CSV using this method (you can set column names, if no, you can skip lines at the beginning and at the end, etc.)

Plotting based on a csv file edited in python

1 answer 1

Code:

Results:

More articles: