Python Pandas filter rows based on the string value of an entry -


i have excel sheet (bloomberg data license output) read in with

import pandas pd raw_data = pd.read_excel('my-file.xlsx') 

there 1 column (start-of-file) , varying number rows, depending on amount of data returned.

i interested in data between 2 rows, start-of-data , end-of-data. rows in column

19                                      start-of-data 20  wts equity|0|6|wts|50545|54.440000|54.000000|5... 21  xom equity|0|6|xom|6555175|84.950000|85.300000... 22  sup equity|0|6|sup|27405|19.250000|19.200000|1... 23                                        end-of-data 

with varying number of rows, (not 20 22). how can filter rows in column data between cells, ie raw_data['start-of-file']['start-of-data' : 'end-of-data']. , use str.split('|') seperate pipe delimited data seperate columns new dataframe?

>>> import pandas pd >>> df = pd.dataframe(['abcdef', 'start-of-data', 'g|h|i', 'j|k|l', 'm|n|o', 'end-of-data', 'pqrstu', columns=['a']]) >>> df                0         abcdef 1  start-of-data 2          g|h|i 3          j|k|l 4          m|n|o 5    end-of-data 6         pqrstu >>> start, end = df[df['a'].str.contains('(start|end)-of-data')].index.tolist() >>> pd.dataframe(df[start+1:end]['a'].str.split('|').tolist(), columns=['a', 'b', 'c'])     b  c 2  g  h  3  j  k  l 4  m  n  o 

Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -