python - Replacing Strings in Column of Dataframe with the number in the string -

June 15, 2011

i have dataframe follows , want replace strings in maturity number within them. example, want replace fzcy0d 0 , on.

            date   maturity  yield_pct currency 0     2009-01-02     fzcy0d       4.25      aus 1     2009-01-05     fzcy0d       4.25      aus 2     2009-01-06     fzcy0d       4.25      aus

my code follows , tried replacing these strings numbers, lead error attributeerror: 'series' object has no attribute 'split' in line result.maturity.replace(result['maturity'], [int(s) s in result['maturity'].split() if s.isdigit()]). hence struggling understand how this.

from pandas.io.excel import read_excel import pandas pd import numpy np import xlrd  url = 'http://www.rba.gov.au/statistics/tables/xls/f17hist.xls' xls = pd.excelfile(url)  #gets rid of information dont need in dataframe df = xls.parse('yields', skiprows=10, index_col=none, na_values=['na'])   df.rename(columns={'series id': 'date'}, inplace=true)  # line assumes want datetime, ignore if don't #combined_data['date'] = pd.to_datetime(combined_data['date'])  result = pd.melt(df, id_vars=['date'])  result['currency'] = 'aus' result.rename(columns={'value': 'yield_pct'}, inplace=true) result.rename(columns={'variable': 'maturity'}, inplace=true)  result.maturity.replace(result['maturity'], [int(s) s in result['maturity'].split() if s.isdigit()])   print result

you can use vectorised str methods , pass regex extract number:

in [15]:  df['maturity'] = df['maturity'].str.extract('(\d+)') df out[15]:          date maturity  yield_pct currency 0  2009-01-02        0       4.25      aus 1  2009-01-05        0       4.25      aus 2  2009-01-06        0       4.25      aus

you can call astype(int) cast series int:

in [17]: df['maturity'] = df['maturity'].str.extract('(\d+)').astype(int) df.info()  <class 'pandas.core.frame.dataframe'> int64index: 3 entries, 0 2 data columns (total 4 columns): date         3 non-null object maturity     3 non-null int32 yield_pct    3 non-null float64 currency     3 non-null object dtypes: float64(1), int32(1), object(2) memory usage: 108.0+ bytes

Search This Blog

Macro

python - Replacing Strings in Column of Dataframe with the number in the string -

Comments

Post a Comment

Popular posts from this blog

twig - Using Twigbridge in a Laravel 5.1 Package -

jdbc - Not able to establish database connection in eclipse -

Kivy: Swiping (Carousel & ScreenManager) -