python - Replacing Strings in Column of Dataframe with the number in the string -


i have dataframe follows , want replace strings in maturity number within them. example, want replace fzcy0d 0 , on.

            date   maturity  yield_pct currency 0     2009-01-02     fzcy0d       4.25      aus 1     2009-01-05     fzcy0d       4.25      aus 2     2009-01-06     fzcy0d       4.25      aus 

my code follows , tried replacing these strings numbers, lead error attributeerror: 'series' object has no attribute 'split' in line result.maturity.replace(result['maturity'], [int(s) s in result['maturity'].split() if s.isdigit()]). hence struggling understand how this.

from pandas.io.excel import read_excel import pandas pd import numpy np import xlrd  url = 'http://www.rba.gov.au/statistics/tables/xls/f17hist.xls' xls = pd.excelfile(url)  #gets rid of information dont need in dataframe df = xls.parse('yields', skiprows=10, index_col=none, na_values=['na'])   df.rename(columns={'series id': 'date'}, inplace=true)  # line assumes want datetime, ignore if don't #combined_data['date'] = pd.to_datetime(combined_data['date'])  result = pd.melt(df, id_vars=['date'])  result['currency'] = 'aus' result.rename(columns={'value': 'yield_pct'}, inplace=true) result.rename(columns={'variable': 'maturity'}, inplace=true)  result.maturity.replace(result['maturity'], [int(s) s in result['maturity'].split() if s.isdigit()])   print result 

you can use vectorised str methods , pass regex extract number:

in [15]:  df['maturity'] = df['maturity'].str.extract('(\d+)') df out[15]:          date maturity  yield_pct currency 0  2009-01-02        0       4.25      aus 1  2009-01-05        0       4.25      aus 2  2009-01-06        0       4.25      aus 

you can call astype(int) cast series int:

in [17]: df['maturity'] = df['maturity'].str.extract('(\d+)').astype(int) df.info()  <class 'pandas.core.frame.dataframe'> int64index: 3 entries, 0 2 data columns (total 4 columns): date         3 non-null object maturity     3 non-null int32 yield_pct    3 non-null float64 currency     3 non-null object dtypes: float64(1), int32(1), object(2) memory usage: 108.0+ bytes 

Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -