python - Replacing Strings in Column of Dataframe with the number in the string -
i have dataframe follows , want replace strings in maturity
number within them. example, want replace fzcy0d
0
, on.
date maturity yield_pct currency 0 2009-01-02 fzcy0d 4.25 aus 1 2009-01-05 fzcy0d 4.25 aus 2 2009-01-06 fzcy0d 4.25 aus
my code follows , tried replacing these strings numbers, lead error attributeerror: 'series' object has no attribute 'split'
in line result.maturity.replace(result['maturity'], [int(s) s in result['maturity'].split() if s.isdigit()])
. hence struggling understand how this.
from pandas.io.excel import read_excel import pandas pd import numpy np import xlrd url = 'http://www.rba.gov.au/statistics/tables/xls/f17hist.xls' xls = pd.excelfile(url) #gets rid of information dont need in dataframe df = xls.parse('yields', skiprows=10, index_col=none, na_values=['na']) df.rename(columns={'series id': 'date'}, inplace=true) # line assumes want datetime, ignore if don't #combined_data['date'] = pd.to_datetime(combined_data['date']) result = pd.melt(df, id_vars=['date']) result['currency'] = 'aus' result.rename(columns={'value': 'yield_pct'}, inplace=true) result.rename(columns={'variable': 'maturity'}, inplace=true) result.maturity.replace(result['maturity'], [int(s) s in result['maturity'].split() if s.isdigit()]) print result
you can use vectorised str
methods , pass regex extract number:
in [15]: df['maturity'] = df['maturity'].str.extract('(\d+)') df out[15]: date maturity yield_pct currency 0 2009-01-02 0 4.25 aus 1 2009-01-05 0 4.25 aus 2 2009-01-06 0 4.25 aus
you can call astype(int)
cast series int:
in [17]: df['maturity'] = df['maturity'].str.extract('(\d+)').astype(int) df.info() <class 'pandas.core.frame.dataframe'> int64index: 3 entries, 0 2 data columns (total 4 columns): date 3 non-null object maturity 3 non-null int32 yield_pct 3 non-null float64 currency 3 non-null object dtypes: float64(1), int32(1), object(2) memory usage: 108.0+ bytes
Comments
Post a Comment