python - IndexError obstructing code from working with larger csv file -
i have data sorts csv using groupby , plots information. used small sample of information create code. ran smoothly , tried running huge file of data.
i pretty new @ python , problem has been quite frustrating suggestions on how troubleshoot problem helpful.
my code stopping in section:
import pandas pd df =pd.dataframe.from_csv('mydata.csv') mode = lambda ts: ts.value_counts(sort=true).index[0]
i tried selecting parts of huge data file , ran, entire thing getting error:
indexerror: index 0 out of bounds axis 0 size 0
but i've looked @ 2 data set side-by-side , columns same! noticed big file has utf8 issues accents , working on combing out, indexerror perplexing me.
here traceback
runfile('c:/users/jbyrusb/documents/python scripts/tests/tests/topsixcustomersexecute.py', wdir='c:/users/jbyrusb/documents/python scripts/tests/tests') traceback (most recent call last): file "<ipython-input-45-53a2a006076e>", line 1, in <module> runfile('c:/users/jbyrusb/documents/python scripts/tests/tests/topsixcustomersexecute.py', wdir='c:/users/jbyrusb/documents/python scripts/tests/tests') file "c:\users\jbyrusb\appdata\local\continuum\anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile execfile(filename, namespace) file "c:\users\jbyrusb\appdata\local\continuum\anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile exec(compile(scripttext, filename, 'exec'), glob, loc) file "c:/users/jbyrusb/documents/python scripts/tests/tests/topsixcustomersexecute.py", line 23, in <module> df = df.groupby('companyname')[['column1','name', 'birthday', 'country', 'county']].agg(mode).t.reindex(columns=cols) file "c:\users\jbyrusb\appdata\local\continuum\anaconda\lib\site-packages\pandas\core\groupby.py", line 676, in agg return self.aggregate(func, *args, **kwargs) file "c:\users\jbyrusb\appdata\local\continuum\anaconda\lib\site-packages\pandas\core\groupby.py", line 2674, in aggregate result = self._aggregate_generic(arg, *args, **kwargs) file "c:\users\jbyrusb\appdata\local\continuum\anaconda\lib\site-packages\pandas\core\groupby.py", line 2722, in _aggregate_generic return self._aggregate_item_by_item(func, *args, **kwargs) file "c:\users\jbyrusb\appdata\local\continuum\anaconda\lib\site-packages\pandas\core\groupby.py", line 2751, in _aggregate_item_by_item colg.aggregate(func, *args, **kwargs), data) file "c:\users\jbyrusb\appdata\local\continuum\anaconda\lib\site-packages\pandas\core\groupby.py", line 2307, in aggregate result = self._aggregate_named(func_or_funcs, *args, **kwargs) file "c:\users\jbyrusb\appdata\local\continuum\anaconda\lib\site-packages\pandas\core\groupby.py", line 2394, in _aggregate_named output = func(group, *args, **kwargs) file "c:/users/jbyrusb/documents/python scripts/tests/tests/topsixcustomersexecute.py", line 20, in <lambda> mode = lambda ts: ts.value_counts(sort=true).index[0] file "c:\users\jbyrusb\appdata\local\continuum\anaconda\lib\site-packages\pandas\core\index.py", line 915, in __getitem__ return getitem(key) indexerror: index 0 out of bounds axis 0 size 0
it difficult without seeing data causing error, try this:
mode = (lambda ts: ts.value_counts(sort=true).index[0] if len(ts.value_counts(sort=true)) else none)
Comments
Post a Comment