i generate empty data frame follows:
topfields = ['desc', 'desc', 'price', 'price', 'units', 'units'] bottomfields = ['foo', 'bar', 'mean', 'mom_2', 'mean', 'mom_2'] resultsdf = pd.dataframe(columns=pd.multiindex.from_arrays([topfields, bottomfields]))
now set first 2 columns (with desc
top-level value) index (and more general challenge, all columns desc
top-level value). i've tried several ways, none of work.
here's intuitive (failure):
>>> test = resultsdf.set_index('desc') >>> test out[4]: empty dataframe columns: [(price, mean), (price, mom_2), (units, mean), (units, mom_2)] index: [] >>> test.index out[5]: index([], dtype='object', name='desc')
pandas
correctly removes both desc
columns (from "columns"), none of these appear in index. instead, have 1 field in index. when try create row based on multiindex, error:
>>> test.loc[pd.indexslice[0, 0], :] = 1 traceback (most recent call last): [...] keyerror: '[0 0] not in index'
it looks need set_index
tuple:
test = resultsdf.set_index(('desc', 'foo')) print (test) empty dataframe columns: [(desc, bar), (price, mean), (price, mom_2), (units, mean), (units, mom_2)] index: [] print (test.index) index([], dtype='object', name=('desc', 'foo'))
or maybe:
test = resultsdf.set_index([('desc', 'foo'), ('desc', 'bar')]) print (test) columns: [(price, mean), (price, mom_2), (units, mean), (units, mom_2)] index: [] print (test.index) multiindex(levels=[[], []], labels=[[], []], names=[('desc', 'foo'), ('desc', 'bar')])
Comments
Post a Comment