i want use .notnull() on several columns of dataframe eliminate rows contain "nan" values.
let have following df:
b c 0 1 1 1 1 1 nan 1 2 1 nan nan 3 nan 1 1 i tried use syntax not work? know doing wrong?
df[[df.a.notnull()],[df.b.notnull()],[df.c.notnull()]] i error:
typeerror: 'series' objects mutable, cannot hashed what should following output?
b c 0 1 1 1 any idea?
you can first select subset of columns df[['a','b','c']], apply notnull , specify if all values in mask true:
print (df[['a','b','c']].notnull()) b c 0 true true true 1 true false true 2 true false false 3 false true true print (df[['a','b','c']].notnull().all(1)) 0 true 1 false 2 false 3 false dtype: bool print (df[df[['a','b','c']].notnull().all(1)]) b c 0 1.0 1.0 1.0 another solution ayhan comment dropna:
print (df.dropna(subset=['a', 'b', 'c'])) b c 0 1.0 1.0 1.0 what same as:
print (df.dropna(subset=['a', 'b', 'c'], how='any')) and means drop rows, @ least 1 nan value.
Comments
Post a Comment