i want use .notnull()
on several columns of dataframe eliminate rows contain "nan" values.
let have following df
:
b c 0 1 1 1 1 1 nan 1 2 1 nan nan 3 nan 1 1
i tried use syntax not work? know doing wrong?
df[[df.a.notnull()],[df.b.notnull()],[df.c.notnull()]]
i error:
typeerror: 'series' objects mutable, cannot hashed
what should following output?
b c 0 1 1 1
any idea?
you can first select subset of columns df[['a','b','c']]
, apply notnull
, specify if all
values in mask true
:
print (df[['a','b','c']].notnull()) b c 0 true true true 1 true false true 2 true false false 3 false true true print (df[['a','b','c']].notnull().all(1)) 0 true 1 false 2 false 3 false dtype: bool print (df[df[['a','b','c']].notnull().all(1)]) b c 0 1.0 1.0 1.0
another solution ayhan
comment dropna
:
print (df.dropna(subset=['a', 'b', 'c'])) b c 0 1.0 1.0 1.0
what same as:
print (df.dropna(subset=['a', 'b', 'c'], how='any'))
and means drop rows, @ least 1 nan
value.
Comments
Post a Comment