python - What is the Right Syntax When Using .notnull() in Pandas? -


i want use .notnull() on several columns of dataframe eliminate rows contain "nan" values.

let have following df:

    b   c 0 1   1   1 1 1   nan 1 2 1   nan nan 3 nan 1   1 

i tried use syntax not work? know doing wrong?

df[[df.a.notnull()],[df.b.notnull()],[df.c.notnull()]] 

i error:

typeerror: 'series' objects mutable, cannot hashed 

what should following output?

    b   c 0 1   1   1 

any idea?

you can first select subset of columns df[['a','b','c']], apply notnull , specify if all values in mask true:

print (df[['a','b','c']].notnull())             b      c 0   true   true   true 1   true  false   true 2   true  false  false 3  false   true   true  print (df[['a','b','c']].notnull().all(1)) 0     true 1    false 2    false 3    false dtype: bool  print (df[df[['a','b','c']].notnull().all(1)])         b    c 0  1.0  1.0  1.0 

another solution ayhan comment dropna:

print (df.dropna(subset=['a', 'b', 'c']))         b    c 0  1.0  1.0  1.0 

what same as:

print (df.dropna(subset=['a', 'b', 'c'], how='any')) 

and means drop rows, @ least 1 nan value.


Comments