Bug: impossible to delete infinite values from DataFrame


Bug: impossible to delete infinite values from DataFrame



This is my DataFrame df:


df


col1 col2
-0.441406 2.523047
-0.321105 1.555589
-0.412857 2.223047
-0.356610 2.513048



When I check df, I see that there are some infinite values.


df


np.any(np.isnan(df))
np.all(np.isfinite(df))

False
True



What is the difference between NaN and infinite? Also, how can I delete all infinite values to get True in np.all(np.isfinite(X))?


np.all(np.isfinite(X))



This is what I tried:


df = df.replace([np.inf, -np.inf], np.nan).dropna(how="all")



But still the check of infinite gives me True.


infinite



enter image description here



Moreover, .apply(lambda s: s[np.isfinite(s)].dropna()).count() gives me the same number of rows of all columns as simply df.shape, which indicates the lack of infinite values. But in this case why np.all(np.isfinite(df)) returns True?


.apply(lambda s: s[np.isfinite(s)].dropna()).count()


df.shape


np.all(np.isfinite(df))




1 Answer
1



Your question is similar to dropping infinite values from dataframes in pandas?,
did you try:


df.replace([np.inf, -np.inf], np.nan).dropna(subset=["col1", "col2"], how="all")



np.nan is not considered as finite, you may replace np.nan by any finite number see that code for example:


np.nan


finite


np.nan


number


import pandas as pd
import numpy as np

df = pd.DataFrame(columns=list('ABC'))
df.loc[0] = [1,np.inf,-np.inf]
print df

print np.all(np.isfinite(df))

df_nan = df.replace([np.inf, -np.inf], np.nan).dropna(subset=df.columns, how="all")
print df_nan

print np.all(np.isfinite(df_nan))

df_0 = df.replace([np.inf, -np.inf], 0).dropna(subset=df.columns, how="all")
print df_0

print np.all(np.isfinite(df_0))



Result:


A B C
0 1.0 inf -inf
False
A B C
0 1.0 NaN NaN
False
A B C
0 1.0 0.0 0.0
True





How is it different from what I posted in my question? This is exactly what I tried and it didn't work.
– ScalaBoy
Jun 28 at 15:38






Not exactly the same because : .dropna(subset=["col1", "col2"], how="all") != .dropna()
– A STEFANI
Jun 28 at 15:46


.dropna(subset=["col1", "col2"], how="all")


.dropna()





Should I mention all the columns? Can I do .dropna(subset=df.columns, how="all")?
– ScalaBoy
Jun 28 at 15:58


.dropna(subset=df.columns, how="all")





I added the screenshot of my Jupyter Notebook to my post.
– ScalaBoy
Jun 28 at 16:03






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Export result set on Dbeaver to CSV

Opening a url is failing in Swift