Bug: impossible to delete infinite values from DataFrame

This is my DataFrame df:

df

col1 col2 -0.441406 2.523047 -0.321105 1.555589 -0.412857 2.223047 -0.356610 2.513048

When I check df, I see that there are some infinite values.

df

np.any(np.isnan(df)) np.all(np.isfinite(df)) False True

What is the difference between NaN and infinite? Also, how can I delete all infinite values to get True in np.all(np.isfinite(X))?

np.all(np.isfinite(X))

This is what I tried:

df = df.replace([np.inf, -np.inf], np.nan).dropna(how="all")

But still the check of infinite gives me True.

infinite

enter image description here

Moreover, .apply(lambda s: s[np.isfinite(s)].dropna()).count() gives me the same number of rows of all columns as simply df.shape, which indicates the lack of infinite values. But in this case why np.all(np.isfinite(df)) returns True?

.apply(lambda s: s[np.isfinite(s)].dropna()).count()

df.shape

np.all(np.isfinite(df))

1 Answer
1

Your question is similar to dropping infinite values from dataframes in pandas?,
did you try:

df.replace([np.inf, -np.inf], np.nan).dropna(subset=["col1", "col2"], how="all")

np.nan is not considered as finite, you may replace np.nan by any finite number see that code for example:

np.nan

finite

np.nan

number

import pandas as pd import numpy as np df = pd.DataFrame(columns=list('ABC')) df.loc[0] = [1,np.inf,-np.inf] print df print np.all(np.isfinite(df)) df_nan = df.replace([np.inf, -np.inf], np.nan).dropna(subset=df.columns, how="all") print df_nan print np.all(np.isfinite(df_nan)) df_0 = df.replace([np.inf, -np.inf], 0).dropna(subset=df.columns, how="all") print df_0 print np.all(np.isfinite(df_0))

Result:

A B C 0 1.0 inf -inf False A B C 0 1.0 NaN NaN False A B C 0 1.0 0.0 0.0 True

How is it different from what I posted in my question? This is exactly what I tried and it didn't work.
– ScalaBoy
Jun 28 at 15:38

Not exactly the same because : .dropna(subset=["col1", "col2"], how="all") != .dropna()
– A STEFANI
Jun 28 at 15:46

.dropna(subset=["col1", "col2"], how="all")

.dropna()

Should I mention all the columns? Can I do .dropna(subset=df.columns, how="all")?
– ScalaBoy
Jun 28 at 15:58

.dropna(subset=df.columns, how="all")

I added the screenshot of my Jupyter Notebook to my post.
– ScalaBoy
Jun 28 at 16:03

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Search This Blog

Mgiyuk