Proper way to extend Python class
Proper way to extend Python class
I'm looking to extend a Panda's DataFrame, creating an object where all of the original DataFrame attributes/methods are in tact, while making a few new attributes/methods available. I also need the ability to convert (or copy) objects that are already DataFrames to my new class. What I have seems to work, but I feel like I might have violated some fundamental convention. Is this the proper way of doing this, or should I even be doing it in the first place?
import pandas as pd
class DataFrame(pd.DataFrame):
def __init__(self, df):
df.__class__ = DataFrame # effectively 'cast' Pandas DataFrame as my own
the idea being I could then initialize it directly from a Pandas DataFrame, e.g.:
df = DataFrame(pd.read_csv(path))
DataFrame
pd.DataFrame
self = df
doesn't do anything– juanpa.arrivillaga
Jun 29 at 20:33
self = df
3 Answers
3
I'd probably do it this way, if I had to:
import pandas as pd
class CustomDataFrame(pd.DataFrame):
@classmethod
def convert_dataframe(cls, df):
df.__class__ = cls
return df
def foo(self):
return "Works"
df = pd.DataFrame([1,2,3])
print(df)
#print(df.foo()) # Will throw, since .foo() is not defined on pd.DataFrame
cdf = CustomDataFrame.convert_dataframe(df)
print(cdf)
print(cdf.foo()) # "Works"
Note: This will forever change the df object you pass to convert_dataframe
:
convert_dataframe
print(type(df)) # <class '__main__.CustomDataFrame'>
print(type(cdf)) # <class '__main__.CustomDataFrame'>
If you don't want this, you could copy the dataframe inside the classmethod.
If you just want to add methods to a DataFrame
just monkey patch before you run anything else as below.
DataFrame
>>> import pandas
>>> def foo(self, x):
... return x
...
>>> foo
<function foo at 0x00000000009FCC80>
>>> pandas.DataFrame.foo = foo
>>> bar = pandas.DataFrame()
>>> bar
Empty DataFrame
Columns:
Index:
>>> bar.foo(5)
5
>>>
Thanks for the response, I actually wanted to add some attributes to the dataframe that specifically interact with the methods, I've updated the question
– Tahlor
Jun 29 at 20:07
You can create an initializer method (not called
__init__
) that monkey patches your new attributes onto the data frame after it is created– user25064
Jun 29 at 20:10
__init__
if __name__ == '__main__':
app = DataFrame()
app()
event
super(DataFrame,self).__init__()
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
You're mixing up inheritance and composition. Your
DataFrame
class both "has a" and "is a"pd.DataFrame
.– mypetlion
Jun 29 at 19:21