Proper way to extend Python class


Proper way to extend Python class



I'm looking to extend a Panda's DataFrame, creating an object where all of the original DataFrame attributes/methods are in tact, while making a few new attributes/methods available. I also need the ability to convert (or copy) objects that are already DataFrames to my new class. What I have seems to work, but I feel like I might have violated some fundamental convention. Is this the proper way of doing this, or should I even be doing it in the first place?


import pandas as pd

class DataFrame(pd.DataFrame):
def __init__(self, df):
df.__class__ = DataFrame # effectively 'cast' Pandas DataFrame as my own



the idea being I could then initialize it directly from a Pandas DataFrame, e.g.:


df = DataFrame(pd.read_csv(path))





You're mixing up inheritance and composition. Your DataFrame class both "has a" and "is a" pd.DataFrame.
– mypetlion
Jun 29 at 19:21


DataFrame


pd.DataFrame





self = df doesn't do anything
– juanpa.arrivillaga
Jun 29 at 20:33


self = df




3 Answers
3



I'd probably do it this way, if I had to:


import pandas as pd

class CustomDataFrame(pd.DataFrame):
@classmethod
def convert_dataframe(cls, df):
df.__class__ = cls
return df

def foo(self):
return "Works"


df = pd.DataFrame([1,2,3])
print(df)
#print(df.foo()) # Will throw, since .foo() is not defined on pd.DataFrame

cdf = CustomDataFrame.convert_dataframe(df)
print(cdf)
print(cdf.foo()) # "Works"



Note: This will forever change the df object you pass to convert_dataframe:


convert_dataframe


print(type(df)) # <class '__main__.CustomDataFrame'>
print(type(cdf)) # <class '__main__.CustomDataFrame'>



If you don't want this, you could copy the dataframe inside the classmethod.



If you just want to add methods to a DataFrame just monkey patch before you run anything else as below.


DataFrame


>>> import pandas
>>> def foo(self, x):
... return x
...
>>> foo
<function foo at 0x00000000009FCC80>
>>> pandas.DataFrame.foo = foo
>>> bar = pandas.DataFrame()
>>> bar
Empty DataFrame
Columns:
Index:
>>> bar.foo(5)
5
>>>





Thanks for the response, I actually wanted to add some attributes to the dataframe that specifically interact with the methods, I've updated the question
– Tahlor
Jun 29 at 20:07





You can create an initializer method (not called __init__) that monkey patches your new attributes onto the data frame after it is created
– user25064
Jun 29 at 20:10


__init__


if __name__ == '__main__':
app = DataFrame()
app()



event


super(DataFrame,self).__init__()






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Opening a url is failing in Swift

Possible Unhandled Promise Rejection (id: 0): ReferenceError: user is not defined ReferenceError: user is not defined