Operations on the amount of values using python pandas
Operations on the amount of values using python pandas
I have 2 dataframes 'transactions' and 'offsets'
offsets:
Contact Account Name
0 TODD HOWARD
1 TODD HOWARD
2 JEFF COX
3 JEFF COX
4 TODD HOWARD
5 JEFF COX
6 MIKE BALDWIN
transactions:
Contact Account Name
0 TODD HOWARD
1 TODD HOWARD
2 JEFF COX
3 JEFF COX
4 TODD HOWARD
5 JEFF COX
6 TODD HOWARD
7 MIKE BALDWIN
8 MIKE BALDWIN
9 JEFF COX
10 JC WHITE
What it want to do:
1) Is to count each unique values. For this purpose I've used:
df1 = offsets.groupby('Contact Account Name').size()
df2 = transactions.groupby('Contact Account Name').size()
And I've got
df1:
Contact Account Name
TODD HOWARD 3
JEFF COX 3
MIKE BALDWIN 1
df2:
Contact Account Name
JC WHITE 1
TODD HOWARD 4
JEFF COX 4
MIKE BALDWIN 2
2) I want to merge both dataframes. I've tried merge
but it didn't work.
merge
3) I want to create another dataframe and to calculate percentage of the offsets out of overall transactions.
What result i want to see at the end:
Contact Account Name Offset Percentage
TODD HOWARD 75
JEFF COX 75
MIKE BALDWIN 50
JC WHITE 100
Thanks in advance!
1 Answer
1
Output of aggregation is Series
, so is possible divide by div
with multiple by mul
and last reset_index
:
Series
div
mul
reset_index
df = df1.div(df2, fill_value=1).mul(100).reset_index(name='Offset Percentage')
print (df)
Contact Account Name Offset Percentage
0 JC WHITE 100.0
1 JEFF COX 75.0
2 MIKE BALDWIN 50.0
3 TODD HOWARD 75.0
Similar solution with value_counts
:
value_counts
df1 = offsets['Contact Account Name'].value_counts()
df2 = transactions['Contact Account Name'].value_counts()
df = (df1.div(df2, fill_value=1)
.mul(100)
.rename_axis('Contact Account Name')
.reset_index(name='Offset Percentage'))
print (df)
Contact Account Name Offset Percentage
0 JC WHITE 100.0
1 JEFF COX 75.0
2 MIKE BALDWIN 50.0
3 TODD HOWARD 75.0
If need join both Series together call concat
:
concat
df = pd.concat([df2, df1], axis=1, keys=('Offset Percentage','b'))
df['Offset Percentage'] = df.b.div(df['Offset Percentage'], fill_value=1).mul(100)
df = df.drop('b', 1).rename_axis('Contact Account Name').reset_index()
print (df)
Contact Account Name Offset Percentage
0 JC WHITE 100.0
1 JEFF COX 75.0
2 MIKE BALDWIN 50.0
3 TODD HOWARD 75.0
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Comments
Post a Comment