Count difference in row data between two data frames in R
Count difference in row data between two data frames in R
I have two data frames (a1 and a2).
The first (a1) is an original dataset and the second (a2) is the same, only contains data that has been appended to some records. I want to get a count of the number of records that contain appended data. I don't need to view the records.
What is the best way to just get a count of the number of records that are different in a2?
Please, provide a Minimal, Complete, and Verifiable example. From your description it is not clear what the difference between dataframes
a1
and a2
is, especially what you mean by contains data that has been appended to some records. Thank you.– Uwe
Jun 29 at 16:35
a1
a2
2 Answers
2
Ok, so first let me get this straight. You basically want to compare two dataframes and find the number of different columns.
Using dplyr
> a1
a b
1 1 a
2 2 b
3 3 c
4 4 d
5 5 e
> a2
a b
1 1 a
2 2 b
3 3 c
> df <- setdiff(a1,a2)
a b
1 4 d
2 5 e
> nrow(df)
2
Is this what you are looking for?
Using anti_join from dplyr: anti_join a2 with a1 will result in the records that are in a2, but not in a1. and tally will count the rows.
a2 %>%
anti_join(a1) %>%
tally()
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Its not clear what you're asking, A simple example would help. See stackoverflow.com/questions/5963269/…
– Ryan
Jun 29 at 16:08