Reorder not working in ggplot with my current data frame
Reorder not working in ggplot with my current data frame
I'm currently trying to make my own graphical timeline like the one at the bottom of this page. I scraped the table from that link using the rvest package and cleaned it up.
Here is my code:
library(tidyverse)
library(rvest)
library(ggthemes)
library(lubridate)
URL <- "https://en.wikipedia.org/wiki/List_of_Justices_of_the_Supreme_Court_of_the_United_States"
justices <- URL %>%
read_html %>%
html_node("table.wikitable") %>%
html_table(fill = TRUE) %>%
data.frame()
# Removes weird row at bottom of the table
n <- nrow(justices)
justices <- justices[1:(n - 1), ]
# Separating the information I want
justices <- justices %>%
separate(Justice.2, into = c("name","year"), sep = "(") %>%
separate(Tenure, into = c("start", "end"), sep = "n–") %>%
separate(end, into = c("end", "reason"), sep = "(") %>%
select(name, start, end)
# Removes wikipedia tags in start column
justices$start <- gsub('[e]$|[m]|[j]$$','', justices$start)
justices$start <- mdy(justices$start)
# This will replace incumbencies with NA
justices$end <- mdy(justices$end)
# Incumbent judges are still around!
justices[is.na(justices)] <- today()
justices$start = as.Date(justices$start, format = "%m/%d%/Y")
justices$end = as.Date(justices$end, format = "%m/%d%/Y")
justices %>%
ggplot(aes(reorder(x = name, X = start))) +
geom_segment(aes(xend = name,
yend = start,
y = end)) +
coord_flip() +
scale_y_date(date_breaks = "20 years", date_labels = "%Y") +
theme(axis.title = element_blank()) +
theme_fivethirtyeight() +
NULL
This is the output from ggplot (I'm not worried about aesthetics yet I know it looks terrible!):
The goal for this plot is to order the judges chronologically from their start date, so the judge with the oldest start date should be at the bottom while the judge with the most recent should be at the top. As you can see, There are multiple instances where this rule is broken.
Instead of sorting chronologically, it simply lists the judges as the order they appear in the data frame, which is also the order Wikipedia has it in.
Therefore, a line segment above another segment should always start further right than the one below it
My understanding of reorder is that it will take the X = start from geom_segment and sort that and list the names in that order.
The only help I could find to this problem is to factor the dates and then order them that way, however I get the error
Error: Invalid input: date_trans works with objects of class Date only.
Thank you for your help!
year
name
I thought this too, but if you look at the bottom portion of the plot there is that group of three small lines that are entirely out of order, even if it were year. However now that I'm looking at it I wonder if my geom_segment code is also wrong...
– Joe Stoica
Jun 29 at 22:10
Oh, you're right. It's putting Thomas Johnson before John Rutledge.
– Anonymous coward
Jun 29 at 22:12
Okay, so it looks like that the lines that belong to the judges that had more than one appointment are being placed in the center of where they are actually supposed to be too.
– Joe Stoica
Jun 29 at 22:23
I would love to see it!
– Joe Stoica
Jun 29 at 22:28
2 Answers
2
You can make the name
column a factor and use forcats::fct_reorder
to reorder names based on start date. fct_reorder
can take a function that's used for ordering start
; you can use min()
to order by the earliest start date for each justice. That way, judges with multiple start dates will be sorted according to the earliest one. Only a two line change: add a mutate
at the beginning of the pipe, and remove the reorder
inside aes
.
name
forcats::fct_reorder
fct_reorder
start
min()
mutate
reorder
aes
justices %>%
mutate(name = as.factor(name) %>% fct_reorder(start, min)) %>%
ggplot(aes(x = name)) +
geom_segment(aes(xend = name,
yend = start,
y = end)) +
coord_flip() +
scale_y_date(date_breaks = "20 years", date_labels = "%Y") +
theme(axis.title = element_blank()) +
theme_fivethirtyeight()
Created on 2018-06-29 by the reprex package (v0.2.0).
Perfect, you're a lifesaver!
– Joe Stoica
Jun 30 at 3:21
I would make this a comment, but I couldn't fit it.
This was an attempt I gave up on. It looks like it actually does fix the problem, but it broke several other aspects of the formatting and I've run out of time to fix it back.
justices <- justices[order(justices$start, decreasing = TRUE),]
any(diff(justices$start) > 0) # FALSE, i.e. it works
justices$id <- nrow(justices):1
ggplot(data=justices, mapping=aes(x = start, y=id)) + #,color=name, color =
scale_x_date(date_breaks = "20 years", date_labels = "%Y") +
scale_y_discrete(breaks=justices$id, labels = justices$name) +
geom_segment(aes(xend = end, y = justices$id, yend = justices$id), size = 5) +
theme(axis.title = element_blank()) +
theme_fivethirtyeight()
Please also refer to this thread. GL!
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
I think it is ordering them by
year
, not the full date, then byname
. Seems the trick is getting reorder to use the full date.– Anonymous coward
Jun 29 at 22:06