Data

Week 30 of tidy Tuesday is data from the-numbers.com. It documents movies and their profit/cost.

Plotly

The plotly library creates interactive graphs. Graphs can be created using base R graphics or using ggplot2.

The graph below shows the median production cost of movies by genre and year. Genre’s can be hidden interactively.


Code

Loading and tidying data

library(tidyverse)
library(ggplot2)
library(scales)
library(plotly)

 # Read data
horror_movie <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2018-10-23/movie_profit.csv")

 # Tidy dates and calculate return
horror_movie <- horror_movie %>% select(-X1) %>%
  mutate(release_date = lubridate::mdy(release_date), 
         release_year = lubridate::year(release_date),
         return = worldwide_gross - production_budget)

  # Aggregate and summarise data by year and genre
horror_movie %>% 
  group_by(release_year, genre) %>%
  summarise(avg_return = mean(return), 
            median_return = median(return),
            avg_production = mean(production_budget), 
            median_production = median(production_budget),
            avg_gross = mean(worldwide_gross), 
            median_gross = median(worldwide_gross)) -> movie_agg

ggplot graph

  # ggplot of median production cost versus year by genre
yearly_cost_plot <- movie_agg %>% ggplot(aes(release_year, median_production, colour=genre)) +
  geom_line() +
  scale_y_continuous(labels = dollar_format(scale=0.000001, suffix="M")) +  # Change y-axis labels to dollars
  ylab(NULL) + xlab("Year") + 
  labs(colour=NULL) +  # remove legend title
  ggtitle("Median Production Cost")

yearly_cost_plot

plotly graph

After creating a ggplot graph, the ggplotly function will create a plotly version of the ggplot.

 # Create plotly graph
yearly_cost_plotly <- ggplotly(yearly_cost_plot) %>% 
  layout(margin = list(l = 45))  # add extra space on left for axis labels

shiny::div(yearly_cost_plotly, align = "center")  # need the div function to centre the graph on the webpage