我对在ggplot条形图上方添加分组标签感兴趣.这个特性存在于数据可视化中,比如系统发育树(在ggtree中),但我还没有找到在ggplot中实现它的方法.

我试着玩弄geom_文本和geom_标签,但还没有成功.也许还有另一个软件包支持这个功能?我附上了一些示例代码,这些代码应该是完全可复制的.我想让评级变量越过列出的各大洲(跨越多个大洲)的横栏.

非常感谢您的帮助!非常感谢.

请原谅所有的 comments ——我在写一篇教学教程.

#load necessary packages
library(tidyverse)
library(stringr)
library(hrbrthemes)
library(scales)

#load data
covid<- read_csv("https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv", na = ".")  

#this makes a new dataframe (total_cases) that only has the latest COVID cases count and location data
total_cases <- covid %>% filter(date == "2021-05-23") %>% 
  group_by(location, total_cases) %>% 
  summarize()

#get number for world total cases. 
world <- total_cases %>%
  filter(location == "World") %>%
  select(total_cases)

#make new column that has the proportion of total world cases (number was total on that day)
total_cases$prop_total <- total_cases$total_cases/world$total_cases

#this specifies what the continents are so we can filter them out with dplyr
continents <- c("North America", "South America", "Antarctica", "Asia", "Europe", "Africa", "Australia")

#Using dyplr, we're choosing total_cases pnly for the continents
contin_cases <- total_cases %>%
  filter(location %in% continents)

#Loading a colorblind accessible palette
cbbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")

#Add a column that rates proportion of cases categorically.   
contin_cases <- contin_cases %>% 
  mutate(rating = case_when(prop_total <= 0.1 ~ 'low',
                            prop_total <= 0.2 ~ 'medium',
                            prop_total <= 1 ~ 'high'))

#Ploting it on a bar chart. 
plot1 <- ggplot(contin_cases, 
           aes(x = reorder(location, prop_total),
               y = prop_total,
               fill = location)) +
  geom_bar(stat="identity", color="white") +
  ylim(0, 1) +
  geom_text(aes(y = prop_total,
                label = round(prop_total, 4)),
            vjust = -1.5) +
  scale_fill_manual(name = "Continent", 
                    values = cbbPalette) +
  labs(title = "Proportion of total COVID-19 Cases Per Continent", 
       caption ="Figure 1. Asia leads total COVID case count as of May 23rd, 2021. No data exists in this dataset for Antarctica.") +
  ylab("Proportion of total cases") +
  xlab("") + #this makes x-axis blank
  theme_classic()+
    theme(
    plot.caption = element_text(hjust = 0, face = "italic"))

plot1

这里有一些类似于我试图实现的目标:

bar chart showing total covid cases by continent as of May 2021

推荐答案

实现你想要的结果的一种方法是通过geom_segment.为此,我首先准备了一个数据集,其中包含了分段的起始位置和结束位置,这些分段将按评级组放置在杆的顶部.基本上,这涉及到将离散位置转换为数字.

之后,添加片段和标签非常简单.

library(tidyverse)
library(hrbrthemes)
library(scales)

# Loading a colorblind accessible palette
cbbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")

width <- .45 # Half of default width of bars
df_segment <- contin_cases %>% 
  ungroup() %>% 
  # Convert location to numerics
  mutate(loc_num = as.numeric(fct_reorder(location, prop_total))) %>%
  group_by(rating) %>% 
  summarise(x = min(loc_num) - width, xend = max(loc_num) + width,
            y = max(prop_total) * 1.5, yend = max(prop_total) * 1.5)

ggplot(
  contin_cases,
  aes(
    x = reorder(location, prop_total),
    y = prop_total,
    fill = location
  )
) +
  geom_bar(stat = "identity", color = "white") +
  ylim(0, 1) +
  geom_segment(data = df_segment, aes(x = x, xend = xend, y = max(y), yend = max(yend), 
                                      color = rating, group = rating), 
               inherit.aes = FALSE, show.legend = FALSE) +
  geom_text(data = df_segment, aes(x = .5 * (x + xend), y = max(y), label = str_to_title(rating), color = rating), 
            vjust = -.5, inherit.aes = FALSE, show.legend = FALSE) +
  geom_text(aes(
    y = prop_total,
    label = round(prop_total, 4)
  ),
  vjust = -1.5
  ) +
  scale_fill_manual(
    name = "Continent",
    values = cbbPalette
  ) +
  labs(
    title = "Proportion of total COVID-19 Cases Per Continent",
    caption = "Figure 1. Asia leads total COVID case count as of May 23rd, 2021. No data exists in this dataset for Antarctica."
  ) +
  ylab("Proportion of total cases") +
  xlab("") + # this makes x-axis blank
  theme_classic() +
  theme(
    plot.caption = element_text(hjust = 0, face = "italic")
  )

DATA

contin_cases <- structure(list(location = c(
  "Africa", "Asia", "Australia", "Europe",
  "North America", "South America"
), total_cases = c(
  4756650, 49204489,
  30019, 46811325, 38790782, 27740153
), prop_total = c(
  0.0284197291646085,
  0.293983843894959, 0.000179355607369132, 0.2796853202015, 0.231764691226676,
  0.165740097599109
), rating = c(
  "low", "high", "low", "high",
  "high", "medium"
)), class = c(
  "grouped_df", "tbl_df", "tbl",
  "data.frame"
), row.names = c(NA, -6L), groups = structure(list(
  location = c(
    "Africa", "Asia", "Australia", "Europe", "North America",
    "South America"
  ), .rows = structure(list(
    1L, 2L, 3L, 4L,
    5L, 6L
  ), ptype = integer(0), class = c(
    "vctrs_list_of",
    "vctrs_vctr", "list"
  ))
), row.names = c(NA, -6L), class = c(
  "tbl_df",
  "tbl", "data.frame"
), .drop = TRUE))

R相关问答推荐

使用%in%时如何应用多个条件?

如何使用R以NASAGIBS.ViirsEarthAtNight2012风格绘制自定义 map

导入到固定列宽的R中时出现问题

从具有随机模式的字符串中提取值

如何根据包含相同值的某些列获取总额

使用预定值列表将模拟数量(n)替换为rnorm()

带有叠加饼图系列的Highmap

当月份额减go 当月份额

如果某些列全部为NA,则更改列

使用rest从header(h2,h3,table)提取分层信息

R中边际效应包中Logistic回归的交互作用风险比

无法定义沿边轨迹的 colored颜色 渐变(与值无关)

如何根据数据帧中的值从该数据帧中提取值?

如何将使用rhandsontable呈现的表值格式化为百分比,同时保留并显示完整的小数精度?

如何创建累加到现有列累计和的新列?

如何使用同比折线图中的个别日

Geom_arcbar()中出错:找不到函数";geom_arcbar";

计算来自单独分组的分幅的值的百分位数

Ggplot2如何找到存储在对象中的残差和拟合值?

使用LAG和dplyr执行计算,以便按行和按组迭代