R版本4.3.1 库(Ggplot2)版本3.4.2

我想使用ggplot2条形图来显示特定浓度类的百分比频率. 我有五个浓度类别(=c1,选项"1_","2_","3_","4_","5_"),事件可以发生在"早期"(=c2二分,带有"是"或"否")或"迟"(=c3二分,带有"是"或"否").

为了防止条形图中的条形图掉落,我在gggraph中设置了"Scale_x_Display(Drop=False)".

如果一个事件发生在一个、三个、四个或所有五个浓度类别中,条形图就像我想象的那样绘制:没有发生事件的浓度没有绘制条形图,但所有五个浓度类别都显示在x轴上.

如果一个事件只在两个浓度类中发生,则会绘制相邻的粗条,而不是在各自浓度处绘制两个细条.

这个问题的原因是什么?我如何解决它? 事先非常感谢您的任何建议.

这是我使用的代码:

# build data.frame
c1 <- c("1_", "1_", "2_", "2_", "3_", "4_", "4_", "4_", "5_", "5_")
c2 <- c("yes", "yes", "no", "no", "no", "no", "no", "no", "no", "no")
c3 <- c("no", "no", "yes", "no", "yes", "yes", "no", "no", "no", "yes") # for five bars
c3 <- c("no", "no", "yes", "no", "yes", "no", "no", "no", "no", "yes") # for four bars
c3 <- c("no", "no", "yes", "no", "no", "no", "no", "no", "no", "yes") # for three bars
c3 <- c("no", "no", "no", "no", "no", "no", "no", "no", "no", "yes") # for two bars --> THIS is where the error occurs
c3 <- c("no", "no", "no", "no", "no", "no", "no", "no", "no", "no") # for one bars

dataf <- data.frame(c1, c2, c3)
dataf

# compilation of data + addition of column with event status + reduction of table to relevant columns
bp11 <- dataf %>% mutate(c.Compilation = case_when(c2 == "yes" ~ "EarlyEvent",
                                                   c3 == "yes" ~ "LateEvent")) %>%
  select(c1, c.Compilation)
bp11

# insert number (n) and percentage share (prop)
bp22 <- bp11 %>% group_by(c1) %>% count(c.Compilation) %>% mutate(prop = n / sum(n) * 100)

# removal of NA
bp33 <- bp22[complete.cases(bp22$c.Compilation), ]

# specify factor level order "Groups"
bp33$c.Compilation = factor(bp33$c.Compilation, levels = c("LateEvent", "EarlyEvent"))

# specify factor level order "Concentration"
bp33$c1 = factor(bp33$c1, levels = c("1_", "2_", "3_", "4_", "5_"))

# calculate n of each concentration group
calcN <- table(bp11$c1)
calcN

# draw stacked barplot
ggplot(bp33,
       aes(x = c1, y = prop, fill = c.Compilation)) +
  geom_bar(stat = "identity",
           position = "stack") +
  labs(x = "Concentration (mg/L)",
       y = "Event (%)",
       fill = "Groups",
       title = "Css") +
  theme(axis.text = element_text(size = 15, face = "bold"),
        axis.title = element_text(size = 15, face = "bold"),
        legend.text = element_text(size = 15),
        legend.title = element_text(size = 15),
        plot.title = element_text(size = 20, hjust = 0.5)) +
  theme(legend.position = c(.18, .84)) +
  scale_x_discrete(breaks = c("1_", "2_", "3_", "4_", "5_"),
                   drop = FALSE) +
  scale_y_continuous(limit = c(-1, 101), breaks = c(5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100)) +
  annotate("text", x = c(0.9, 1.15), y = -1, label = c("n = ", calcN[c("1_")]), size = 5) +  
  annotate("text", x = c(1.9, 2.15), y = -1, label = c("n = ", calcN[c("2_")]), size = 5) +
  annotate("text", x = c(2.9, 3.15), y = -1, label = c("n = ", calcN[c("3_")]), size = 5) +
  annotate("text", x = c(3.9, 4.15), y = -1, label = c("n = ", calcN[c("4_")]), size = 5) +
  annotate("text", x = c(4.9, 5.15), y = -1, label = c("n = ", calcN[c("5_")]), size = 5) +
  scale_fill_grey(start = 0.6, end = 0)

这是我在运行代码时得到的结果:

Five bars in the bar chart

Four bars in the bar chart

Three bars in the bar chart

Two bars in the bar chart, this is where the error occurs

One bar in the bar chart

推荐答案

对我来说,这看起来像是一个错误(经过一些测试,只有在stat="identity"并且没有相邻类别的情况下才会出现这种情况).

NOTE:该问题已报告,并已在ggplot2的开发版本中修复.(见下文).

一种解决方法是手动完成数据集,例如使用tidyr::complete:

library(ggplot2)
library(dplyr, warn = FALSE)
library(tidyr)

bp33_complete <- bp33 |>
  ungroup() |>
  complete(c1, c.Compilation, fill = list(prop = 0))

ggplot(
  bp33_complete,
  aes(x = c1, y = prop, fill = c.Compilation)
) +
  geom_col() +
  scale_x_discrete(
    breaks = c("1_", "2_", "3_", "4_", "5_"),
    drop = FALSE
  ) +
  scale_y_continuous(
    breaks = seq(5, 100, 5),
    limits = c(-1, 101)
  ) +
  labs(
    x = "Concentration (mg/L)",
    y = "Event (%)",
    fill = "Groups",
    title = "Css"
  ) +
  theme(
    axis.text = element_text(size = 15, face = "bold"),
    axis.title = element_text(size = 15, face = "bold"),
    legend.text = element_text(size = 15),
    legend.title = element_text(size = 15),
    plot.title = element_text(size = 20, hjust = 0.5)
  ) +
  theme(legend.position = c(.18, .84)) +
  geom_text(
    aes(
      label = after_stat(paste0("n = ", count)),
      y = after_stat(-1),
      fill = NULL
    ),
    data = bp11,
    stat = "count",
    size = 5
  ) +
  scale_fill_grey(start = 0.6, end = 0)

enter image description here

DATA

c1 <- c("1_", "1_", "2_", "2_", "3_", "4_", "4_", "4_", "5_", "5_")
c2 <- c("yes", "yes", "no", "no", "no", "no", "no", "no", "no", "no")
# for two bars --> THIS is where the error occurs
c3 <- c("no", "no", "no", "no", "no", "no", "no", "no", "no", "yes") 

dataf <- data.frame(c1, c2, c3)

Update该问题已在herehere中报告,似乎已在ggplot2的开发版本中修复:

使用ggplot2 3.4.4:

library(ggplot2)

packageVersion("ggplot2")
#> [1] '3.4.4'

dat <- data.frame(
  x = factor(c("A", "E"), levels = LETTERS[1:5])
)

ggplot(dat, aes(x, y = 1)) +
  geom_bar(stat = "identity") + # or geom_col
  scale_x_discrete(drop = FALSE)

创建于2023-12-30,共reprex v2.0.2

使用开发版本:

library(ggplot2)

packageVersion("ggplot2")
#> [1] '3.4.4.9000'

dat <- data.frame(
  x = factor(c("A", "E"), levels = LETTERS[1:5])
)

ggplot(dat, aes(x, y = 1)) +
  geom_bar(stat = "identity") + # or geom_col
  scale_x_discrete(drop = FALSE)

R相关问答推荐

以R中的正确顺序将日期时间字符列转换为posixct

当两个图层映射到相同的美学时,隐藏一个图层的图例值

自动变更列表

计算时间段的ECDF(R)

使用外部文件分配变量名及其值

找出二叉树中每个 node 在R中的深度?

计算满足R中条件的连续列

如何在科学记数法中显示因子

从多面条形图中删除可变部分

调换行/列并将第一行(原始数据帧的第一列)提升为标题的Tidyr类似功能?

Geom_arcbar()中出错:找不到函数";geom_arcbar";

为什么函数toTitleCase不能处理english(1),而toupper可以?

为R中的16组参数生成10000个样本的有效方法是什么?

有没有办法通过str_Detect()或其他字符串匹配函数来连接两个长度不等的数据帧?

对R中的列表列执行ROW Mean操作

如何在内联代码中添加额外的空格(R Markdown)

R dplyr::带有名称注入(LHS of:=)的函数,稍后在:=的RHS上引用

为什么R列名称忽略具有指定名称的向量,而只关注索引?

如何使用ggsurvfit包更改风险表中的标签名称?

使用相对风险回归计算RR