我有两个df美元,如下所示.

df1 <- structure(list(Sequence = c(
  "ABC>EFGHI", "ABC>NOPQ", "ABC>JKLM",
  "ABC>RSTUV", "ABC>EFGHI>NOPQ", "ABC>NOPQ>EFGHI", "ABC>NOPQ>RSTUV",
  "ABC>EFGHI>RSTUV", "TD2>EFGHI>JKLM", "ABC>JKLM>EFGHI", "ABC>EFGHI>NOPQ>RSTUV",
  "ABC>NOPQ>EFGHI>RSTUV", "ABC>JKLM>NOPQ", "ABC>NOPQ>JKLM", "ABC>JKLM>RSTUV",
  "ABC>JKLM>NOPQ>RSTUV", "ABC>EFGHI>JKLM>RSTUV", "ABC>JKLM>EFGHI>RSTUV",
  "ABC>NOPQ>JKLM>RSTUV"
), Proportion = c(
  21.05, 8.4, 5.35, 4.36,
  2.87, 2.48, 1.52, 1.27, 1.04, 0.94, 0.66, 0.53, 0.44, 0.36, 0.31,
  0.11, 0.07, 0.06, 0.06
), Order = c(
  1, 3, 2, 4, 6, 11, 13, 7,
  5, 8, 15, 18, 9, 12, 10, 17, 14, 16, 19
), `ABC, NOPQ and JKLM` = c(
  NA,
  NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
  NA, 75
), `ABC and EFGHI` = c(
  NA, NA, NA, NA, NA, NA, NA, NA, NA,
  NA, NA, NA, NA, NA, NA, NA, 66, NA, NA
), `NOPQ and RSTUV` = c(
  NA,
  NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 70, NA,
  NA, NA
), RSTUV = c(
  NA, NA, NA, 71, NA, NA, 76, 71, NA, NA, 75,
  78, NA, NA, 61, NA, 70, 78, 77
), NOPQ = c(
  NA, 66, NA, NA, 66,
  65, 74, NA, NA, NA, 73, 74, 60, 60, NA, NA, NA, NA, NA
), JKLM = c(
  NA,
  NA, 51, NA, NA, NA, NA, NA, 56, 52, NA, NA, 56, 62, 59, 68, 67,
  71, NA
), EFGHI = c(
  59, NA, NA, NA, 63, 67, NA, 69, 54, 54, 72,
  76, NA, NA, NA, NA, NA, 76, NA
), ABC = c(
  56, 63, 48, 69, 61,
  63, 72, 66, 53, 50, 71, 73, 54, 58, 58, 66, NA, 68, NA
)), row.names = c(
  NA,
  -19L
), class = "data.frame")


df1_long <- df1 %>%
  pivot_longer(-c(1:3), names_to = "event") %>%
  filter(!is.na(value)) %>%
  arrange(Order, value) %>%
  mutate(Sequence = fct_inorder(Sequence)) %>% arrange(desc(Proportion))

这是我的代码和情节.我如何固定地块内重叠的标签(例如‘ABC、NOPQ和JKLM’和‘RSTUV’还包括‘ABC和EFGHI’和‘JLKM’)?

我已经用geom_text_repel(aes(label = event))试了ggrepel次,但没有成功.

此外,我如何才能增加地块内部线条之间的空间,使其看起来不那么繁忙.

df1_long %>%
  ggplot(aes(value, Sequence, color = event)) +
  geom_path(
    aes(group = Sequence),
    linewidth = 1.0,
    arrow = arrow(length = unit(5, "pt"))
  ) +
  geom_point() +
  geom_label(aes(label = event),
    vjust = 1, fill = NA, label.size = 0,
    label.padding = unit(8, "pt"),
    color = "black"
  ) +
  geom_label(aes(label = value),
    vjust = 0, fill = NA, label.size = 0,
    label.padding = unit(8, "pt"),
    color = "black"
  ) +
  geom_text(
    data = df1,
    aes(label = scales::percent(Proportion, scale = 1, accuracy = 0.01)),
    x = 87,
    color = "black", hjust = "inward"
  ) +
  scale_x_continuous(breaks = c(45, 85, 15)) +
  scale_x_continuous(expand = c(0.05, 0, 0.05, 5)) +
  scale_color_brewer(type = "qual", palette = 8) +
  guides(color = "none") +
  theme_bw() +
  theme(
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(
    x = "Age",
    y = "Sequence"
  )

enter image description here

推荐答案

两条建议是把"和"S改成"+"S,缩小标签尺寸.这是其中的一部分,然后使用ggrepel来处理最后几个重叠:

df1_long %>%
  mutate(event = str_replace_all(event, " and ", "+")) |> 
  ggplot(aes(value, Sequence, color = event)) +
  geom_path(
    aes(group = Sequence),
    linewidth = 1.0,
    arrow = arrow(length = unit(5, "pt"))
  ) +
  geom_point() +
  ggrepel::geom_text_repel(aes(label = event),
             vjust = 2, color = "black", angle = 0, size = 3, direction = "both", force_pull = 1,
  ) + ...

enter image description here

另一个注意事项是,geom_text比geom_Label快得多,所以除非您想要标签,否则使用geom_text.

R相关问答推荐

如何判断某列中由某些行组成的百分比

列出用m n个值替换来绘制n个数字的所有方法(i.o.w.:R中大小为n的集合的所有划分为m个不同子集)

将非重复序列高效转换为长格式

如何在emmeans中计算连续变量的对比度

使用sf或terra的LINESTRAING的累积长度

R函数‘paste`正在颠倒其参数的顺序

将小数分隔符放在R中的前两位数字之后

有效识别长载体中的高/低命中

如何在R中描绘#符号?

如何用书面利率绘制geom_bar图

当我们有多个反斜杠和/特殊字符时使用Gsubing

在R函数中使用加号

R中Gamma回归模型均方误差的两种计算方法不一致

如何使用前缀作为匹配来连接数据帧?

如何将一些单元格的内容随机 Select 到一个数据框中?

有没有办法将不等长的列表转换为R中的数据帧

Rmarkdown::Render vs Source()

替换在以前工作的代码中有x行&q;错误(geom_sf/gganimate/dow_mark)

R/shiny APP:如何充分利用窗口?

图中显示错误 colored颜色 的图例geom_sf