Hi there just started to work with violin plots in R and I'm pretty fine with the results but, for some reason, despite attempting various alternatives I cannot change the order of the plots on the x-axis. See below for an example:enter image description here

基本上,我这里有的是八个种群的一系列小提琴曲线图,我展示了它们的变种统计数据;我希望它们按如下顺序排序:AFR、EUR、MENA、SAS、CEA、SIB、OCE和AME,假设它们概括了每一组中发现的变种总数递减的情况.

这是我使用的代码:

library(dplyr)
library(readxl)
library(tibble)
library(ggplot2)
library(hrbrthemes)
library(introdataviz)

variants_dist <- read_excel("path/to/file.xlsm", 10)
df_var = variants_dist %>% group_by(population_ID) %>% summarise(num=n())

### PLOT THE DATA
variants_dist %>%
  left_join(df_var) %>%
  mutate(pop_count = paste0(population_ID, "\n", "n=", num)) %>%
  ggplot(aes(x=pop_count, y=snps, fill=population_ID)) +
  geom_violin(position="dodge", trim=FALSE) +
  geom_boxplot(width=0.07, color="black", alpha=0.6) +
  scale_fill_manual(values=c(EUR="dodgerblue2", MENA="mediumvioletred", SIB="darkkhaki", CEA="firebrick2", AFR="olivedrab2", OCE="powderblue", SAS="darksalmon", AME="plum2")) +
  #scale_x_discrete(limits = c("AFR", "EUR", "MENA", "SAS", "CEA", "SIB", "OCE", "AME")) +
  theme_bw() +
  theme(
    legend.position="none",
  ) +
  xlab("")

我遵循了其中一个建议的教程来获得这个结果,但不幸的是,一些基本的东西,如改变我通常与factor指定所需的序列为levels似乎不工作.我 comments 了一行,它将x刻度设置为离散,并覆盖了theme_bw()选项,我发现here,但我不一定倾向于使用.

任何帮助是非常感谢的,我怀疑问题可能是最初的left_join(df_var) %>%,如果是这样,我仍然不知道如何解决它.任何帮助是非常感谢,谢谢!

100 output

structure(list(samples = c("abh100 - number of:", "abh107 - number of:", "ALB212 - number of:", "Ale14 - number of:", "Ale20 - number of:", "Ale22 - number of:", "Ale32 - number of:", "altai363p - number of:", "armenia293 - number of:", "Armenian222 - number of:", "AV-21 - number of:", "Ayodo_430C - number of:", "Ayodo_502C - number of:", "Ayodo_81S - number of:", "B11 - number of:", "B17 - number of:", "Bishkek28439 - number of:", "Bishkek28440 - number of:", "Bu16 - number of:", "Bu5 - number of:", "BulgarianB4 - number of:", "BulgarianC1 - number of:", "ch113 - number of:", "CHI-007 - number of:", "CHI-034 - number of:", "DNK05 - number of:", "DNK07 - number of:", "DNK11 - number of:", "Dus16 - number of:", "Dus22 - number of:", "Esk29 - number of:", "Est375 - number of:", "Est400 - number of:", "HG00126 - number of:", "HG00128 - number of:"), population_ID = c("MENA", "MENA", "EUR", "SIB", "SIB", "SIB", "SIB", "SIB", "EUR", "EUR", "EUR", "AFR", "AFR", "AFR", "SAS", "SAS", "SIB", "SIB", "CEA", "CEA", "EUR", "EUR", "EUR", "CEA", "CEA", "AFR", "AFR", "AFR", "OCE", "OCE", "SIB", "EUR", "EUR", "EUR", "EUR"), snps = c(4847876, 4820146, 4875942, 4848405, 4846958, 4893150, 4886498, 4778500, 4868602, 4861225, 5513106, 5726596, 5766508, 5372587, 4974419, 4894272, 4870208, 4913870, 4923787, 4925207, 4840414, 4798908, 4891562, 4953420, 4881495, 5605004, 5703805, 5643221, 4831148, 4829405, 4688483, 4783761, 4778239, 4774887, 4811481)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -35L))

@stefan得EDIT

variants_dist <- variants_dist %>%
  mutate(population_ID=factor(population_ID, levels=c("AFR", "EUR", "MENA", "SAS", "CEA", "SIB", "OCE", "AME")))

variants_dist %>% arrange(population_ID) -> pop_sort

然后,我把x=pop_count改成了x=forcats::fct_inorder(pop_count)

这就是你在 comments 中的意思吗?

推荐答案

这似乎有效.因为没有提供给fct的级别,所以它们是按照它们出现的顺序从唯一值计算的,并且它们已经按照所需的顺序预先排列.

df_var = variants_dist %>% group_by(population_ID) %>% summarise(num=n())

### PLOT THE DATA
variants_dist %>%
  left_join(df_var) %>%
  arrange(factor(population_ID, levels = c("AFR", "EUR", "MENA", "SAS", "CEA", "SIB", "OCE", "AME"))) |> 
  mutate(pop_count = paste0(population_ID, "\n", "n=", num)) %>%
  mutate(pop_count = fct(pop_count)) %>%
  ggplot(aes(x=pop_count, y=snps, fill=population_ID)) +
  geom_violin(position="dodge", trim=FALSE) +
  geom_boxplot(width=0.07, color="black", alpha=0.6) +
  scale_fill_manual(values=c(EUR="dodgerblue2", MENA="mediumvioletred", SIB="darkkhaki", CEA="firebrick2", AFR="olivedrab2", OCE="powderblue", SAS="darksalmon", AME="plum2")) +
  theme_bw() +
  theme(
    legend.position="none",
  ) +
  xlab("")

创建于2024—03—19,reprex v2.1.0

R相关问答推荐

如何从其他前面列中减go 特定列的平均值?

根据列表中项目的名称多次合并数据框和列表

如何 bootstrap glm回归、估计95%置信区间并绘制它?

更改编号列表的 colored颜色

在R中创建一个包含转换和转换之间的时间的列

为什么观察不会被无功值变化触发?

在R中将特定列的值向右移动

使用sf或terra的LINESTRAING的累积长度

提取具有连续零值的行,如果它们前面有R中的有效值

将包含卷的底部25%的组拆分为2行

如何对r中包含特定(未知)文本的行求和?

以不同于绘图中元素的方式对GG图图例进行排序

错误包arrowR:READ_PARQUET/OPEN_DATASET&QOT;无法反序列化SARIFT:TProtocolException:超出大小限制&Quot;

填充图例什么时候会有点?

名字的模糊匹配

roxygen2正在处理太多的文件

了解nchar在列表上的意外行为

将每晚的平均值与每晚的值进行比较,统计是否有效?

将Geojson保存为R中的shapefile

如何在分组蜂群小区中正确定位标签