我的数据来自rna_seq分析.我试图描绘CPM和频率,只是将一些上下文放在不同基因的零频率(CPM)上

我提供了一些名为DAT的小数据集.从这个数据集中,我想跟着做一些类似的事情.


# Create an empty list to store ggplot objects
histogram_plots <- list()

# Iterate through the elements
for (element in goi) {
  # Filter data for the specific element
  element_data <- dat[dat$Gene == element, ]
  
  # Create a ggplot histogram for the element
  plot <- ggplot(element_data, aes(x = CPM, y = after_stat(Frequency / sum(Frequency)))) +
    geom_histogram(fill = "cornflowerblue", color = "white", binwidth = 5) +
    labs(title = paste("Frequency Distribution of", element),
         y = "Percent",
         x = "CPM") +
    scale_y_continuous(labels = scales::percent_format()) +
    theme_minimal()
  
  # Add the ggplot object to the list
  histogram_plots[[element]] <- plot
  
  # Save the ggplot object as an image file (adjust the filename and format accordingly)
  ggsave(paste0("histogram_", gsub("-", "_", element), ".png"), plot, width = 8, height = 6)
}

geom_histogram()中的错误: 好了!将统计数据映射到美学时出现问题. 第1层出现ℹ错误. 由错误引起的: 好了!Objeto‘频率’没有禁忌 运行rlang::last_trace()查看错误发生的位置.

抛出找不到频率,但显然存在的错误.我不确定这是否与NSE和这些工作人员有关

《DF》


dat <- structure(list(Gene = c("hsa-miR-139-5p", "hsa-miR-139-5p", "hsa-miR-139-5p", 
"hsa-miR-139-5p", "hsa-miR-139-5p", "hsa-miR-139-5p", "hsa-miR-139-5p", 
"hsa-miR-139-5p", "hsa-miR-139-5p", "hsa-miR-139-5p", "hsa-miR-139-5p", 
"hsa-miR-139-5p", "hsa-miR-139-5p", "hsa-miR-139-5p", "hsa-miR-139-5p", 
"hsa-miR-139-5p", "hsa-miR-139-5p", "hsa-miR-139-5p", "hsa-miR-139-5p", 
"hsa-miR-139-5p", "hsa-miR-144-3p", "hsa-miR-144-3p", "hsa-miR-144-3p", 
"hsa-miR-144-3p", "hsa-miR-144-3p", "hsa-miR-144-3p", "hsa-miR-144-3p", 
"hsa-miR-144-3p", "hsa-miR-144-3p", "hsa-miR-144-3p", "hsa-miR-144-3p", 
"hsa-miR-144-3p", "hsa-miR-144-3p", "hsa-miR-144-3p", "hsa-miR-144-3p", 
"hsa-miR-144-3p", "hsa-miR-144-3p", "hsa-miR-144-3p", "hsa-miR-144-3p", 
"hsa-miR-144-3p", "hsa-miR-15a-5p", "hsa-miR-15a-5p", "hsa-miR-15a-5p", 
"hsa-miR-15a-5p", "hsa-miR-15a-5p", "hsa-miR-15a-5p", "hsa-miR-15a-5p", 
"hsa-miR-15a-5p", "hsa-miR-15a-5p", "hsa-miR-15a-5p", "hsa-miR-15a-5p", 
"hsa-miR-15a-5p", "hsa-miR-15a-5p", "hsa-miR-15a-5p", "hsa-miR-15a-5p", 
"hsa-miR-15a-5p", "hsa-miR-15a-5p", "hsa-miR-15a-5p", "hsa-miR-15a-5p", 
"hsa-miR-15a-5p", "hsa-miR-28-3p", "hsa-miR-28-3p", "hsa-miR-28-3p", 
"hsa-miR-28-3p", "hsa-miR-28-3p", "hsa-miR-28-3p", "hsa-miR-28-3p", 
"hsa-miR-28-3p", "hsa-miR-28-3p", "hsa-miR-28-3p", "hsa-miR-28-3p", 
"hsa-miR-28-3p", "hsa-miR-28-3p", "hsa-miR-28-3p", "hsa-miR-28-3p", 
"hsa-miR-28-3p", "hsa-miR-28-3p", "hsa-miR-28-3p", "hsa-miR-28-3p", 
"hsa-miR-28-3p", "hsa-miR-92a-3p", "hsa-miR-92a-3p", "hsa-miR-92a-3p", 
"hsa-miR-92a-3p", "hsa-miR-92a-3p", "hsa-miR-92a-3p", "hsa-miR-92a-3p", 
"hsa-miR-92a-3p", "hsa-miR-92a-3p", "hsa-miR-92a-3p", "hsa-miR-92a-3p", 
"hsa-miR-92a-3p", "hsa-miR-92a-3p", "hsa-miR-92a-3p", "hsa-miR-92a-3p", 
"hsa-miR-92a-3p", "hsa-miR-92a-3p", "hsa-miR-92a-3p", "hsa-miR-92a-3p", 
"hsa-miR-92a-3p"), CPM = c(140, 840, 820, 1740, 100, 460, 40, 
940, 760, 340, 620, 680, 380, 1360, 660, 1440, 180, 360, 720, 
1100, 2000, 0, 4300, 8600, 1200, 9100, 3100, 7300, 5100, 4100, 
3200, 9400, 6600, 2600, 2200, 7800, 2400, 10100, 6900, 8500, 
2600, 450, 2800, 2750, 0, 1400, 1550, 650, 1800, 1300, 3200, 
1750, 2300, 2500, 300, 8050, 600, 1950, 3150, 750, 140, 580, 
380, 500, 570, 740, 30, 490, 410, 250, 340, 540, 550, 260, 350, 
470, 680, 100, 180, 530, 3700, 100, 2900, 2100, 2800, 1700, 6900, 
2200, 2600, 2150, 1300, 300, 350, 3200, 1500, 1650, 2700, 2550, 
1850, 3600), Frequency = c(7, 2, 3, 2, 2, 4, 1, 2, 2, 7, 2, 1, 
5, 1, 3, 1, 5, 1, 3, 2, 3, 21, 8, 1, 5, 1, 4, 2, 4, 3, 2, 1, 
1, 2, 4, 2, 7, 1, 2, 2, 2, 3, 2, 1, 30, 7, 2, 6, 5, 6, 1, 4, 
3, 2, 3, 1, 4, 8, 1, 5, 1, 2, 5, 1, 1, 1, 1, 1, 2, 3, 1, 4, 2, 
1, 4, 1, 1, 4, 3, 2, 1, 3, 1, 3, 1, 4, 1, 2, 1, 1, 3, 3, 9, 1, 
5, 6, 1, 1, 4, 1)), row.names = c(NA, -100L), class = "data.frame")

UPDATE: enter image description here

推荐答案

问题是,当你使用after_stat时,你不再能够访问你的原始数据.相反,stat正在转换您的原始数据,例如,geom_histogramstat="bin"将根据x上映射的变量对您的数据进行bin,并计算每个bin的计数.然后将这些计数存储在名为count的列中.after_stat()允许在应用了统计数据之后处理转换后的数据或数据.但在after_stat内,您只能访问转换后的数据的列,例如,您可以访问after_stat(count / sum(count)).

然而,总的来说,您的代码对我来说没有多大意义.因为你的数据已经包括了你的Frequency个基因,我猜你只是想要一个这些频率的条形图,这可以很容易地用geom_col来实现.此外,请注意,在循环中创建ggplot时,最好使用lapply来避免由于延迟求值而产生的任何问题:

library(ggplot2)

goi <- unique(dat$Gene)
names(goi) <- goi

# Iterate through the elements
histogram_plots <- lapply(goi, \(element) {
  # Filter data for the specific element
  element_data <- dat[dat$Gene == element, ]

  # Create a ggplot histogram for the element
  plot <- ggplot(element_data, aes(x = CPM, y = Frequency / sum(Frequency))) +
    geom_col(fill = "cornflowerblue", color = "white") +
    labs(
      title = paste("Frequency Distribution of", element),
      y = "Percent",
      x = "CPM"
    ) +
    scale_y_continuous(labels = scales::percent_format()) +
    theme_minimal()

  # Save the ggplot object as an image file (adjust the filename and format accordingly)
  # ggsave(paste0("histogram_", gsub("-", "_", element), ".png"), plot, width = 8, height = 6)

  plot
})

histogram_plots[1:2]
#> $`hsa-miR-139-5p`

#> 
#> $`hsa-miR-144-3p`

R相关问答推荐

如何创建构成多个独立列条目列表的收件箱框列?

从BRM预测价值

条形图和在Ploly中悬停的问题

如何提取所有完美匹配的10个核苷酸在一个成对的匹配与生物字符串在R?>

如何通过ggplot2添加短轴和删除长轴?

如何平滑或忽略R中变量的微小变化?

为什么这个表格格罗布不打印?

ggplot R:X,Y,Z使用固定/等距的X,Y坐标绘制六边形热图

根据r中每行中的日期序列,使用列名序列创建新列

R仅当存在列时才发生变异

随机 Select 的非NA列的行均数

变异以按组从其他列创建具有最大和最小值的新列

ggplot斜体轴刻度标签中的单个字符-以前的帖子建议不工作

使用ggplot2绘制具有边缘分布的坡度图

如何将字符类对象中的数据转换为R中的字符串

使用一个标签共享多个组图图例符号

在使用ggplot2的情况下,如何在使用coord_trans函数的同时,根据未转换的坐标比来定位geom_瓷砖?

在一个multiplot中以非对称的方式在R中绘制多个图

使用卡环从R中的列中删除单位(&C)

残差与拟合图上标记点的故障排除