df <- data.frame(
Type = c("A", "B", "C", "D", "E", "F", "G"),
Value = c("UT", "30", "45", "50", "62", "70", "72"),
Efficiency = c(70, 72, 80, 88, 90, 92, 98)
)
退一步考虑数据集的组织.字符串中的列是单一类型的向量,因此通过在该列中包含"UT",这是一个字符列:
dplyr::glimpse(df)
Rows: 7
Columns: 3
$ Type <chr> "A", "B", "C", "D", "E", "F", "G"
$ Value <chr> "UT", "30", "45", "50", "62", "70", "72"
$ Efficiency <dbl> 70, 72, 80, 88, 90, 92, 98
你的Value
列似乎来自数字数据,但你有令人讨厌的UT
.在这种情况下,我认为最好更详细地了解数据帧,以便更准确地描述情况:
# New column to give context to the value column
df$Value_type <- ifelse(df$Value == "UT", "UT", NA)
df$Value[df$Value == "UT"] <- NA
df$Value <- as.numeric(df$Value)
# Categories for the Value column
df <- df |>
dplyr::mutate(Value_cat = dplyr::case_when(
30 <= Value & Value < 50 ~ "30-50",
50 <= Value & Value < 70 ~ "50-70",
70 <= Value & Value < 90 ~ "70-90",
Value_type == "UT" ~ "UT",
.default = NA
))
# Set factor levels so that any plots have desired order
df$Value_cat <- factor(df$Value_cat, levels = c(
"UT", "30-50", "50-70", "70-90"
))
dplyr::glimpse(df)
Rows: 7
Columns: 5
$ Type <chr> "A", "B", "C", "D", "E", "F", "G"
$ Value <dbl> NA, 30, 45, 50, 62, 70, 72
$ Efficiency <dbl> 70, 72, 80, 88, 90, 92, 98
$ Value_type <chr> "UT", NA, NA, NA, NA, NA, NA
$ Value_cat <fct> UT, 30-50, 30-50, 50-70, 50-70, 70-90, 70-90
现在我们更接近传说中的"整齐"数据集:行是每个单独的观察值,每个观察值具有在单独的变量/列中定义良好的特征.您还可以方便地以数字向量形式访问您的Value
个测量值.当您需要编写关于数字属性的任何类型的逻辑时,避免将数字转换为字符串通常是好的.
因为你有一个类别"UT",你的x轴是分类的,而不是数字.我建议使用从这些类别创建的"bin"条形图,而不是设置bin宽度的直方图,因为直方图通常用于连续的数值数据.
library(ggplot2)
# Since your `fill` is not 1:1 with your x-axis, you can set
# the position of the bars in the position argument
ggplot(data = df, mapping = aes(x = Value_cat, y = Efficiency)) +
geom_col(aes(fill = Type), position = position_dodge())