image of excelexcel image-技能流前排 专栏(雇主赞助,州领地提名,区域,商业创新,全球人才,技能独立,杰出人才,区域技能,11月在岸(应删除))

  • 家庭流顶行 列(合作伙伴父母、子元素、其他家庭)

如何将技能流和家庭流作为一个列,前面有子类别?

需要在R语言中将其转换为列 Here is a link to download my data.

推荐答案

# load libraries
library(readxl)
library(tidyverse)
library(zoo)

file_path <-"~/Downloads/migration_trends_statistical_package_2021_22.xlsx"

# pull out the top column names. For some reason, this requires skipping the first 5 rows. Don't ask me why, I don't know. This is likely to change sheet to sheet, so you want to adjust it so that you get the same result.
categories <- read_excel(path = file_path, sheet = "1.1", col_names = FALSE, skip = 5, n_max = 2) %>% 
                mutate(across(everything(), ~ str_trim(str_remove_all(., "\\d+"))))

# we then turn it into a cleaned long dataframe, from a messy wide one
categories <- data.frame(
    category = na.locf(as.character(categories[1,])), # fill in the blanks with the last non-NA value
    name = as.character(categories[2,])
)

read_excel(path = file_path, sheet = "1.1", skip = 7, col_names = categories[,2]) %>%
  pivot_longer(-Year, values_transform = list(value = as.numeric)) %>%
  left_join(categories)

# A tibble: 468 × 4
   Year    name                                  value category    
   <chr>   <chr>                                 <dbl> <chr>       
 1 2012–13 "Employer Sponsored"                  47740 Skill stream
 2 2012–13 "State/Territory Nominated"           21637 Skill stream
 3 2012–13 "Regional"                               NA Skill stream
 4 2012–13 "Business Innovation and Investment"   7010 Skill stream
 5 2012–13 "Global Talent (Independent)"            NA Skill stream
 6 2012–13 "Skilled Independent"                 44251 Skill stream
 7 2012–13 "Distinguished\r\n Talent"              200 Skill stream
 8 2012–13 "Skilled Regional"                     8132 Skill stream
 9 2012–13 "November\r\nOnshore"                    NA Skill stream
10 2012–13 "Skill stream total"                 128973 Skill stream
# ℹ 458 more rows

R相关问答推荐

基于R中的GPS点用方向箭头替换点

创建重复删除的唯一数据集组合列表

将模拟变量乘以多个观测结果中的模拟变量

使用sensemakr和fixest feols模型(R)

使用R中的gt对R中的html rmarkdown文件进行条件格式设置表的单元格

更改默认系列1以更改名称

r替换lme S4对象的字符串的一部分

使用case_match()和char数组重新编码值

使用strsplit()将向量操作为数据框

可以替代与NSE一起使用的‘any_of()’吗?

如何使这些react 表对象相互独立?

如何使用字符串从重复的模式中提取多个数字?

删除字符串R中的重复项

Rmarkdown::Render vs Source()

数值型数据与字符混合时如何进行绑定

是否有可能从边界中找到一个点值?

避免在图例中显示VLINS组

R仅当存在列时才发生变异

在同一单元格中创建包含整数和百分比的交叉表

创建两个变量组合的索引矩阵