你根本不需要rvest
.该页面提供了一个下载按钮来获取搜索项目的csv.这有一个基本的url编码GET语法,允许您创建一个简单的小API:
get_clin_trials_data <- function(terms, n = 1000) {
terms<- URLencode(paste(terms, collapse = " AND "))
df <- read.csv(paste0(
"https://clinicaltrials.gov/ct2/results/download_fields",
"?down_count=", n, "&down_flds=shown&down_fmt=csv",
"&term=", terms, "&flds=a&flds=b&flds=y"))
dplyr::as_tibble(df)
}
这允许您传入搜索词向量和要返回的最大结果数.不需要像web抓取那样进行复杂的解析.
get_clin_trials_data(c("nivolumab", "Overall Survival"), n = 10)
#> # A tibble: 10 x 8
#> Rank Title Status Study.Results Conditions Interventions Locations URL
#> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1 A Study ~ Compl~ No Results A~ Hepatocel~ "" "Bristol~ http~
#> 2 2 Nivoluma~ Activ~ No Results A~ Glioblast~ "Drug: Nivol~ "Duke Un~ http~
#> 3 3 Nivoluma~ Unkno~ No Results A~ Melanoma "Biological:~ "CHU d'A~ http~
#> 4 4 Study of~ Compl~ Has Results Advanced ~ "Biological:~ "Highlan~ http~
#> 5 5 A Study ~ Unkno~ No Results A~ Brain Met~ "Drug: Fotem~ "Medical~ http~
#> 6 6 Trial of~ Compl~ Has Results Squamous ~ "Drug: Nivol~ "Stanfor~ http~
#> 7 7 Nivoluma~ Compl~ No Results A~ MGMT-unme~ "Drug: Nivol~ "New Yor~ http~
#> 8 8 Study of~ Compl~ Has Results Squamous ~ "Biological:~ "Mayo Cl~ http~
#> 9 9 Study of~ Compl~ Has Results Non-Squam~ "Biological:~ "Mayo Cl~ http~
#> 10 10 An Open-~ Unkno~ No Results A~ Squamous-~ "Drug: Nivol~ "IRCCS -~ http~
由reprex package(v2.0.1)于2022年6月21日创建