我有两张DT
X = data.table(names = c("a", "a", "b", "b", "c", "c"), years = c("2001", "2002", "2001", "2002", "2001", "2002"), val.1 = 1:6, key = c("names", "years"))
X
| names | years | val.1 |
| -------- | -------- |
| a | 2001 | 1 |
| a | 2002 | 2 |
| b | 2001 | 3 |
| b | 2002 | 4 |
| c | 2001 | 5 |
| c | 2002 | 6 |
和
X.update = data.table(names = c("a", "b", "b", "c", "d", "d", "d"), years = c("2003", "2002", "2003", "2003", "2001", "2002", "2003"), val.1 = 11:17, key = c("names", "years"))
X.update
| names | years | val.1 |
| -------- | -------- |
| a | 2003 | 11 |
| b | 2002 | 12 |
| b | 2003 | 13 |
| c | 2003 | 14 |
| d | 2001 | 15 |
| d | 2002 | 16 |
| d | 2003 | 17 |
The task looks natural to me. X.update supersedes all old values (val.1) for the same c("names", "year") 和 adds new entries everywhere else.
这里的意思是:
- 2003年所有名称的新行
- 每年都有一个新的d行(基本上是添加了d)
- B在2002年的修正
X.Final X.Final
| names | year | val.1 |
| -------- | -------- |
| a | 2001 | 1 |
| a | 2002 | 2 |
| a | 2003 | 11 | <-added for a for 2003
| b | 2001 | 3 | # edited, R2Evans 和 Sam are right
| b | 2002 | 12 | <-corrected for b for 2002
| b | 2003 | 13 | <-added for b for 2003
| c | 2001 | 5 |
| c | 2002 | 6 |
| c | 2003 | 14 | <-added for c for 2003
| d | 2001 | 15 | <-added
| d | 2002 | 16 | <-added
| d | 2003 | 17 | <-added
因为我需要对有100,000行的表使用它,所以我想在DT中寻求一个惯用的(=FAST)解决方案.