Edit 2019: This question was asked prior to changes in 100 in November 2016, see the accepted answer below for both the current and previous methods.
我有一张大约250万行的data.table
张桌子.有两列.我想删除两列中重复的所有行.前情提要.框架我会这么做:
有什么建议吗?
干杯
实例
>dt
V1 V2
[1,] A B
[2,] A C
[3,] A D
[4,] A B
[5,] B A
[6,] C D
[7,] C D
[8,] E F
[9,] G G
[10,] A B
在上述数据中.table如果V2
是表键,则只会删除第4、7和10行.
> dput(dt)
structure(list(V1 = c("B", "A", "A", "A", "A", "A", "C", "C",
"E", "G"), V2 = c("A", "B", "B", "B", "C", "D", "D", "D", "F",
"G")), .Names = c("V1", "V2"), row.names = c(NA, -10L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x7fb4c4804578>, sorted = "V2")