我的问题非常类似于:
但是,这里提出的解决方案不适用于我,因为在同一行中,该值可能出现两次,但我只想计算出现该值的行.我已经想出了一个解决方案,但似乎太长了:
> toy_data = data.table(from=c("A","A","A","C","E","E"), to=c("B","C","A","D","F","E"))
> toy_data
from to
1: A B
2: A C
3: A A
4: C D
5: E F
6: E E
> #get a table with intra-link count
> A = data.table(table(unlist(toy_data[from==to,from ])))
> A
V1 N
1: A 1
2: E 1
A #get a table with total count
> B = data.table(table(unlist(toy_data[,c(from,to)])))
> B
V1 N
1: A 4
2: B 1
3: C 2
4: D 1
5: E 3
6: F 1
>
> # concatenate changing sign
> table = rbind(B,A[,.(V1,-N)],use.names=FALSE)
> # groupby and subtract
> table[,sum(N),by=V1]
V1 V1
1: A 3
2: B 1
3: C 2
4: D 1
5: E 2
6: F 1
是否有一些功能可以在更少的线路中完成这项工作?我以为在python中我会连接from和to,然后匹配(),但找不到正确的sintax
编辑:我知道这可以工作A=length(toy_data[from=="A"|to=="A",from])
次,但我希望避免各种"A","B"...
次循环(我不知道如何以这种方式格式化输出)