Start_Year End_Year Opp1 Opp2 Duration
1500 1501 ['A','B'] ['C','D'] 1
1500 1510 ['P','Q','R'] ['X','Y'] 10
1520 1520 ['A','X'] ['C'] 0
... .... ........ ..... ..
1809 1820 ['M'] ['F','H','Z'] 11
我的数据集(csv文件格式)是不同实体(国家、州和派系,用大写字母A、B、P、Q等表示,如Opp1(反对派)和Opp2列中的列表)之间的武装战争.Start_Year和End_Year是战争开始和结束的年份.持续时间列是通过将End_Year的值减go Start_Year来创建的.
我想通过战争持续时间的因子复制那些持续时间大于0的行,即如果持续时间为6年,则复制该行6次,将持续时间值减少1,并将复制行中的每个复制的Start_Year增加1,并保持其他列中的值相同.(如果持续时间为1年,则应将行复制2次,以便在复制到最后一步后,每次战争的持续时间变为0年).
我不知道如何继续这样的事情,因为我是一个数据科学和分析的初学者.请原谅我没有在这里显示任何试用代码.
Start_Year End_Year Opp1 Opp2 Duration
1500 1501 ['A','B'] ['C','D'] 1
1501 1501 ['A','B'] ['C','D'] 0
1500 1510 ['P','Q','R'] ['X','Y'] 10
1501 1510 ['P','Q','R'] ['X','Y'] 9
1502 1510 ['P','Q','R'] ['X','Y'] 8
1503 1510 ['P','Q','R'] ['X','Y'] 7
1504 1510 ['P','Q','R'] ['X','Y'] 6
1505 1510 ['P','Q','R'] ['X','Y'] 5
.... .... ............. ........ ..
1510 1510 ['P','Q','R'] ['X','Y'] 0
1520 1520 ['A','X'] ['C'] 0
... .... ........ ..... ..
1809 1820 ['M'] ['F','H','Z'] 11
1810 1820 ['M'] ['F','H','Z'] 10
.... .... ..... .............. ..
1820 1820 ['M'] ['F','H','Z'] 0
编辑:1