我正在try 删除内容和作者相同但时间戳略有不同(即1秒内)的JSON对象array.我想将重复消息保留为一个新字段,称为重复.例如,考虑以下内容,其中条目2、3和5为应消除重复的消息:
myObject = [
{content: 'content1', date: '1980-08-01 12:12:40.000', author: 'Person1'},
{content: 'content2', date: '1980-08-01 12:12:40.900', author: 'Person2'},
{content: 'content2', date: '1980-08-01 12:12:41.100', author: 'Person2'},
{content: 'content3', date: '1980-08-01 12:12:41.000', author: 'Person1'},
{content: 'content2', date: '1980-08-01 12:12:41.400', author: 'Person2'},
{content: 'content4', date: '1980-08-01 12:12:45.100', author: 'Person2'},
]
应转换为:
deduped = [
{content: 'content1', date: '1980-08-01 12:12:40.000', author: 'Person1', duplicates: 0},
{content: 'content2', date: '1980-08-01 12:12:40.900', author: 'Person2', duplicates: 2},
{content: 'content3', date: '1980-08-01 12:12:41.000', author: 'Person1', duplicates: 0},
{content: 'content4', date: '1980-08-01 12:12:45.100', author: 'Person2', duplicates: 0},
]
我遇到的问题是日期时间.如果重复消息之间出现非重复消息,则按日期时间排序然后减少很容易出错.比较datetimes的字符串值也容易出错,因为两条消息可能非常接近,但根据它们落下的位置显示为1秒.
使用lodash _.uniqWith,我可以基于具有相同内容和作者的实际时间增量的组合进行重复数据消除,但我缺少duplicates字段...
const dedupedButNoCount = _.uniqWith(myObject, (item1, item2) =>
{return (item1.content== item2.content) && (item1.author== item2.author)
&& ((new Date(item1.date).getTime() - new Date(item2.date).getTime())<500)}
)
关于如何消除具有相似但不相同日期时间的对象数组的重复数据,有什么建议吗?