我有一个带有对象数组的文档,我需要执行以下操作:

  • 获取成员数组的总分.在这个例子中,这两个文档总共有5500分.
  • 接下来,获取每个文档的成员的总分(Josh为2000,Carl为3500),然后计算其占总点数的百分比.计算公式为:(会员积分/总积分)*2000(2000 / 5500) * 100.
  • 最后,返回字段_id、Leader、Members‘s Points和Percent字段.

它应该如下所示:

{
  _id: '00001',
  leader: "Josh",
  memberpoints: 2000,
  percentage: 36.3636
},
{
  _id: '00002',
  leader: "Carl",
  memberpoints: 3500,
  percentage: 63.6363
}

These are the documents:

[{
  "_id": { "$oid": "00001" },
  "leader": "Josh",
  "members": [
    {
      "name": "Person A",
      "points": 500
    },
    {
      "name": "Person B",
      "points": 500
    },
    {
      "name": "Person C",
      "points": 1000
    }]
  },
  {
  "_id": { "$oid": "00002" },
  "leader": "Carl",
  "members": [
    {
      "name": "Person D",
      "points": 1000
    },
    {
      "name": "Person E",
      "points": 1000
    },
    {
      "name": "Person F",
      "points": 1500
    }]
}]

但是,我得到了错误的百分比,因为它似乎只从Members数组中获得了一个对象.它可能与我的聚合管道中的$set部分有关,特别是[{ $divide: ["$members.points", "$total"] }, 100],但我不确定.

Wrong percentage, should be 36.3636

{
  _id: '00001',
  leader: "Josh",
  memberpoints: 2000,
  percentage: 18.1818
}

Here is my MongoDB aggregation pipeline:

db.users.aggregate([
{$unwind: "$members"}, 
{$setWindowFields: { output: { total: { $sum: "$members.points" }}}},
{$set: { percentage: { $multiply: [{ $divide: ["$members.points", "$total"] }, 100] }}},
{$group: { _id: "$_id", leader: {$first: "$leader"}, memberpoints: {$sum: "$members.points"}, percentage: {$first: "$percentage"}}}])

推荐答案

你的方法几乎是正确的;只是$set$group的顺序颠倒了.下面是一个管道,它生成所需的输出,并带有一些注释:

db.foo.aggregate([
    {$unwind: "$members"},

    // Great way to get the sum of everything in one pass without having
    // $group get in the way:
    {$setWindowFields: { output: { total: { $sum: "$members.points" }}}},

    // Basically home free.  Now re-group, summing the points.  The trick
    // here is to NOT lose the total amount; we will need it after we $sum.
    // We will call it 'percentage' here but only as a placeholder; we
    // will overwrite it with the REAL percentage in the next stage:
    {$group: {_id: "$_id",
              leader: {$first: "$leader"},
              memberpoints: {$sum: "$members.points"},
              percentage: {$first: "$total"}  // not really pct yet.
             }},

    // Now turn it into REAL pct.  The overwrite trick allows us to
    // to not have to unset a "temporary" total value.
    {$addFields: {
        percentage: {$multiply: [100, {$divide: ['$memberpoints', '$percentage']}]}
    }}
)]

这里有一个奇特的版本,它通过让$reduce对成员点数求和来避免$unwind和Re-$group.消除$unwind/$group阶段是不容小觑的.样本集为1,000,000 DOCS,上面的解决方案运行13128ms(平均);下面更紧凑的版本运行只需要6020ms--速度是原来的两倍:

db.foo.aggregate([

    {$setWindowFields: {
        output: {
          total: { $sum: {$reduce: {
                         input: "$members",
                         initialValue: 0,
                         in: {$add: [ "$$value", "$$this.points"]}
                     }}
                 }
          }
    }}

    ,{$project: {
        _id: true,
        leader: true,
        memberpoints: {$sum:'$members.points'},
        percentage: {$multiply:[100,{$divide: [{$sum:'$members.points'},'$total']}]}
    }}

]);

Database相关问答推荐

将数据拆分为月份(以Postgres为单位

重新运行后未找到 Taipy 场景

网络分区恢复后副本的更新数据发生了什么

有关托管默认 SQL Server 实例的 Azure VM 数据库服务器的问题

无法向 SiriDB 添加新副本

Spring Framework 中的默认隔离级别

SQL Server 2008如何同步不同服务器中的数据库?

Django:如何以线程安全的方式执行 get_or_create()?

如何在 DB2 的单个更新语句中更新多个列

不同的数据库是否使用不同的名称引用?

我需要在这个 Django 模型中添加一个 db_index 吗?

Mysql用户创建脚本

PostgreSQL 字符变长限制

哪个更重要?数据库设计或编码?

如何在多列上创建 FULLTEXT 索引?

SQL 查询 - 如何按 null 或不为 null 进行过滤

复式记账的关系数据模型

什么是提交日志(log)?

xampp phpmyadmin,格式参数不正确

Windows phone 7 的本地 Sql 数据库支持