Mongodb 使用 C# 聚合 $lookup

发布于05月26日

我有以下MongoDb查询工作:

db.Entity.aggregate(
    [
        {
            "$match":{"Id": "12345"}
        },
        {
            "$lookup": {
                "from": "OtherCollection",
                "localField": "otherCollectionId",
                "foreignField": "Id",
                "as": "ent"
            }
        },
        { 
            "$project": { 
                "Name": 1,
                "Date": 1,
                "OtherObject": { "$arrayElemAt": [ "$ent", 0 ] } 
            }
        },
        { 
            "$sort": { 
                "OtherObject.Profile.Name": 1
            } 
        }
    ]
)

这将检索与另一个集合中的匹配对象关联的对象列表.

有人知道我如何在C#中使用LINQ或使用这个精确的字符串吗？

我试着使用下面的代码，但似乎找不到QueryDocument和MongoCursor的类型——我想它们已经被弃用了？

BsonDocument document = MongoDB.Bson.Serialization.BsonSerializer.Deserialize<BsonDocument>("{ name : value }");
QueryDocument queryDoc = new QueryDocument(document);
MongoCursor toReturn = _connectionCollection.Find(queryDoc);

Setup

基本上我们这里有两个系列

entities

{ "_id" : ObjectId("5b08ceb40a8a7614c70a5710"), "name" : "A" }
{ "_id" : ObjectId("5b08ceb40a8a7614c70a5711"), "name" : "B" }

还有others

{
        "_id" : ObjectId("5b08cef10a8a7614c70a5712"),
        "entity" : ObjectId("5b08ceb40a8a7614c70a5710"),
        "name" : "Sub-A"
}
{
        "_id" : ObjectId("5b08cefd0a8a7614c70a5713"),
        "entity" : ObjectId("5b08ceb40a8a7614c70a5711"),
        "name" : "Sub-B"
}

还有几个类将它们绑定到，就像非常基本的示例一样:

public class Entity
{
  public ObjectId id;
  public string name { get; set; }
}

public class Other
{
  public ObjectId id;
  public ObjectId entity { get; set; }
  public string name { get; set; }
}

public class EntityWithOthers
{
  public ObjectId id;
  public string name { get; set; }
  public IEnumerable<Other> others;
}

 public class EntityWithOther
{
  public ObjectId id;
  public string name { get; set; }
  public Other others;
}

Queries

流畅接口

var listNames = new[] { "A", "B" };

var query = entities.Aggregate()
    .Match(p => listNames.Contains(p.name))
    .Lookup(
      foreignCollection: others,
      localField: e => e.id,
      foreignField: f => f.entity,
      @as: (EntityWithOthers eo) => eo.others
    )
    .Project(p => new { p.id, p.name, other = p.others.First() } )
    .Sort(new BsonDocument("other.name",-1))
    .ToList();

发送到服务器的请求:

[
  { "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
  { "$lookup" : { 
    "from" : "others",
    "localField" : "_id",
    "foreignField" : "entity",
    "as" : "others"
  } }, 
  { "$project" : { 
    "id" : "$_id",
    "name" : "$name",
    "other" : { "$arrayElemAt" : [ "$others", 0 ] },
    "_id" : 0
  } },
  { "$sort" : { "other.name" : -1 } }
]

可能是最容易理解的，因为fluent界面基本上与一般的BSON struct 相同.$lookup阶段有所有相同的参数，$arrayElemAt阶段用First()表示.对于$sort，只需提供BSON文档或其他有效表达式即可.

另一种是更新的表达形式$lookup，其中包含MongoDB 3.6及更高版本的子管道语句.

BsonArray subpipeline = new BsonArray();

subpipeline.Add(
  new BsonDocument("$match",new BsonDocument(
    "$expr", new BsonDocument(
      "$eq", new BsonArray { "$$entity", "$entity" }  
    )
  ))
);

var lookup = new BsonDocument("$lookup",
  new BsonDocument("from", "others")
    .Add("let", new BsonDocument("entity", "$_id"))
    .Add("pipeline", subpipeline)
    .Add("as","others")
);

var query = entities.Aggregate()
  .Match(p => listNames.Contains(p.name))
  .AppendStage<EntityWithOthers>(lookup)
  .Unwind<EntityWithOthers, EntityWithOther>(p => p.others)
  .SortByDescending(p => p.others.name)
  .ToList();

发送到服务器的请求:

[ 
  { "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
  { "$lookup" : {
    "from" : "others",
    "let" : { "entity" : "$_id" },
    "pipeline" : [
      { "$match" : { "$expr" : { "$eq" : [ "$$entity", "$entity" ] } } }
    ],
    "as" : "others"
  } },
  { "$unwind" : "$others" },
  { "$sort" : { "others.name" : -1 } }
]

Fluent"Builder"还不直接支持语法，LINQ表达式也不支持$expr运算符，但是您仍然可以使用BsonDocument和BsonArray或其他有效表达式进行构造.在这里，我们还"键入"$unwind结果，以便使用表达式而不是如前所示的BsonDocument应用$sort.

除了其他用途外，"子管道"的主要任务是减少在$lookup个目标数组中返回的文档.此外，这里的$unwind用于在服务器执行时将being "merged"放入$lookup语句中，因此这通常比只获取结果数组的第一个元素更有效.

可查询的GroupJoin

var query = entities.AsQueryable()
    .Where(p => listNames.Contains(p.name))
    .GroupJoin(
      others.AsQueryable(),
      p => p.id,
      o => o.entity,
      (p, o) => new { p.id, p.name, other = o.First() }
    )
    .OrderByDescending(p => p.other.name);

发送到服务器的请求:

[ 
  { "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
  { "$lookup" : {
    "from" : "others",
    "localField" : "_id",
    "foreignField" : "entity",
    "as" : "o"
  } },
  { "$project" : {
    "id" : "$_id",
    "name" : "$name",
    "other" : { "$arrayElemAt" : [ "$o", 0 ] },
    "_id" : 0
  } },
  { "$sort" : { "other.name" : -1 } }
]

这几乎是相同的，但只是使用了不同的接口，并生成了一个稍微不同的BSON语句，这实际上只是因为函数语句中简化了命名.这确实带来了另一种可能性，即简单地使用SelectMany()生产的$unwind:

var query = entities.AsQueryable()
  .Where(p => listNames.Contains(p.name))
  .GroupJoin(
    others.AsQueryable(),
    p => p.id,
    o => o.entity,
    (p, o) => new { p.id, p.name, other = o }
  )
  .SelectMany(p => p.other, (p, other) => new { p.id, p.name, other })
  .OrderByDescending(p => p.other.name);

发送到服务器的请求:

[
  { "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
  { "$lookup" : {
    "from" : "others",
    "localField" : "_id",
    "foreignField" : "entity",
    "as" : "o"
  }},
  { "$project" : {
    "id" : "$_id",
    "name" : "$name",
    "other" : "$o",
    "_id" : 0
  } },
  { "$unwind" : "$other" },
  { "$project" : {
    "id" : "$id",
    "name" : "$name",
    "other" : "$other",
    "_id" : 0
  }},
  { "$sort" : { "other.name" : -1 } }
]

通常，在$lookup之后直接放置$unwind实际上是聚合框架的"optimized pattern".然而.在这种组合中，NET驱动程序会将$project置于两者之间，而不是在"as"上使用隐含的命名.如果不是因为这个，这实际上比$arrayElemAt更好，当你知道你有"一"相关的结果.如果你想要$unwind个"合并"，那么最好使用fluent界面，或者后面演示的另一种形式.

可征服的自然

var query = from p in entities.AsQueryable()
            where listNames.Contains(p.name) 
            join o in others.AsQueryable() on p.id equals o.entity into joined
            select new { p.id, p.name, other = joined.First() }
            into p
            orderby p.other.name descending
            select p;

发送到服务器的请求:

[
  { "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
  { "$lookup" : {
    "from" : "others",
    "localField" : "_id",
    "foreignField" : "entity",
    "as" : "joined"
  } },
  { "$project" : {
    "id" : "$_id",
    "name" : "$name",
    "other" : { "$arrayElemAt" : [ "$joined", 0 ] },
    "_id" : 0
  } },
  { "$sort" : { "other.name" : -1 } }
]

所有这些都很熟悉，实际上只是功能命名.就像使用$unwind选项一样:

var query = from p in entities.AsQueryable()
            where listNames.Contains(p.name) 
            join o in others.AsQueryable() on p.id equals o.entity into joined
            from sub_o in joined.DefaultIfEmpty()
            select new { p.id, p.name, other = sub_o }
            into p
            orderby p.other.name descending
            select p;

发送到服务器的请求:

[ 
  { "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
  { "$lookup" : {
    "from" : "others",
    "localField" : "_id",
    "foreignField" : "entity",
    "as" : "joined"
  } },
  { "$unwind" : { 
    "path" : "$joined", "preserveNullAndEmptyArrays" : true
  } }, 
  { "$project" : { 
    "id" : "$_id",
    "name" : "$name",
    "other" : "$joined",
    "_id" : 0
  } }, 
  { "$sort" : { "other.name" : -1 } }
]

实际上是用"optimized coalescence"的形式.译者仍然坚持增加$project，因为我们需要中间的select才能使语句有效.

Summary

因此，有很多方法可以得到基本相同的查询语句，并且得到完全相同的结果.虽然您"可以"将JSON解析为BsonDocument形式，并将其提供给fluent Aggregate()命令，但通常最好使用natural Builder或LINQ接口，因为它们很容易映射到同一条语句上.

显示$unwind的选项主要是因为即使使用"单数"匹配，"合并"形式实际上比使用$arrayElemAt获取"第一个"数组元素更为优化.考虑到BSON限制，$lookup目标数组可能会导致父文档超过16MB，而无需进一步过滤，这一点甚至变得更加重要.这里还有一篇关于Aggregate $lookup Total size of documents in matching pipeline exceeds maximum document size的帖子，我在这里讨论了如何通过使用这些选项或目前仅适用于fluent界面的其他Lookup()语法来避免达到这个限制.

Mongodb 使用 C# 聚合 $lookup

推荐答案

Setup

Queries

流畅接口

可查询的GroupJoin

可征服的自然

Summary

Mongodb相关问答推荐

筛选出嵌套数组中的记录Mongo DB聚合

减法聚合Mongo文档Golang

mongoDB文档数组字段中的唯一项如何

Nestjs中不同语言数据的mongodb聚合代码

将 MongoDB 转移到另一台服务器？

创建索引需要很长时间

使用 homebrew 和 Xcode 8.1.1 安装 Mongodb 失败

findOneAndUpdate 和 findOneAndReplace 有什么区别？

mongoose默认值等于其他值

在mongo聚合中将ObjectID转换为字符串

Mongoose 的保存回调是如何工作的？

MongoDB 批处理操作的最大大小是多少？

有没有办法自动更新 MongoDB 中的两个集合？

在 Postgres JSON 数组中查询

遍历所有 Mongo 数据库

判断 mongoDB 是否连接

了解 Mongoose 中的关系和外键

Mongoose 为所有嵌套对象添加 _id

如何使用 mongodb-java-driver 进行 upsert

创建模型时出现mongoose错误