Summary:

  • 包含10k个项目的测试数据,每个项目5个注释,每个注释100k个字符的唯一文本
  • 与原生SQL相比,实体框架核心运行查询的速度最多慢10倍
  • 全文搜索的列是字符串类型
  • 我不使用延迟加载
  • try 了全文列的不同方法
  • 有时速度要快10倍
  • 我总是等到全文索引完成(状态为0)

我运行一个半复杂的查询,在我的数据库上使用实体框架core(7.0.5)执行全文搜索:

BaseEntity型号:

public class BaseEntity
{
    // Main data
    [Key]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    public string SearchKeywords { get; set; }

    // Audit data
    [ForeignKey("TeamId")]
    public virtual Team Team { get; set; }
    public string TeamId { get; set; }

    public DateTime? CreatedDate { get; set; }
    [ForeignKey("CreatedByUserId")]
    public virtual ApplicationUser CreatedByUser { get; set; }
    public string CreatedByUserId { get; set; }

    //...
}

Project型号班级:

[Table("Projects")]
public class Project : BaseEntity
{
    // Main data
    public string ProjectName { get; set; }
    public string ProjectDescription { get; set; }
    public string ProjectLocation { get; set; }
    public DateTime? ProjectStart { get; set; }
    public DateTime? ProjectEnd { get; set; }
    //...

    // Computed columns (see ApplicationDbContext)
    [DatabaseGenerated(DatabaseGeneratedOption.Computed)]
    public string MainProperty { get; set; }

    [DatabaseGenerated(DatabaseGeneratedOption.Computed)]
    public string FullText { get; set; }

    // Foreign data
    [ForeignKey("ProjectStatusId")]
    public virtual DropDown ProjectStatus { get; set; }
    public string ProjectStatusId { get; set; }

    [ForeignKey("CustomerId")]
    public virtual Customer Customer { get; set; }
    public string CustomerId { get; set; }

    [ForeignKey("CustomerContactId")]
    public virtual CustomerContact CustomerContact { get; set; }
    public string CustomerContactId { get; set; }

    [ForeignKey("AccountOwnerId")]
    public virtual ApplicationUser AccountOwner { get; set; }
    public string AccountOwnerId { get; set; }

    // Foreign data collections
    public virtual IEnumerable<Comment> Comments { get; set; }
    public virtual IEnumerable<UploadedFile> UploadedFiles { get; set; }
    public virtual IEnumerable<ProjectCandidate> ProjectCandidates { get; set; }
}

OnModelCreating:

protected override void OnModelCreating(ModelBuilder builder)
{
    // ...
    builder.Entity<Project>().Property(p => p.FullText).HasComputedColumnSql("CONCAT([SearchKeywords], ' ', [ProjectName], ' ', [ProjectDescription], ' ', [ProjectLocation])");
    // ...
}

GetProjects方法:

private void GetProjects(ApplicationDbContext context, string FullTextFilter)
{
    var projects = context.Projects.Where(x => x.TeamId == "85216daa-a05f-4c0a-92c4-0920f8349370").AsNoTracking();

    projects = projects.Where(x =>
         EF.Functions.Contains(x.FullText, FullTextFilter) ||
         EF.Functions.Contains(x.Customer.FullText, FullTextFilter) ||
         EF.Functions.Contains(x.CustomerContact.FullText, FullTextFilter) ||
         EF.Functions.Contains(x.AccountOwner.FullText, FullTextFilter) ||
         x.UploadedFiles.Any(c => EF.Functions.Contains(c.FullText, FullTextFilter)) ||
         x.Comments.Any(c => EF.Functions.Contains(c.FullText, FullTextFilter))
         );

    projects = projects.OrderBy(x => x.ProjectName);

    projects = projects.Skip(0);
    projects = projects.Take(20);

    var projectsDto = projects.Select(x => new ProjectOverviewDto
    {
         Id = x.Id,
         ProjectName = x.ProjectName,
         ProjectDescription = x.ProjectDescription,
         ProjectLocation = x.ProjectLocation,
         
         ProjectStatus = new DropDownMinimalDto { Name = x.ProjectStatus.Name },
         Customer = new CustomerMinimalDto { CompanyName = x.Customer.CompanyName },
         CustomerContact = new CustomerContactMinimalDto { LastName = x.CustomerContact.LastName },
         AccountOwner = new ApplicationUserMinimalDto { LastName = x.AccountOwner.LastName }
    });

    var projectsList = projectsDto.ToList();
}

我像这样运行此函数(仅限测试):

GetProjects(context, "\"*def*\"");

执行时间变化很大,我已经得到了一些幸运的时刻,大约500毫秒,但大多数时间需要10,000毫秒.我的一些其他测试耗时长达20,000毫秒.

生成的SQL如下所示(在SQL Server事件探查器中可以看到):

exec sp_executesql N'SELECT [t].[Id], [t].[ProjectName], [t].[ProjectDescription], [t].[ProjectLocation], [d].[Name], [t].[CompanyName], [t].[LastName], [t].[LastName0] AS [LastName]
FROM (
    SELECT [p].[Id], [p].[ProjectDescription], [p].[ProjectLocation], [p].[ProjectName], [p].[ProjectStatusId], [c].[CompanyName], [c0].[LastName], [a].[LastName] AS [LastName0]
    FROM [Projects] AS [p]
    LEFT JOIN [Customers] AS [c] ON [p].[CustomerId] = [c].[Id]
    LEFT JOIN [CustomerContacts] AS [c0] ON [p].[CustomerContactId] = [c0].[Id]
    LEFT JOIN [AspNetUsers] AS [a] ON [p].[AccountOwnerId] = [a].[Id]
    WHERE [p].[TeamId] = N''85216daa-a05f-4c0a-92c4-0920f8349370'' AND (CONTAINS([p].[FullText], @__FullTextFilter_1) OR CONTAINS([c].[FullText], @__FullTextFilter_1) OR CONTAINS([c0].[FullText], @__FullTextFilter_1) OR CONTAINS([a].[FullText], @__FullTextFilter_1) OR EXISTS (
        SELECT 1
        FROM [UploadedFiles] AS [u]
        WHERE [p].[Id] = [u].[ProjectId] AND CONTAINS([u].[FullText], @__FullTextFilter_1)) OR EXISTS (
        SELECT 1
        FROM [Comments] AS [c1]
        WHERE [p].[Id] = [c1].[ProjectId] AND CONTAINS([c1].[FullText], @__FullTextFilter_1)))
    ORDER BY [p].[ProjectName]
    OFFSET @__p_2 ROWS FETCH NEXT @__p_3 ROWS ONLY
) AS [t]
LEFT JOIN [DropDowns] AS [d] ON [t].[ProjectStatusId] = [d].[Id]
ORDER BY [t].[ProjectName]',N'@__FullTextFilter_1 nvarchar(4000),@__p_2 int,@__p_3 int',@__FullTextFilter_1=N'"*def*"',@__p_2=0,@__p_3=20

当我执行这段SQL代码时,执行时间大约为460ms(与我使用旧的SqlDataReader和生成的SQL查询时相同.我不明白为什么同样的查询在EF中运行得这么慢.我已经测试过了:

  • 全文列作为全文索引的计算列(如上面的代码)
  • 全文列作为普通字符串列(我复制了该列中的所有文本,因此其内容与计算列相同)
  • 单个列上的全文索引(如SearchKeyword、ProjectName、ProjectDescription等...)
  • EF.Functions.FreeText个而不是EF.Functions.Contains
  • 通过services.AddDbContext<ApplicationDbContext>(options => options.UseLazyLoadingProxies(false).UseSqlServer("xxx"));禁用了延迟加载(但在我的测试用例中,无论是启用还是禁用,数据中都没有代理)
  • 等待一段时间的"热身"

你知道这里出了什么问题吗?我知道EF会增加一些开销,但这太多了.

推荐答案

谢谢你为我指明了正确的方向.Parameter Sniffing是这里的关键词.

我测试了上面提到的RecompileExtension,它给了我一个很好的结果(大多数情况下大约600毫秒),但有时仍然大约是2000毫秒而不是500毫秒.

在我的例子中,禁用整个数据库的参数嗅探(目前)效果最好:

ALTER DATABASE SCOPED CONFIGURATION SET PARAMETER_SNIFFING = OFF;

现在,我所有的Free Text EF查询运行时间约为500ms,几乎是原生SQL查询的100%.

Csharp相关问答推荐

Plotly.NET访问互联网时出现异常

找不到网址:https://localhost:7002/Category/Add?区域= Admin.为什么我的URL是这样生成的?area = Admin而不是/Admin/

如何使用XmlSerializer反序列化字符串数组?

. net依赖注入如何避免服务类中的新

如何将Kafka消息时间戳转换为C#中的日期和时间格式?

从Blob存储中提取tar.gz文件并将提取结果上载到另一个Blob存储

WeakReference未被垃圾收集

在命名管道上使用GRPC ASP.NET核心时如何配置命名管道权限

由于POST中的应用程序/JWT,出现不支持的内容类型异常

取决于您的数据量的多个嵌套循环

如何在用户在线时限制令牌生成?

在被Interactive Server切换后,Blazor SSR页面无法正确加载JS

在使用AWS SDK for.NET时,如何判断S3是否已验证我的对象的校验和?

为什么Swashbakle/Swagger在参数中包含变量名?

Visual Studio如何使用当前的框架?

如何根据分割文本的块数来计算文本的大小?

这是T自身的布尔表达式是什么意思?

异步等待,如何在Windows窗体中使用它们?

如何提高C#中比较大 struct 的性能?

初始化具有EntityFrameworkCore和不同架构的数据库时引发System.MissingMethodExcept