Summary:个
- 包含10k个项目的测试数据,每个项目5个注释,每个注释100k个字符的唯一文本
- 与原生SQL相比,实体框架核心运行查询的速度最多慢10倍
- 全文搜索的列是字符串类型
- 我不使用延迟加载
- try 了全文列的不同方法
- 有时速度要快10倍
- 我总是等到全文索引完成(状态为0)
我运行一个半复杂的查询,在我的数据库上使用实体框架core(7.0.5)执行全文搜索:
BaseEntity
型号:
public class BaseEntity
{
// Main data
[Key]
public string Id { get; set; } = Guid.NewGuid().ToString();
public string SearchKeywords { get; set; }
// Audit data
[ForeignKey("TeamId")]
public virtual Team Team { get; set; }
public string TeamId { get; set; }
public DateTime? CreatedDate { get; set; }
[ForeignKey("CreatedByUserId")]
public virtual ApplicationUser CreatedByUser { get; set; }
public string CreatedByUserId { get; set; }
//...
}
Project
型号班级:
[Table("Projects")]
public class Project : BaseEntity
{
// Main data
public string ProjectName { get; set; }
public string ProjectDescription { get; set; }
public string ProjectLocation { get; set; }
public DateTime? ProjectStart { get; set; }
public DateTime? ProjectEnd { get; set; }
//...
// Computed columns (see ApplicationDbContext)
[DatabaseGenerated(DatabaseGeneratedOption.Computed)]
public string MainProperty { get; set; }
[DatabaseGenerated(DatabaseGeneratedOption.Computed)]
public string FullText { get; set; }
// Foreign data
[ForeignKey("ProjectStatusId")]
public virtual DropDown ProjectStatus { get; set; }
public string ProjectStatusId { get; set; }
[ForeignKey("CustomerId")]
public virtual Customer Customer { get; set; }
public string CustomerId { get; set; }
[ForeignKey("CustomerContactId")]
public virtual CustomerContact CustomerContact { get; set; }
public string CustomerContactId { get; set; }
[ForeignKey("AccountOwnerId")]
public virtual ApplicationUser AccountOwner { get; set; }
public string AccountOwnerId { get; set; }
// Foreign data collections
public virtual IEnumerable<Comment> Comments { get; set; }
public virtual IEnumerable<UploadedFile> UploadedFiles { get; set; }
public virtual IEnumerable<ProjectCandidate> ProjectCandidates { get; set; }
}
OnModelCreating
:
protected override void OnModelCreating(ModelBuilder builder)
{
// ...
builder.Entity<Project>().Property(p => p.FullText).HasComputedColumnSql("CONCAT([SearchKeywords], ' ', [ProjectName], ' ', [ProjectDescription], ' ', [ProjectLocation])");
// ...
}
GetProjects
方法:
private void GetProjects(ApplicationDbContext context, string FullTextFilter)
{
var projects = context.Projects.Where(x => x.TeamId == "85216daa-a05f-4c0a-92c4-0920f8349370").AsNoTracking();
projects = projects.Where(x =>
EF.Functions.Contains(x.FullText, FullTextFilter) ||
EF.Functions.Contains(x.Customer.FullText, FullTextFilter) ||
EF.Functions.Contains(x.CustomerContact.FullText, FullTextFilter) ||
EF.Functions.Contains(x.AccountOwner.FullText, FullTextFilter) ||
x.UploadedFiles.Any(c => EF.Functions.Contains(c.FullText, FullTextFilter)) ||
x.Comments.Any(c => EF.Functions.Contains(c.FullText, FullTextFilter))
);
projects = projects.OrderBy(x => x.ProjectName);
projects = projects.Skip(0);
projects = projects.Take(20);
var projectsDto = projects.Select(x => new ProjectOverviewDto
{
Id = x.Id,
ProjectName = x.ProjectName,
ProjectDescription = x.ProjectDescription,
ProjectLocation = x.ProjectLocation,
ProjectStatus = new DropDownMinimalDto { Name = x.ProjectStatus.Name },
Customer = new CustomerMinimalDto { CompanyName = x.Customer.CompanyName },
CustomerContact = new CustomerContactMinimalDto { LastName = x.CustomerContact.LastName },
AccountOwner = new ApplicationUserMinimalDto { LastName = x.AccountOwner.LastName }
});
var projectsList = projectsDto.ToList();
}
我像这样运行此函数(仅限测试):
GetProjects(context, "\"*def*\"");
执行时间变化很大,我已经得到了一些幸运的时刻,大约500毫秒,但大多数时间需要10,000毫秒.我的一些其他测试耗时长达20,000毫秒.
生成的SQL如下所示(在SQL Server事件探查器中可以看到):
exec sp_executesql N'SELECT [t].[Id], [t].[ProjectName], [t].[ProjectDescription], [t].[ProjectLocation], [d].[Name], [t].[CompanyName], [t].[LastName], [t].[LastName0] AS [LastName]
FROM (
SELECT [p].[Id], [p].[ProjectDescription], [p].[ProjectLocation], [p].[ProjectName], [p].[ProjectStatusId], [c].[CompanyName], [c0].[LastName], [a].[LastName] AS [LastName0]
FROM [Projects] AS [p]
LEFT JOIN [Customers] AS [c] ON [p].[CustomerId] = [c].[Id]
LEFT JOIN [CustomerContacts] AS [c0] ON [p].[CustomerContactId] = [c0].[Id]
LEFT JOIN [AspNetUsers] AS [a] ON [p].[AccountOwnerId] = [a].[Id]
WHERE [p].[TeamId] = N''85216daa-a05f-4c0a-92c4-0920f8349370'' AND (CONTAINS([p].[FullText], @__FullTextFilter_1) OR CONTAINS([c].[FullText], @__FullTextFilter_1) OR CONTAINS([c0].[FullText], @__FullTextFilter_1) OR CONTAINS([a].[FullText], @__FullTextFilter_1) OR EXISTS (
SELECT 1
FROM [UploadedFiles] AS [u]
WHERE [p].[Id] = [u].[ProjectId] AND CONTAINS([u].[FullText], @__FullTextFilter_1)) OR EXISTS (
SELECT 1
FROM [Comments] AS [c1]
WHERE [p].[Id] = [c1].[ProjectId] AND CONTAINS([c1].[FullText], @__FullTextFilter_1)))
ORDER BY [p].[ProjectName]
OFFSET @__p_2 ROWS FETCH NEXT @__p_3 ROWS ONLY
) AS [t]
LEFT JOIN [DropDowns] AS [d] ON [t].[ProjectStatusId] = [d].[Id]
ORDER BY [t].[ProjectName]',N'@__FullTextFilter_1 nvarchar(4000),@__p_2 int,@__p_3 int',@__FullTextFilter_1=N'"*def*"',@__p_2=0,@__p_3=20
当我执行这段SQL代码时,执行时间大约为460ms(与我使用旧的SqlDataReader
和生成的SQL查询时相同.我不明白为什么同样的查询在EF中运行得这么慢.我已经测试过了:
- 全文列作为全文索引的计算列(如上面的代码)
- 全文列作为普通字符串列(我复制了该列中的所有文本,因此其内容与计算列相同)
- 单个列上的全文索引(如SearchKeyword、ProjectName、ProjectDescription等...)
-
EF.Functions.FreeText
个而不是EF.Functions.Contains
个 - 通过
services.AddDbContext<ApplicationDbContext>(options => options.UseLazyLoadingProxies(false).UseSqlServer("xxx"));
禁用了延迟加载(但在我的测试用例中,无论是启用还是禁用,数据中都没有代理) - 等待一段时间的"热身"
你知道这里出了什么问题吗?我知道EF会增加一些开销,但这太多了.