我在一个空的ASP.NET Core8最小API项目中有一个基本的背景类.

应用程序启动只是:

builder.Services.AddHttpClient();
builder.Services.AddHostedService<SteamAppListDumpService>();

后台类用于保存STeam API端点的快照,所有基本内容如下:

public class SteamAppListDumpService : BackgroundService
{
    static TimeSpan RepeatDelay = TimeSpan.FromMinutes(30);
    private readonly IHttpClientFactory _httpClientFactory;

    private string GetSteamKey() => "...";

    private string GetAppListUrl(int? lastAppId = null)
    {
        return $"https://api.steampowered.com/IStoreService/GetAppList/v1/?key={GetSteamKey()}" +
            (lastAppId.HasValue ? $"&last_appid={lastAppId}" : "");
    }

    public SteamAppListDumpService(IHttpClientFactory httpClientFactory)
    {
        _httpClientFactory = httpClientFactory;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            await DumpAppList();
            await Task.Delay(RepeatDelay, stoppingToken);
        }
    }

    public record SteamApiGetAppListApp(int appid, string name, int last_modified, int price_change_number);
    public record SteamApiGetAppListResponse(List<SteamApiGetAppListApp> apps, bool have_more_results, int last_appid);
    public record SteamApiGetAppListOuterResponse(SteamApiGetAppListResponse response);

    protected async Task DumpAppList()
    {
        try
        {
            var httpClient = _httpClientFactory.CreateClient();
            var appList = new List<SteamApiGetAppListApp>();
            int? lastAppId = null;
            do
            {
                using var response = await httpClient.GetAsync(GetAppListUrl(lastAppId));
                if (!response.IsSuccessStatusCode) throw new Exception($"API Returned Invalid Status Code: {response.StatusCode}");

                var responseString = await response.Content.ReadAsStringAsync();
                var responseObject = JsonSerializer.Deserialize<SteamApiGetAppListOuterResponse>(responseString)!.response;
                appList.AddRange(responseObject.apps);
                lastAppId = responseObject.have_more_results ? responseObject.last_appid : null;

            } while (lastAppId != null);

            var contentBytes = JsonSerializer.SerializeToUtf8Bytes(appList);
            using var output = File.OpenWrite(Path.Combine(Config.DumpDataPath, DateTime.UtcNow.ToString("yyyy-MM-dd__HH-mm-ss") + ".json.gz"));
            using var gz = new GZipStream(output, CompressionMode.Compress);
            gz.Write(contentBytes, 0, contentBytes.Length);
        }
        catch (Exception ex)
        {
            Trace.TraceError("skipped...");
        }
    }
}

API总共返回大约16 MB的数据,然后每隔30分钟将其压缩/保存为4 MB的文件.在运行期间,当垃圾收集器运行时,我预计内存消耗将下降到几乎为零,但随着时间的推移,它会增加,例如,它已经在我的PC上运行了2个小时,消耗了700MB内存.在我的服务器上,它已经运行了24小时,现在占用了2.5 GB的内存.

据我所知,所有的流都被释放了,HttpClient是使用推荐的IHttpClientFactory创建的,有人知道为什么这个基本功能在垃圾收集之后还要消耗这么多内存吗?我试着在VS管理内存转储中查看它,但找不到太多有用的东西.这是指向某个类(即HttpClient/SerializeToUtf8Bytes)中的内存泄漏,还是我遗漏了什么?

responseStringcontentBytes的大小通常在2MB左右.

推荐答案

任何时候分配一个连续的内存块&>=85,000字节,它都会进入large object heap字节.与常规堆不同,它不是压缩的,除非您这样做manually[1],因此它可能会因为碎片而增长,看起来像是内存泄漏.请参见102.

由于您的responseStringcontentBytes通常在2MB左右,我建议重写您的代码以消除它们.取而代之的是,使用相关的内置API直接从您的服务器异步传输到JSON文件,如下所示:

const int BufferSize = 16384;
const bool UseAsyncFileStreams = true; //https://learn.microsoft.com/en-us/dotnet/api/system.io.filestream.-ctor?view=net-5.0#System_IO_FileStream__ctor_System_String_System_IO_FileMode_System_IO_FileAccess_System_IO_FileShare_System_Int32_System_Boolean_

protected async Task DumpAppList()
{
    try
    {
        var httpClient = _httpClientFactory.CreateClient();
        var appList = new List<SteamApiGetAppListApp>();
        int? lastAppId = null;
        do
        {
            // Get the SteamApiGetAppListOuterResponse directly from JSON using HttpClientJsonExtensions.GetFromJsonAsync() without the intermediate string.
            // https://learn.microsoft.com/en-us/dotnet/api/system.net.http.json.httpclientjsonextensions.getfromjsonasync
            // If you need customized error handling see 
            // https://stackoverflow.com/questions/65383186/using-httpclient-getfromjsonasync-how-to-handle-httprequestexception-based-on
            var responseObject = (await httpClient.GetFromJsonAsync<SteamApiGetAppListOuterResponse>(GetAppListUrl(lastAppId)))
                !.response;
            appList.AddRange(responseObject.apps);
            lastAppId = responseObject.have_more_results ? responseObject.last_appid : null;

        } while (lastAppId != null);

        await using var output = new FileStream(Path.Combine(Config.DumpDataPath, DateTime.UtcNow.ToString("yyyy-MM-dd__HH-mm-ss") + ".json.gz"),
                                                FileMode.Create, FileAccess.Write, FileShare.None, bufferSize: BufferSize, useAsync: UseAsyncFileStreams);
        await using var gz = new GZipStream(output, CompressionMode.Compress);
        // See https://faithlife.codes/blog/2012/06/always-wrap-gzipstream-with-bufferedstream/ for a discussion of buffer sizes vs compression ratios.
        await using var buffer = new BufferedStream(gz, BufferSize);
        // Serialize directly to the buffered, compressed output stream without the intermediate in-memory array.
        await JsonSerializer.SerializeAsync(buffer, appList);
    }
    catch (Exception ex)
    {
        Trace.TraceError("skipped...");
    }
}

备注:

  • GZipStream不会缓冲其输入,因此以增量方式向其传输数据可能会导致更差的压缩比.然而,正如Bradley Grainger在Always wrap GZipStream with BufferedStream中所讨论的,使用8K或更大的缓冲区来缓冲增量写入可以有效地消除该问题.

  • 根据docs,将useAsync参数传递给FileStream构造函数

    指定是使用异步I/O还是使用同步I/O.但是,请注意,基础操作系统可能不支持异步I/O,因此当指定为TRUE时,句柄可能会根据平台而同步打开.当以异步方式打开时,BeginRead(Byte[], Int32, Int32, AsyncCallback, Object)BeginWrite(Byte[], Int32, Int32, AsyncCallback, Object)方法在大型读取或写入时执行得更好,但在小型读取或写入时可能会慢得多.如果应用程序设计为利用异步I/O,请将useAsync参数设置为True.正确使用异步I/O可以将应用程序的速度提高10倍,但是使用它而不为异步I/O重新设计应用程序可能会使性能降低10倍.

    因此,您可能需要进行测试,看看在实践中,UseAsyncFileStreams等于true还是false会获得更好的性能.您可能还需要调整缓冲区大小以获得最佳性能和压缩比--始终确保缓冲区小于85,000字节.

  • 如果您认为大对象堆碎片可能是一个问题,请参阅MSFT文章The large object heap on Windows systems: A debugger以获取有关如何进一步调查的建议.

  • 因为您的DumpAppList()方法只每半小时运行一次,所以您可以try 在每次运行后手动压缩大对象堆,看看这是否有帮助:

     protected override async Task ExecuteAsync(CancellationToken stoppingToken)
     {
         while (!stoppingToken.IsCancellationRequested)
         {
             await DumpAppList();
             GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
             GC.Collect();           
    
             await Task.Delay(RepeatDelay, stoppingToken);
         }
     }
    
  • 你可能想把CancellationToken stoppingToken改成DumpAppList().


[1]请注意,在Memory management and garbage collection (GC) in ASP.NET Core: Large object heap年中,MSFT写道:

在使用.NET Core 3.0及更高版本的容器中,会自动压缩LOH.

因此,我关于LOH压缩何时发生的声明在某些平台上可能会过时.

Csharp相关问答推荐

如何从顶部提取发票号作为单词发票后的第一个匹配

O(N)测试失败

需要澄清C#的Clean Architecture解决方案模板的AuditableEntityInterceptor类

将XPS转换为PDF C#

ASP.NET Core 8.0 JWT验证问题:尽管令牌有效,但SecurityTokenNoExpirationError异常

如何在Windows 11任务调度程序中每1分钟重复一次任务?

用C#调用由缓冲区指针参数组成的C API

Rx.Net窗口内部可观测数据提前完成

如何将此方法参数化并使其更灵活?

每个http请求需要60秒,为什么?

如何在C#中创建VS代码中的控制台应用程序时自动生成Main方法

有条件地定义预处理器指令常量

Blazor Server/.NET 8/在初始加载时调用异步代码是否冻结屏幕,直到第一次异步调用完成?

用于ASP.NET核心的最小扩展坞

未显示详细信息的弹出对话框

RCL在毛伊岛应用程序和Blazor服务器应用程序.Net 8.0中使用页面

我是否应该注销全局异常处理程序

在ObservableCollection上使用[NotifyPropertyChangedFor()]源代码生成器不会更新UI

在Visual Studio 2022中查找Xamarin模板时遇到问题

ASP.NET重新加载Kestrel SSL证书