我的CSV文件看起来像

Metals:,E10
Al,0.1906
Ca,0.1132
Co,0.01951
Cu,0.5824
Cu,0.02383
Fe,0.03828
K,0.09577
Li,0.03024
Mg,0.007145
Na,0.1833
Ni,0.3236
Pb,0.0005787
Ti,0.4931
Tl,0.001887
Zn,0.07644

GLot,id,Slot,Scribe,Diameter,MPD,SResistivity,SThickness,TTV,LTV,Warp,Bow,S_U_A,Ep,Epi_L,Epi_Layer,Epi_Layer_2,EThick,E2thick,E2Dope,E2DopeT,E2DopeMax,E2DopeMin
31075046-001,XFB-LE00674.CP10023+001-12,1,22C1285,149.98,0,0.0217,334.71,1.91,1.03,5.35,-0.91,99.590582,1.0,1.0E18,9.8,1.12,9.9,9.6,9926193600000000,4.5574,10834500800000000,9551876800000000

我的代码如下所示:

namespace CsvHelperTest
{
    class CsvHelperTester
    {
        static void Main(string[] args)
        {
            var csvConfig = new CsvConfiguration(CultureInfo.InvariantCulture)
            {
                HasHeaderRecord = false,
                HeaderValidated = null,
                IgnoreBlankLines = true,
                MissingFieldFound = null,
                AllowComments = true,
                Comment = ';',
                Delimiter = ",",
                TrimOptions = TrimOptions.Trim, 
                PrepareHeaderForMatch = header => Regex.Replace(header.Header, ",", "\n"),
            };

            using (var streamReader = new StreamReader("C:\\Users\\eyoung\\Desktop\\parse test files\\XFB-1C2002A_62152_CoA.csv"))
            {
                using (var csvReader = new CsvReader(streamReader, csvConfig))
                {
                    for (var i = 0; i < 1; i++)
                    {
                        csvReader.Read();
                    }

                    var records = csvReader.GetRecords<EpiDataNames>().ToList();

                    var table = records[0];

                    records.RemoveAt(0);

                    var columns = records;

                    using (var writer = new CsvWriter(Console.Out, CultureInfo.InvariantCulture))
                    {
                        //writer.WriteField(records[0].Type);
                        //writer.NextRecord();

                        //records.RemoveAt(0);
                        //foreach (var item in records.Select(r => r.Type))
                        //{
                        //    writer.WriteField(item);
                        //}
                        //writer.NextRecord();
                        //foreach (var item in records.Select(r => r.Value))
                        //{
                        //    writer.WriteField(item);
                        //}
                        //writer.NextRecord(); 
                    }
                }
            }
        }

        public class EpiDataNames
        {
            [Index(0)]
            public string Type { get; set; }
            [Index(1)]
            public string Value { get; set; }
        }
    }
}

这很棒,因为它将第一组数据分成两列,"类型"和"值",然而,当第二组数据显示时,问题就出现了,有没有一种方法我只能读取第一块数据?当我试图省略最后那些标头时,它表现得很奇怪,并且删除了第一个数据块.

for (var i = 0; i < 1; i++)
{
    csvReader.Read(); //this skips the first line of data
}

for (var i = 0; i > 18; i++)
{
    csvReader.Read(); //I thought this would skip the last lines of data, but it doesn't.
}

第二个标头块的问题是,标头读起来像

Type Value
GLot Id

当它应该是的时候,

Type Value
Glot 31075046-001

有什么主意吗?我在这一点上相当迷茫,我也应该开始,我没有控制编辑这个CSV文件之前.

推荐答案

您的文本文件由两个单独的CSV表格组成,两个表格带有标题,由空行分隔.使用CsvHelper可以读取这样的文件,但是您需要逐行手动读取它,并跟踪一个表结束和一个新表开始的时间.然后,当一个新表开始时,您将需要引入一些启发式方法来确定它是哪种类型的表.

下面的方法CsvExtensions.ReadTwoTableCsv()是实现这一点的一种方式:

public static class CsvExtensions
{
    public static void ReadTwoTableCsv<TRecord1, TRecord2>(TextReader reader, 
                                                           ClassMap<TRecord1> map1, out List<TRecord1> list1,
                                                           ClassMap<TRecord2> map2, out List<TRecord2> list2)
    {
        (List<TRecord1> l1, List<TRecord2> l2) = (new(), new());
        ReadMultiTableCsv(reader,
                          (map1, HeaderMatchesFirstMember, (map, csv) => l1.Add(csv.GetRecord<TRecord1>())),
                          (map2, HeaderMatchesFirstMember, (map, csv) => l2.Add(csv.GetRecord<TRecord2>())));
        (list1, list2) = (l1, l2);
    }
                                         
    static bool HeaderMatchesFirstMember(ClassMap map, CsvReader reader)
    {
        var firstMember = map.MemberMaps.Where(p => !p.Data.Ignore && p.Data.IsNameSet).SelectMany(p => p.Data.Names).FirstOrDefault();
        return Enumerable.Range(0, reader.Parser.Count)
            .Any(i => string.Equals(firstMember, reader.Parser[i], StringComparison.OrdinalIgnoreCase));
    }
    
    enum ReadState
    {
        Initial,
        Header,
        Data,
        UnknownData,
    }       
    
    public static void ReadMultiTableCsv(TextReader reader, 
                                         params (ClassMap map, Func<ClassMap, CsvReader, bool> isMatch, Action<ClassMap, CsvReader> readRecord) [] maps)
    {
        CsvConfiguration config = new(CultureInfo.InvariantCulture)
        {
            // These options are required to make ReadMultiTableCsv() work correctly:
            HasHeaderRecord = true,   // Headers are required, and are used to determine the table type,
            IgnoreBlankLines = false, // A blank line is used to delimit CSV sections, so we can't ignore it.
            // Other options as required by your application:
            HeaderValidated = null,
            MissingFieldFound = null,
            AllowComments = true,
            Comment = ';',
            Delimiter = ",",
            TrimOptions = TrimOptions.Trim, 
            PrepareHeaderForMatch = header => header.Header.ToLowerInvariant(),
        };
        
        using (var csv = new CsvReader(reader, config))
        {
            (int currentMap, ReadState state) = (-1, ReadState.Initial);
            int currentCount = -1;
            while (csv.Read())
            {
                if (csv.Parser.Count < 1 || csv.Parser.Count != currentCount)
                {
                    // Blank line or change in the number of columns
                    if (currentMap != -1)
                        csv.Context.UnregisterClassMap();
                    (currentMap, state) = (-1, ReadState.Initial);
                }

                currentCount = csv.Parser.Count;
                if (currentCount < 1 || state == ReadState.UnknownData)
                {
                    // Do nothing
                }
                else if (state == ReadState.Initial)
                {
                    var newMap = maps.Select((map, index) => (map, index)).Where(p => p.map.isMatch(p.map.map, csv)).Select(p => p.index).SingleOrDefault(-1);
                    if (newMap >= 0)
                    {
                        csv.Context.RegisterClassMap(maps[newMap].map);
                        csv.ReadHeader();
                        (currentMap, state) = (newMap, ReadState.Data);
                    }
                    else
                    {
                        (currentMap, state) = (-1, ReadState.UnknownData);
                    }
                }
                else if (state == ReadState.Data)
                {
                    maps[currentMap].readRecord(maps[currentMap].map, csv);
                }
                else
                {
                    throw new InvalidOperationException("Unexpected state");
                }
            }
        }           
    }
} 

然后,如果您的两个数据模型如下所示:

public class EpiDataNames
{
    [Name("Metals:")]
    public string Type { get; set; }
    [Name("E10")]
    public string Value { get; set; }
}

class EpiDataNamesMap : ClassMap<EpiDataNames>
{
    public EpiDataNamesMap() : this(new CsvConfiguration(CultureInfo.InvariantCulture)) {}
    public EpiDataNamesMap(CsvConfiguration config) => AutoMap(config);
}

public class Model2
{
    // Auto generated by https://toolslick.com/generation/code/class-from-csv
    public string GLot { get; set; }
    public string Id { get; set; }
    public int Slot { get; set; }
    public string Scribe { get; set; }
    public double Diameter { get; set; }
    public int MPD { get; set; }
    public double SResistivity { get; set; }
    public double SThickness { get; set; }
    public double TTV { get; set; }
    public double LTV { get; set; }
    public double Warp { get; set; }
    public double Bow { get; set; }
    public double SUA { get; set; }
    public double Ep { get; set; }
    public string EpiL { get; set; }
    public double EpiLayer { get; set; }
    public double EpiLayer2 { get; set; }
    public double EThick { get; set; }
    public double E2thick { get; set; }
    public long E2Dope { get; set; }
    public double E2DopeT { get; set; }
    public long E2DopeMax { get; set; }
    public long E2DopeMin { get; set; }
}

public class Model2ClassMap : ClassMap<Model2>
{
    // Auto generated by https://toolslick.com/generation/code/class-from-csv
    public Model2ClassMap()
    {
        Map(m => m.GLot).Name("GLot");
        Map(m => m.Id).Name("id");
        Map(m => m.Slot).Name("Slot");
        Map(m => m.Scribe).Name("Scribe");
        Map(m => m.Diameter).Name("Diameter");
        Map(m => m.MPD).Name("MPD");
        Map(m => m.SResistivity).Name("SResistivity");
        Map(m => m.SThickness).Name("SThickness");
        Map(m => m.TTV).Name("TTV");
        Map(m => m.LTV).Name("LTV");
        Map(m => m.Warp).Name("Warp");
        Map(m => m.Bow).Name("Bow");
        Map(m => m.SUA).Name("S_U_A");
        Map(m => m.Ep).Name("Ep");
        Map(m => m.EpiL).Name("Epi_L");
        Map(m => m.EpiLayer).Name("Epi_Layer");
        Map(m => m.EpiLayer2).Name("Epi_Layer_2");
        Map(m => m.EThick).Name("EThick");
        Map(m => m.E2thick).Name("E2thick");
        Map(m => m.E2Dope).Name("E2Dope");
        Map(m => m.E2DopeT).Name("E2DopeT");
        Map(m => m.E2DopeMax).Name("E2DopeMax");
        Map(m => m.E2DopeMin).Name("E2DopeMin");
    }
}

您将能够将CSV文件读入List<EpiDataNames>List<Model2>,如下所示:

using var textReader = new StreamReader(fileName, Encoding.UTF8);
CsvExtensions.ReadTwoTableCsv(textReader, 
                              new EpiDataNamesMap(), out var list1, 
                              new Model2ClassMap(), out var list2);
        

备注:

  • 我用来从当前头文件中确定正确型号的算法HeaderMatchesFirstMember(ClassMap map, CsvReader reader)--EpiDataNamesModel2--非常粗糙.我查看第一个映射的模型成员的列名--"Metals:""GLot"--是否出现在当前的标题列表中.这是可行的,因为这两个名字不同.如果名字相同--比如"Id""Id"--就需要使用更智能的算法

  • 我使用https://toolslick.com/generation/code/class-from-csv自动生成了第二个数据模型.

演示小提琴here.

Csharp相关问答推荐

如何定义所有项目的解决方案版本?

获取Windows和Linux上的下载文件夹

获取ASP.NET核心身份认证cookie名称

在. NET Core 8 Web API中,当为服务总线使用通用消费者时,如何防止IServiceProvider被释放或空?"

Microsoft. VisualBasic. FileIO. FileSystem. MoveFile()对话框有错误?

实现List T,为什么LINQ之后它不会返回MyList?<>(无法强制转换WhereListIterator `1类型的对象)'

在一个模拟上设置一个方法,该模拟具有一个参数,该参数是一个numc函数表达式

.NET 6控制台应用程序,RabbitMQ消费不工作时,它的程序文件中的S

ASP.NET配置kestrel以使用Windows证书存储中的HTTPS

Cosmos SDK和Newtonsoft对静态只读记录的可能Mutations

将字节转换为 struct 并返回

在不添加不必要的尾随零的情况下本地化浮点型?

有空容错运算符的对立面吗?

Savagger使用Fastendpoint更改用户界面参数

EFR32BG22 BLE在SPP模式下与PC(Windows 10)不连接

当try 测试具有协变返回类型的抽象属性时,类似功能引发System.ArgumentException

在';、';附近有错误的语法.必须声明标量变量";@Checkin";.';

用于请求用户返回列表的C#Google API

如何使用IHostedService添加数据种子方法

Cmd中的&ping.end()";有时会失败,而";ping";总是有效