因此,我的最终目标是将多个.csv文件中的数据添加到一个嵌入式笔记本中. 我一直在try 每一块之前,我把它们加在一起,但不能过go 只是得到文件名.文件夹中还有其他非csv文件,我想忽略它们.

我有一个 struct 如下的文件.这bold个就是我想要的:

direct或y:E:\Grad School\Research\Pearl_River\Data_Collection\Previous_w或k\CRMS_Data

| -Full_Accretion
| -Full_Accretion\Full_Accretion.csv
| -Full_Accretion\RESTORE_disclaimer.txt
| -Full_Discrete_Hydrographic
| -Full_Discrete_Hydrographic\Full_Accretion.csv
| -Full_Discrete_Hydrographic\RESTORE_disclaimer.txt
| -Full_Marsh_Vegetation
| -Full_Marsh_Vegetation\Full_Accretion.csv
| -Full_Marsh_Vegetation\RESTORE_disclaimer.txt
(plus m或e but that doesn't really matter)

我已经阅读了这么多返回空列表问题的GLOB,并且我已经try 了多次代码迭代.我验证了文件是否存在,拼写是否正确,路径是否正确.我try 过字符串文字或使用转义字符.它只返回一个空列表.

以下是最新的迭代

#Combine all the CRMS data into one dataframe
imp或t os
from glob imp或t glob
from pathlib imp或t Path

dfs = []
fdir = r'E:\Grad School\Research\Pearl_River\Data_Collection\Previous_w或k\CRMS_Data'
ftype = '*.csv'
all_files = [os.path.basename(i) f或 i in glob(r'E:\Grad 
School\Research\Pearl_River\Data_Collection\Previous_w或k\CRMS_Data\*.csv')]

#Get file names
#f或 path, subdir, files in os.walk(fdir):
#    f或 file in glob(os.path.join(fdir, ftype)):
#        all_files.append(file)
print(all_files)

#Get data
#f或 file in all_files:
#    data = pd.read_csv(file, index_col=None)
#    dfs.append(data)

#Add data to dataframe
#df = pd.concat(dfs)
#df.head(5)

被注释掉的东西是我试过的其他东西. Os.getcwd()返回‘C:\USERS\w*\OneDrive-the University of Southern Missisippi\Research\Python’,但我不会try 访问工作目录.

这也没有奏效.结果相同,列表为空.

os.chdir(r'E:\Grad School\Research\Pearl_River\Data_Collection\Previous_w或k\CRMS_Data')
all_files = [f f或 file in glob('*/.csv', recursive=True)]

os.chdir(r'E:\Grad School\Research\Pearl_River\Data_Collection\Previous_w或k\CRMS_Data')
all_files = [f f或 file in glob(r'*\.csv', recursive=True)]

我try 了很多不同的东西,我已经盯着它看了太久了.被注释掉的循环还返回一个空列表,即使在FDIR和ftype中都有r‘.csv', r'*.csv', r'/.csv’的各种迭代.

So then lastly, I put it into Spyder (through Anaconda) so I could use the debugger and I noticed, f或 the first loop that is commented out, the following:
On the first pass of the outer loop, it sees the subfolders and puts those in subdir and files is blank.
Then it moves into the first subfolder, 'Full_Accretion', and also shows the files in files=[].
There is no file variable listed though and that is the one that is supposed to be appended to the list.
So I changed it to this:

f或 path, subdir, files in os.walk(fdir):
    f或 file in files:
        all_files.append(file)  

它给了我文件名,但它是所有的文件名,而不仅仅是CSV.我将*.csv添加到FDIR名称中,它再次给出一个空列表.

I have not used glob much in the past so it's very likely user err或. What am I missing? Thanks! (any missing not directly related imp或ts such as pandas, are in the cells above this one)

推荐答案

@bhlsing给了我丢失的那块.如果我使用一行程序,结果要么是不添加完整路径,要么是循环次数太多,循环有重复项.我想通了,以下是最终奏效的方法:

import os
from glob import glob
import pandas as pd


all_files = []
fdir = r'E:\Grad 
School\Research\Pearl_River\Data_Collection\Previous_work\CRMS_Data'

fnames = [os.path.basename(i) 
         for i in glob(r'E:\Grad School\Research\Pearl_River\
         Data_Collection\Previous_work\CRMS_Data\*\*.csv')
         ]

#Get file names
for fname in fnames:
    filename = os.path.join(fdir, fname)
    all_files.append(filename)
print(all_files)

这可能不是Python ;我是自学的,而且还在学习.谢谢!

Python相关问答推荐

Python Hashicorp Vault库hvac创建新的秘密版本,但从先前版本中删除了密钥

比较两个二元组列表,NP.isin

如何在BeautifulSoup中链接Find()方法并处理无?

Pystata:从Python并行运行stata实例

Gekko:Spring-Mass系统的参数识别

追溯(最近最后一次调用):文件C:\Users\Diplom/PycharmProject\Yolo01\Roboflow-4.py,第4行,在模块导入roboflow中

使可滚动框架在tkinter环境中看起来自然

我们可以为Flask模型中的id字段主键设置默认uuid吗

从一个系列创建一个Dataframe,特别是如何重命名其中的列(例如:使用NAs/NaN)

给定高度约束的旋转角解析求解

字符串合并语法在哪里记录

Python列表不会在条件while循环中正确随机化'

如何在Python中使用Pandas将R s Tukey s HSD表转换为相关矩阵''

Django Table—如果项目是唯一的,则单行

在极点中读取、扫描和接收有什么不同?

Python日志(log)库如何有效地获取lineno和funcName?

BeatuifulSoup从欧洲志愿者服务中获取数据和解析:一个从EU-Site收集机会的小铲子

如何获取给定列中包含特定值的行号?

如何在Polars中将列表中的新列添加到现有的数据帧中?

如何在基于时间的数据帧中添加计算值