我试图在S3上划分镶木地板,它与AWS Wrangler一起工作.
basename_template = 'part.'
partitioning = ['cust_id', 'file_name', 'added_year', 'added_month', 'added_date']
loop = asyncio.get_event_loop()
s3_path = "s3://customer-data-lake/main/parquet_data"
await loop.run_in_executor(None, lambda: wr.s3.to_parquet(
df=batch.to_pandas() ,
path=s3_path,
dataset=True,
max_rows_by_file=MAX_ROWS_PER_FILE,
use_threads=True,
partition_cols = partitioning,
mode='append',
boto3_session=s3_session,
filename_prefix=basename_template
))
然后我try 将其转换为lakeFS,我将端点更改为LakeFS
wr.config.s3_endpoint_url = lakefsEndPoint
然后,分区突然不再起作用了.它只是附加到相同的分区.
This image is the original S3 one
Then this is after I changed to lakeFs
它只是附加到CSV_1.我在这里做错了什么?