Python SageMaker Pipeline：类型错误：join() 参数必须是 str、bytes 或 os.PathLike 对象，而不是ParameterString

发布于08月21日

我正在编写一个SageMaker管道，我想将预处理脚本名称作为输入传递到管道.

data_preprocessing_script = ParameterString(
        name="DataPreProcessingScript", default_value='',
        )

我有以下用于预处理的代码-

    sklearn_processor = SKLearnProcessor(
        framework_version=sklearn_framework_version,
        instance_type=processing_instance_type,
        instance_count=processing_instance_count,
        base_job_name=f"PreProcess-{base_job_prefix}",
        volume_size_in_gb=10,
        sagemaker_session=pipeline_session,
        role=role,
        tags=tags
    )

    step_args = sklearn_processor.run(
        code=os.path.join(BASE_DIR, data_preprocessing_script),
        inputs=[
            ProcessingInput(input_name="raw_input_data",

有os.path.join(BASE_DIR, data_preprocessing_script)个，管道失败了，

TypeError: join() argument must be str, bytes, or os.PathLike object, not 'ParameterString'

如果我传递路径/filename-precessing.py-而不是data_preprocess_script(参数字符串)，那么它工作得很好.

code=os.path.join(BASE_DIR, 'preprocessing/xxx_preprocessing.py')

如何将sagemaker管道参数化为接受预处理脚本文件名作为输入，而不是硬编码文件名.

import os from sagemaker.workflow.steps import ProcessingStep from sagemaker.processing import ScriptProcessor BASE_DIR = "/home/ec2-user/SageMaker/pipelines/project" DATA_PREPROCESSING_SCRIPT = "your_script_name.py" # Replace with your script's name script_path = os.path.join(BASE_DIR, DATA_PREPROCESSING_SCRIPT) if not os.path.exists(script_path): raise ValueError(f"Script not found at {script_path}") # Define the script processor script_processor = ScriptProcessor( image_uri="your_container_image_uri", command=["python3"], instance_type="ml.m5.xlarge", instance_count=1, role="your_sagemaker_execution_role_arn" ) # Define the processing step step_process = ProcessingStep( name="DataPreprocessing", processor=script_processor, inputs=[...], # Define your inputs outputs=[...], # Define your outputs code=script_path )

Python SageMaker Pipeline：类型错误：join() 参数必须是 str、bytes 或 os.PathLike 对象，而不是ParameterString

推荐答案

Python相关问答推荐

Python中是否有方法从公共域检索搜索结果

将轨迹优化问题描述为NLP.如何用Gekko解决这个问题？当前面临异常：@错误：最大方程长度错误

使用polars .滤镜进行切片速度比pandas .loc慢

如何调整spaCy token 化器，以便在德国模型中将数字拆分为行末端的点

更改matplotlib彩色条的字体并勾选标签？

Polars LazyFrame在收集后未返回指定的模式顺序

如何使用html从excel中提取条件格式规则列表？

PMMLPipeline._ fit()需要2到3个位置参数，但给出了4个位置参数

Pandas—合并数据帧，在公共列上保留非空值，在另一列上保留平均值

在Python中动态计算范围

对象的`call`方法的setattr在Python中不起作用'

调用decorator返回原始函数的输出

python中的解释会在后台调用函数吗？

为什么np. exp(1000)给出溢出警告，而np. exp(—100000)没有给出下溢警告？

以逻辑方式获取自己的pyproject.toml依赖项

寻找Regex模式返回与我当前函数类似的结果

(Python/Pandas)基于列中非缺失值的子集DataFrame

递归函数修饰器

jsonschema日期格式

如何在Pandas中用迭代器求一个序列的平均值？