def minimax(
        board: Board, 
        depth: int, 
        max_depth: int, 
        is_black: bool
    ) -> tuple[Score, Move]:
    """
    Finds the best move for the input board state.
    Note that you are black.

    Parameters
    ----------
    board: 2D list of lists. Contains characters "B", "W", and "_",
    representing black pawn, white pawn, and empty cell, respectively.

    depth: int, the depth to search for the best move. When this is equal
    to `max_depth`, you should get the evaluation of the position using
    the provided heuristic function.

    max_depth: int, the maximum depth for cutoff.

    is_black: bool. True when finding the best move for black, False
    otherwise.

    Returns
    -------
    A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):
    evaluation: the best score that black can achieve after this move.
    src_row, src_col: position of the pawn to move.
    dst_row, dst_col: position to move the pawn to.
    """
    def max_value(board, depth):
        if utils.is_game_over(board) or depth == max_depth: #if game is over or depth is max_depth return current score
            return evaluate(board), None  # Return v along with action
        v = float('-inf')
        best_action = None  # Initialize best action
        for action in generate_valid_moves(board): #for each action in valid moves
            next_board = utils.state_change(board, action[0], action[1], False) #generate next state
            next_board = utils.invert_board(next_board, False) #invert the board for the white turns(because all the function is applied for black), so invert board will flip the board and turn white to black
            score, _ = max_value(next_board, depth + 1) #get the score for the next state
            if score > v:
                v = score
                best_action = action  # Update best action
        return v, best_action

    return max_value(board, depth)

我已经try 实现了一个最小最大算法,以获得最优的移动为黑色或白色与最大深度.当我用我的测试用例运行它时,它只返回第一个操作,而不是产生所有操作中最高判断的最佳操作.我在代码中找不到问题所在.

def minimax(
        board: Board, 
        depth: int, 
        max_depth: int, 
        is_black: bool
    ) -> tuple[Score, Move]:
    """
    Finds the best move for the input board state.
    Note that you are black.

    Parameters
    ----------
    board: 2D list of lists. Contains characters "B", "W", and "_",
    representing black pawn, white pawn, and empty cell, respectively.

    depth: int, the depth to search for the best move. When this is equal
    to `max_depth`, you should get the evaluation of the position using
    the provided heuristic function.

    max_depth: int, the maximum depth for cutoff.

    is_black: bool. True when finding the best move for black, False
    otherwise.

    Returns
    -------
    A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):
    evaluation: the best score that black can achieve after this move.
    src_row, src_col: position of the pawn to move.
    dst_row, dst_col: position to move the pawn to.
    """
    if depth == max_depth or utils.is_game_over(board):
        return evaluate(board), None

    # Determine the best move and its evaluation
    if is_black:
        best_evaluation = float('-inf')
        best_move = None
        for action in generate_valid_moves(board):
            new_board = utils.state_change(board, action[0], action[1], in_place=False)
            opponent_evaluation, _ = minimax(new_board, depth + 1, max_depth, False)
            if opponent_evaluation > best_evaluation:
                best_evaluation = opponent_evaluation
                best_move = (action[0], action[1])
        return best_evaluation, best_move
    else:
        best_evaluation = float('inf')
        best_move = None
        for action in generate_valid_moves(utils.invert_board(board, in_place=False)):
            new_board = utils.state_change(utils.invert_board(board, in_place=False), action[0], action[1], in_place=False)
            opponent_evaluation, _ = minimax(utils.invert_board(new_board, in_place=False), depth + 1, max_depth, True)
            if opponent_evaluation < best_evaluation:
                best_evaluation = opponent_evaluation
                best_move = (action[0], action[1])  # Convert from black's perspective to white's
        return best_evaluation, best_move

我只是try 了一种不同的方法,其中切换情况时每次黑白,通过使用内部实现倒置表.所以它基本上通过了解决方案的公共测试用例,但它遇到了这样的问题:"确保使用正确的输入调用‘EVALUATE’,特别是对于白人",我不明白为什么?我认为我的判断正确地判断了白色和黑色,因为我在白色转弯后颠倒了黑板.

推荐答案

假设您调用的函数都是正确的(如invert_board,...),仍然存在这样一个问题:

虽然棋盘颠倒了,但你并没有颠倒比分.对白人有利的事对黑人不利,反之亦然,所以你应该反其道而行之.

我会改变这一点:

score, _ = max_value(next_board, depth + 1)

对此:

score = -max_value(next_board, depth + 1)[0]

一些无关的 comments :

  • 我会让depth倒计时而不是倒计时.这样,您就可以与0(对于基本大小写)进行比较,并且在此函数中不需要访问max_depth.然后,最初的调用方minimax应该只向depth参数传递max_depth,并且它可以不传递max_depth参数.

  • invert_board可能会对运行时间产生相当负面的影响.一旦你得到这个工作,你可以通过使你的所有函数(如evaluate)aware轮到谁来提高性能,并从该玩家的Angular 让它们返回一个值.

  • 与上述观点相关:minimax函数有一个is_black参数.我希望你已经在你没有分享的代码中处理了这些信息,即在调用你的max_value函数之前,确保在is_blackFalse的情况下反转棋盘.

  • state_change也可能会对运行时间产生负面影响.如果是这样的话,你可以在它的基础上改进mutating-board(所以不复制),然后创建一个可以执行上一步的undo的函数.您将在进行递归调用后调用此函数.

  • 基本Case if语句可以按相反的顺序执行这两个条件:这将使您在达到最大深度时节省is_game_over次调用.

  • Related 对此: is_game_over could also return the evaluation score in case the game is over. So it could either return a score (indicating the game is over) or None (indicating the game is not over). That can be used to then save a separate call to evaluate, which probably would have to do the same game-over checks again.

  • 考虑实施alpha-beta pruning:它可能有助于减少要分析的州的数量,同时仍然确保返回相同的分数.

Python相关问答推荐

如何从具有多个嵌入选项卡的网页中Web抓取td类元素

TARete错误:类型对象任务没有属性模型'

try 与gemini-pro进行多轮聊天时出错

使用numpy提取数据块

将整组数组拆分为最小值与最大值之和的子数组

如何检测背景有噪的图像中的正方形

将jit与numpy linSpace函数一起使用时出错

Python库:可选地支持numpy类型,而不依赖于numpy

如何获取TFIDF Transformer中的值?

如何将一个动态分配的C数组转换为Numpy数组,并在C扩展模块中返回给Python

如何设置视频语言时上传到YouTube与Python API客户端

cv2.matchTemplate函数匹配失败

将JSON对象转换为Dataframe

如何使用SentenceTransformers创建矢量嵌入?

CommandeError:模块numba没有属性generated_jit''''

判断solve_ivp中的事件

Gekko中基于时间的间隔约束

pandas fill和bfill基于另一列中的条件

如何在Pandas中用迭代器求一个序列的平均值?

函数()参数';代码';必须是代码而不是字符串