Python 将polars框架中的列组合并为单个列

发布于02月12日

我有一个列为a_0, a_1, a_2, b_0, b_1, b_2的极地数据框.我想将它转换为更长更细的数据帧(3x行，但只有2列a和b)，这样a包含a_0[0], a_1[0], a_2[0], a_0[1], a_1[1], a_2[1],...,b也包含a_0[0], a_1[0], a_2[0], a_0[1], a_1[1], a_2[1],....我怎么能做到这一点？

推荐答案

您可以使用concat_list()将所需的列联接在一起，然后使用explode()将它们转换为行.

让我们以简单的数据框为例:

df = pl.DataFrame(
    data=[[x for x in range(6)]],
    schema=[f"a_{i}" for i in range(3)] + [f"b_{i}" for i in range(3)]
)

┌─────┬─────┬─────┬─────┬─────┬─────┐
│ a_0 ┆ a_1 ┆ a_2 ┆ b_0 ┆ b_1 ┆ b_2 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╪═════╪═════╪═════╡
│ 0   ┆ 1   ┆ 2   ┆ 3   ┆ 4   ┆ 5   │
└─────┴─────┴─────┴─────┴─────┴─────┘

现在，你可以reshape 它了.首先，将列连接到列表中，并为最终结果重命名列:

import polars.selectors as cs

df.select(
    pl.concat_list(cs.starts_with(x)).alias(x) for x in ['a','b']
)

┌───────────┬───────────┐
│ a         ┆ b         │
│ ---       ┆ ---       │
│ list[i64] ┆ list[i64] │
╞═══════════╪═══════════╡
│ [0, 1, 2] ┆ [3, 4, 5] │
└───────────┴───────────┘

否，将列表分解为行:

df.select(
    pl.concat_list(cs.starts_with(x)).alias(x) for x in ['a','b']
).explode(pl.all())

┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 0   ┆ 3   │
│ 1   ┆ 4   │
│ 2   ┆ 5   │
└─────┴─────┘

Python相关问答推荐

Python 3.12中的通用[T]类方法隐式类型检索

Python 将polars框架中的列组合并为单个列

推荐答案

Python相关问答推荐

Python 3.12中的通用[T]类方法隐式类型检索

在Google Colab中设置Llama-2出现问题-加载判断点碎片时Cell-run失败

2D空间中的反旋算法

log 1 p numpy的意外行为

修复mypy错误-赋值中的类型不兼容(表达式具有类型xxx，变量具有类型yyy)

对所有子图应用相同的轴格式

cv2.matchTemplate函数匹配失败

在单个对象中解析多个Python数据帧

启用/禁用shiny 的自动重新加载

如果初始groupby找不到满足掩码条件的第一行，我如何更改groupby列，以找到它？

numpy.unique如何消除重复列？

重置PD帧中的值

ConversationalRetrivalChain引发键错误

Python日志(log)模块如何在将消息发送到父日志(log)记录器之前向消息添加类实例变量

如何在Python Pandas中填充外部连接后的列中填充DDL值

用两个字符串构建回文

在Python中控制列表中的数据步长

Python：从目录内的文件导入目录

Pandas：将值从一列移动到适当的列

对当前的鼹鼠进行编码，并且我的按键获得了注册