>>> n = 3
>>> x = range(n ** 2),
>>> xn = list(zip(*[iter(x)] * n))

In PEP 618, the author gives this example of how zip can be used to chunk data into equal sized groups.

How does it work?

I think that it relies on an implementation detail of zip such that if it takes the first element of each of the elements of the list [iter(x)] * n that equates to the first n elements because of the changing state of iter(x) as each of the elements are taken.

This is because the following code replicates the above behavior:

n = 3
x = range(n ** 2)
xn = [iter(x)] * n

res = []

while True:    
        try:    
                col = []    
                for element in xn:    
                        col.append(next(element))    
                res.append(col)    
        except:    
                break

However, I would like to make sure that this is indeed the case and that this is a reliable behavior that can be used to chunk elements of an iterable.

推荐答案

It's not really specific to zip, but you basically have that right. In effect, it's zipping 3 references to the same iterator, causing it to round-robin between them. During each iteration, one more element is consumed from the iterator.

Effectively, it's the same as doing this:

>>> n = 3
>>> x = range(n ** 2)
>>> a = b = c = iter(x)
>>> list(zip(a, b, c))
[(0, 1, 2), (3, 4, 5), (6, 7, 8)]

Note that it only produces equal sized groups and may drop elements (that part is a characteristic of zip, because it's limited by the smallest iterable, though you could use itertools.zip_longest if you want):

>>> n = 4
>>> x = range(n ** 2)
>>> a = b = c = iter(x)
>>> list(zip(a, b, c))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]

Python相关问答推荐

计算组中唯一值的数量

如何使用它?

当递归函数的返回值未绑定到变量时,非局部变量不更新:

不能使用Gekko方程'

Polars asof在下一个可用日期加入

如何在Python中使用另一个数据框更改列值(列表)

try 检索blob名称列表时出现错误填充错误""

Flask运行时无法在Python中打印到控制台

并行编程:同步进程

为什么后跟inplace方法的`.rename(Columns={';b';:';b';},Copy=False)`没有更新原始数据帧?

为什么Visual Studio Code说我的代码在使用Pandas concat函数后无法访问?

Polars时间戳同步延迟计算

如何将一个文件的多列导入到Python中的同一数组中?

如何在Python中创建仅包含完整天数的月份的列表

Numpy`astype(Int)`给出`np.int64`而不是`int`-怎么办?

当lambda函数作为参数传递时,pyo3执行

如何在开始迭代自定义迭代器类时重置索引属性?

从pandas框架中删除重复的子框架

在使用TO_EXCEL时如何为正数加上加号?

收到Firebase推送通知时,电话不会震动