Are dictionaries ordered in Python 3.6+?
它们是insertion ordered[1].从Python 3.6开始,对于Python的CPython实现,字典remember the order of items inserted.This is considered an implementation detail in Python 3.6; 如果希望在其他Python实现中插入顺序为guaranteed(以及其他有序行为[1]),则需要使用OrderedDict
.
As of Python 3.7,这不再是一个实现细节,而是一个语言特性.From a python-dev message by GvR:
就这么办吧."DICT保持插入顺序"是裁决.谢谢!
这简单地说就是you can depend on it.如果其他Python实现希望成为符合Python3.7的实现,那么它们也必须提供插入顺序字典.
How does the Python 100 dictionary implementation perform better[2] than the older one while preserving element order?
基本上是keeping two arrays.
在之前的实现中,必须分配PyDictKeyEntry
类型和dk_size
大小的稀疏数组;不幸的是,它也导致了大量的空白空间,因为该数组不允许超过2/3 * dk_size
满for performance reasons.(而still号空位的大小为PyDictKeyEntry
!).
现在的情况并非如此,因为只存储了required个条目(已插入的条目),并且保留了类型为intX_t
(X
取决于dict大小)2/3 * dk_size
s full的稀疏array.空位从PyDictKeyEntry
型变为intX_t
型.
因此,显然,创建类型为PyDictKeyEntry
的稀疏数组比存储int
的稀疏数组需要更多内存.
如果感兴趣,您可以查看关于此功能的完整对话on Python-Dev,这是一本不错的读物.
In the original proposal made by Raymond Hettinger,可以看到所使用的数据 struct 的可视化,它抓住了 idea 的要点.
例如,字典:
d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'}
当前存储为[keyhash,key,value]:
entries = [['--', '--', '--'],
[-8522787127447073495, 'barry', 'green'],
['--', '--', '--'],
['--', '--', '--'],
['--', '--', '--'],
[-9092791511155847987, 'timmy', 'red'],
['--', '--', '--'],
[-6480567542315338377, 'guido', 'blue']]
相反,数据应按如下方式组织:
indices = [None, 1, None, None, None, 0, None, 2]
entries = [[-9092791511155847987, 'timmy', 'red'],
[-8522787127447073495, 'barry', 'green'],
[-6480567542315338377, 'guido', 'blue']]
正如您现在可以看到的那样,在最初的提案中,为了减少碰撞和更快地进行查找,很多空间基本上是空的.使用新方法,您可以通过将稀疏性移动到索引中真正需要的位置来减少所需的内存.
[1]: I say "insertion ordered" and not "ordered" since, with the existence of OrderedDict, "ordered" suggests further behavior that the `dict` object *doesn't provide*. OrderedDicts are reversible, provide order sensitive methods and, mainly, provide an order-sensive equality tests (`==`, `!=`). `dict`s currently don't offer any of those behaviors/methods.
[2]: The new dictionary implementations performs better **memory wise** by being designed more compactly; that's the main benefit here. Speed wise, the difference isn't so drastic, there's places where the new dict might introduce slight regressions (key-lookups, for example) while in others (iteration and resizing come to mind) a performance boost should be present.
Overall, the performance of the dictionary, especially in real-life situations, improves due to the compactness introduced.