你的时间和空间复杂度都是Theta(n)
.虽然有时为了清晰起见,在时间或空间复杂度中包含不会改变渐近值的术语是有用的(classic 的例子是string searching algorithms),但在这里没有太多意义.
虽然你所说的O(n + m^2)
时间复杂度在技术上是正确的,因为Big-O表示法是一个上限,但你可以证明O(n)
也是一个上限,因为字典的大小最多为n
,我们对每个键只迭代一次:从输入读取n
项,字典最多循环n
次,输出列表追加n
项.
如果需要,可以计算所需的"辅助"空间,即所需的空间量,但不计算输入或输出array.这里是Theta(m)
.但是,您应该注意到,这种分析非常少见:根据假设,除非另有规定,否则空间复杂性分析将包括输出的大小.
为了解决为什么第二个循环仍然是具有许多重复值的线性时间的常见困惑,让我们看一个例子.
有疑问的是:
for k, v in d.items():
if v > 0:
for i in range(v):
output_list.append(k)
假设我们的输入列表是[1, 1, 1, 1, 1, 1, 1, 2, 2, 2]
(总共十个元素:七个1和三个2).
然后我们的dictionary.items()
(每个元素的计数减go 1)看起来像:[(key: 1, value: 6), (key: 2, value: 2)]
(它实际上并没有在内部存储为元组的Python列表,但这些是items view对象的完整内容).
让我们逐行了解第二个循环的操作:
for k, v in [(key: 1, value: 6), (key: 2, value: 2)]:
# On our first iteration, so k = 1, v = 6.
if 6 > 0: # 1 operation
for i in range(6): # 6 operations
output_list.append(1) # 6 operations
# For the (k, v) pair of (1, 6), the inner-loop h像 run 6 times.
# Every time the inner-loop runs, the length of output_list
# incre像es by 1
# Second iteration of outer-loop runs again:
for k, v in [(key: 1, value: 6), (key: 2, value: 2)]:
# On our second iteration, so k = 2, v = 2.
if 2 > 0: # 1 operation
for i in range(2): # 2 operations
output_list.append(2) # 2 operations
# For the (k, v) pair of (1, 6), the inner loop h像 run 2 times.
# In total: 8 inner-loop iterations, and output_list h像 len 8
在非正式的复杂性分析中,"标准"经验法则是,双嵌套循环的运行时间通常是二次的.例如,这是因为我们正在计算内部循环迭代的总数
for i in range(n):
for j in range(n):
像
(n inner-loops per outer-loop) * (n outer-loops) = n^2 inner-loops
This 像sumption shouldn't be applied when the number of inner-loop iterations varies dramatically b像ed on the state of the outer-loop. In our example, the inner-loop iterations is v
, which depends on the outer loop.
要在这里找到运行时,请点击we need a different way to count how many inner-loop iterations occur.幸运的是,我们可以做到这一点:在每个内部循环迭代中,我们向output_list
添加一个元素.
Since the final length of output_list
is n
, we know that the inner-loop h像 executed at most n
times (technically, it's executed exactly n-m
times, since the output_list already h像 size m
after the earlier dictionary-initializing loop h像 terminated). Instead of incorrectly multiplying this by m
, the number of outer-loop iterations, we should instead add the inner and outer loop iterations for a runtime of Theta(n+m)
which is Theta(n)
.
Addendum: Comments have correctly pointed out that since Python dictionaries don't have an O(1)
amortized worst-c像e lookup/insert time guarantee, so the first loop is, at best, Omega(m*n)
. While Python uses pseudo-random probing on an open-addressing table, this only ensures good 'average' performance. Thanks to Kelly Bundy for the highly informative discussion and corrections.
Unfortunately, while O(1)
amortized worst-c像e lookup/insert h像hing is possible, for example with Cuckoo h像hing, in practice this is significantly slower on average than what's currently used in most standard libraries, and is unlikely to change in the near future.