我正在对三维ndarray(KxMxN)进行行缩减,即取一列的所有值,并使用Reduce函数产生标量值;最终,KxMxN矩阵将变成KxN阶的二维ndarray.
我想要解决的问题是:
- 取一个3D矩阵并将其分成16个子矩阵(或任何大于2的偶数个分裂,16是最佳的);
- 通过从16块中 Select 一半的子矩阵(8)并将这8个子矩阵连接到一个矩阵中来生成所有组合;因此每个组合都有一个矩阵;
- 为新矩阵的每一列,为所有组合计算一个标量值,最好是一次计算一个标量值.
还有更多的实现细节,我将在此过程中解释.
3-D ndarray是浮点数.
在下面的例子中,njit
和numpy
是我目前所能得到的最好的结果.我想知道从任何Angular 看,是否还有进一步改进的空间.
cupy
(GPU并行化)、dask
(CPU并行化)或numba
并行化都未能击败下面的(我的用例显然太微不足道了,无法利用GPU的能力,而且我只有8G GPU).这些工具很有可能被用在更高级的方式上,这是我不知道的.
from numba import njit, guvectorize, float64, int64
from math import sqrt
import numba as nb
import numpy as np
import itertools
# Create a 2D ndarray
m = np.random.rand(800,100)
# Reshape it into a list of sub-matrices
mr = m.reshape(16,50,100)
# Create an indices matrix from combinatorics
# a typical one for me "select 8 from 16", 12870 combinations
# I do have a custom combination generator, but this is not what I wanted to optimise and itertools really has done a decent job already.
x = np.array( list(itertools.combinations(np.arange(16),8)) )
# Now we are going to select 8 sub-matrices from `mr` and reshape them to become one bigger sub-matrix; we do this in list comprehension.
# This is the matrix we are going to reduce.
# Bottleneck 1: This line takes the longest and I'd hope to improve on this line, but I am not sure there's much we could do here.
m3d = np.array([mr[idx_arr].reshape(400,100) for idx_arr in x])
# We create different versions of the same reduce function.
# Bottleneck 2: The reduce function is another place I'd want to improve on.
# col - column values
# days - trading days in a year
# rf - risk free rate
# njit version with instance function `mean`, `std`, and python `sqrt`
@njit
def nb_sr(col, days, rf):
mean = (col.mean() * days) - rf
std = col.std() * sqrt(days)
return mean / std
# njit version with numpy
@njit
def nb_sr_np(col, days, rf):
mean = (np.mean(col) * days) -rf
std = np.std(col) * np.sqrt(days)
return mean / std
# guvectorize with numpy
@guvectorize([(float64[:],int64,float64,float64[:])], '(n),(),()->()', nopython=True)
def gu_sr_np(col,days,rf,res):
mean = (np.mean(col) * days) - rf
std = np.std(col) * np.sqrt(days)
res[0] = mean / std
# We wrap them such that they can be applied on 2-D matrix with list comprehension.
# Bottleneck 3: I was thinking to probably vectorize this wrapper, but the closest I can get is list comprehension, which isn't really vectorization.
def nb_sr_wrapper(m2d):
return [nb_sr(r, 252, .25) for r in m2d.T]
def nb_sr_np_wrapper(m2d):
return [nb_sr_np(r, 252, .25) for r in m2d.T]
def gu_sr_np_wrapper(m2d):
return [gu_sr_np(r, 252, .25) for r in m2d.T]
# Finally! here's our performance benchmarking step.
%timeit np.array( [nb_sr_wrapper(m) for m in m3d.T] )
# output: 4.26 s ± 3.67 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit np.array( [nb_sr_np_wrapper(m) for m in m3d.T] )
# output: 4.33 s ± 26.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit np.array( [gu_sr_np_wrapper(m) for m in m3d.T] )
# output: 6.06 s ± 11.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)