我假设"绝对误差和"足以代替SSE作为解决方案的质量衡量标准.通过这样做,模型可以很容易地线性化,我认为使用MIP解算器比使用二次求解器更容易处理.请注意,assume以下的两个示例都有一个good(在本例中实际上是精确的)解.如果矩阵基本上是"噪声"的,而您正在寻找最佳匹配,那么几乎可以肯定的是,解算器将更加费力,因为问题可能更退化.
在看解决方案之前,我认为你的A
以上的建造有问题.做一个小的m x n矩阵,看看它是不是你想要的.我得到了很多重复的值,都是正值,所以完全有可能,对于一个b
向量,x向量is the optimal solution的"全零"应该是~0的平均值.我改变了一些关于位置/负值和 struct 的东西,以获得你想要的密度.
如果你把我的cvxpy
shell 换回SSE,我很好奇你会得到什么样的性能与线性相比.我找不到合适的解算器来解决这个问题,因为我不经常使用cvxpy
.
对于这种可以很容易地用矩阵格式表示的问题,cvxpy
是很自然的.为了在解题之前达到相同的点,我在pyomo
中跳过了几个圈,但它工作得很好.在我8岁的时候.旧机器,pyomo
需要大约200秒来构建最大规模的问题(如您在结果中所见)
我下面的cvxpy
解决方案对于较小的实例很好,但对于20k x 20k左右的矩阵来说是"不可行的",所以可能有一些内在的东西我不明白,因为它显然是可行的.也许更了解cvxpy
的人可以修修补补,看看发生了什么.
小型矩阵的CVXPY:
import cvxpy as cp
import numpy as np
from time import time
#I tried to simulate my inputs for you.
#Here the problem manifests itself already at m, n = 1600, 4950.
np.random.seed(0)
m, n= 5_000, 5_000 # 20_000, 20_000 <--- barfs
A_base = (np.random.randint(-1000, 1000, (m, n)))
marks = np.random.binomial(n=1, p=(n*m)**0.5/(m*n), size=(m, n))
A = np.where(marks, A_base, 0)
true_x = np.random.binomial(n=1, p=0.6, size=n)
b = A @ true_x
tic = time()
x = cp.Variable(n, integer=True)
objective = cp.Minimize(cp.sum(cp.abs(A @ x - b)))
constraints = []
constraints.extend([x >= 0.0, x <= 1])
prob = cp.Problem(objective,constraints)
res=prob.solve() #solver=cp.SCIP,verbose=True,scip_params={"limits/time": 50})
toc = time()
print("Status: ", prob.status)
print("The optimal value is", prob.value)
print("A solution x is")
print(x.value)
print("Non-zero ",sum([i for i in x.value if i is not None ]))
print(f'solver time: {toc-tic : 0.1f} sec.')
CVXPY output:
Status: optimal
The optimal value is 0.0
A solution x is
[ 1. 1. 1. ... -0. 1. -0.]
Non-zero 1907.0
solver time: 3.5 sec.
[Finished in 5.5s]
全尺寸矩阵的PYOMO
import numpy as np
import pyomo.environ as pyo
from time import time
import sys
np.random.seed(0)
m, n = 28_680, 62_500
A_base = (np.random.randint(-1000, 1000, (m, n)))
marks = np.random.binomial(n=1, p=(n*m)**0.5/(m*n), size=(m, n))
A = np.where(marks, A_base, 0)
true_x = np.random.binomial(n=1, p=0.6, size=n)
b = A @ true_x
# b = np.random.randint(-100, 100, n)
# print(b)
tic = time()
model = pyo.ConcreteModel()
model.N = pyo.Set(initialize=range(n))
model.M = pyo.Set(initialize=range(m))
# initializing x only because some values of x[n] may end up unconstrained and this prevents "None" in result
model.x = pyo.Var(model.N, within=pyo.Binary, initialize=0)
model.row_err = pyo.Var(model.M)
model.row_abs_err = pyo.Var(model.M)
# constrain the rowsum/error appropriately
@model.Constraint(model.M)
def rowsum(model, m):
return model.row_err[m] == sum(A[m,n] * model.x[n] for n in model.N if A[m,n]) - b[m]
# constrain the abs error
@model.Constraint(model.M)
def abs_1(model, m):
return model.row_abs_err[m] >= model.row_err[m]
@model.Constraint(model.M)
def abs_2(model, m):
return model.row_abs_err[m] >= -model.row_err[m]
# minimize sum of absolute row errors
model.obj = pyo.Objective(expr=sum(model.row_abs_err[m] for m in model.M))
toc = time()
print(f'starting solver... at time: {toc-tic: 0.1f} sec.')
solver = pyo.SolverFactory('cbc')
res = solver.solve(model)
toc = time()
print(f'total solve time: {toc-tic: 0.1f} sec')
print(res)
# model.x.display()
x = np.array([pyo.value(model.x[n]) for n in model.N], dtype=int)
print(f'verifying solution: {np.sum(A@x - b)}')
PYOMO output:
starting solver... at time: 228.2 sec.
solver time: 384.8 sec
Problem:
- Name: unknown
Lower bound: 0.0
Upper bound: 0.0
Number of objectives: 1
Number of constraints: 40941
Number of variables: 50488
Number of binary variables: 30839
Number of integer variables: 30839
Number of nonzeros: 20949
Sense: minimize
Solver:
- Status: ok
User time: -1.0
System time: 123.18
Wallclock time: 152.91
Termination condition: optimal
Termination message: Model was solved to optimality (subject to tolerances), and an optimal solution is available.
Statistics:
Branch and bound:
Number of bounded subproblems: 21
Number of created subproblems: 21
Black box:
Number of iterations: 2668
Error rc: 0
Time: 153.19930219650269
Solution:
- number of solutions: 0
number of solutions displayed: 0
verifying solution: 0