Python 如何设置SciPy插值器以最准确地保存数据

发布于10月19日

这是我每0.1秒就有一辆车的位置数据的(x，y)曲线图.总分在500分左右.

我读到了其他关于使用SciPy(here和here)进行内插的解决方案，但似乎在默认情况下，SciPy以均匀的间隔进行内插.以下是我当前的代码:

def reduce_dataset(x_list, y_list, num_interpolation_points):
    points = np.array([x_list, y_list]).T 
    distance = np.cumsum( np.sqrt(np.sum( np.diff(points, axis=0)**2, axis=1 )) )
    distance = np.insert(distance, 0, 0)/distance[-1]
    interpolator =  interp1d(distance, points, kind='quadratic', axis=0)
    results = interpolator(np.linspace(0, 1, num_interpolation_points)).T.tolist()
    new_xs = results[0]
    new_ys = results[1]
    return new_xs, new_ys



xs, ys = reduce_dataset(xs,ys, 50)
colors = cm.rainbow(np.linspace(0, 1, len(ys)))
i = 0
for y, c in zip(ys, colors):
    plt.scatter(xs[i], y, color=c)
    i += 1

它产生以下输出:

这是不错的，但我想设置插值器，try 在最难进行线性插补的地方放置更多的点，而在可以使用插值线轻松重建的区域放置较少的点.

请注意，在第二张图片中，最后一个点似乎突然从前一个点"跳"了出来.中间的部分似乎有点多余，因为其中许多点都落在一条完全直线上.对于要使用线性内插法尽可能准确地重建的东西，这不是使用50个点的最有效使用.

我是手工制作的，但我正在寻找类似这样的东西，其中的算法足够智能，可以在数据非线性变化的地方非常密集地放置点:

通过这种方式，可以以更高的准确度对数据进行插值.该图中点之间的大间隙可以用简单的线非常精确地插值，而密集的聚类需要更频繁的采样. 我已经阅读到interpolator docs on SciPy，但似乎找不到任何发电机或设置可以做到这一点.

我也try 过使用"线性"和"三次"插值法，但它似乎仍然以均匀的间隔进行采样，而不是对最需要它们的点进行分组.

这是SciPy可以做到的吗，或者我应该为这样的工作使用类似SKLearn ML算法的东西吗？

import numpy as np from scipy.interpolate import interp1d import matplotlib.pyplot as plt # Generate fake data x = np.linspace(1, 3, 1000) y = (x - 2)**3 # interpolation interpolator = interp1d(x, y) # different xnews N = 20 xnew_linspace = np.linspace(x.min(), x.max(), N) # linearly spaced xnew_logspace = np.logspace(np.log10(x.min()), np.log10(x.max()), N) # log spaced # spacing based on curvature gradient = np.gradient(y, x) second_gradient = np.gradient(gradient, x) curvature = np.abs(second_gradient) / (1 + gradient**2)**(3 / 2) idx = np.round(np.linspace(0, len(curvature) - 1, N)).astype(int) epsilon = 1e-1 a = (0.99 * x.max() - x.min()) / np.sum(1 / (curvature[idx] + epsilon)) xnew_curvature = np.insert(x.min() + np.cumsum(a / (curvature[idx] + epsilon)), 0, x.min()) fig, axarr = plt.subplots(2, 2, layout='constrained', sharex=True, sharey=True) axarr[0, 0].plot(x, y) for ax, xnew in zip(axarr.flatten()[1:], [xnew_linspace, xnew_logspace, xnew_curvature]): ax.plot(xnew, interpolator(xnew), '.--') axarr[0, 0].set_title('base signal') axarr[0, 1].set_title('linearly spaced') axarr[1, 0].set_title('log spaced') axarr[1, 1].set_title('curvature based spaced') plt.savefig('test_interp1d.png', dpi=400)

Python 如何设置SciPy插值器以最准确地保存数据

推荐答案

Python相关问答推荐

无法获得指数曲线_fit来处理日期

使用decorator 自动继承父类

Python中的锁定类和线程以实现dict移动

Django序列化器没有验证或保存数据

自定义新元未更新参数

"Discord机器人中缺少所需的位置参数ctx

当值是一个integer时，在Python中使用JMESPath来验证字典中的值(例如：1)

如何在PIL、Python中对图像应用彩色面膜？

如何使用Python中的clinicalTrials.gov API获取完整结果？

在Python和matlab中显示不同 colored颜色的图像

线性模型PanelOLS和statmodels OLS之间的区别

根据条件将新值添加到下面的行或下面新创建的行中

处理(潜在)不断增长的任务队列的并行/并行方法

对整个 pyramid 进行分组与对 pyramid 列子集进行分组

如何请求使用Python将文件下载到带有登录名的门户网站？

利用Selenium和Beautiful Soup实现Web抓取JavaScript表

driver. find_element无法通过class_name找到元素'""

调用decorator返回原始函数的输出

将pandas导出到CSV数据，但在此之前，将日期按最小到最大排序

在pandas/python中计数嵌套类别