Python 如何对列进行裁剪和匹配，得到常用的时间值和对应的值

发布于11月21日

该设备(两个示波器)给出不同点数的数据，并在同一时刻触发(时间栏0).(没有机会找到相同的设置).

我想要实现的是将时间与其对应的值对齐(如果错过，可能会进行内插，这里的数据变化不会很快).

如何实现数据沿两个数据帧的裁剪和匹配？

作为示例，我准备了两个数据集:一个来自Scope 1(更多的数据点)，其他数据来自Scope 2(更少的数据点).这只是一个例子，实际上我从Scope 1得到了20K个样本，从其他地方得到了10K个样本.

scope1Data = pd.DataFrame({
'TIME': [-1, -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1],
'V': [-0.841470985, -0.78332691, -0.717356091, -0.644217687, -0.564642473, -0.479425539, -0.389418342, -0.295520207, -0.198669331, -0.099833417, 0, 0.099833417, 0.198669331, 0.295520207, 0.389418342, 0.479425539, 0.564642473, 0.644217687, 0.717356091, 0.78332691, 0.841470985],
})

scope2Data = pd.DataFrame({
'TIME': [-1.05, -0.9, -0.75, -0.6, -0.45, -0.3, -0.15, 0, 0.15, 0.3, 0.45, 0.6, 0.75, 0.9, 1.05]
'I': [-0.887362369, -0.946300088, -0.983985947, -0.999573603, -0.992712991, -0.963558185, -0.91276394, -0.841470985, -0.751280405, -0.644217687, -0.522687229, -0.389418342, -0.247403959, -0.099833417, 0.049979169]
})

最好是从零时间(位于中间的某个位置)开始，并将范围1的时间与范围2的时间进行匹配.丢失的值可能会被外推，或者我可以从更快的作用域(Scope 1)更改数据点的数量.来自较快作用域的额外值可能会被丢弃.简单地说，从-1.05到1.05的时间数据就足够了，其余的可能会被削减.

另外，scope2中的TIME列也不再需要了.

我不期待完整的答案，当然越多越好;只要命名这个过程就足够了.

所需的输出格式可以是:

 combinedData = pd.DataFrame({
 'TIME': [-0.9, -0.75, -0.6, -0.45, -0.3, -0.15, 0, 0.15, 0.3, 0.45, 0.6, 0.75, 0.9]
 'V': [corresponding values]
 'I': [corresponding-iterpolated-values-if-not-available-from-data]
 })

import pandas as pd import numpy as np import matplotlib.pyplot as plt #The data scope1Data = pd.DataFrame({ 'TIME': [-1, -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1], 'V': [-0.841470985, -0.78332691, -0.717356091, -0.644217687, -0.564642473, -0.479425539, -0.389418342, -0.295520207, -0.198669331, -0.099833417, 0, 0.099833417, 0.198669331, 0.295520207, 0.389418342, 0.479425539, 0.564642473, 0.644217687, 0.717356091, 0.78332691, 0.841470985], }) scope2Data = pd.DataFrame({ 'TIME': [-1.05, -0.9, -0.75, -0.6, -0.45, -0.3, -0.15, 0, 0.15, 0.3, 0.45, 0.6, 0.75, 0.9, 1.05], 'I': [-0.887362369, -0.946300088, -0.983985947, -0.999573603, -0.992712991, -0.963558185, -0.91276394, -0.841470985, -0.751280405, -0.644217687, -0.522687229, -0.389418342, -0.247403959, -0.099833417, 0.049979169] })

#interpolate scope2 data onto the time axis of scope1 scope2_interpolated = np.interp(x=scope1Data['TIME'], xp=scope2Data['TIME'], fp=scope2Data['I']) #Put it in a dataframe scope2_interpolated = pd.DataFrame({ 'TIME': scope1Data['TIME'], 'I_scope2_interp': scope2_interpolated, }) #new dataframe that has both scope1 and the interpolated scope2 combined_data = pd.merge(scope1Data, scope2_interpolated, on='TIME') #Optionally rename columns combined_data = combined_data.rename(columns={'TIME': 'TIME_scope1', 'V': 'V_scope1'})

#As a sanity check, visualise the results. #Interpolated should overlay with the original scope2 data f, ax1 = plt.subplots(figsize=(7, 3)) ax1.plot(combined_data.TIME_scope1, combined_data.V_scope1, linewidth=2, label='V [scope1]') ax1.set_xlabel('time (ms)') ax1.set_ylabel('V (mV)') ax1.legend() ax1.grid(axis='x', which='both') ax1.set_title('Original and interpolated scope data') ax2 = ax1.twinx() ax2.plot(scope2Data.TIME, scope2Data.I, label='original I [scope2]', color='tab:red', marker='d', linestyle='') ax2.plot(combined_data.TIME_scope1, combined_data.I_scope2_interp, color='tab:red', linestyle=':', label='interpolated I [scope2]') ax2.set_ylabel('I (mA)') ax2.legend(loc='lower right') #Rather than trimming the data, you can trim the # plot limits if you only want to see a smaller range x_limits = [-1.2, 1.2] ax1.set_xlim(x_limits);

Python 如何对列进行裁剪和匹配，得到常用的时间值和对应的值

推荐答案

Python相关问答推荐

在for循环中保存和删除收件箱

在Python中根据id填写年份系列

我可以使用极点优化这个面向cpu的pandas代码吗？

使用Python OpenCV的文本检测分割

获取Azure Pipelines以从pyproject.toml(而不是relevments_dev.文本)安装测试环境

为什么我的(工作)代码(生成交互式情节)在将其放入函数中时不再工作？

在Python和matlab中显示不同 colored颜色的图像

我在使用fill_between()将最大和最小带应用到我的图表中时遇到问题

如何使用symy打印方程？

Matlab中是否有Python的f-字符串等效物

Python中的嵌套Ruby哈希

' osmnx.shortest_track '返回有效源 node 和目标 node 的'无'

numpy卷积与有效

Pandas计数符合某些条件的特定列的数量

与命令行相比，相同的Python代码在Companyter Notebook中运行速度慢20倍

如何从列表框中 Select 而不出错？

判断solve_ivp中的事件

处理具有多个独立头的CSV文件

找到相对于列表索引的当前最大值列表""

人口全部乱序 - Python—Matplotlib—映射