假设我们有如下所示的蜱虫数据

,timestamp,close,security_code,volume,bid_volume,ask_volume
2024-04-02 01:00:00.128123+00:00,2024-04-02 01:00:00.128123+00:00,18465.5,NQ,1,0,1
2024-04-02 01:00:00.128123+00:00,2024-04-02 01:00:00.128123+00:00,18465.5,NQ,1,0,1
2024-04-02 01:00:03.782064+00:00,2024-04-02 01:00:03.782064+00:00,18465.25,NQ,1,0,1
2024-04-02 01:00:04.112603+00:00,2024-04-02 01:00:04.112603+00:00,18465.0,NQ,1,0,1
2024-04-02 01:00:04.112603+00:00,2024-04-02 01:00:04.112603+00:00,18465.0,NQ,1,0,1
2024-04-02 01:00:04.112603+00:00,2024-04-02 01:00:04.112603+00:00,18464.75,NQ,1,0,1
2024-04-02 01:00:04.112603+00:00,2024-04-02 01:00:04.112603+00:00,18464.75,NQ,1,0,1
2024-04-02 01:00:05.759876+00:00,2024-04-02 01:00:05.759876+00:00,18464.5,NQ,1,0,1
2024-04-02 01:00:06.273686+00:00,2024-04-02 01:00:06.273686+00:00,18464.75,NQ,5,5,0

然后,可以如下计算电流高,电流低和最常点(poc),

import pandas as pd, matplotlib.pyplot as plt
from collections import defaultdict

df = pd.read_csv("csv/nq_out_daily.csv")
df.drop('Unnamed: 0', inplace=True, axis=1)

df['timestamp'] = pd.to_datetime(df['timestamp'])
df["timestamp"] = df['timestamp'].dt.strftime('%d-%m-%Y %H:%M:%S')

summary = {"high": [], "low": [], "poc": []}

dist = defaultdict(float)
current_high = current_low = None
for idx, (timestamp, tick, ask, bid) in enumerate(zip(df.timestamp, df.close, df.ask_volume, df.bid_volume)):

    current_high = tick if (current_high is None or tick > current_high) else current_high
    current_low = tick if (current_low is None or tick < current_low) else current_low
    dist[tick] += 1

    summary["high"].append(current_high)
    summary["low"].append(current_low)
    summary["poc"].append(max(dist, key=dist.get))


# plot the summary
fig = plt.figure()
x = range(len(summary["high"]))

plt.scatter(x, summary["high"], s=1)
plt.scatter(x, summary["low"], s=1)
plt.scatter(x, summary["poc"], s=1)

plt.legend(['high', 'low', 'poc'])
plt.savefig(f"distribution.png")
plt.close(fig)

Which would yield the following figure for today Plotted high, low and poc

现在,如何计算poc上下的第一个标准差?

推荐答案

你不需要重新设置你的框架来生成你想要的数据,你可以简单地使用expanding rolling windowmodescipy.stats找到最常见的值:

import matplotlib.pyplot as plt
from scipy.stats import mode

summary = df['close'].expanding().agg({'high' : 'max', 'low' : 'min', 'poc': lambda s:mode(s)[0]})
summary['std'] = summary['poc'].expanding().std()

示例数据的输出:

      high       low       poc       std
0  18465.5  18465.50  18465.50       NaN
1  18465.5  18465.50  18465.50  0.000000
2  18465.5  18465.25  18465.50  0.000000
3  18465.5  18465.00  18465.50  0.000000
4  18465.5  18465.00  18465.00  0.223607
5  18465.5  18464.75  18465.00  0.258199
6  18465.5  18464.75  18464.75  0.322749
7  18465.5  18464.50  18464.75  0.347183
8  18465.5  18464.50  18464.75  0.356000

绘制:

for col in ['high', 'low', 'poc']:
    plt.scatter(summary.index, summary[col].values, s=1)
plt.legend(['high', 'low', 'poc'])

Python相关问答推荐

Pythind 11无法弄清楚如何访问tuple元素

如何使用Jinja语法在HTML中重定向期间传递变量?

管道冻结和管道卸载

如何设置视频语言时上传到YouTube与Python API客户端

dask无groupby(ddf. agg([min,max])?''''

使用__json__的 pyramid 在客户端返回意外格式

以异步方式填充Pandas 数据帧

使用嵌套对象字段的Qdrant过滤

如何获得3D点的平移和旋转,给定的点已经旋转?

如何将相同组的值添加到嵌套的Pandas Maprame的倒数第二个索引级别

当HTTP 201响应包含 Big Data 的POST请求时,应该是什么?  

在第一次调用时使用不同行为的re. sub的最佳方式

Python OPCUA,modbus通信代码运行3小时后出现RuntimeError

查找查找表中存在的列值组合

从列表中分离数据的最佳方式

为什么按下按钮后屏幕的 colored颜色 保持不变?

使用pythonminidom过滤XML文件

迭代工具组合不会输出大于3的序列

如何计算Pandas 中具有特定条件的行之间的天差

对列中的数字进行迭代,得到n次重复开始的第一个行号