Python 计算分布的标准差

发布于04月03日

假设我们有如下所示的蜱虫数据

,timestamp,close,security_code,volume,bid_volume,ask_volume
2024-04-02 01:00:00.128123+00:00,2024-04-02 01:00:00.128123+00:00,18465.5,NQ,1,0,1
2024-04-02 01:00:00.128123+00:00,2024-04-02 01:00:00.128123+00:00,18465.5,NQ,1,0,1
2024-04-02 01:00:03.782064+00:00,2024-04-02 01:00:03.782064+00:00,18465.25,NQ,1,0,1
2024-04-02 01:00:04.112603+00:00,2024-04-02 01:00:04.112603+00:00,18465.0,NQ,1,0,1
2024-04-02 01:00:04.112603+00:00,2024-04-02 01:00:04.112603+00:00,18465.0,NQ,1,0,1
2024-04-02 01:00:04.112603+00:00,2024-04-02 01:00:04.112603+00:00,18464.75,NQ,1,0,1
2024-04-02 01:00:04.112603+00:00,2024-04-02 01:00:04.112603+00:00,18464.75,NQ,1,0,1
2024-04-02 01:00:05.759876+00:00,2024-04-02 01:00:05.759876+00:00,18464.5,NQ,1,0,1
2024-04-02 01:00:06.273686+00:00,2024-04-02 01:00:06.273686+00:00,18464.75,NQ,5,5,0

然后，可以如下计算电流高，电流低和最常点(poc)，

import pandas as pd, matplotlib.pyplot as plt
from collections import defaultdict

df = pd.read_csv("csv/nq_out_daily.csv")
df.drop('Unnamed: 0', inplace=True, axis=1)

df['timestamp'] = pd.to_datetime(df['timestamp'])
df["timestamp"] = df['timestamp'].dt.strftime('%d-%m-%Y %H:%M:%S')

summary = {"high": [], "low": [], "poc": []}

dist = defaultdict(float)
current_high = current_low = None
for idx, (timestamp, tick, ask, bid) in enumerate(zip(df.timestamp, df.close, df.ask_volume, df.bid_volume)):

    current_high = tick if (current_high is None or tick > current_high) else current_high
    current_low = tick if (current_low is None or tick < current_low) else current_low
    dist[tick] += 1

    summary["high"].append(current_high)
    summary["low"].append(current_low)
    summary["poc"].append(max(dist, key=dist.get))


# plot the summary
fig = plt.figure()
x = range(len(summary["high"]))

plt.scatter(x, summary["high"], s=1)
plt.scatter(x, summary["low"], s=1)
plt.scatter(x, summary["poc"], s=1)

plt.legend(['high', 'low', 'poc'])
plt.savefig(f"distribution.png")
plt.close(fig)

Which would yield the following figure for today

现在，如何计算poc上下的第一个标准差？

import matplotlib.pyplot as plt from scipy.stats import mode summary = df['close'].expanding().agg({'high' : 'max', 'low' : 'min', 'poc': lambda s:mode(s)[0]}) summary['std'] = summary['poc'].expanding().std()

high low poc std 0 18465.5 18465.50 18465.50 NaN 1 18465.5 18465.50 18465.50 0.000000 2 18465.5 18465.25 18465.50 0.000000 3 18465.5 18465.00 18465.50 0.000000 4 18465.5 18465.00 18465.00 0.223607 5 18465.5 18464.75 18465.00 0.258199 6 18465.5 18464.75 18464.75 0.322749 7 18465.5 18464.50 18464.75 0.347183 8 18465.5 18464.50 18464.75 0.356000

Python 计算分布的标准差

推荐答案

Python相关问答推荐

Pythind 11无法弄清楚如何访问tuple元素

如何使用Jinja语法在HTML中重定向期间传递变量？

管道冻结和管道卸载

如何设置视频语言时上传到YouTube与Python API客户端

dask无groupby(ddf. agg([min，max])？''''

使用json的 pyramid 在客户端返回意外格式

以异步方式填充Pandas 数据帧

使用嵌套对象字段的Qdrant过滤

如何获得3D点的平移和旋转，给定的点已经旋转？

如何将相同组的值添加到嵌套的Pandas Maprame的倒数第二个索引级别

当HTTP 201响应包含 Big Data 的POST请求时，应该是什么？

在第一次调用时使用不同行为的re. sub的最佳方式

Python OPCUA，modbus通信代码运行3小时后出现RuntimeError

查找查找表中存在的列值组合

从列表中分离数据的最佳方式

为什么按下按钮后屏幕的 colored颜色保持不变？

使用pythonminidom过滤XML文件

迭代工具组合不会输出大于3的序列

如何计算Pandas 中具有特定条件的行之间的天差

对列中的数字进行迭代，得到n次重复开始的第一个行号