Introduction
我正在使用Python pandas
对本地存储的市场数据进行回测自己的策略.由于我想快速回测这些策略,并且数据很大(7+ 00万行),因此我正在try 将所有操作进行载体化.对于入口信号判断,情况已经是如此,而且效果相当好.作 for each 参赛作品的退出标准,使用take profit和stop loss价格阈值.即为以下DataFrame
提供datetime index
:
import pandas as pd
from pandas import Timestamp
import numpy as np
df = pd.DataFrame({
'open': {Timestamp('2021-01-03 22:11:00'): 1.22319, Timestamp('2021-01-03 22:12:00'): 1.22315, Timestamp('2021-01-03 22:15:00'): 1.22324, Timestamp('2021-01-03 22:16:00'): 1.22355, Timestamp('2021-01-03 22:17:00'): 1.22357},
'high': {Timestamp('2021-01-03 22:11:00'): 1.22319, Timestamp('2021-01-03 22:12:00'): 1.22318, Timestamp('2021-01-03 22:15:00'): 1.22358, Timestamp('2021-01-03 22:16:00'): 1.2236, Timestamp('2021-01-03 22:17:00'): 1.22361},
'low': {Timestamp('2021-01-03 22:11:00'): 1.22317, Timestamp('2021-01-03 22:12:00'): 1.22315, Timestamp('2021-01-03 22:15:00'): 1.22324, Timestamp('2021-01-03 22:16:00'): 1.22352, Timestamp('2021-01-03 22:17:00'): 1.22355},
'close': {Timestamp('2021-01-03 22:11:00'): 1.22317, Timestamp('2021-01-03 22:12:00'): 1.22315, Timestamp('2021-01-03 22:15:00'): 1.22358, Timestamp('2021-01-03 22:16:00'): 1.22352, Timestamp('2021-01-03 22:17:00'): 1.22356},
'longEntrySignal': {Timestamp('2021-01-03 22:11:00'): False, Timestamp('2021-01-03 22:12:00'): False, Timestamp('2021-01-03 22:15:00'): True, Timestamp('2021-01-03 22:16:00'): False, Timestamp('2021-01-03 22:17:00'): False},
'longEntry': {Timestamp('2021-01-03 22:11:00'): False, Timestamp('2021-01-03 22:12:00'): False, Timestamp('2021-01-03 22:15:00'): False, Timestamp('2021-01-03 22:16:00'): True, Timestamp('2021-01-03 22:17:00'): False},
'longEntryPrice': {Timestamp('2021-01-03 22:11:00'): np.nan, Timestamp('2021-01-03 22:12:00'): np.nan, Timestamp('2021-01-03 22:15:00'): np.nan, Timestamp('2021-01-03 22:16:00'): 1.22355, Timestamp('2021-01-03 22:17:00'): np.nan},
'longTpPrice': {Timestamp('2021-01-03 22:11:00'): np.nan, Timestamp('2021-01-03 22:12:00'): np.nan, Timestamp('2021-01-03 22:15:00'): np.nan, Timestamp('2021-01-03 22:16:00'): 1.2243451663854852, Timestamp('2021-01-03 22:17:00'): np.nan},
'longSlPrice': {Timestamp('2021-01-03 22:11:00'): np.nan, Timestamp('2021-01-03 22:12:00'): np.nan, Timestamp('2021-01-03 22:15:00'): np.nan, Timestamp('2021-01-03 22:16:00'): 1.2227548336145146, Timestamp('2021-01-03 22:17:00'): np.nan}})
print(df)
open high low close longEntrySignal longEntry longEntryPrice longTpPrice longSlPrice
2021-01-03 22:11:00 1.22319 1.22319 1.22317 1.22317 False False NaN NaN NaN
2021-01-03 22:12:00 1.22315 1.22318 1.22315 1.22315 False False NaN NaN NaN
2021-01-03 22:15:00 1.22324 1.22358 1.22324 1.22358 True False NaN NaN NaN
2021-01-03 22:16:00 1.22355 1.22360 1.22352 1.22352 False True 1.22355 1.224345 1.222755
2021-01-03 22:17:00 1.22357 1.22361 1.22355 1.22356 False False NaN NaN NaN
longEntrySignal
通过True
和False
表示在下一个蜡烛内打开多头头寸的给定信号.longEntry
代表开盘位置,True
和longEntryPrice
以该蜡烛的开盘位置作为入场价格.longTpPrice
和longSlPrice
是根据达到的止盈或停止损失标准应关闭开仓头寸的相应价格.
Desired output
根据为take profit和stop loss Select 的阈值,很可能存在具有不同入口点(time)和take profit以及stop loss阈值的多个持有位置.
不管怎样,我现在的问题是如何计算入场后平仓的情况.这意味着之后验证策略性能的最低信息为exitPrice
和exitTime
列.
open high low close longEntrySignal longEntry longEntryPrice longTpPrice longSlPrice exitPrice exitTime
2021-01-03 22:11:00 1.22319 1.22319 1.22317 1.22317 False False NaN NaN NaN NaN NaN
2021-01-03 22:12:00 1.22315 1.22318 1.22315 1.22315 False False NaN NaN NaN NaN NaN
2021-01-03 22:15:00 1.22324 1.22358 1.22324 1.22358 True False NaN NaN NaN NaN NaN
2021-01-03 22:16:00 1.22355 1.22360 1.22352 1.22352 False True 1.22355 1.224345 1.222755 1.224345 2021-01-03 22:29:00
2021-01-03 22:17:00 1.22357 1.22361 1.22355 1.22356 False False NaN NaN NaN NaN NaN
exitPrice
将是相应的take profit(longTpPrice
)或stop loss(longSlPrice
)阈值,而如果在同一row
(candle)内达到两个阈值,则应考虑longSlPrice
.那么exitTime
将是相应的时间.
Current approach
目前,我将df
减少到给定long entries的行后,使用apply()
函数进行退出计算:
entryDf = df[df['longEntry']].copy()
entryDf[['exitPrice', 'exitTime']] = entryDf.apply(lambda x: getLongExit(exitDf=df[['high', 'low']], entryPrice=x['longEntryPrice'], entryTime=x.index, takeProfit=x['longTpPrice'], stopLoss=x['longSlPrice']), axis=1, result_type='expand')
然后,通过基本上使用
.loc
、.idxmax()
和.idxmin()
来确定getLongExit
是否达到阈值,以及如果何时达到阈值take profit或/和stop loss并比较结果以判断哪个发生得更早.它返回相应的exitPrice
和相应的exitTime
.
基于确定的exitPrice
,例如overall gain、win-rate等信息可以很容易地计算.
我希望我能够让您正确了解我的问题是什么.可以肯定的是,使用.ffill()
或类似功能可能会更快,但我无法让它工作.我期待您的建议-谢谢!
Edit/Update according to @Andrej's solution
我按照建议实施了您的解决方案,但它会导致以下错误:
Traceback (most recent call last):
File "/Users/maxwitt/PycharmProjects/ForexStrategies/strategy1.py", line 288, in <module>
get_long_exit(
File "/Users/maxwitt/PycharmProjects/ForexStrategies/venv/lib/python3.10/site-packages/numba/core/dispatcher.py", line 468, in _compile_for_args
error_rewrite(e, 'typing')
File "/Users/maxwitt/PycharmProjects/ForexStrategies/venv/lib/python3.10/site-packages/numba/core/dispatcher.py", line 409, in error_rewrite
raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in function setitem>) found for signature:
>>> setitem(array(float64, 1d, C), int64, datetime64[ns])
There are 16 candidate implementations:
- Of which 16 did not match due to:
Overload of function 'setitem': File: <numerous>: Line N/A.
With argument(s): '(array(float64, 1d, C), int64, datetime64[ns])':
No match.
During: typing of setitem at /Users/maxwitt/PycharmProjects/ForexStrategies/strategy1.py (167)
File "strategy1.py", line 167:
def get_long_exit(index, high_vals, low_vals, tp_prices, sl_prices, out_exit_price, out_indices):
<source elided>
out_exit_price[idx1] = sl_entry
out_indices[idx1] = index[idx2]
^