Target Engineering - Magnitude¶
Magnitude targets focus on predicting the amplitude of future price movements rather than their direction. These targets are key components for building regression models aimed at estimating how large a return or volatility could be over a given horizon. In this notebook, we will engineer and extract magnitude-based targets using the quantreo package to enhance trading signals, optimize risk-adjusted returns, and support advanced portfolio management techniques.
# Import the Target Engineering Package from Quantreo
import quantreo.target_engineering as te
# To display the graphics
import matplotlib.pyplot as plt
plt.style.use("seaborn-v0_8")
# Import a dataset to test the functions and create new ones easily
from quantreo.datasets import load_generated_ohlcv
df = load_generated_ohlcv()
df = df.loc["2016"]
# Show the data
df
| open | high | low | close | volume | |
|---|---|---|---|---|---|
| time | |||||
| 2016-01-04 00:00:00 | 104.944241 | 105.312073 | 104.929735 | 105.232289 | 576.805768 |
| 2016-01-04 04:00:00 | 105.233361 | 105.252139 | 105.047564 | 105.149357 | 485.696723 |
| 2016-01-04 08:00:00 | 105.159851 | 105.384745 | 105.141110 | 105.330306 | 403.969745 |
| 2016-01-04 12:00:00 | 105.330306 | 105.505799 | 104.894155 | 104.923404 | 1436.917324 |
| 2016-01-04 16:00:00 | 104.914147 | 105.023293 | 104.913252 | 105.014347 | 1177.672605 |
| ... | ... | ... | ... | ... | ... |
| 2016-12-30 04:00:00 | 103.632257 | 103.711884 | 103.495896 | 103.564574 | 563.932484 |
| 2016-12-30 08:00:00 | 103.564574 | 103.629321 | 103.555581 | 103.616731 | 697.707475 |
| 2016-12-30 12:00:00 | 103.615791 | 103.628165 | 103.496810 | 103.515847 | 1768.926665 |
| 2016-12-30 16:00:00 | 103.515847 | 103.692243 | 103.400370 | 103.422193 | 921.437054 |
| 2016-12-30 20:00:00 | 103.420285 | 103.429955 | 103.195921 | 103.226868 | 764.291608 |
1548 rows × 5 columns
Future Returns¶
The future_returns function computes the amplitude of the future return for each observation.
- You can choose between log-returns or simple returns.
- It’s particularly useful for regression models or for further feature engineering.
df["label"] = te.magnitude.future_returns(df, close_col='close', window_size=10)
df["label"]
time
2016-01-04 00:00:00 0.001054
2016-01-04 04:00:00 0.002075
2016-01-04 08:00:00 0.000521
2016-01-04 12:00:00 0.005400
2016-01-04 16:00:00 0.004218
...
2016-12-30 04:00:00 NaN
2016-12-30 08:00:00 NaN
2016-12-30 12:00:00 NaN
2016-12-30 16:00:00 NaN
2016-12-30 20:00:00 NaN
Name: label, Length: 1548, dtype: float64
plt.figure(figsize=(15,6))
plt.plot(df["label"])
plt.title("Future Returns Target", size=15)
plt.show()
Future Volatility¶
The future_volatility function estimates future market volatility using several common models:
close_to_close,
parkinson,
rogers_satchell,
yang_zhang.
df["label"] = te.magnitude.future_volatility(df, method='yang_zhang', window_size=10, open_col="open", high_col="high", low_col="low", close_col="close")
df["label"]
time
2016-01-04 00:00:00 NaN
2016-01-04 04:00:00 NaN
2016-01-04 08:00:00 NaN
2016-01-04 12:00:00 NaN
2016-01-04 16:00:00 NaN
...
2016-12-30 04:00:00 0.001797
2016-12-30 08:00:00 0.001842
2016-12-30 12:00:00 0.001724
2016-12-30 16:00:00 0.001629
2016-12-30 20:00:00 0.001601
Name: label, Length: 1548, dtype: float64
plt.figure(figsize=(15,6))
plt.plot(df["label"])
plt.title("Future Volatility Target", size=15)
plt.show()
Continuous Barrier Labeling¶
The continuous_barrier_labeling function estimates the exact time (in hours) it takes for the price to hit either a Take Profit (TP) or a Stop Loss (SL) level, starting from each index.
It belongs to the family of event-based targets and provides continuous labels that reflect the speed of a movement rather than just its direction.
This labeling approach is useful for timing analysis, position sizing, and training models that incorporate the notion of time-to-event.
Warning "⏱ Time-based requirement"¶
Unlike other targets, this method requires:
- A DatetimeIndex (named
'time'). - Two timestamp columns:
high_timeandlow_time, indicating when the high and low of the candle occurred (not the end of the bar).
These columns are essential to compute the more accurate label possible without using the ticks.
# Import a dataset to test the next functions more easily
from quantreo.datasets import load_generated_ohlcv_with_time
df = load_generated_ohlcv_with_time()
df = df.loc["2016"]
# Show the data
df
| open | high | low | close | volume | low_time | high_time | |
|---|---|---|---|---|---|---|---|
| time | |||||||
| 2016-01-04 00:00:00 | 104.944241 | 105.312073 | 104.929735 | 105.232289 | 576.805768 | 2016-01-04 03:21:00 | 2016-01-04 00:23:00 |
| 2016-01-04 04:00:00 | 105.233361 | 105.252139 | 105.047564 | 105.149357 | 485.696723 | 2016-01-04 04:15:00 | 2016-01-04 07:48:00 |
| 2016-01-04 08:00:00 | 105.159851 | 105.384745 | 105.141110 | 105.330306 | 403.969745 | 2016-01-04 09:07:00 | 2016-01-04 10:21:00 |
| 2016-01-04 12:00:00 | 105.330306 | 105.505799 | 104.894155 | 104.923404 | 1436.917324 | 2016-01-04 15:50:00 | 2016-01-04 12:01:00 |
| 2016-01-04 16:00:00 | 104.914147 | 105.023293 | 104.913252 | 105.014347 | 1177.672605 | 2016-01-04 18:22:00 | 2016-01-04 16:09:00 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2016-12-30 04:00:00 | 103.632257 | 103.711884 | 103.495896 | 103.564574 | 563.932484 | 2016-12-30 05:34:00 | 2016-12-30 04:00:00 |
| 2016-12-30 08:00:00 | 103.564574 | 103.629321 | 103.555581 | 103.616731 | 697.707475 | 2016-12-30 10:00:00 | 2016-12-30 11:59:00 |
| 2016-12-30 12:00:00 | 103.615791 | 103.628165 | 103.496810 | 103.515847 | 1768.926665 | 2016-12-30 14:57:00 | 2016-12-30 12:00:00 |
| 2016-12-30 16:00:00 | 103.515847 | 103.692243 | 103.400370 | 103.422193 | 921.437054 | 2016-12-30 16:36:00 | 2016-12-30 17:43:00 |
| 2016-12-30 20:00:00 | 103.420285 | 103.429955 | 103.195921 | 103.226868 | 764.291608 | 2016-12-30 21:02:00 | 2016-12-30 20:00:00 |
1548 rows × 7 columns
df["label"] = te.magnitude.continuous_barrier_labeling(df, open_col="open", high_col="high", low_col="low", high_time_col="high_time",
low_time_col="low_time", tp=0.015, sl=-0.015, buy=True)
100%|███████████████████████████████████| 1548/1548 [00:00<00:00, 509677.57it/s]
plt.figure(figsize=(15,6))
plt.plot(df["label"])
plt.title("Continuous Barrier Labeling", size=15)
plt.show()