Data Aggregation - Bar Metrics¶
This notebook demonstrates how to compute and visualize custom bar-level metrics from raw tick data using the Quantreo library.
Let’s explore how to create and extend your own metrics — all computed directly from the tick data inside each bar.
# Import the Data Aggregation Package from Quantreo
import quantreo.data_aggregation as da
# Import a dataset to test the functions and create new ones easily
from quantreo.datasets import load_generated_ticks
df = load_generated_ticks()
# Show the data
df
| price | volume | |
|---|---|---|
| datetime | ||
| 2023-03-03 13:36:36 | 114.806983 | 3 |
| 2023-03-03 13:36:37 | 114.806983 | 1 |
| 2023-03-03 13:36:38 | 114.806983 | 1 |
| 2023-03-03 13:36:39 | 114.799521 | 1 |
| 2023-03-03 13:36:40 | 114.799521 | 1 |
| ... | ... | ... |
| 2023-03-15 03:23:11 | 118.686705 | 1 |
| 2023-03-15 03:23:12 | 118.686705 | 1 |
| 2023-03-15 03:23:13 | 118.686705 | 3 |
| 2023-03-15 03:23:14 | 118.686705 | 1 |
| 2023-03-15 03:23:15 | 118.672084 | 1 |
1000000 rows × 2 columns
Apply Additional Metrics¶
The additional_metrics parameter lets you enrich any bar with custom columns.
It must be a list of tuples, where each tuple follows this exact structure:
(
function, # A callable applied to the bar's internal data
"price" | "volume" | "price_volume", # Data source passed to the function
["col_name1", "col_name2", ...] # Names of the output columns
)
🔍 Component Breakdown
function: A Python function that takes a NumPy array (or a tuple of arrays if"price_volume") and returns either a float or a tuple of floats."price"/"volume"/"price_volume": Specifies which internal tick data will be passed to the function (price, volume or both).["output_col_name"]: The names of the column(s) added to the resulting DataFrame.
✅ These metrics are computed independently for each bar, using only the ticks that belong to that bar.
time_bars = da.bar_building.ticks_to_time_bars(df, resample_factor="4H", col_price="price", col_volume="volume", additional_metrics=[
(da.bar_metrics.kurtosis, "price", ["kurtosis"]),
(da.bar_metrics.max_traded_volume, "price_volume", ["max_vol", "price_max_vol"])
])
time_bars
| open | high | low | close | volume | number_ticks | high_time | low_time | skewness | kurtosis | poc | poc_distance | max_vol | price_max_vol | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| time | ||||||||||||||
| 2023-03-03 12:00:00 | 114.806983 | 114.821643 | 114.519924 | 114.640622 | 15573.0 | 8604 | 2023-03-03 13:39:22 | 2023-03-03 15:42:52 | 0.079447 | -1.114048 | 114.716041 | 0.65 | 1111.0 | 114.650636 |
| 2023-03-03 16:00:00 | 114.640622 | 115.063405 | 114.577370 | 114.681267 | 24813.0 | 14400 | 2023-03-03 19:10:23 | 2023-03-03 16:54:07 | 0.503873 | -1.256384 | 114.650276 | 0.15 | 936.0 | 114.957178 |
| 2023-03-03 20:00:00 | 114.681267 | 114.896731 | 114.411137 | 114.859736 | 24455.0 | 14400 | 2023-03-03 23:58:56 | 2023-03-03 20:54:14 | -0.029107 | -0.319391 | 114.629654 | 0.45 | 478.0 | 114.702053 |
| 2023-03-04 00:00:00 | 114.859736 | 115.313589 | 114.747424 | 115.149874 | 26909.0 | 14400 | 2023-03-04 01:54:15 | 2023-03-04 00:36:35 | -0.042187 | -0.900129 | 115.115431 | 0.65 | 1330.0 | 115.284433 |
| 2023-03-04 04:00:00 | 115.157326 | 115.382338 | 115.057242 | 115.288936 | 26443.0 | 14400 | 2023-03-04 05:27:52 | 2023-03-04 07:39:06 | -0.617723 | -0.124820 | 115.268555 | 0.65 | 1011.0 | 115.330631 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2023-03-14 08:00:00 | 118.248409 | 118.263260 | 117.510691 | 117.664435 | 28025.0 | 14400 | 2023-03-14 08:00:06 | 2023-03-14 11:20:01 | 0.191413 | -0.906341 | 117.623576 | 0.15 | 1401.0 | 117.636825 |
| 2023-03-14 12:00:00 | 117.664435 | 117.997454 | 117.635474 | 117.843508 | 25745.0 | 14400 | 2023-03-14 15:15:01 | 2023-03-14 12:06:39 | 0.061016 | 0.139561 | 117.834563 | 0.55 | 562.0 | 117.878979 |
| 2023-03-14 16:00:00 | 117.843508 | 117.970998 | 117.649717 | 117.910534 | 24350.0 | 14400 | 2023-03-14 19:53:27 | 2023-03-14 16:39:29 | 0.364729 | -0.470253 | 117.762165 | 0.35 | 316.0 | 117.783954 |
| 2023-03-14 20:00:00 | 117.910534 | 118.255959 | 117.811763 | 118.226044 | 28402.0 | 14400 | 2023-03-14 23:58:51 | 2023-03-14 21:11:00 | 0.725995 | -0.699551 | 117.922812 | 0.25 | 936.0 | 117.932291 |
| 2023-03-15 00:00:00 | 118.226044 | 118.722931 | 118.175237 | 118.672084 | 21583.0 | 12196 | 2023-03-15 03:14:41 | 2023-03-15 01:11:08 | 0.954973 | 0.185180 | 118.366930 | 0.35 | 454.0 | 118.625803 |
70 rows × 14 columns
tick_bars = da.bar_building.ticks_to_tick_bars(df, tick_per_bar=10_000, col_price="price", col_volume="volume", additional_metrics=[
(da.bar_metrics.skewness, "price", ["skewness"]),
(lambda px,v:da.bar_metrics.volume_profile_features(px,v,n_bins=10), "price_volume",["poc", "poc_distance"]),
])
tick_bars
| open | high | low | close | volume | number_ticks | duration_minutes | high_time | low_time | skewness | poc | poc_distance | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| time | ||||||||||||
| 2023-03-03 13:36:36 | 114.806983 | 114.821643 | 114.519924 | 114.661382 | 17732.0 | 10000 | 166.65 | 2023-03-03 13:39:22 | 2023-03-03 15:42:52 | 0.130231 | 114.685869 | 0.55 |
| 2023-03-03 16:23:16 | 114.661382 | 115.055958 | 114.577370 | 115.048522 | 17867.0 | 10000 | 166.65 | 2023-03-03 19:09:50 | 2023-03-03 16:54:07 | 0.541432 | 114.649159 | 0.15 |
| 2023-03-03 19:09:56 | 115.048522 | 115.063405 | 114.411137 | 114.666460 | 16616.0 | 10000 | 166.65 | 2023-03-03 19:10:23 | 2023-03-03 20:54:14 | 0.810642 | 114.639430 | 0.35 |
| 2023-03-03 21:56:36 | 114.666460 | 114.928161 | 114.564990 | 114.873724 | 16559.0 | 10000 | 166.65 | 2023-03-04 00:21:44 | 2023-03-03 23:27:27 | 0.041901 | 114.692100 | 0.35 |
| 2023-03-04 00:43:16 | 114.873724 | 115.313589 | 114.845063 | 114.969048 | 19984.0 | 10000 | 166.65 | 2023-03-04 01:54:15 | 2023-03-04 00:47:55 | -0.010750 | 115.102752 | 0.55 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2023-03-14 13:29:56 | 117.833813 | 117.997454 | 117.701412 | 117.730518 | 18495.0 | 10000 | 166.65 | 2023-03-14 15:15:01 | 2023-03-14 16:14:55 | 0.822873 | 117.775423 | 0.25 |
| 2023-03-14 16:16:36 | 117.723186 | 117.925650 | 117.649717 | 117.868810 | 16815.0 | 10000 | 166.65 | 2023-03-14 18:52:12 | 2023-03-14 16:39:29 | 0.252527 | 117.718700 | 0.25 |
| 2023-03-14 19:03:16 | 117.868810 | 118.007704 | 117.718963 | 117.919602 | 18407.0 | 10000 | 166.65 | 2023-03-14 21:37:33 | 2023-03-14 19:20:44 | -0.640993 | 117.935518 | 0.75 |
| 2023-03-14 21:49:56 | 117.919602 | 118.378553 | 117.874878 | 118.341288 | 19526.0 | 10000 | 166.65 | 2023-03-15 00:28:01 | 2023-03-14 22:04:57 | 0.089221 | 118.101532 | 0.45 |
| 2023-03-15 00:36:36 | 118.334049 | 118.722931 | 118.175237 | 118.672084 | 17692.0 | 10000 | 166.65 | 2023-03-15 03:14:41 | 2023-03-15 01:11:08 | 0.708433 | 118.366930 | 0.35 |
100 rows × 12 columns
Create New Metrics¶
Sometimes, standard OHLCV data isn’t enough.
You might want to extract advanced metrics from the raw ticks inside each bar like skewness, volume profile peaks, or volatility spikes.
That’s exactly what additional_metrics is for. It lets you plug in your own logic and enrich every bar with custom, computed features.
from numba import njit
import numpy as np
from typing import Tuple
# Example of additional metric functions
@njit
def median_volume(x: np.ndarray) -> float:
"""
Compute the median of a 1D array (e.g., volumes within a bar).
Parameters
----------
x : np.ndarray
Input 1D array of numerical values.
Returns
-------
float
Median value of the input array.
"""
n = len(x)
if n == 0:
return 0.0
sorted_x = np.sort(x)
mid = n // 2
if n % 2 == 0:
return 0.5 * (sorted_x[mid - 1] + sorted_x[mid])
else:
return sorted_x[mid]
@njit
def min_max(x: np.ndarray) -> tuple:
"""
Compute the minimum and maximum of a 1D array.
Parameters
----------
x : np.ndarray
Input 1D array of numerical values.
Returns
-------
tuple
A tuple (min_value, max_value) of the array.
"""
n = len(x)
if n == 0:
return (0.0, 0.0)
min_val = x[0]
max_val = x[0]
for i in range(1, n):
if x[i] < min_val:
min_val = x[i]
elif x[i] > max_val:
max_val = x[i]
return (min_val, max_val)
@njit
def max_traded_volume(prices: np.ndarray, volumes: np.ndarray) -> Tuple[float, float]:
"""
Return the maximum traded volume and the associated price.
Parameters
----------
prices : np.ndarray
1D array of price values corresponding to each tick.
volumes : np.ndarray
1D array of traded volume at each tick.
Returns
-------
Tuple[float, float]
- max_volume : Highest volume exchanged on a single tick.
- price_at_max_volume : Price level at which the maximum volume occurred.
"""
n = len(volumes)
if n == 0:
return 0.0, 0.0
max_idx = 0
max_vol = volumes[0]
for i in range(1, n):
if volumes[i] > max_vol:
max_vol = volumes[i]
max_idx = i
return max_vol, prices[max_idx]
time_bars = da.bar_building.ticks_to_time_bars(df, resample_factor="4H", col_price="price", col_volume="volume", additional_metrics=[
(median_volume, "volume", ["volume_median"]),
(min_max, "price", ["low_price", "high_price"]),
(max_traded_volume, "price_volume", ["max_vol", "price_max_vol"])
])
time_bars
| open | high | low | close | volume | number_ticks | high_time | low_time | volume_median | low_price | high_price | max_vol | price_max_vol | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| time | |||||||||||||
| 2023-03-03 12:00:00 | 114.806983 | 114.821643 | 114.519924 | 114.640622 | 15573.0 | 8604 | 2023-03-03 13:39:22 | 2023-03-03 15:42:52 | 1.0 | 114.519924 | 114.821643 | 1111.0 | 114.650636 |
| 2023-03-03 16:00:00 | 114.640622 | 115.063405 | 114.577370 | 114.681267 | 24813.0 | 14400 | 2023-03-03 19:10:23 | 2023-03-03 16:54:07 | 1.0 | 114.577370 | 115.063405 | 936.0 | 114.957178 |
| 2023-03-03 20:00:00 | 114.681267 | 114.896731 | 114.411137 | 114.859736 | 24455.0 | 14400 | 2023-03-03 23:58:56 | 2023-03-03 20:54:14 | 1.0 | 114.411137 | 114.896731 | 478.0 | 114.702053 |
| 2023-03-04 00:00:00 | 114.859736 | 115.313589 | 114.747424 | 115.149874 | 26909.0 | 14400 | 2023-03-04 01:54:15 | 2023-03-04 00:36:35 | 1.0 | 114.747424 | 115.313589 | 1330.0 | 115.284433 |
| 2023-03-04 04:00:00 | 115.157326 | 115.382338 | 115.057242 | 115.288936 | 26443.0 | 14400 | 2023-03-04 05:27:52 | 2023-03-04 07:39:06 | 1.0 | 115.057242 | 115.382338 | 1011.0 | 115.330631 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2023-03-14 08:00:00 | 118.248409 | 118.263260 | 117.510691 | 117.664435 | 28025.0 | 14400 | 2023-03-14 08:00:06 | 2023-03-14 11:20:01 | 1.0 | 117.510691 | 118.263260 | 1401.0 | 117.636825 |
| 2023-03-14 12:00:00 | 117.664435 | 117.997454 | 117.635474 | 117.843508 | 25745.0 | 14400 | 2023-03-14 15:15:01 | 2023-03-14 12:06:39 | 1.0 | 117.635474 | 117.997454 | 562.0 | 117.878979 |
| 2023-03-14 16:00:00 | 117.843508 | 117.970998 | 117.649717 | 117.910534 | 24350.0 | 14400 | 2023-03-14 19:53:27 | 2023-03-14 16:39:29 | 1.0 | 117.649717 | 117.970998 | 316.0 | 117.783954 |
| 2023-03-14 20:00:00 | 117.910534 | 118.255959 | 117.811763 | 118.226044 | 28402.0 | 14400 | 2023-03-14 23:58:51 | 2023-03-14 21:11:00 | 1.0 | 117.811763 | 118.255959 | 936.0 | 117.932291 |
| 2023-03-15 00:00:00 | 118.226044 | 118.722931 | 118.175237 | 118.672084 | 21583.0 | 12196 | 2023-03-15 03:14:41 | 2023-03-15 01:11:08 | 1.0 | 118.175237 | 118.722931 | 454.0 | 118.625803 |
70 rows × 13 columns