Skip to content

Productivity

Annotations

  • Whenever you write a function, the meaning of each argument can be specified using an annotation next to the argument. VBT now offers a rich set of in-house annotations tailored to specific tasks. For example, whether an argument is a parameter can be specified directly in the function rather than in the parameterized decorator.
Test a cross-validation function with annotations
>>> @vbt.cv_split(
...     splitter="from_rolling", 
...     splitter_kwargs=dict(length=365, split=0.5, set_labels=["train", "test"]),
...     execute_kwargs=dict(show_progress=True),
...     parameterized_kwargs=dict(random_subset=100),
... )
... def sma_crossover_cv(
...     data: vbt.Takeable,  # (1)!
...     fast_period: vbt.Param(condition="x < slow_period"),  # (2)!
...     slow_period: vbt.Param,  # (3)!
...     metric
... ) -> vbt.MergeFunc("concat"):
...     fast_sma = data.run("sma", fast_period, hide_params=True)
...     slow_sma = data.run("sma", slow_period, hide_params=True)
...     entries = fast_sma.real_crossed_above(slow_sma)
...     exits = fast_sma.real_crossed_below(slow_sma)
...     pf = vbt.PF.from_signals(data, entries, exits, direction="both")
...     return pf.deep_getattr(metric)

>>> sma_crossover_cv(
...     vbt.YFData.pull("BTC-USD", start="4 years ago"),
...     np.arange(20, 50),
...     np.arange(20, 50),
...     "trades.expectancy"
... )
split  set    fast_period  slow_period
0      train  22           33             26.351841
       test   21           34             35.788733
1      train  21           46             24.114027
       test   21           39              2.261432
2      train  30           44             29.635233
       test   30           38              1.909916
3      train  20           49             -7.038924
       test   20           44             -1.366734
4      train  28           44              2.144805
       test   29           38             -4.945776
5      train  35           47             -8.877875
       test   34           37              2.792217
6      train  29           41              8.816846
       test   28           43             36.008302
dtype: float64
  1. Here
  2. Here
  3. Here

DataFrame product

  • Several parameterized indicators can produce DataFrames with different shapes and columns, which makes a Cartesian product of them tricky since they often share common column levels (such as "symbol") that shouldn't be combined with each other. There's now a method to cross-join multiple DataFrames block-wise.
Enter when SMA goes above WMA, exit when EMA goes below WMA
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], missing_index="drop")
>>> sma = data.run("sma", timeperiod=[10, 20], unpack=True)
>>> ema = data.run("ema", timeperiod=[30, 40], unpack=True)
>>> wma = data.run("wma", timeperiod=[50, 60], unpack=True)
>>> sma, ema, wma = sma.vbt.x(ema, wma)  # (1)!
>>> entries = sma.vbt.crossed_above(wma)
>>> exits = ema.vbt.crossed_below(wma)

>>> entries.columns
MultiIndex([(10, 30, 50, 'BTC-USD'),
            (10, 30, 50, 'ETH-USD'),
            (10, 30, 60, 'BTC-USD'),
            (10, 30, 60, 'ETH-USD'),
            (10, 40, 50, 'BTC-USD'),
            (10, 40, 50, 'ETH-USD'),
            (10, 40, 60, 'BTC-USD'),
            (10, 40, 60, 'ETH-USD'),
            (20, 30, 50, 'BTC-USD'),
            (20, 30, 50, 'ETH-USD'),
            (20, 30, 60, 'BTC-USD'),
            (20, 30, 60, 'ETH-USD'),
            (20, 40, 50, 'BTC-USD'),
            (20, 40, 50, 'ETH-USD'),
            (20, 40, 60, 'BTC-USD'),
            (20, 40, 60, 'ETH-USD')],
           names=['sma_timeperiod', 'ema_timeperiod', 'wma_timeperiod', 'symbol'])
  1. Build a Cartesian product of three DataFrames by keeping the column level "symbol" untouched. Same can be achieved with vbt.pd_acc.cross(sma, ema, wma).

Compression

  • Serialized VBT objects may sometimes take a lot of disk space. With this update, there's now support for a variety of compression algorithms to make files as light as possible! ðŸŠķ
Save data without and with compression
>>> data = vbt.RandomOHLCData.pull("RAND", start="2022", end="2023", timeframe="1 minute")

>>> file_path = data.save()
>>> print(vbt.file_size(file_path))
21.0 MB

>>> file_path = data.save(compression="blosc")
>>> print(vbt.file_size(file_path))
13.3 MB

Faster loading

  • If your pipeline doesn't need accessors, Plotly graphs, and most of other optional functionalities, you can disable the auto-import feature entirely to bring down the loading time of VBT to under a second âģ
Define importing settings in vbt.ini
[importing]
auto_import = False
Measure the loading time
>>> import time
>>> start = time.time()
>>> from vectorbtpro import *
>>> end = time.time()
>>> end - start
0.580937910079956

Configuration files

  • VectorBT PRO extends configparser to define its own configuration format that lets the user save, introspect, modify, and load back any complex in-house object. The main advantages of this format are readability and round-tripping: any object can be encoded and then decoded back without information loss. The main features include nested structures, references, parsing of literals, as well as evaluation of arbitrary Python expressions. Additionally, you can now create a configuration file for VBT and put it into the working directory - it will be used to update the default settings whenever the package is imported!
Define global settings in vbt.ini
[plotting]
default_theme = dark

[portfolio]
init_cash = 5000

[data.custom.binance.client_config]
api_key = YOUR_API_KEY
api_secret = YOUR_API_SECRET

[data.custom.ccxt.exchanges.binance.exchange_config]
apiKey = &data.custom.binance.client_config.api_key
secret = &data.custom.binance.client_config.api_secret
Verify that the settings have been loaded correctly
>>> from vectorbtpro import *

>>> vbt.settings.portfolio["init_cash"]
5000

Serialization

  • Just like machine learning models, every native VBT object can be serialized and saved to a binary file - it's never been easier to share data and insights! Another benefit is that only the actual content of each object is serialized, and not its class definition, such that the loaded object uses only the most up-to-date class definition. There's also a special logic implemented that can help you "reconstruct" objects if VBT has introduced some breaking API changes 🏗
Backtest each month of data and save the results for later
>>> data = vbt.YFData.pull("BTC-USD", start="2022-01-01", end="2022-06-01")

>>> def backtest_month(close):
...     return vbt.PF.from_random_signals(close, n=10)

>>> month_pfs = data.close.resample("MS").apply(backtest_month)
>>> month_pfs
Date
2022-01-01 00:00:00+00:00    Portfolio(\n    wrapper=ArrayWrapper(\n       ...
2022-02-01 00:00:00+00:00    Portfolio(\n    wrapper=ArrayWrapper(\n       ...
2022-03-01 00:00:00+00:00    Portfolio(\n    wrapper=ArrayWrapper(\n       ...
2022-04-01 00:00:00+00:00    Portfolio(\n    wrapper=ArrayWrapper(\n       ...
2022-05-01 00:00:00+00:00    Portfolio(\n    wrapper=ArrayWrapper(\n       ...
Freq: MS, Name: Close, dtype: object

>>> vbt.save(month_pfs, "month_pfs")  # (1)!

>>> month_pfs = vbt.load("month_pfs")  # (2)!
>>> month_pfs.apply(lambda pf: pf.total_return)
Date
2022-01-01 00:00:00+00:00   -0.048924
2022-02-01 00:00:00+00:00    0.168370
2022-03-01 00:00:00+00:00    0.016087
2022-04-01 00:00:00+00:00   -0.120525
2022-05-01 00:00:00+00:00    0.110751
Freq: MS, Name: Close, dtype: float64
  1. Save to disk
  2. Load from disk later

Data parsing

  • Tired of passing open, high, low, and close as separate time series? Portfolio class methods have been extended to take a data instance instead of close and extract the contained OHLC data automatically - a small but timesaving feature!
Run the example above using the new approach
>>> data = vbt.YFData.pull("BTC-USD", start="2020-01", end="2020-03")
>>> pf = vbt.PF.from_random_signals(data, n=10)

Index dictionaries

Manually constructing arrays and setting their data with Pandas is often painful. Gladly, there is a new functionality that provides a much needed help! Any broadcastable argument can become an index dictionary, which contains instructions on where to set values in the array and does the filling job for you. It knows exactly which axis has to be modified and doesn't create a full array if not necessary - with much love to RAM âĪ

1) Accumulate daily and exit on Sunday vs 2) accumulate weekly and exit on month end
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"])
>>> tile = pd.Index(["daily", "weekly"], name="strategy")  # (1)!
>>> pf = vbt.PF.from_orders(
...     data.close,
...     size=vbt.index_dict({  # (2)!
...         vbt.idx(
...             vbt.pointidx(every="D"), 
...             vbt.colidx("daily", level="strategy")): 100,  # (3)!
...         vbt.idx(
...             vbt.pointidx(every="W-SUN"), 
...             vbt.colidx("daily", level="strategy")): -np.inf,  # (4)!
...         vbt.idx(
...             vbt.pointidx(every="W-MON"), 
...             vbt.colidx("weekly", level="strategy")): 100,
...         vbt.idx(
...             vbt.pointidx(every="M"), 
...             vbt.colidx("weekly", level="strategy")): -np.inf,
...     }),
...     size_type="value",
...     direction="longonly",
...     init_cash="auto",
...     broadcast_kwargs=dict(tile=tile)
... )
>>> pf.sharpe_ratio
strategy  symbol 
daily     BTC-USD    0.702259
          ETH-USD    0.782296
weekly    BTC-USD    0.838895
          ETH-USD    0.524215
Name: sharpe_ratio, dtype: float64
  1. To represent two strategies, you need to tile the same data twice. For this, create a parameter with strategy names and pass it as tile to the broadcaster for it to tile the columns of each array (such as price) by two times.
  2. Index dictionary contains index instructions as keys and data as values to be set. Keys can be anything from row indices and labels, to custom indexer classes such as PointIdxr.
  3. Find the indices of the rows that correspond to the beginning of each day and the index of the column "daily", and set each element under those indices to 100 (= accumulate)
  4. Find the indices of the rows that correspond to Sunday. If any value under those indices has already been set with any previous instruction, it will be overridden.

Slicing

  • Similarly to selecting columns, each VBT object is now capable of slicing rows, using the exact same mechanism as in Pandas 🔊 This makes it supereasy to analyze and plot any subset of simulated data, without the need of re-simulation!
Analyze multiple date ranges of the same portfolio
>>> data = vbt.YFData.pull("BTC-USD")
>>> pf = vbt.PF.from_holding(data, freq="d")

>>> pf.sharpe_ratio
1.116727709477293

>>> pf.loc[:"2020"].sharpe_ratio  # (1)!
1.2699801554196481

>>> pf.loc["2021": "2021"].sharpe_ratio  # (2)!
0.9825161170278687

>>> pf.loc["2022":].sharpe_ratio  # (3)!
-1.0423271337174647
  1. Get the Sharpe during the year 2020 and before
  2. Get the Sharpe during the year 2021
  3. Get the Sharpe during the year 2022 and after

Column stacking

  • Complex VBT objects of the same type can be easily stacked along columns. For instance, you can combine multiple totally-unrelated trading strategies into the same portfolio for analysis. Under the hood, the final object is still represented as a monolithic multi-dimensional structure that can be processed even faster than merged objects separately ðŸŦ
Analyze two trading strategies separately and then jointly
>>> def strategy1(data):
...     fast_ma = vbt.MA.run(data.close, 50, short_name="fast_ma")
...     slow_ma = vbt.MA.run(data.close, 200, short_name="slow_ma")
...     entries = fast_ma.ma_crossed_above(slow_ma)
...     exits = fast_ma.ma_crossed_below(slow_ma)
...     return vbt.PF.from_signals(
...         data.close, 
...         entries, 
...         exits, 
...         size=100,
...         size_type="value",
...         init_cash="auto"
...     )

>>> def strategy2(data):
...     bbands = vbt.BBANDS.run(data.close, window=14)
...     entries = bbands.close_crossed_below(bbands.lower)
...     exits = bbands.close_crossed_above(bbands.upper)
...     return vbt.PF.from_signals(
...         data.close, 
...         entries, 
...         exits, 
...         init_cash=200
...     )

>>> data1 = vbt.BinanceData.pull("BTCUSDT")
>>> pf1 = strategy1(data1)  # (1)!
>>> pf1.sharpe_ratio
0.9100317671866922

>>> data2 = vbt.BinanceData.pull("ETHUSDT")
>>> pf2 = strategy2(data2)  # (2)!
>>> pf2.sharpe_ratio
-0.11596286232734827

>>> pf_sep = vbt.PF.column_stack((pf1, pf2))  # (3)!
>>> pf_sep.sharpe_ratio
0    0.910032
1   -0.115963
Name: sharpe_ratio, dtype: float64

>>> pf_join = vbt.PF.column_stack((pf1, pf2), group_by=True)  # (4)!
>>> pf_join.sharpe_ratio
0.42820898354646514
  1. Analyze the first strategy in a separate portfolio
  2. Analyze the second strategy in a separate portfolio
  3. Analyze both strategies in the same portfolio separately
  4. Analyze both strategies in the same portfolio jointly

Row stacking

  • Complex VBT objects of the same type can be easily stacked along rows. For instance, you can append new data to an existing portfolio, or even concatenate in-sample portfolios with their out-of-sample counterparts 🧎
Analyze two date ranges separately and then jointly
>>> def strategy(data, start=None, end=None):
...     fast_ma = vbt.MA.run(data.close, 50, short_name="fast_ma")
...     slow_ma = vbt.MA.run(data.close, 200, short_name="slow_ma")
...     entries = fast_ma.ma_crossed_above(slow_ma)
...     exits = fast_ma.ma_crossed_below(slow_ma)
...     return vbt.PF.from_signals(
...         data.close[start:end], 
...         entries[start:end], 
...         exits[start:end], 
...         size=100,
...         size_type="value",
...         init_cash="auto"
...     )

>>> data = vbt.BinanceData.pull("BTCUSDT")

>>> pf_whole = strategy(data)  # (1)!
>>> pf_whole.sharpe_ratio
0.9100317671866922

>>> pf_sub1 = strategy(data, end="2019-12-31")  # (2)!
>>> pf_sub1.sharpe_ratio
0.7810397448678937

>>> pf_sub2 = strategy(data, start="2020-01-01")  # (3)!
>>> pf_sub2.sharpe_ratio
1.070339534746574

>>> pf_join = vbt.PF.row_stack((pf_sub1, pf_sub2))  # (4)!
>>> pf_join.sharpe_ratio
0.9100317671866922
  1. Analyze the entire range
  2. Analyze the first date range
  3. Analyze the second date range
  4. Join both date ranges and analyze as a whole

Index alignment

  • There is no more limitation of each Pandas array being required to have the same index. Indexes of all arrays that should broadcast against each other are automatically aligned, as long as they have the same data type.
Predict ETH price with BTC price using linear regression
>>> btc_data = vbt.YFData.pull("BTC-USD")
>>> btc_data.wrapper.shape
(2817, 7)

>>> eth_data = vbt.YFData.pull("ETH-USD")  # (1)!
>>> eth_data.wrapper.shape
(1668, 7)

>>> ols = vbt.OLS.run(  # (2)!
...     btc_data.close,
...     eth_data.close
... )
>>> ols.pred
Date
2014-09-17 00:00:00+00:00            NaN
2014-09-18 00:00:00+00:00            NaN
2014-09-19 00:00:00+00:00            NaN
2014-09-20 00:00:00+00:00            NaN
2014-09-21 00:00:00+00:00            NaN
...                                  ...
2022-05-30 00:00:00+00:00    2109.769242
2022-05-31 00:00:00+00:00    2028.856767
2022-06-01 00:00:00+00:00    1911.555689
2022-06-02 00:00:00+00:00    1930.169725
2022-06-03 00:00:00+00:00    1882.573170
Freq: D, Name: Close, Length: 2817, dtype: float64
  1. ETH-USD history is shorter than BTC-USD history
  2. This now works! Just make sure that all arrays share the same timeframe and timezone.

Numba datetime

  • There is no support for datetime indexes (and any other Pandas objects) in Numba. There are also no built-in Numba functions for working with datetime. So, how to connect data to time? VBT closes this loophole by implementing a collection of functions to extract various information from each timestamp, such as the current time and day of the week to determine whether the bar happens during trading hours.

Tutorial

Learn more in the Signal development tutorial.

Plot the percentage change from the start of the month to now
>>> @njit
... def month_start_pct_change_nb(arr, index):
...     out = np.full(arr.shape, np.nan)
...     for col in range(arr.shape[1]):
...         for i in range(arr.shape[0]):
...             if i == 0 or vbt.dt_nb.month_nb(index[i - 1]) != vbt.dt_nb.month_nb(index[i]):
...                 month_start_value = arr[i, col]
...             else:
...                 out[i, col] = (arr[i, col] - month_start_value) / month_start_value
...     return out

>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], start="2022", end="2023")
>>> pct_change = month_start_pct_change_nb(
...     vbt.to_2d_array(data.close), 
...     data.index.vbt.to_ns()  # (1)!
... )
>>> pct_change = data.symbol_wrapper.wrap(pct_change)
>>> pct_change.vbt.plot().show()
  1. Convert the datetime index to the nanosecond format

Periods ago

  • Instead of writing Numba functions, comparing values at different bars can be also done in a vectorized manner with Pandas. The problem is that there are no-built in functions to easily shift values based on timedeltas, neither there are rolling functions to check whether an event happened during a period time in the past. This gap is closed by various new accessor methods.

Tutorial

Learn more in the Signal development tutorial.

Check whether the price dropped for 5 consecutive bars
>>> data = vbt.YFData.pull("BTC-USD", start="2022-05", end="2022-08")
>>> mask = (data.close < data.close.vbt.ago(1)).vbt.all_ago(5)
>>> fig = data.plot(plot_volume=False)
>>> mask.vbt.signals.ranges.plot_shapes(
...     plot_close=False, 
...     fig=fig, 
...     shape_kwargs=dict(fillcolor="orangered")
... )
>>> fig.show()

Safe resampling

  • The look-ahead bias is an ongoing threat when working with array data, especially on multiple time frames. Using Pandas alone is strongly discouraged because it's not aware that financial data mainly involves bars where timestamps are opening times and events can happen at any time between bars, and thus falsely assumes that timestamps denote the exact time of an event. In VBT, there is an entire collection of functions and classes for resampling and analyzing data in a safe way!

Tutorial

Learn more in the MTF analysis tutorial.

Calculate SMA on multiple time frames and display on the same chart
>>> def mtf_sma(close, close_freq, target_freq, timeperiod=5):
...     target_close = close.vbt.realign_closing(target_freq)  # (1)!
...     target_sma = vbt.talib("SMA").run(target_close, timeperiod=timeperiod).real  # (2)!
...     target_sma = target_sma.rename(f"SMA ({target_freq})")
...     return target_sma.vbt.realign_closing(close.index, freq=close_freq)  # (3)!

>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2023")
>>> fig = mtf_sma(data.close, "D", "D").vbt.plot()
>>> mtf_sma(data.close, "D", "W-MON").vbt.plot(fig=fig)
>>> mtf_sma(data.close, "D", "MS").vbt.plot(fig=fig)
>>> fig.show()
  1. Resample the source frequency to the target frequency. Close happens at the end of the bar, thus resample as a "closing event".
  2. Calculate SMA on the target frequency
  3. Resample the target frequency back to the source frequency to be able to display multiple time frames on the same chart. Since close contains gaps, we cannot resample to close_freq because it may result in unaligned series - resample directly to the index of close instead.

Resamplable objects

  • Not only you can resample time series, but also complex VBT objects! Under the hood, each object comprises of a bunch of array-like attributes, thus resampling here simply means aggregating all the accompanied information in one go. This is very convenient when you want to simulate on higher frequency for best accuracy, and then analyze on lower frequency for best speed.

Tutorial

Learn more in the MTF analysis tutorial.

Plot the monthly return heatmap of a random portfolio
>>> import calendar

>>> data = vbt.YFData.pull("BTC-USD", start="2018", end="2023")
>>> pf = vbt.PF.from_random_signals(data, n=100, direction="both")
>>> mo_returns = pf.resample("MS").returns  # (1)!
>>> mo_return_matrix = pd.Series(
...     mo_returns.values, 
...     index=pd.MultiIndex.from_arrays([
...         mo_returns.index.year,
...         mo_returns.index.month
...     ], names=["year", "month"])
... ).unstack("month")
>>> mo_return_matrix.columns = mo_return_matrix.columns.map(lambda x: calendar.month_abbr[x])
>>> mo_return_matrix.vbt.heatmap(
...     is_x_category=True,
...     trace_kwargs=dict(zmid=0, colorscale="Spectral")
... ).show()
  1. Resample the entire portfolio to the monthly frequency and compute the returns

Formatting engine

  • VectorBT PRO is a very extensive library that defines thousands of classes, functions, and objects. Thus, when working with any of them, you may want to "see through" the object to gain a better understanding of its attributes and contents. Gladly, there is a new formatting engine that can accurately format any in-house object as a human-readable string. Did you know that the API documentation is partially powered by this engine? 😉
Introspect a data instance
>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2021")

>>> vbt.pprint(data)  # (1)!
YFData(
    wrapper=ArrayWrapper(...),
    data=symbol_dict({
        'BTC-USD': <pandas.core.frame.DataFrame object at 0x7f7f1fbc6cd0 with shape (366, 7)>
    }),
    single_key=True,
    classes=symbol_dict(),
    fetch_kwargs=symbol_dict({
        'BTC-USD': dict(
            start='2020',
            end='2021'
        )
    }),
    returned_kwargs=symbol_dict({
        'BTC-USD': dict()
    }),
    last_index=symbol_dict({
        'BTC-USD': Timestamp('2020-12-31 00:00:00+0000', tz='UTC')
    }),
    tz_localize=datetime.timezone.utc,
    tz_convert='UTC',
    missing_index='nan',
    missing_columns='raise'
)

>>> vbt.pdir(data)  # (2)!
                                            type                                             path
attr                                                                                                     
align_columns                        classmethod                       vectorbtpro.data.base.Data
align_index                          classmethod                       vectorbtpro.data.base.Data
build_feature_config_doc             classmethod                       vectorbtpro.data.base.Data
...                                          ...                                              ...
vwap                                    property                       vectorbtpro.data.base.Data
wrapper                                 property               vectorbtpro.base.wrapping.Wrapping
xs                                      function          vectorbtpro.base.indexing.PandasIndexer

>>> vbt.phelp(data.get)  # (3)!
YFData.get(
    columns=None,
    symbols=None,
    **kwargs
):
    Get one or more columns of one or more symbols of data.
  1. Just like the Python's print command to pretty-print the contents of any VBT object
  2. Just like the Python's dir command to pretty-print the attributes of a class, object, or module
  3. Just like the Python's help command to pretty-print the signature and docstring of a function

Meta methods

  • Many methods such as rolling apply are now available in two flavors: regular (instance methods) and meta (class methods). Regular methods are bound to a single array and do not have to take metadata anymore, while meta methods are not bound to any array and act as micro-pipelines with their own broadcasting and templating logic. Here, VBT closes one of the key limitations of Pandas - the inability to apply a function on multiple arrays at once.
Compute the rolling z-score on one array and the rolling correlation coefficient on two arrays
>>> @njit
... def zscore_nb(x):  # (1)!
...     return (x[-1] - np.mean(x)) / np.std(x)

>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2021")
>>> data.close.rolling(14).apply(zscore_nb, raw=True)  # (2)!
Date
2020-01-01 00:00:00+00:00         NaN
                                  ...
2020-12-27 00:00:00+00:00    1.543527
2020-12-28 00:00:00+00:00    1.734715
2020-12-29 00:00:00+00:00    1.755125
2020-12-30 00:00:00+00:00    2.107147
2020-12-31 00:00:00+00:00    1.781800
Freq: D, Name: Close, Length: 366, dtype: float64

>>> data.close.vbt.rolling_apply(14, zscore_nb)  # (3)!
2020-01-01 00:00:00+00:00         NaN
                                  ...
2020-12-27 00:00:00+00:00    1.543527
2020-12-28 00:00:00+00:00    1.734715
2020-12-29 00:00:00+00:00    1.755125
2020-12-30 00:00:00+00:00    2.107147
2020-12-31 00:00:00+00:00    1.781800
Freq: D, Name: Close, Length: 366, dtype: float64

>>> @njit
... def corr_meta_nb(from_i, to_i, col, a, b):  # (4)!
...     a_window = a[from_i:to_i, col]
...     b_window = b[from_i:to_i, col]
...     return np.corrcoef(a_window, b_window)[1, 0]

>>> data2 = vbt.YFData.pull(["ETH-USD", "XRP-USD"], start="2020", end="2021")
>>> vbt.pd_acc.rolling_apply(  # (5)!
...     14, 
...     corr_meta_nb, 
...     vbt.Rep("a"),
...     vbt.Rep("b"),
...     broadcast_named_args=dict(a=data.close, b=data2.close)
... )
symbol                      ETH-USD   XRP-USD
Date                                         
2020-01-01 00:00:00+00:00       NaN       NaN
...                             ...       ...
2020-12-27 00:00:00+00:00  0.636862 -0.511303
2020-12-28 00:00:00+00:00  0.674514 -0.622894
2020-12-29 00:00:00+00:00  0.712531 -0.773791
2020-12-30 00:00:00+00:00  0.839355 -0.772295
2020-12-31 00:00:00+00:00  0.878897 -0.764446

[366 rows x 2 columns]
  1. Access to the window only
  2. Using Pandas
  3. Using the regular method, which accepts the same function as pandas
  4. Access to one to multiple whole arrays
  5. Using the meta method, which accepts metadata and variable arguments

Array expressions

  • When combining multiple arrays, they often need to be properly aligned and broadcasted before the actual operation. Using Pandas alone won't do the trick because Pandas is too strict in this regard. Luckily, VBT has an accessor class method that can take a regular Python expression, identify all the variable names, extract the corresponding arrays from the current context, broadcast them, and only then evaluate the actual expression (also using NumExpr!) âŒĻ
Evaluate a multiline array expression based on a Bollinger Bands indicator
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"])

>>> low = data.low
>>> high = data.high
>>> bb = vbt.talib("BBANDS").run(data.close)
>>> upperband = bb.upperband
>>> lowerband = bb.lowerband
>>> bandwidth = (bb.upperband - bb.lowerband) / bb.middleband
>>> up_th = vbt.Param([0.3, 0.4]) 
>>> low_th = vbt.Param([0.1, 0.2])

>>> expr = """
... narrow_bands = bandwidth < low_th
... above_upperband = high > upperband
... wide_bands = bandwidth > up_th
... below_lowerband = low < lowerband
... (narrow_bands & above_upperband) | (wide_bands & below_lowerband)
... """
>>> mask = vbt.pd_acc.eval(expr)
>>> mask.sum()
low_th  up_th  symbol 
0.1     0.3    BTC-USD    344
               ETH-USD    171
        0.4    BTC-USD    334
               ETH-USD    158
0.2     0.3    BTC-USD    444
               ETH-USD    253
        0.4    BTC-USD    434
               ETH-USD    240
dtype: int64

Resource management

  • New profiling tools to measure the execution time and memory usage of any code block 🧰
Profile getting the Sharpe ratio of a random portfolio
>>> data = vbt.YFData.pull("BTC-USD")

>>> with (
...     vbt.Timer() as timer, 
...     vbt.MemTracer() as mem_tracer
... ):
...     print(vbt.PF.from_random_signals(data.close, n=100).sharpe_ratio)
0.33111243921865163

>>> print(timer.elapsed())
74.15 milliseconds

>>> print(mem_tracer.peak_usage())
459.7 kB

Templates

It's super-easy to extend classes, but VBT revolves around functions, so how do we enhance them or change their workflow? The easiest way is to introduce a tiny function (i.e., callback) that can be provided by the user and called by the main fucntion at some point in time. But this would require the main function to know which arguments to pass to the callback and what to do with the outputs. Here's a better idea: allow most arguments of the main function to become callbacks and then execute them to reveal the actual values. Such arguments are called "templates" and such a process is called "substitution". Templates are especially useful when some arguments (such as arrays) should be constructed only once all the required information is available, for example, once other arrays have been broadcast. Also, each such substitution opportunity has its own identifier such that you can control when a template should be substituted. In VBT, templates are first-class citizens and are integrated into most functions for an unmatched flexibility! 🧞

Design a template-enhanced resampling functionality
>>> def resample_apply(index, by, apply_func, *args, template_context={}, **kwargs):
...     grouper = index.vbt.get_grouper(by)  # (1)!
...     results = {}
...     with vbt.get_pbar() as pbar:
...         for group, group_idxs in grouper:  # (2)!
...             group_index = index[group_idxs]
...             context = {"group": group, "group_index": group_index, **template_context}  # (3)!
...             final_apply_func = vbt.substitute_templates(apply_func, context, eval_id="apply_func")  # (4)!
...             final_args = vbt.substitute_templates(args, context, eval_id="args")
...             final_kwargs = vbt.substitute_templates(kwargs, context, eval_id="kwargs")
...             results[group] = final_apply_func(*final_args, **final_kwargs)
...             pbar.update(1)
...     return pd.Series(results)

>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], missing_index="drop")
>>> resample_apply(
...     data.index, "Y", 
...     lambda x, y: x.corr(y),  # (5)!
...     vbt.RepEval("btc_close[group_index]"),  # (6)!
...     vbt.RepEval("eth_close[group_index]"),
...     template_context=dict(
...         btc_close=data.get("Close", "BTC-USD"),  # (7)!
...         eth_close=data.get("Close", "ETH-USD")
...     )
... )
  1. Build a grouper. Accepts both group-by and resample instructions.
  2. Iterate over groups in the grouper. Each group consists of the label (such as 2017-01-01 00:00:00+00:00) and row indices corresponding to this label.
  3. Populate a new context with the information on the current group and user-provided external information
  4. Substitute the function and arguments using the newly-populated context
  5. Simple function to compute the correlation coefficient of two arrays
  6. Define both arguments as expression templates where we select the data corresponding to each group. All variables in these expressions will be automatically recognized and replaced by the current context. Once evaluated, the templates will be substituted by their outputs.
  7. Here we can specify additional information our templates depend upon

Group 7/7

2017    0.808930
2018    0.897112
2019    0.753659
2020    0.940741
2021    0.553255
2022    0.975911
2023    0.974914
Freq: A-DEC, dtype: float64