Productivity¶

Iterated decorator ¶

Considering parallelizing a for-loop? Think no more—VBT has a decorator just for that.

Emulate a parallelized nested loop to get Sharpe by year and month

>>> import calendar

>>> @vbt.iterated(over_arg="year", merge_func="column_stack", engine="pathos")  # (1)!
... @vbt.iterated(over_arg="month", merge_func="concat")  # (2)!
... def get_year_month_sharpe(data, year, month):  # (3)!
...     mask = (data.index.year == year) & (data.index.month == month)
...     if not mask.any():
...         return np.nan
...     year_returns = data.loc[mask].returns
...     return year_returns.vbt.returns.sharpe_ratio()

>>> years = data.index.year.unique().sort_values().rename("year")
>>> months = data.index.month.unique().sort_values().rename("month")
>>> sharpe_matrix = get_year_month_sharpe(
...     data,
...     years,
...     {calendar.month_abbr[month]: month for month in months},  # (4)!
... )
>>> sharpe_matrix.transpose().vbt.heatmap(
...     trace_kwargs=dict(colorscale="RdBu", zmid=0), 
...     yaxis=dict(autorange="reversed")
... ).show()

Iterate over years (in parallel)
Iterate over months (sequentially)
Function is getting called on each combination of year and month
Map month numbers to names and pass as a dict, VBT will extract the keys and use them as labels

Tasks ¶

Testing multiple parameter combinations typically involves using the @vbt.parameterized decorator. But what if we want to test entirely uncorrelated configurations or even different functions? The latest addition to VectorBT PRO allows you to execute any sequence of unrelated tests in parallel by assigning each test to a task.

Simulate SL, TSL, and TP parameters in three separate processes and compare their expectancy

>>> data = vbt.YFData.pull("BTC-USD")

>>> task1 = vbt.Task(  # (1)!
...     vbt.PF.from_random_signals, 
...     data, 
...     n=100, seed=42,
...     sl_stop=vbt.Param(np.arange(1, 51) / 100)
... )
>>> task2 = vbt.Task(
...     vbt.PF.from_random_signals, 
...     data, 
...     n=100, seed=42,
...     tsl_stop=vbt.Param(np.arange(1, 51) / 100)
... )
>>> task3 = vbt.Task(
...     vbt.PF.from_random_signals, 
...     data, 
...     n=100, seed=42,
...     tp_stop=vbt.Param(np.arange(1, 51) / 100)
... )
>>> pf1, pf2, pf3 = vbt.execute([task1, task2, task3], engine="pathos")  # (2)!

>>> fig = pf1.trades.expectancy.rename("SL").vbt.plot()
>>> pf2.trades.expectancy.rename("TSL").vbt.plot(fig=fig)
>>> pf3.trades.expectancy.rename("TP").vbt.plot(fig=fig)
>>> fig.show()

Task consists of a function and arguments that we want to pass to the function. Just creating a task won't execute the function!
Execute all three tasks using multiprocessing

Nested progress bars¶

Progress bars are now aware of each other! When a new progress bar starts, it checks if another progress bar with the same identifier has completed its task. If it has, the new progress bar will close itself and delegate its progress to the other progress bar.

Display progress of three parameters using nested progress bars

>>> symbols = ["BTC-USD", "ETH-USD"]
>>> fast_windows = range(5, 105, 5)
>>> slow_windows = range(5, 105, 5)
>>> sharpe_ratios = dict()

>>> with vbt.ProgressBar(total=len(symbols), bar_id="pbar1") as pbar1:  # (1)!
...     for symbol in symbols:
...         pbar1.set_description(dict(symbol=symbol), refresh=True)
...         data = vbt.YFData.pull(symbol)
...         
...         with vbt.ProgressBar(total=len(fast_windows), bar_id="pbar2") as pbar2:  # (2)!
...             for fast_window in fast_windows:
...                 pbar2.set_description(dict(fast_window=fast_window), refresh=True)
...                 
...                 with vbt.ProgressBar(total=len(slow_windows), bar_id="pbar3") as pbar3:  # (3)!
...                     for slow_window in slow_windows:
...                         if fast_window < slow_window:
...                             pbar3.set_description(dict(slow_window=slow_window), refresh=True)
...                             fast_sma = data.run("talib_func:sma", fast_window)
...                             slow_sma = data.run("talib_func:sma", slow_window)
...                             entries = fast_sma.vbt.crossed_above(slow_sma)
...                             exits = fast_sma.vbt.crossed_below(slow_sma)
...                             pf = vbt.PF.from_signals(data, entries, exits)
...                             sharpe_ratios[(symbol, fast_window, slow_window)] = pf.sharpe_ratio
...                         pbar3.update()
...                         
...                 pbar2.update()
...                 
...         pbar1.update()

Track iteration over symbols
Track iteration over fast windows
Track iteration over slow windows

Symbol 2/2

Fast window 20/20

Slow window 20/20

>>> sharpe_ratios = pd.Series(sharpe_ratios)
>>> sharpe_ratios.index.names = ["symbol", "fast_window", "slow_window"]
>>> sharpe_ratios
symbol   fast_window  slow_window
BTC-USD  5            10             1.063616
                      15             1.218345
                      20             1.273154
                      25             1.365664
                      30             1.394469
                                          ...
ETH-USD  80           90             0.582995
                      95             0.617568
         85           90             0.701215
                      95             0.616037
         90           95             0.566650
Length: 342, dtype: float64

Annotations¶

Whenever you write a function, the meaning of each argument can be specified using an annotation next to the argument. VBT now offers a rich set of in-house annotations tailored to specific tasks. For example, whether an argument is a parameter can be specified directly in the function rather than in the parameterized decorator.

Test a cross-validation function with annotations

>>> @vbt.cv_split(
...     splitter="from_rolling", 
...     splitter_kwargs=dict(length=365, split=0.5, set_labels=["train", "test"]),
...     parameterized_kwargs=dict(random_subset=100),
... )
... def sma_crossover_cv(
...     data: vbt.Takeable,  # (1)!
...     fast_period: vbt.Param(condition="x < slow_period"),  # (2)!
...     slow_period: vbt.Param,  # (3)!
...     metric
... ) -> vbt.MergeFunc("concat"):
...     fast_sma = data.run("sma", fast_period, hide_params=True)
...     slow_sma = data.run("sma", slow_period, hide_params=True)
...     entries = fast_sma.real_crossed_above(slow_sma)
...     exits = fast_sma.real_crossed_below(slow_sma)
...     pf = vbt.PF.from_signals(data, entries, exits, direction="both")
...     return pf.deep_getattr(metric)

>>> sma_crossover_cv(
...     vbt.YFData.pull("BTC-USD", start="4 years ago"),
...     np.arange(20, 50),
...     np.arange(20, 50),
...     "trades.expectancy"
... )
split  set    fast_period  slow_period
0      train  22           33             26.351841
       test   21           34             35.788733
1      train  21           46             24.114027
       test   21           39              2.261432
2      train  30           44             29.635233
       test   30           38              1.909916
3      train  20           49             -7.038924
       test   20           44             -1.366734
4      train  28           44              2.144805
       test   29           38             -4.945776
5      train  35           47             -8.877875
       test   34           37              2.792217
6      train  29           41              8.816846
       test   28           43             36.008302
dtype: float64

Here
Here
Here

DataFrame product¶

Several parameterized indicators can produce DataFrames with different shapes and columns, which makes a Cartesian product of them tricky since they often share common column levels (such as "symbol") that shouldn't be combined with each other. There's now a method to cross-join multiple DataFrames block-wise.

Enter when SMA goes above WMA, exit when EMA goes below WMA

>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], missing_index="drop")
>>> sma = data.run("sma", timeperiod=[10, 20], unpack=True)
>>> ema = data.run("ema", timeperiod=[30, 40], unpack=True)
>>> wma = data.run("wma", timeperiod=[50, 60], unpack=True)
>>> sma, ema, wma = sma.vbt.x(ema, wma)  # (1)!
>>> entries = sma.vbt.crossed_above(wma)
>>> exits = ema.vbt.crossed_below(wma)

>>> entries.columns
MultiIndex([(10, 30, 50, 'BTC-USD'),
            (10, 30, 50, 'ETH-USD'),
            (10, 30, 60, 'BTC-USD'),
            (10, 30, 60, 'ETH-USD'),
            (10, 40, 50, 'BTC-USD'),
            (10, 40, 50, 'ETH-USD'),
            (10, 40, 60, 'BTC-USD'),
            (10, 40, 60, 'ETH-USD'),
            (20, 30, 50, 'BTC-USD'),
            (20, 30, 50, 'ETH-USD'),
            (20, 30, 60, 'BTC-USD'),
            (20, 30, 60, 'ETH-USD'),
            (20, 40, 50, 'BTC-USD'),
            (20, 40, 50, 'ETH-USD'),
            (20, 40, 60, 'BTC-USD'),
            (20, 40, 60, 'ETH-USD')],
           names=['sma_timeperiod', 'ema_timeperiod', 'wma_timeperiod', 'symbol'])

Build a Cartesian product of three DataFrames by keeping the column level "symbol" untouched. Same can be achieved with vbt.pd_acc.cross(sma, ema, wma).

Compression¶

Serialized VBT objects may sometimes take a lot of disk space. With this update, there's now support for a variety of compression algorithms to make files as light as possible!

Save data without and with compression

>>> data = vbt.RandomOHLCData.pull("RAND", start="2022", end="2023", timeframe="1 minute")

>>> file_path = data.save()
>>> print(vbt.file_size(file_path))
21.0 MB

>>> file_path = data.save(compression="blosc")
>>> print(vbt.file_size(file_path))
13.3 MB

Faster loading¶

If your pipeline doesn't need accessors, Plotly graphs, and most of other optional functionalities, you can disable the auto-import feature entirely to bring down the loading time of VBT to under a second

Define importing settings in vbt.ini

[importing]
auto_import = False

Measure the loading time

>>> start = utc_time()
>>> from vectorbtpro import *
>>> end = utc_time()
>>> end - start
0.580937910079956

Configuration files¶

VectorBT PRO extends configparser to define its own configuration format that lets the user save, introspect, modify, and load back any complex in-house object. The main advantages of this format are readability and round-tripping: any object can be encoded and then decoded back without information loss. The main features include nested structures, references, parsing of literals, as well as evaluation of arbitrary Python expressions. Additionally, you can now create a configuration file for VBT and put it into the working directory - it will be used to update the default settings whenever the package is imported!

Define global settings in vbt.ini

[plotting]
default_theme = dark

[portfolio]
init_cash = 5000

[data.custom.binance.client_config]
api_key = YOUR_API_KEY
api_secret = YOUR_API_SECRET

[data.custom.ccxt.exchanges.binance.exchange_config]
apiKey = &data.custom.binance.client_config.api_key
secret = &data.custom.binance.client_config.api_secret

Verify that the settings have been loaded correctly

>>> from vectorbtpro import *

>>> vbt.settings.portfolio["init_cash"]
5000

Serialization¶

Just like machine learning models, every native VBT object can be serialized and saved to a binary file - it's never been easier to share data and insights! Another benefit is that only the actual content of each object is serialized, and not its class definition, such that the loaded object uses only the most up-to-date class definition. There's also a special logic implemented that can help you "reconstruct" objects if VBT has introduced some breaking API changes

Backtest each month of data and save the results for later

>>> data = vbt.YFData.pull("BTC-USD", start="2022-01-01", end="2022-06-01")

>>> def backtest_month(close):
...     return vbt.PF.from_random_signals(close, n=10)

>>> month_pfs = data.close.resample(vbt.offset("M")).apply(backtest_month)
>>> month_pfs
Date
2022-01-01 00:00:00+00:00    Portfolio(\n    wrapper=ArrayWrapper(\n       ...
2022-02-01 00:00:00+00:00    Portfolio(\n    wrapper=ArrayWrapper(\n       ...
2022-03-01 00:00:00+00:00    Portfolio(\n    wrapper=ArrayWrapper(\n       ...
2022-04-01 00:00:00+00:00    Portfolio(\n    wrapper=ArrayWrapper(\n       ...
2022-05-01 00:00:00+00:00    Portfolio(\n    wrapper=ArrayWrapper(\n       ...
Freq: MS, Name: Close, dtype: object

>>> vbt.save(month_pfs, "month_pfs")  # (1)!

>>> month_pfs = vbt.load("month_pfs")  # (2)!
>>> month_pfs.apply(lambda pf: pf.total_return)
Date
2022-01-01 00:00:00+00:00   -0.048924
2022-02-01 00:00:00+00:00    0.168370
2022-03-01 00:00:00+00:00    0.016087
2022-04-01 00:00:00+00:00   -0.120525
2022-05-01 00:00:00+00:00    0.110751
Freq: MS, Name: Close, dtype: float64

Save to disk
Load from disk later

Data parsing¶

Tired of passing open, high, low, and close as separate time series? Portfolio class methods have been extended to take a data instance instead of close and extract the contained OHLC data automatically - a small but timesaving feature!

Run the example above using the new approach

>>> data = vbt.YFData.pull("BTC-USD", start="2020-01", end="2020-03")
>>> pf = vbt.PF.from_random_signals(data, n=10)

Index dictionaries¶

Manually constructing arrays and setting their data with Pandas is often painful. Gladly, there is a new functionality that provides a much needed help! Any broadcastable argument can become an index dictionary, which contains instructions on where to set values in the array and does the filling job for you. It knows exactly which axis has to be modified and doesn't create a full array if not necessary - with much love to RAM

1) Accumulate daily and exit on Sunday vs 2) accumulate weekly and exit on month end

>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"])
>>> tile = pd.Index(["daily", "weekly"], name="strategy")  # (1)!
>>> pf = vbt.PF.from_orders(
...     data.close,
...     size=vbt.index_dict({  # (2)!
...         vbt.idx(
...             vbt.pointidx(every="day"), 
...             vbt.colidx("daily", level="strategy")): 100,  # (3)!
...         vbt.idx(
...             vbt.pointidx(every="sunday"), 
...             vbt.colidx("daily", level="strategy")): -np.inf,  # (4)!
...         vbt.idx(
...             vbt.pointidx(every="monday"), 
...             vbt.colidx("weekly", level="strategy")): 100,
...         vbt.idx(
...             vbt.pointidx(every="monthend"), 
...             vbt.colidx("weekly", level="strategy")): -np.inf,
...     }),
...     size_type="value",
...     direction="longonly",
...     init_cash="auto",
...     broadcast_kwargs=dict(tile=tile)
... )
>>> pf.sharpe_ratio
strategy  symbol 
daily     BTC-USD    0.702259
          ETH-USD    0.782296
weekly    BTC-USD    0.838895
          ETH-USD    0.524215
Name: sharpe_ratio, dtype: float64

To represent two strategies, you need to tile the same data twice. For this, create a parameter with strategy names and pass it as tile to the broadcaster for it to tile the columns of each array (such as price) by two times.
Index dictionary contains index instructions as keys and data as values to be set. Keys can be anything from row indices and labels, to custom indexer classes such as PointIdxr.
Find the indices of the rows that correspond to the beginning of each day and the index of the column "daily", and set each element under those indices to 100 (= accumulate)
Find the indices of the rows that correspond to Sunday. If any value under those indices has already been set with any previous instruction, it will be overridden.

Slicing¶

Similarly to selecting columns, each VBT object is now capable of slicing rows, using the exact same mechanism as in Pandas This makes it supereasy to analyze and plot any subset of simulated data, without the need of re-simulation!

Analyze multiple date ranges of the same portfolio

>>> data = vbt.YFData.pull("BTC-USD")
>>> pf = vbt.PF.from_holding(data, freq="d")

>>> pf.sharpe_ratio
1.116727709477293

>>> pf.loc[:"2020"].sharpe_ratio  # (1)!
1.2699801554196481

>>> pf.loc["2021": "2021"].sharpe_ratio  # (2)!
0.9825161170278687

>>> pf.loc["2022":].sharpe_ratio  # (3)!
-1.0423271337174647

Get the Sharpe during the year 2020 and before
Get the Sharpe during the year 2021
Get the Sharpe during the year 2022 and after

Column stacking¶

Complex VBT objects of the same type can be easily stacked along columns. For instance, you can combine multiple totally-unrelated trading strategies into the same portfolio for analysis. Under the hood, the final object is still represented as a monolithic multi-dimensional structure that can be processed even faster than merged objects separately

Analyze two trading strategies separately and then jointly

>>> def strategy1(data):
...     fast_ma = vbt.MA.run(data.close, 50, short_name="fast_ma")
...     slow_ma = vbt.MA.run(data.close, 200, short_name="slow_ma")
...     entries = fast_ma.ma_crossed_above(slow_ma)
...     exits = fast_ma.ma_crossed_below(slow_ma)
...     return vbt.PF.from_signals(
...         data.close, 
...         entries, 
...         exits, 
...         size=100,
...         size_type="value",
...         init_cash="auto"
...     )

>>> def strategy2(data):
...     bbands = vbt.BBANDS.run(data.close, window=14)
...     entries = bbands.close_crossed_below(bbands.lower)
...     exits = bbands.close_crossed_above(bbands.upper)
...     return vbt.PF.from_signals(
...         data.close, 
...         entries, 
...         exits, 
...         init_cash=200
...     )

>>> data1 = vbt.BinanceData.pull("BTCUSDT")
>>> pf1 = strategy1(data1)  # (1)!
>>> pf1.sharpe_ratio
0.9100317671866922

>>> data2 = vbt.BinanceData.pull("ETHUSDT")
>>> pf2 = strategy2(data2)  # (2)!
>>> pf2.sharpe_ratio
-0.11596286232734827

>>> pf_sep = vbt.PF.column_stack((pf1, pf2))  # (3)!
>>> pf_sep.sharpe_ratio
0    0.910032
1   -0.115963
Name: sharpe_ratio, dtype: float64

>>> pf_join = vbt.PF.column_stack((pf1, pf2), group_by=True)  # (4)!
>>> pf_join.sharpe_ratio
0.42820898354646514

Analyze the first strategy in a separate portfolio
Analyze the second strategy in a separate portfolio
Analyze both strategies in the same portfolio separately
Analyze both strategies in the same portfolio jointly

Row stacking¶

Complex VBT objects of the same type can be easily stacked along rows. For instance, you can append new data to an existing portfolio, or even concatenate in-sample portfolios with their out-of-sample counterparts

Analyze two date ranges separately and then jointly

>>> def strategy(data, start=None, end=None):
...     fast_ma = vbt.MA.run(data.close, 50, short_name="fast_ma")
...     slow_ma = vbt.MA.run(data.close, 200, short_name="slow_ma")
...     entries = fast_ma.ma_crossed_above(slow_ma)
...     exits = fast_ma.ma_crossed_below(slow_ma)
...     return vbt.PF.from_signals(
...         data.close[start:end], 
...         entries[start:end], 
...         exits[start:end], 
...         size=100,
...         size_type="value",
...         init_cash="auto"
...     )

>>> data = vbt.BinanceData.pull("BTCUSDT")

>>> pf_whole = strategy(data)  # (1)!
>>> pf_whole.sharpe_ratio
0.9100317671866922

>>> pf_sub1 = strategy(data, end="2019-12-31")  # (2)!
>>> pf_sub1.sharpe_ratio
0.7810397448678937

>>> pf_sub2 = strategy(data, start="2020-01-01")  # (3)!
>>> pf_sub2.sharpe_ratio
1.070339534746574

>>> pf_join = vbt.PF.row_stack((pf_sub1, pf_sub2))  # (4)!
>>> pf_join.sharpe_ratio
0.9100317671866922

Analyze the entire range
Analyze the first date range
Analyze the second date range
Join both date ranges and analyze as a whole

Index alignment¶

There is no more limitation of each Pandas array being required to have the same index. Indexes of all arrays that should broadcast against each other are automatically aligned, as long as they have the same data type.

Predict ETH price with BTC price using linear regression

>>> btc_data = vbt.YFData.pull("BTC-USD")
>>> btc_data.wrapper.shape
(2817, 7)

>>> eth_data = vbt.YFData.pull("ETH-USD")  # (1)!
>>> eth_data.wrapper.shape
(1668, 7)

>>> ols = vbt.OLS.run(  # (2)!
...     btc_data.close,
...     eth_data.close
... )
>>> ols.pred
Date
2014-09-17 00:00:00+00:00            NaN
2014-09-18 00:00:00+00:00            NaN
2014-09-19 00:00:00+00:00            NaN
2014-09-20 00:00:00+00:00            NaN
2014-09-21 00:00:00+00:00            NaN
...                                  ...
2022-05-30 00:00:00+00:00    2109.769242
2022-05-31 00:00:00+00:00    2028.856767
2022-06-01 00:00:00+00:00    1911.555689
2022-06-02 00:00:00+00:00    1930.169725
2022-06-03 00:00:00+00:00    1882.573170
Freq: D, Name: Close, Length: 2817, dtype: float64

ETH-USD history is shorter than BTC-USD history
This now works! Just make sure that all arrays share the same timeframe and timezone.

Numba datetime¶

There is no support for datetime indexes (and any other Pandas objects) in Numba. There are also no built-in Numba functions for working with datetime. So, how to connect data to time? VBT closes this loophole by implementing a collection of functions to extract various information from each timestamp, such as the current time and day of the week to determine whether the bar happens during trading hours.

Tutorial

Learn more in the Signal development tutorial.

Plot the percentage change from the start of the month to now

>>> @njit
... def month_start_pct_change_nb(arr, index):
...     out = np.full(arr.shape, np.nan)
...     for col in range(arr.shape[1]):
...         for i in range(arr.shape[0]):
...             if i == 0 or vbt.dt_nb.month_nb(index[i - 1]) != vbt.dt_nb.month_nb(index[i]):
...                 month_start_value = arr[i, col]
...             else:
...                 out[i, col] = (arr[i, col] - month_start_value) / month_start_value
...     return out

>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], start="2022", end="2023")
>>> pct_change = month_start_pct_change_nb(
...     vbt.to_2d_array(data.close), 
...     data.index.vbt.to_ns()  # (1)!
... )
>>> pct_change = data.symbol_wrapper.wrap(pct_change)
>>> pct_change.vbt.plot().show()

Convert the datetime index to the nanosecond format

Periods ago¶

Instead of writing Numba functions, comparing values at different bars can be also done in a vectorized manner with Pandas. The problem is that there are no-built in functions to easily shift values based on timedeltas, neither there are rolling functions to check whether an event happened during a period time in the past. This gap is closed by various new accessor methods.

Tutorial

Learn more in the Signal development tutorial.

Check whether the price dropped for 5 consecutive bars

>>> data = vbt.YFData.pull("BTC-USD", start="2022-05", end="2022-08")
>>> mask = (data.close < data.close.vbt.ago(1)).vbt.all_ago(5)
>>> fig = data.plot(plot_volume=False)
>>> mask.vbt.signals.ranges.plot_shapes(
...     plot_close=False, 
...     fig=fig, 
...     shape_kwargs=dict(fillcolor="orangered")
... )
>>> fig.show()

Safe resampling¶

The look-ahead bias is an ongoing threat when working with array data, especially on multiple time frames. Using Pandas alone is strongly discouraged because it's not aware that financial data mainly involves bars where timestamps are opening times and events can happen at any time between bars, and thus falsely assumes that timestamps denote the exact time of an event. In VBT, there is an entire collection of functions and classes for resampling and analyzing data in a safe way!

Tutorial

Learn more in the MTF analysis tutorial.

Calculate SMA on multiple time frames and display on the same chart

>>> def mtf_sma(close, close_freq, target_freq, timeperiod=5):
...     target_close = close.vbt.realign_closing(target_freq)  # (1)!
...     target_sma = vbt.talib("SMA").run(target_close, timeperiod=timeperiod).real  # (2)!
...     target_sma = target_sma.rename(f"SMA ({target_freq})")
...     return target_sma.vbt.realign_closing(close.index, freq=close_freq)  # (3)!

>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2023")
>>> fig = mtf_sma(data.close, "D", "daily").vbt.plot()
>>> mtf_sma(data.close, "D", "weekly").vbt.plot(fig=fig)
>>> mtf_sma(data.close, "D", "monthly").vbt.plot(fig=fig)
>>> fig.show()

Resample the source frequency to the target frequency. Close happens at the end of the bar, thus resample as a "closing event".
Calculate SMA on the target frequency
Resample the target frequency back to the source frequency to be able to display multiple time frames on the same chart. Since close contains gaps, we cannot resample to close_freq because it may result in unaligned series - resample directly to the index of close instead.

Resamplable objects¶

Not only you can resample time series, but also complex VBT objects! Under the hood, each object comprises of a bunch of array-like attributes, thus resampling here simply means aggregating all the accompanied information in one go. This is very convenient when you want to simulate on higher frequency for best accuracy, and then analyze on lower frequency for best speed.

Tutorial

Learn more in the MTF analysis tutorial.

Plot the monthly return heatmap of a random portfolio

>>> import calendar

>>> data = vbt.YFData.pull("BTC-USD", start="2018", end="2023")
>>> pf = vbt.PF.from_random_signals(data, n=100, direction="both")
>>> mo_returns = pf.resample("M").returns  # (1)!
>>> mo_return_matrix = pd.Series(
...     mo_returns.values, 
...     index=pd.MultiIndex.from_arrays([
...         mo_returns.index.year,
...         mo_returns.index.month
...     ], names=["year", "month"])
... ).unstack("month")
>>> mo_return_matrix.columns = mo_return_matrix.columns.map(lambda x: calendar.month_abbr[x])
>>> mo_return_matrix.vbt.heatmap(
...     is_x_category=True,
...     trace_kwargs=dict(zmid=0, colorscale="Spectral")
... ).show()

Resample the entire portfolio to the monthly frequency and compute the returns

Formatting engine¶

VectorBT PRO is a very extensive library that defines thousands of classes, functions, and objects. Thus, when working with any of them, you may want to "see through" the object to gain a better understanding of its attributes and contents. Gladly, there is a new formatting engine that can accurately format any in-house object as a human-readable string. Did you know that the API documentation is partially powered by this engine?

Introspect a data instance

>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2021")

>>> vbt.pprint(data)  # (1)!
YFData(
    wrapper=ArrayWrapper(...),
    data=symbol_dict({
        'BTC-USD': <pandas.core.frame.DataFrame object at 0x7f7f1fbc6cd0 with shape (366, 7)>
    }),
    single_key=True,
    classes=symbol_dict(),
    fetch_kwargs=symbol_dict({
        'BTC-USD': dict(
            start='2020',
            end='2021'
        )
    }),
    returned_kwargs=symbol_dict({
        'BTC-USD': dict()
    }),
    last_index=symbol_dict({
        'BTC-USD': Timestamp('2020-12-31 00:00:00+0000', tz='UTC')
    }),
    tz_localize=datetime.timezone.utc,
    tz_convert='UTC',
    missing_index='nan',
    missing_columns='raise'
)

>>> vbt.pdir(data)  # (2)!
                                            type                                             path
attr                                                                                                     
align_columns                        classmethod                       vectorbtpro.data.base.Data
align_index                          classmethod                       vectorbtpro.data.base.Data
build_feature_config_doc             classmethod                       vectorbtpro.data.base.Data
...                                          ...                                              ...
vwap                                    property                       vectorbtpro.data.base.Data
wrapper                                 property               vectorbtpro.base.wrapping.Wrapping
xs                                      function          vectorbtpro.base.indexing.PandasIndexer

>>> vbt.phelp(data.get)  # (3)!
YFData.get(
    columns=None,
    symbols=None,
    **kwargs
):
    Get one or more columns of one or more symbols of data.

Just like the Python's print command to pretty-print the contents of any VBT object
Just like the Python's dir command to pretty-print the attributes of a class, object, or module
Just like the Python's help command to pretty-print the signature and docstring of a function

Meta methods¶

Many methods such as rolling apply are now available in two flavors: regular (instance methods) and meta (class methods). Regular methods are bound to a single array and do not have to take metadata anymore, while meta methods are not bound to any array and act as micro-pipelines with their own broadcasting and templating logic. Here, VBT closes one of the key limitations of Pandas - the inability to apply a function on multiple arrays at once.

Compute the rolling z-score on one array and the rolling correlation coefficient on two arrays

>>> @njit
... def zscore_nb(x):  # (1)!
...     return (x[-1] - np.mean(x)) / np.std(x)

>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2021")
>>> data.close.rolling(14).apply(zscore_nb, raw=True)  # (2)!
Date
2020-01-01 00:00:00+00:00         NaN
                                  ...
2020-12-27 00:00:00+00:00    1.543527
2020-12-28 00:00:00+00:00    1.734715
2020-12-29 00:00:00+00:00    1.755125
2020-12-30 00:00:00+00:00    2.107147
2020-12-31 00:00:00+00:00    1.781800
Freq: D, Name: Close, Length: 366, dtype: float64

>>> data.close.vbt.rolling_apply(14, zscore_nb)  # (3)!
2020-01-01 00:00:00+00:00         NaN
                                  ...
2020-12-27 00:00:00+00:00    1.543527
2020-12-28 00:00:00+00:00    1.734715
2020-12-29 00:00:00+00:00    1.755125
2020-12-30 00:00:00+00:00    2.107147
2020-12-31 00:00:00+00:00    1.781800
Freq: D, Name: Close, Length: 366, dtype: float64

>>> @njit
... def corr_meta_nb(from_i, to_i, col, a, b):  # (4)!
...     a_window = a[from_i:to_i, col]
...     b_window = b[from_i:to_i, col]
...     return np.corrcoef(a_window, b_window)[1, 0]

>>> data2 = vbt.YFData.pull(["ETH-USD", "XRP-USD"], start="2020", end="2021")
>>> vbt.pd_acc.rolling_apply(  # (5)!
...     14, 
...     corr_meta_nb, 
...     vbt.Rep("a"),
...     vbt.Rep("b"),
...     broadcast_named_args=dict(a=data.close, b=data2.close)
... )
symbol                      ETH-USD   XRP-USD
Date                                         
2020-01-01 00:00:00+00:00       NaN       NaN
...                             ...       ...
2020-12-27 00:00:00+00:00  0.636862 -0.511303
2020-12-28 00:00:00+00:00  0.674514 -0.622894
2020-12-29 00:00:00+00:00  0.712531 -0.773791
2020-12-30 00:00:00+00:00  0.839355 -0.772295
2020-12-31 00:00:00+00:00  0.878897 -0.764446

[366 rows x 2 columns]

Access to the window only
Using Pandas
Using the regular method, which accepts the same function as pandas
Access to one to multiple whole arrays
Using the meta method, which accepts metadata and variable arguments

Array expressions¶

When combining multiple arrays, they often need to be properly aligned and broadcasted before the actual operation. Using Pandas alone won't do the trick because Pandas is too strict in this regard. Luckily, VBT has an accessor class method that can take a regular Python expression, identify all the variable names, extract the corresponding arrays from the current context, broadcast them, and only then evaluate the actual expression (also using NumExpr!)

Evaluate a multiline array expression based on a Bollinger Bands indicator

>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"])

>>> low = data.low
>>> high = data.high
>>> bb = vbt.talib("BBANDS").run(data.close)
>>> upperband = bb.upperband
>>> lowerband = bb.lowerband
>>> bandwidth = (bb.upperband - bb.lowerband) / bb.middleband
>>> up_th = vbt.Param([0.3, 0.4]) 
>>> low_th = vbt.Param([0.1, 0.2])

>>> expr = """
... narrow_bands = bandwidth < low_th
... above_upperband = high > upperband
... wide_bands = bandwidth > up_th
... below_lowerband = low < lowerband
... (narrow_bands & above_upperband) | (wide_bands & below_lowerband)
... """
>>> mask = vbt.pd_acc.eval(expr)
>>> mask.sum()
low_th  up_th  symbol 
0.1     0.3    BTC-USD    344
               ETH-USD    171
        0.4    BTC-USD    334
               ETH-USD    158
0.2     0.3    BTC-USD    444
               ETH-USD    253
        0.4    BTC-USD    434
               ETH-USD    240
dtype: int64

Resource management¶

New profiling tools to measure the execution time and memory usage of any code block

Profile getting the Sharpe ratio of a random portfolio

>>> data = vbt.YFData.pull("BTC-USD")

>>> with (
...     vbt.Timer() as timer, 
...     vbt.MemTracer() as mem_tracer
... ):
...     print(vbt.PF.from_random_signals(data.close, n=100).sharpe_ratio)
0.33111243921865163

>>> print(timer.elapsed())
74.15 milliseconds

>>> print(mem_tracer.peak_usage())
459.7 kB

Templates¶

It's super-easy to extend classes, but VBT revolves around functions, so how do we enhance them or change their workflow? The easiest way is to introduce a tiny function (i.e., callback) that can be provided by the user and called by the main fucntion at some point in time. But this would require the main function to know which arguments to pass to the callback and what to do with the outputs. Here's a better idea: allow most arguments of the main function to become callbacks and then execute them to reveal the actual values. Such arguments are called "templates" and such a process is called "substitution". Templates are especially useful when some arguments (such as arrays) should be constructed only once all the required information is available, for example, once other arrays have been broadcast. Also, each such substitution opportunity has its own identifier such that you can control when a template should be substituted. In VBT, templates are first-class citizens and are integrated into most functions for an unmatched flexibility!

Design a template-enhanced resampling functionality

>>> def resample_apply(index, by, apply_func, *args, template_context={}, **kwargs):
...     grouper = index.vbt.get_grouper(by)  # (1)!
...     results = {}
...     with vbt.ProgressBar() as pbar:
...         for group, group_idxs in grouper:  # (2)!
...             group_index = index[group_idxs]
...             context = {"group": group, "group_index": group_index, **template_context}  # (3)!
...             final_apply_func = vbt.substitute_templates(apply_func, context, eval_id="apply_func")  # (4)!
...             final_args = vbt.substitute_templates(args, context, eval_id="args")
...             final_kwargs = vbt.substitute_templates(kwargs, context, eval_id="kwargs")
...             results[group] = final_apply_func(*final_args, **final_kwargs)
...             pbar.update()
...     return pd.Series(results)

>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], missing_index="drop")
>>> resample_apply(
...     data.index, "Y", 
...     lambda x, y: x.corr(y),  # (5)!
...     vbt.RepEval("btc_close[group_index]"),  # (6)!
...     vbt.RepEval("eth_close[group_index]"),
...     template_context=dict(
...         btc_close=data.get("Close", "BTC-USD"),  # (7)!
...         eth_close=data.get("Close", "ETH-USD")
...     )
... )

Build a grouper. Accepts both group-by and resample instructions.
Iterate over groups in the grouper. Each group consists of the label (such as 2017-01-01 00:00:00+00:00) and row indices corresponding to this label.
Populate a new context with the information on the current group and user-provided external information
Substitute the function and arguments using the newly-populated context
Simple function to compute the correlation coefficient of two arrays
Define both arguments as expression templates where we select the data corresponding to each group. All variables in these expressions will be automatically recognized and replaced by the current context. Once evaluated, the templates will be substituted by their outputs.
Here we can specify additional information our templates depend upon

Group 7/7

2017    0.808930
2018    0.897112
2019    0.753659
2020    0.940741
2021    0.553255
2022    0.975911
2023    0.974914
Freq: A-DEC, dtype: float64