Productivity¶
>>> import numpy as np
>>> import pandas as pd
>>> from numba import njit
>>> import vectorbtpro as vbt
DataFrame product¶
- Several parameterized indicators can produce DataFrames with different shapes and columns, which makes a Cartesian product of them tricky since they often share common column levels (such as "symbol") that shouldn't be combined with each other. There's now a method to cross-join multiple DataFrames block-wise.
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], missing_index="drop")
>>> sma = data.run("sma", timeperiod=[10, 20], unpack=True)
>>> ema = data.run("ema", timeperiod=[30, 40], unpack=True)
>>> wma = data.run("wma", timeperiod=[50, 60], unpack=True)
>>> sma, ema, wma = sma.vbt.x(ema, wma) # (1)!
>>> entries = sma.vbt.crossed_above(wma)
>>> exits = ema.vbt.crossed_below(wma)
>>> entries.columns
MultiIndex([(10, 30, 50, 'BTC-USD'),
(10, 30, 50, 'ETH-USD'),
(10, 30, 60, 'BTC-USD'),
(10, 30, 60, 'ETH-USD'),
(10, 40, 50, 'BTC-USD'),
(10, 40, 50, 'ETH-USD'),
(10, 40, 60, 'BTC-USD'),
(10, 40, 60, 'ETH-USD'),
(20, 30, 50, 'BTC-USD'),
(20, 30, 50, 'ETH-USD'),
(20, 30, 60, 'BTC-USD'),
(20, 30, 60, 'ETH-USD'),
(20, 40, 50, 'BTC-USD'),
(20, 40, 50, 'ETH-USD'),
(20, 40, 60, 'BTC-USD'),
(20, 40, 60, 'ETH-USD')],
names=['sma_timeperiod', 'ema_timeperiod', 'wma_timeperiod', 'symbol'])
- Build a Cartesian product of three DataFrames by keeping the column level "symbol" untouched. Same can be achieved with
vbt.pd_acc.cross(sma, ema, wma)
.
Index records¶
- How do you backtest time and asset-anchored queries such as "Order X units of asset Y on date Z"? Normally, you'd have to construct a full array and set information manually. But now there's a less stressful way: thanks to the preparers and redesigned smart indexing you can provide all information in a compressed record format! Under the hood, the record array will be translated into a set of index dictionaries - one per argument.
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], missing_index="drop")
>>> records = [
... dict(date="2022", symbol="BTC-USD", long_entry=True), # (1)!
... dict(date="2022", symbol="ETH-USD", short_entry=True),
... dict(row=-1, exit=True),
... ]
>>> pf = vbt.PF.from_signals(data, records=records) # (2)!
>>> pf.orders.records_readable
Order Id Column Signal Index Creation Index
0 0 BTC-USD 2022-01-01 00:00:00+00:00 2022-01-01 00:00:00+00:00 \
1 1 BTC-USD 2023-04-25 00:00:00+00:00 2023-04-25 00:00:00+00:00
2 0 ETH-USD 2022-01-01 00:00:00+00:00 2022-01-01 00:00:00+00:00
3 1 ETH-USD 2023-04-25 00:00:00+00:00 2023-04-25 00:00:00+00:00
Fill Index Size Price Fees Side Type
0 2022-01-01 00:00:00+00:00 0.002097 47686.812500 0.0 Buy Market \
1 2023-04-25 00:00:00+00:00 0.002097 27534.675781 0.0 Sell Market
2 2022-01-01 00:00:00+00:00 0.026527 3769.697021 0.0 Sell Market
3 2023-04-25 00:00:00+00:00 0.026527 1834.759644 0.0 Buy Market
Stop Type
0 None
1 None
2 None
3 None
- Every broadcastable argument is supported. Rows can be provided via "row", "index", "date", or "datetime". Columns can be provided via "col", "column", or "symbol". If a row or column is not provided, will set the entire row or column respectively. If neither of them is provided, will set the entire array. Rows and columns can be provided as integer positions, labels, datetimes, or even complex indexers!
- Arguments not used in records can still be provided as usual. Arguments used in records can also be provided to be used as defaults.
Compression¶
- Serialized vectorbtpro objects may sometimes take a lot of disk space. With this update, there's now support for a variety of compression algorithms to make files as light as possible!
>>> data = vbt.RandomOHLCData.pull("RAND", start="2022", end="2023", freq="1 minute")
>>> file_path = data.save()
>>> print(vbt.file_size(file_path))
21.0 MB
>>> file_path = data.save(compression="blosc")
>>> print(vbt.file_size(file_path))
13.3 MB
Faster loading¶
- If your pipeline doesn't need accessors, Plotly graphs, and most of other optional functionalities, you can disable the auto-import feature entirely to bring down the loading time of vectorbtpro to under a second
>>> import time
>>> start = time.time()
>>> import vectorbtpro as vbt
>>> end = time.time()
>>> end - start
0.580937910079956
Configuration files¶
- VectorBT PRO extends configparser to define its own configuration format that lets the user save, introspect, modify, and load back any complex in-house object. The main advantages of this format are readability and round-tripping: any object can be encoded and then decoded back without information loss. The main features include nested structures, references, parsing of literals, as well as evaluation of arbitrary Python expressions. Additionally, you can now create a configuration file for vectorbtpro and put it into the working directory - it will be used to update the default settings whenever the package is imported!
[plotting]
default_theme = dark
[portfolio]
init_cash = 5000
[data.custom.binance.client_config]
api_key = YOUR_API_KEY
api_secret = YOUR_API_SECRET
[data.custom.ccxt.exchanges.binance.exchange_config]
apiKey = &data.custom.binance.client_config.api_key
secret = &data.custom.binance.client_config.api_secret
>>> import vectorbtpro as vbt
>>> vbt.settings.portfolio["init_cash"]
5000
Serialization¶
- Just like machine learning models, every native vectorbtpro object can be serialized and saved to a binary file - it's never been easier to share data and insights! Another benefit is that only the actual content of each object is serialized, and not its class definition, such that the loaded object uses only the most up-to-date class definition. There's also a special logic implemented that can help you "reconstruct" objects if vectorbtpro has introduced some breaking API changes
>>> data = vbt.YFData.pull("BTC-USD", start="2022-01-01", end="2022-06-01")
>>> def backtest_month(close):
... return vbt.PF.from_random_signals(close, n=10)
>>> month_pfs = data.close.resample("MS").apply(backtest_month)
>>> month_pfs
Date
2022-01-01 00:00:00+00:00 Portfolio(\n wrapper=ArrayWrapper(\n ...
2022-02-01 00:00:00+00:00 Portfolio(\n wrapper=ArrayWrapper(\n ...
2022-03-01 00:00:00+00:00 Portfolio(\n wrapper=ArrayWrapper(\n ...
2022-04-01 00:00:00+00:00 Portfolio(\n wrapper=ArrayWrapper(\n ...
2022-05-01 00:00:00+00:00 Portfolio(\n wrapper=ArrayWrapper(\n ...
Freq: MS, Name: Close, dtype: object
>>> vbt.save(month_pfs, "month_pfs") # (1)!
>>> month_pfs = vbt.load("month_pfs") # (2)!
>>> month_pfs.apply(lambda pf: pf.total_return)
Date
2022-01-01 00:00:00+00:00 -0.048924
2022-02-01 00:00:00+00:00 0.168370
2022-03-01 00:00:00+00:00 0.016087
2022-04-01 00:00:00+00:00 -0.120525
2022-05-01 00:00:00+00:00 0.110751
Freq: MS, Name: Close, dtype: float64
- Save to disk
- Load from disk later
Data parsing¶
- Tired of passing open, high, low, and close as separate time series? Portfolio class methods have been extended to take a data instance instead of close and extract the contained OHLC data automatically - a small but timesaving feature!
>>> data = vbt.YFData.pull("BTC-USD", start="2020-01", end="2020-03")
>>> pf = vbt.PF.from_random_signals(data, n=10)
Index dictionaries¶
Manually constructing arrays and setting their data with Pandas is often painful. Gladly, there is a new functionality that provides a much needed help! Any broadcastable argument can become an index dictionary, which contains instructions on where to set values in the array and does the filling job for you. It knows exactly which axis has to be modified and doesn't create a full array if not necessary - with much love to RAM
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"])
>>> tile = pd.Index(["daily", "weekly"], name="strategy") # (1)!
>>> pf = vbt.PF.from_orders(
... data.close,
... size=vbt.index_dict({ # (2)!
... vbt.idx(
... vbt.pointidx(every="D"),
... vbt.colidx("daily", level="strategy")): 100, # (3)!
... vbt.idx(
... vbt.pointidx(every="W-SUN"),
... vbt.colidx("daily", level="strategy")): -np.inf, # (4)!
... vbt.idx(
... vbt.pointidx(every="W-MON"),
... vbt.colidx("weekly", level="strategy")): 100,
... vbt.idx(
... vbt.pointidx(every="M"),
... vbt.colidx("weekly", level="strategy")): -np.inf,
... }),
... size_type="value",
... direction="longonly",
... init_cash="auto",
... broadcast_kwargs=dict(tile=tile)
... )
>>> pf.sharpe_ratio
strategy symbol
daily BTC-USD 0.702259
ETH-USD 0.782296
weekly BTC-USD 0.838895
ETH-USD 0.524215
Name: sharpe_ratio, dtype: float64
- To represent two strategies, you need to tile the same data twice. For this, create a parameter with strategy names and pass it as
tile
to the broadcaster for it to tile the columns of each array (such as price) by two times. - Index dictionary contains index instructions as keys and data as values to be set. Keys can be anything from row indices and labels, to custom indexer classes such as PointIdxr.
- Find the indices of the rows that correspond to the beginning of each day and the index of the column "daily", and set each element under those indices to 100 (= accumulate)
- Find the indices of the rows that correspond to Sunday. If any value under those indices has already been set with any previous instruction, it will be overridden.
Slicing¶
- Similarly to selecting columns, each vectorbtpro object is now capable of slicing rows, using the exact same mechanism as in Pandas
This makes it supereasy to analyze and plot any subset of simulated data, without the need of re-simulation!
>>> data = vbt.YFData.pull("BTC-USD")
>>> pf = vbt.PF.from_holding(data, freq="d")
>>> pf.sharpe_ratio
1.116727709477293
>>> pf.loc[:"2020"].sharpe_ratio # (1)!
1.2699801554196481
>>> pf.loc["2021": "2021"].sharpe_ratio # (2)!
0.9825161170278687
>>> pf.loc["2022":].sharpe_ratio # (3)!
-1.0423271337174647
- Get the Sharpe during the year 2020 and before
- Get the Sharpe during the year 2021
- Get the Sharpe during the year 2022 and after
Column stacking¶
- Complex vectorbtpro objects of the same type can be easily stacked along columns. For instance, you can combine multiple totally-unrelated trading strategies into the same portfolio for analysis. Under the hood, the final object is still represented as a monolithic multi-dimensional structure that can be processed even faster than merged objects separately
>>> def strategy1(data):
... fast_ma = vbt.MA.run(data.close, 50, short_name="fast_ma")
... slow_ma = vbt.MA.run(data.close, 200, short_name="slow_ma")
... entries = fast_ma.ma_crossed_above(slow_ma)
... exits = fast_ma.ma_crossed_below(slow_ma)
... return vbt.PF.from_signals(
... data.close,
... entries,
... exits,
... size=100,
... size_type="value",
... init_cash="auto"
... )
>>> def strategy2(data):
... bbands = vbt.BBANDS.run(data.close, window=14)
... entries = bbands.close_crossed_below(bbands.lower)
... exits = bbands.close_crossed_above(bbands.upper)
... return vbt.PF.from_signals(
... data.close,
... entries,
... exits,
... init_cash=200
... )
>>> data1 = vbt.BinanceData.pull("BTCUSDT")
>>> pf1 = strategy1(data1) # (1)!
>>> pf1.sharpe_ratio
0.9100317671866922
>>> data2 = vbt.BinanceData.pull("ETHUSDT")
>>> pf2 = strategy2(data2) # (2)!
>>> pf2.sharpe_ratio
-0.11596286232734827
>>> pf_sep = vbt.PF.column_stack((pf1, pf2)) # (3)!
>>> pf_sep.sharpe_ratio
0 0.910032
1 -0.115963
Name: sharpe_ratio, dtype: float64
>>> pf_join = vbt.PF.column_stack((pf1, pf2), group_by=True) # (4)!
>>> pf_join.sharpe_ratio
0.42820898354646514
- Analyze the first strategy in a separate portfolio
- Analyze the second strategy in a separate portfolio
- Analyze both strategies in the same portfolio separately
- Analyze both strategies in the same portfolio jointly
Row stacking¶
- Complex vectorbtpro objects of the same type can be easily stacked along rows. For instance, you can append new data to an existing portfolio, or even concatenate in-sample portfolios with their out-of-sample counterparts
>>> def strategy(data, start=None, end=None):
... fast_ma = vbt.MA.run(data.close, 50, short_name="fast_ma")
... slow_ma = vbt.MA.run(data.close, 200, short_name="slow_ma")
... entries = fast_ma.ma_crossed_above(slow_ma)
... exits = fast_ma.ma_crossed_below(slow_ma)
... return vbt.PF.from_signals(
... data.close[start:end],
... entries[start:end],
... exits[start:end],
... size=100,
... size_type="value",
... init_cash="auto"
... )
>>> data = vbt.BinanceData.pull("BTCUSDT")
>>> pf_whole = strategy(data) # (1)!
>>> pf_whole.sharpe_ratio
0.9100317671866922
>>> pf_sub1 = strategy(data, end="2019-12-31") # (2)!
>>> pf_sub1.sharpe_ratio
0.7810397448678937
>>> pf_sub2 = strategy(data, start="2020-01-01") # (3)!
>>> pf_sub2.sharpe_ratio
1.070339534746574
>>> pf_join = vbt.PF.row_stack((pf_sub1, pf_sub2)) # (4)!
>>> pf_join.sharpe_ratio
0.9100317671866922
- Analyze the entire range
- Analyze the first date range
- Analyze the second date range
- Join both date ranges and analyze as a whole
Index alignment¶
- There is no more limitation of each Pandas array being required to have the same index. Indexes of all arrays that should broadcast against each other are automatically aligned, as long as they have the same data type.
>>> btc_data = vbt.YFData.pull("BTC-USD")
>>> btc_data.wrapper.shape
(2817, 7)
>>> eth_data = vbt.YFData.pull("ETH-USD") # (1)!
>>> eth_data.wrapper.shape
(1668, 7)
>>> ols = vbt.OLS.run( # (2)!
... btc_data.close,
... eth_data.close
... )
>>> ols.pred
Date
2014-09-17 00:00:00+00:00 NaN
2014-09-18 00:00:00+00:00 NaN
2014-09-19 00:00:00+00:00 NaN
2014-09-20 00:00:00+00:00 NaN
2014-09-21 00:00:00+00:00 NaN
... ...
2022-05-30 00:00:00+00:00 2109.769242
2022-05-31 00:00:00+00:00 2028.856767
2022-06-01 00:00:00+00:00 1911.555689
2022-06-02 00:00:00+00:00 1930.169725
2022-06-03 00:00:00+00:00 1882.573170
Freq: D, Name: Close, Length: 2817, dtype: float64
- ETH-USD history is shorter than BTC-USD history
- This now works! Just make sure that all arrays share the same timeframe and timezone.
Numba datetime¶
- There is no support for datetime indexes (and any other Pandas objects) in Numba. There are also no built-in Numba functions for working with datetime. So, how to connect data to time? vectorbtpro closes this loophole by implementing a collection of functions to extract various information from each timestamp, such as the current time and day of the week to determine whether the bar happens during trading hours.
Tutorial
Learn more in the Signal development tutorial.
>>> @njit
... def month_start_pct_change_nb(arr, index):
... out = np.full(arr.shape, np.nan)
... for col in range(arr.shape[1]):
... for i in range(arr.shape[0]):
... if i == 0 or vbt.dt_nb.month_nb(index[i - 1]) != vbt.dt_nb.month_nb(index[i]):
... month_start_value = arr[i, col]
... else:
... out[i, col] = (arr[i, col] - month_start_value) / month_start_value
... return out
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], start="2022", end="2023")
>>> pct_change = month_start_pct_change_nb(
... vbt.to_2d_array(data.close),
... data.index.vbt.to_ns() # (1)!
... )
>>> pct_change = data.symbol_wrapper.wrap(pct_change)
>>> pct_change.vbt.plot().show()
- Convert the datetime index to the nanosecond format
Periods ago¶
- Instead of writing Numba functions, comparing values at different bars can be also done in a vectorized manner with Pandas. The problem is that there are no-built in functions to easily shift values based on timedeltas, neither there are rolling functions to check whether an event happened during a period time in the past. This gap is closed by various new accessor methods.
Tutorial
Learn more in the Signal development tutorial.
>>> data = vbt.YFData.pull("BTC-USD", start="2022-05", end="2022-08")
>>> mask = (data.close < data.close.vbt.ago(1)).vbt.all_ago(5)
>>> fig = data.plot(plot_volume=False)
>>> mask.vbt.signals.ranges.plot_shapes(
... plot_close=False,
... fig=fig,
... shape_kwargs=dict(fillcolor="orangered")
... )
>>> fig.show()
Safe resampling¶
- The look-ahead bias is an ongoing threat when working with array data, especially on multiple time frames. Using Pandas alone is strongly discouraged because it's not aware that financial data mainly involves bars where timestamps are opening times and events can happen at any time between bars, and thus falsely assumes that timestamps denote the exact time of an event. In vectorbtpro, there is an entire collection of functions and classes for resampling and analyzing data in a safe way!
Tutorial
Learn more in the MTF analysis tutorial.
>>> def mtf_sma(close, close_freq, target_freq, timeperiod=5):
... target_close = close.vbt.resample_closing(target_freq) # (1)!
... target_sma = vbt.talib("SMA").run(target_close, timeperiod=timeperiod).real # (2)!
... target_sma = target_sma.rename(f"SMA ({target_freq})")
... return target_sma.vbt.resample_closing(close.index, freq=close_freq) # (3)!
>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2023")
>>> fig = mtf_sma(data.close, "D", "D").vbt.plot()
>>> mtf_sma(data.close, "D", "W-MON").vbt.plot(fig=fig)
>>> mtf_sma(data.close, "D", "MS").vbt.plot(fig=fig)
>>> fig.show()
- Resample the source frequency to the target frequency. Close happens at the end of the bar, thus resample as a "closing event".
- Calculate SMA on the target frequency
- Resample the target frequency back to the source frequency to be able to display multiple time frames on the same chart. Since
close
contains gaps, we cannot resample toclose_freq
because it may result in unaligned series - resample directly to the index ofclose
instead.
Resamplable objects¶
- Not only you can resample time series, but also complex vectorbtpro objects! Under the hood, each object comprises of a bunch of array-like attributes, thus resampling here simply means aggregating all the accompanied information in one go. This is very convenient when you want to simulate on higher frequency for best accuracy, and then analyze on lower frequency for best speed.
Tutorial
Learn more in the MTF analysis tutorial.
>>> import calendar
>>> data = vbt.YFData.pull("BTC-USD", start="2018", end="2023")
>>> pf = vbt.PF.from_random_signals(data, n=100, direction="both")
>>> mo_returns = pf.resample("MS").returns # (1)!
>>> mo_return_matrix = pd.Series(
... mo_returns.values,
... index=pd.MultiIndex.from_arrays([
... mo_returns.index.year,
... mo_returns.index.month
... ], names=["year", "month"])
... ).unstack("month")
>>> mo_return_matrix.columns = mo_return_matrix.columns.map(lambda x: calendar.month_abbr[x])
>>> mo_return_matrix.vbt.heatmap(
... is_x_category=True,
... trace_kwargs=dict(zmid=0, colorscale="Spectral")
... ).show()
- Resample the entire portfolio to the monthly frequency and compute the returns
Formatting engine¶
- VectorBT PRO is a very extensive library that defines thousands of classes, functions, and objects. Thus, when working with any of them, you may want to "see through" the object to gain a better understanding of its attributes and contents. Gladly, there is a new formatting engine that can accurately format any in-house object as a human-readable string. Did you know that the API documentation is partially powered by this engine?
>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2021")
>>> vbt.pprint(data) # (1)!
YFData(
wrapper=ArrayWrapper(...),
data=symbol_dict({
'BTC-USD': <pandas.core.frame.DataFrame object at 0x7f7f1fbc6cd0 with shape (366, 7)>
}),
single_key=True,
classes=symbol_dict(),
fetch_kwargs=symbol_dict({
'BTC-USD': dict(
start='2020',
end='2021'
)
}),
returned_kwargs=symbol_dict({
'BTC-USD': dict()
}),
last_index=symbol_dict({
'BTC-USD': Timestamp('2020-12-31 00:00:00+0000', tz='UTC')
}),
tz_localize=datetime.timezone.utc,
tz_convert='UTC',
missing_index='nan',
missing_columns='raise'
)
>>> vbt.pdir(data) # (2)!
type path
attr
align_columns classmethod vectorbtpro.data.base.Data
align_index classmethod vectorbtpro.data.base.Data
build_feature_config_doc classmethod vectorbtpro.data.base.Data
... ... ...
vwap property vectorbtpro.data.base.Data
wrapper property vectorbtpro.base.wrapping.Wrapping
xs function vectorbtpro.base.indexing.PandasIndexer
>>> vbt.phelp(data.get) # (3)!
YFData.get(
columns=None,
symbols=None,
**kwargs
):
Get one or more columns of one or more symbols of data.
- Just like the Python's
print
command to pretty-print the contents of any vectorbtpro object - Just like the Python's
dir
command to pretty-print the attributes of a class, object, or module - Just like the Python's
help
command to pretty-print the signature and docstring of a function
Meta methods¶
- Many methods such as rolling apply are now available in two flavors: regular (instance methods) and meta (class methods). Regular methods are bound to a single array and do not have to take metadata anymore, while meta methods are not bound to any array and act as micro-pipelines with their own broadcasting and templating logic. Here, vectorbtpro closes one of the key limitations of Pandas - the inability to apply a function on multiple arrays at once.
>>> @njit
... def zscore_nb(x): # (1)!
... return (x[-1] - np.mean(x)) / np.std(x)
>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2021")
>>> data.close.rolling(14).apply(zscore_nb, raw=True) # (2)!
Date
2020-01-01 00:00:00+00:00 NaN
...
2020-12-27 00:00:00+00:00 1.543527
2020-12-28 00:00:00+00:00 1.734715
2020-12-29 00:00:00+00:00 1.755125
2020-12-30 00:00:00+00:00 2.107147
2020-12-31 00:00:00+00:00 1.781800
Freq: D, Name: Close, Length: 366, dtype: float64
>>> data.close.vbt.rolling_apply(14, zscore_nb) # (3)!
2020-01-01 00:00:00+00:00 NaN
...
2020-12-27 00:00:00+00:00 1.543527
2020-12-28 00:00:00+00:00 1.734715
2020-12-29 00:00:00+00:00 1.755125
2020-12-30 00:00:00+00:00 2.107147
2020-12-31 00:00:00+00:00 1.781800
Freq: D, Name: Close, Length: 366, dtype: float64
>>> @njit
... def corr_meta_nb(from_i, to_i, col, a, b): # (4)!
... a_window = a[from_i:to_i, col]
... b_window = b[from_i:to_i, col]
... return np.corrcoef(a_window, b_window)[1, 0]
>>> data2 = vbt.YFData.pull(["ETH-USD", "XRP-USD"], start="2020", end="2021")
>>> vbt.pd_acc.rolling_apply( # (5)!
... 14,
... corr_meta_nb,
... vbt.Rep("a"),
... vbt.Rep("b"),
... broadcast_named_args=dict(a=data.close, b=data2.close)
... )
symbol ETH-USD XRP-USD
Date
2020-01-01 00:00:00+00:00 NaN NaN
... ... ...
2020-12-27 00:00:00+00:00 0.636862 -0.511303
2020-12-28 00:00:00+00:00 0.674514 -0.622894
2020-12-29 00:00:00+00:00 0.712531 -0.773791
2020-12-30 00:00:00+00:00 0.839355 -0.772295
2020-12-31 00:00:00+00:00 0.878897 -0.764446
[366 rows x 2 columns]
- Access to the window only
- Using Pandas
- Using the regular method, which accepts the same function as pandas
- Access to one to multiple whole arrays
- Using the meta method, which accepts metadata and variable arguments
Array expressions¶
- When combining multiple arrays, they often need to be properly aligned and broadcasted before the actual operation. Using Pandas alone won't do the trick because Pandas is too strict in this regard. Luckily, vectorbtpro has an accessor class method that can take a regular Python expression, identify all the variable names, extract the corresponding arrays from the current context, broadcast them, and only then evaluate the actual expression (also using NumExpr!)
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"])
>>> low = data.low
>>> high = data.high
>>> bb = vbt.talib("BBANDS").run(data.close)
>>> upperband = bb.upperband
>>> lowerband = bb.lowerband
>>> bandwidth = (bb.upperband - bb.lowerband) / bb.middleband
>>> up_th = vbt.Param([0.3, 0.4])
>>> low_th = vbt.Param([0.1, 0.2])
>>> expr = """
... narrow_bands = bandwidth < low_th
... above_upperband = high > upperband
... wide_bands = bandwidth > up_th
... below_lowerband = low < lowerband
... (narrow_bands & above_upperband) | (wide_bands & below_lowerband)
... """
>>> mask = vbt.pd_acc.eval(expr)
>>> mask.sum()
low_th up_th symbol
0.1 0.3 BTC-USD 344
ETH-USD 171
0.4 BTC-USD 334
ETH-USD 158
0.2 0.3 BTC-USD 444
ETH-USD 253
0.4 BTC-USD 434
ETH-USD 240
dtype: int64
Resource management¶
- New profiling tools to measure the execution time and memory usage of any code block
>>> data = vbt.YFData.pull("BTC-USD")
>>> with (
... vbt.Timer() as timer,
... vbt.MemTracer() as mem_tracer
... ):
... print(vbt.PF.from_random_signals(data.close, n=100).sharpe_ratio)
0.33111243921865163
>>> print(timer.elapsed())
74.15 milliseconds
>>> print(mem_tracer.peak_usage())
459.7 kB
Templates¶
It's super-easy to extend classes, but vectorbtpro revolves around functions, so how do we enhance them or change their workflow? The easiest way is to introduce a tiny function (i.e., callback) that can be provided by the user and called by the main fucntion at some point in time. But this would require the main function to know which arguments to pass to the callback and what to do with the outputs. Here's a better idea: allow most arguments of the main function to become callbacks and then execute them to reveal the actual values. Such arguments are called "templates" and such a process is called "substitution". Templates are especially useful when some arguments (such as arrays) should be constructed only once all the required information is available, for example, once other arrays have been broadcast. Also, each such substitution opportunity has its own identifier such that you can control when a template should be substituted. In vectorbtpro, templates are first-class citizens and are integrated into most functions for an unmatched flexibility!
>>> def resample_apply(index, by, apply_func, *args, template_context={}, **kwargs):
... grouper = index.vbt.get_grouper(by) # (1)!
... results = {}
... with vbt.get_pbar() as pbar:
... for group, group_idxs in grouper: # (2)!
... group_index = index[group_idxs]
... context = {"group": group, "group_index": group_index, **template_context} # (3)!
... final_apply_func = vbt.substitute_templates(apply_func, context, sub_id="apply_func") # (4)!
... final_args = vbt.substitute_templates(args, context, sub_id="args")
... final_kwargs = vbt.substitute_templates(kwargs, context, sub_id="kwargs")
... results[group] = final_apply_func(*final_args, **final_kwargs)
... pbar.update(1)
... return pd.Series(results)
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], missing_index="drop")
>>> resample_apply(
... data.index, "Y",
... lambda x, y: x.corr(y), # (5)!
... vbt.RepEval("btc_close[group_index]"), # (6)!
... vbt.RepEval("eth_close[group_index]"),
... template_context=dict(
... btc_close=data.get("Close", "BTC-USD"), # (7)!
... eth_close=data.get("Close", "ETH-USD")
... )
... )
- Build a grouper. Accepts both group-by and resample instructions.
- Iterate over groups in the grouper. Each group consists of the label (such as
2017-01-01 00:00:00+00:00
) and row indices corresponding to this label. - Populate a new context with the information on the current group and user-provided external information
- Substitute the function and arguments using the newly-populated context
- Simple function to compute the correlation coefficient of two arrays
- Define both arguments as expression templates where we select the data corresponding to each group. All variables in these expressions will be automatically recognized and replaced by the current context. Once evaluated, the templates will be substituted by their outputs.
- Here we can specify additional information our templates depend upon