Productivity¶
Source refactorer¶
- Use the source refactorer to automatically enhance any Python code, whether it is a full package, Python object, or raw string. If the source is larger than expected, it intelligently splits it into clean AST-based chunks, refactors each using an LLM, and merges them back into polished code. You can choose to return the result, update the source in place, or copy it to your clipboard. For full transparency, you can also preview the diff directly in your browser
>>> env["GITHUB_TOKEN"] = "<YOUR_GITHUB_TOKEN>"
>>> env["OPENAI_API_KEY"] = "<YOUR_OPENAI_API_KEY>"
>>> source = """
... @njit
... def signal_func_nb(c, entries, exits):
... is_entry = vbt.pf_nb.select_nb(c, entries)
... is_exit = vbt.pf_nb.select_nb(c, exits)
...
... # TODO: Enter only if no other asset is active
...
... return is_entry, is_exit, False, False
... """ # (1)!
>>> new_source = vbt.refactor_source(
... source,
... attach_knowledge=True, # (2)!
... model="o3-mini",
... reasoning_effort="high",
... system_as_user=True,
... show_diff=True, # (3)!
... )
>>> print(new_source)
@njit
def signal_func_nb(c, entries, exits):
is_entry = vbt.pf_nb.select_nb(c, entries)
is_exit = vbt.pf_nb.select_nb(c, exits)
for col in range(c.from_col, c.to_col):
if col != c.col and c.last_position[col] != 0:
is_entry = False
break
return is_entry, is_exit, False, False
- Use TODO or FIXME comments to provide custom instructions.
- Attach relevant knowledge from the website and Discord.
- Show the produced changes in a browser.
Quick search & chat¶
- In addition to embeddings, which require prior generation, VBT now supports BM25 for fast, fully offline lexical search. This is perfect for quickly finding something specific.
>>> env["GITHUB_TOKEN"] = "<YOUR_GITHUB_TOKEN>"
>>> vbt.quick_search("UserWarning: Symbols have mismatching index")
ChatVBT¶
- Similar to SearchVBT, ChatVBT takes search results and forwards them to an LLM for completion. This allows you to interact seamlessly with the entire VBT knowledge base, receiving detailed and context-aware responses.
Info
The first time you run this command, it may take up to 15 minutes to prepare and embed documents. However, most of the preparation steps are cached and stored, so future searches will be much faster and will not require repeating the process.
>>> env["GITHUB_TOKEN"] = "<YOUR_GITHUB_TOKEN>"
>>> env["OPENAI_API_KEY"] = "<YOUR_OPENAI_API_KEY>"
>>> vbt.settings.set(
... "knowledge.chat.completions_configs.openai.model",
... "o3-mini" # (1)!
... )
>>> vbt.settings.set(
... "knowledge.chat.completions_configs.openai.reasoning_effort",
... "high" # (2)!
... )
>>> vbt.settings.set(
... "knowledge.chat.completions_configs.openai.system_as_user",
... True # (3)!
... )
>>> vbt.chat("How to rebalance weekly?", formatter="html")
- Discover more models. Note that the availability of this model depends on your tier. If your tier does not support o3-mini, try another model, such as o1-mini.
- Do not forget to update
openai
to the latest version. - Some models do not support system prompts.
>>> env["GITHUB_TOKEN"] = "<YOUR_GITHUB_TOKEN>"
>>> env["OPENAI_API_KEY"] = "<YOUR_OPENAI_API_KEY>" # (1)!
>>> env["OPENROUTER_API_KEY"] = "<YOUR_OPENROUTER_API_KEY>"
>>> vbt.settings.set(
... "knowledge.chat.completions_configs.openai.base_url",
... "https://openrouter.ai/api/v1"
... )
>>> vbt.settings.set(
... "knowledge.chat.completions_configs.openai.api_key",
... env["OPENROUTER_API_KEY"]
... )
>>> vbt.settings.set(
... "knowledge.chat.completions_configs.openai.model",
... "deepseek/deepseek-r1" # (2)!
... )
>>> vbt.chat("How to rebalance weekly?", formatter="html")
- Required for embeddings.
- Discover more models.
>>> env["GITHUB_TOKEN"] = "<YOUR_GITHUB_TOKEN>"
>>> env["OPENAI_API_KEY"] = "<YOUR_OPENAI_API_KEY>" # (1)!
>>> env["ANTHROPIC_API_KEY"] = "<YOUR_ANTHROPIC_API_KEY>"
>>> vbt.settings.set(
... "knowledge.chat.completions",
... "litellm"
... )
>>> vbt.settings.set(
... "knowledge.chat.completions_configs.litellm.model",
... "anthropic/claude-3-5-sonnet-20241022" # (2)!
... )
>>> vbt.chat("How to rebalance weekly?", formatter="html")
- Required for embeddings.
- Discover more models.
Note
Make sure you have the required hardware
The Hugging Face extension for LlamaIndex is required for embeddings and LLMs. Please ensure it is installed.
>>> env["GITHUB_TOKEN"] = "<YOUR_GITHUB_TOKEN>"
>>> vbt.settings.set(
... "knowledge.chat.embeddings",
... "llama_index"
... )
>>> vbt.settings.set(
... "knowledge.chat.embeddings_configs.llama_index.embedding",
... "huggingface"
... )
>>> vbt.settings.set(
... "knowledge.chat.embeddings_configs.llama_index.model_name",
... "BAAI/bge-small-en-v1.5" # (1)!
... )
>>> vbt.settings.set(
... "knowledge.chat.completions",
... "llama_index"
... )
>>> vbt.settings.set(
... "knowledge.chat.completions_configs.llama_index.llm",
... "huggingface"
... )
>>> vbt.settings.set(
... "knowledge.chat.completions_configs.llama_index.model_name",
... "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B" # (2)!
... )
>>> vbt.settings.set(
... "knowledge.chat.completions_configs.llama_index.tokenizer_name",
... "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"
... )
>>> vbt.settings.set(
... "knowledge.chat.rank_kwargs.dataset_id",
... "local"
... )
>>> vbt.chat("How to rebalance weekly?", formatter="html")
- Discover more embedding models.
- Discover more DeepSeek models.
SearchVBT¶
- Want to find specific information using natural language on the website or Discord? VBT provides a powerful smart search feature called SearchVBT. Enter your query and it will generate an HTML page with well-structured search results. Behind the scenes, SearchVBT uses a RAG pipeline to embed, rank, and retrieve only the most relevant documents from VBT, ensuring precise and efficient search results.
Info
The first time you run this command, it may take up to 15 minutes to prepare and embed documents. However, most preparation steps are cached and stored, so future searches will be significantly faster without needing to repeat the process.
>>> env["GITHUB_TOKEN"] = "<YOUR_GITHUB_TOKEN>"
>>> env["OPENAI_API_KEY"] = "<YOUR_API_KEY>"
>>> vbt.search("How to run indicator expressions?")
Self-aware classes¶
- Each VBT class offers methods to explore its features, including its API, associated documentation, Discord messages, and code examples. You can even interact with it directly via an LLM!
>>> env["GITHUB_TOKEN"] = "<YOUR_GITHUB_TOKEN>"
>>> env["OPENAI_API_KEY"] = "<YOUR_API_KEY>"
>>> vbt.PortfolioOptimizer.find_assets().get("link")
['https://vectorbt.pro/pvt_xxxxxxxx/api/portfolio/pfopt/base/#vectorbtpro.portfolio.pfopt.base',
'https://vectorbt.pro/pvt_xxxxxxxx/api/generic/analyzable/#vectorbtpro.generic.analyzable.Analyzable',
'https://vectorbt.pro/pvt_xxxxxxxx/api/base/wrapping/#vectorbtpro.base.wrapping.Wrapping',
...
'https://vectorbt.pro/pvt_xxxxxxxx/features/optimization/#riskfolio-lib',
'https://vectorbt.pro/pvt_xxxxxxxx/features/optimization/#portfolio-optimization',
'https://vectorbt.pro/pvt_xxxxxxxx/features/optimization/#pyportfolioopt',
...
'https://discord.com/channels/x/918629995415502888/1064943203071045753',
'https://discord.com/channels/x/918629995415502888/1067718833646874634',
'https://discord.com/channels/x/918629995415502888/1067718855734075403',
...]
>>> vbt.PortfolioOptimizer.chat("How to rebalance weekly?", formatter="html")
Knowledge assets¶
- Each release now includes valuable knowledge assets—JSON files containing private website content and the complete "vectorbt.pro" Discord history. These assets can be used with LLMs and services like Cursor. In addition, VBT offers a palette of classes for working with these assets, providing functions such as converting to Markdown and HTML files, browsing the website offline, performing targeted searches, interacting with LLMs, and much more!
>>> env["GITHUB_TOKEN"] = "<YOUR_GITHUB_TOKEN>"
>>> pages_asset = vbt.PagesAsset.pull() # (1)!
>>> messages_asset = vbt.MessagesAsset.pull()
>>> vbt_asset = pages_asset + messages_asset
>>> code = vbt_asset.find_code("def signal_func_nb", return_type="item")
>>> code.print_sample() # (2)!
- The first pull will download the assets, while subsequent pulls will use the cached versions.
- Print a random code snippet.
link: https://discord.com/channels/x/918630948248125512/1251081573147742298
block: https://discord.com/channels/x/918630948248125512/1251081573147742298
thread: https://discord.com/channels/x/918630948248125512/1250844139952541837
reference: https://discord.com/channels/x/918630948248125512/1250844139952541837
replies:
- https://discord.com/channels/x/918630948248125512/1251083513336299610
channel: support
timestamp: '2024-06-14 07:51:31'
author: '@polakowo'
content: Something like this
mentions:
- '@fei'
attachments:
- file_name: Screenshot_2024-06-13_at_20.29.45-B4517.png
content: |-
Here's the text extracted from the image:
```python
@njit
def signal_func_nb(c, entries, exits, wait):
is_entry = vbt.pf_nb.select_nb(c, entries)
if is_entry:
return True, False, False, False
is_exit = vbt.pf_nb.select_nb(c, exits)
if is_exit:
if vbt.pf_nb.in_position_nb(c):
last_order = vbt.pf_nb.get_last_order_nb(c)
if c.index[c.i] - c.index[last_order["idx"]] >= wait:
return False, True, False, False
return False, False, False, False
pf = vbt.PF.from_random_signals(
"BTC-USD",
n=100,
signal_func_nb=signal_func_nb,
signal_args=(
vbt.Rep("entries"),
vbt.Rep("exits"),
vbt.dt.to_ns(vbt.timedelta("1000 days"))
)
)
```
reactions: 0
Iterated decorator¶
- Thinking about parallelizing a for-loop? No need to hesitate—VBT has a decorator for that.
>>> import calendar
>>> @vbt.iterated(over_arg="year", merge_func="column_stack", engine="pathos") # (1)!
... @vbt.iterated(over_arg="month", merge_func="concat") # (2)!
... def get_year_month_sharpe(data, year, month): # (3)!
... mask = (data.index.year == year) & (data.index.month == month)
... if not mask.any():
... return np.nan
... year_returns = data.loc[mask].returns
... return year_returns.vbt.returns.sharpe_ratio()
>>> years = data.index.year.unique().sort_values().rename("year")
>>> months = data.index.month.unique().sort_values().rename("month")
>>> sharpe_matrix = get_year_month_sharpe(
... data,
... years,
... {calendar.month_abbr[month]: month for month in months}, # (4)!
... )
>>> sharpe_matrix.transpose().vbt.heatmap(
... trace_kwargs=dict(colorscale="RdBu", zmid=0),
... yaxis=dict(autorange="reversed")
... ).show()
- Iterate over years (in parallel).
- Iterate over months (sequentially).
- The function is called for each combination of year and month.
- Map month numbers to names and pass them as a dict. VBT will extract the keys and use them as labels.
Tasks¶
- Testing multiple parameter combinations usually involves using the
@vbt.parameterized
decorator. But what if you want to test entirely uncorrelated configurations or even different functions? The latest addition to VBT lets you execute any sequence of unrelated tests in parallel by assigning each test to a task.
>>> data = vbt.YFData.pull("BTC-USD")
>>> task1 = vbt.Task( # (1)!
... vbt.PF.from_random_signals,
... data,
... n=100, seed=42,
... sl_stop=vbt.Param(np.arange(1, 51) / 100)
... )
>>> task2 = vbt.Task(
... vbt.PF.from_random_signals,
... data,
... n=100, seed=42,
... tsl_stop=vbt.Param(np.arange(1, 51) / 100)
... )
>>> task3 = vbt.Task(
... vbt.PF.from_random_signals,
... data,
... n=100, seed=42,
... tp_stop=vbt.Param(np.arange(1, 51) / 100)
... )
>>> pf1, pf2, pf3 = vbt.execute([task1, task2, task3], engine="pathos") # (2)!
>>> fig = pf1.trades.expectancy.rename("SL").vbt.plot()
>>> pf2.trades.expectancy.rename("TSL").vbt.plot(fig=fig)
>>> pf3.trades.expectancy.rename("TP").vbt.plot(fig=fig)
>>> fig.show()
- A task consists of a function and the arguments you want to pass to that function. Just creating a task does not execute the function!
- Execute all three tasks using multiprocessing.
Nested progress bars¶
- Progress bars are now aware of each other. When a new progress bar starts, it checks whether another progress bar with the same identifier has already finished its task. If so, the new progress bar will close itself and delegate its progress to the existing one.
>>> symbols = ["BTC-USD", "ETH-USD"]
>>> fast_windows = range(5, 105, 5)
>>> slow_windows = range(5, 105, 5)
>>> sharpe_ratios = dict()
>>> with vbt.ProgressBar(total=len(symbols), bar_id="pbar1") as pbar1: # (1)!
... for symbol in symbols:
... pbar1.set_description(dict(symbol=symbol), refresh=True)
... data = vbt.YFData.pull(symbol)
...
... with vbt.ProgressBar(total=len(fast_windows), bar_id="pbar2") as pbar2: # (2)!
... for fast_window in fast_windows:
... pbar2.set_description(dict(fast_window=fast_window), refresh=True)
...
... with vbt.ProgressBar(total=len(slow_windows), bar_id="pbar3") as pbar3: # (3)!
... for slow_window in slow_windows:
... if fast_window < slow_window:
... pbar3.set_description(dict(slow_window=slow_window), refresh=True)
... fast_sma = data.run("talib_func:sma", fast_window)
... slow_sma = data.run("talib_func:sma", slow_window)
... entries = fast_sma.vbt.crossed_above(slow_sma)
... exits = fast_sma.vbt.crossed_below(slow_sma)
... pf = vbt.PF.from_signals(data, entries, exits)
... sharpe_ratios[(symbol, fast_window, slow_window)] = pf.sharpe_ratio
... pbar3.update()
...
... pbar2.update()
...
... pbar1.update()
- Track iteration over symbols.
- Track iteration over fast windows.
- Track iteration over slow windows.
>>> sharpe_ratios = pd.Series(sharpe_ratios)
>>> sharpe_ratios.index.names = ["symbol", "fast_window", "slow_window"]
>>> sharpe_ratios
symbol fast_window slow_window
BTC-USD 5 10 1.063616
15 1.218345
20 1.273154
25 1.365664
30 1.394469
...
ETH-USD 80 90 0.582995
95 0.617568
85 90 0.701215
95 0.616037
90 95 0.566650
Length: 342, dtype: float64
Annotations¶
- When writing a function, you can specify the meaning of each argument using an annotation immediately next to the argument. VBT now provides a rich set of in-house annotations tailored to specific tasks. For example, whether an argument is a parameter can be specified directly in the function instead of in the parameterized decorator.
>>> @vbt.cv_split(
... splitter="from_rolling",
... splitter_kwargs=dict(length=365, split=0.5, set_labels=["train", "test"]),
... parameterized_kwargs=dict(random_subset=100),
... )
... def sma_crossover_cv(
... data: vbt.Takeable, # (1)!
... fast_period: vbt.Param(condition="x < slow_period"), # (2)!
... slow_period: vbt.Param, # (3)!
... metric
... ) -> vbt.MergeFunc("concat"):
... fast_sma = data.run("sma", fast_period, hide_params=True)
... slow_sma = data.run("sma", slow_period, hide_params=True)
... entries = fast_sma.real_crossed_above(slow_sma)
... exits = fast_sma.real_crossed_below(slow_sma)
... pf = vbt.PF.from_signals(data, entries, exits, direction="both")
... return pf.deep_getattr(metric)
>>> sma_crossover_cv(
... vbt.YFData.pull("BTC-USD", start="4 years ago"),
... np.arange(20, 50),
... np.arange(20, 50),
... "trades.expectancy"
... )
split set fast_period slow_period
0 train 22 33 26.351841
test 21 34 35.788733
1 train 21 46 24.114027
test 21 39 2.261432
2 train 30 44 29.635233
test 30 38 1.909916
3 train 20 49 -7.038924
test 20 44 -1.366734
4 train 28 44 2.144805
test 29 38 -4.945776
5 train 35 47 -8.877875
test 34 37 2.792217
6 train 29 41 8.816846
test 28 43 36.008302
dtype: float64
- The passed argument must be takeable (for selecting subsets by split).
- Parameter with a condition requiring it to be less than the slow period.
- Parameter for the slow period.
DataFrame product¶
- Several parameterized indicators can produce DataFrames with different shapes and columns, which makes creating a Cartesian product tricky because they often share common column levels (such as "symbol") that should not be combined. There is now a method to cross-join multiple DataFrames block-wise.
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], missing_index="drop")
>>> sma = data.run("sma", timeperiod=[10, 20], unpack=True)
>>> ema = data.run("ema", timeperiod=[30, 40], unpack=True)
>>> wma = data.run("wma", timeperiod=[50, 60], unpack=True)
>>> sma, ema, wma = sma.vbt.x(ema, wma) # (1)!
>>> entries = sma.vbt.crossed_above(wma)
>>> exits = ema.vbt.crossed_below(wma)
>>> entries.columns
MultiIndex([(10, 30, 50, 'BTC-USD'),
(10, 30, 50, 'ETH-USD'),
(10, 30, 60, 'BTC-USD'),
(10, 30, 60, 'ETH-USD'),
(10, 40, 50, 'BTC-USD'),
(10, 40, 50, 'ETH-USD'),
(10, 40, 60, 'BTC-USD'),
(10, 40, 60, 'ETH-USD'),
(20, 30, 50, 'BTC-USD'),
(20, 30, 50, 'ETH-USD'),
(20, 30, 60, 'BTC-USD'),
(20, 30, 60, 'ETH-USD'),
(20, 40, 50, 'BTC-USD'),
(20, 40, 50, 'ETH-USD'),
(20, 40, 60, 'BTC-USD'),
(20, 40, 60, 'ETH-USD')],
names=['sma_timeperiod', 'ema_timeperiod', 'wma_timeperiod', 'symbol'])
- Build a Cartesian product of three DataFrames while keeping the column level "symbol" untouched. This can also be done with
vbt.pd_acc.cross(sma, ema, wma)
.
Compression¶
- Serialized VBT objects can sometimes use a lot of disk space. With this update, VBT now supports a variety of compression algorithms to make files as light as possible!
>>> data = vbt.RandomOHLCData.pull("RAND", start="2022", end="2023", timeframe="1 minute")
>>> file_path = data.save()
>>> print(vbt.file_size(file_path))
21.0 MB
>>> file_path = data.save(compression="blosc")
>>> print(vbt.file_size(file_path))
13.3 MB
Faster loading¶
- If your pipeline does not need accessors, Plotly graphs, or most other optional features, you can disable the auto-import feature entirely to reduce VBT's loading time to under a second
>>> start = utc_time()
>>> from vectorbtpro import *
>>> end = utc_time()
>>> end - start
0.580937910079956
Configuration files¶
- VBT extends configparser to define its own configuration format that allows users to save, introspect, modify, and load any complex in-house object. The main advantages of this format are readability and round-tripping: any object can be encoded and then decoded back without loss of information. The main features include nested structures, references, literal parsing, and evaluation of arbitrary Python expressions. Additionally, you can now create a configuration file for VBT and place it in the working directory— it will be used to update the default settings whenever the package is imported.
[plotting]
default_theme = dark
[portfolio]
init_cash = 5000
[data.custom.binance.client_config]
api_key = YOUR_API_KEY
api_secret = YOUR_API_SECRET
[data.custom.ccxt.exchanges.binance.exchange_config]
apiKey = &data.custom.binance.client_config.api_key
secret = &data.custom.binance.client_config.api_secret
>>> from vectorbtpro import *
>>> vbt.settings.portfolio["init_cash"]
5000
Serialization¶
- Just like machine learning models, every native VBT object can be serialized and saved to a binary file. It has never been easier to share data and insights! Another benefit is that only the actual content of each object is serialized, not its class definition, so the loaded object always uses the most up-to-date class definition. There is also special logic implemented to help you "reconstruct" objects if VBT introduces any breaking API changes
>>> data = vbt.YFData.pull("BTC-USD", start="2022-01-01", end="2022-06-01")
>>> def backtest_month(close):
... return vbt.PF.from_random_signals(close, n=10)
>>> month_pfs = data.close.resample(vbt.offset("M")).apply(backtest_month)
>>> month_pfs
Date
2022-01-01 00:00:00+00:00 Portfolio(\n wrapper=ArrayWrapper(\n ...
2022-02-01 00:00:00+00:00 Portfolio(\n wrapper=ArrayWrapper(\n ...
2022-03-01 00:00:00+00:00 Portfolio(\n wrapper=ArrayWrapper(\n ...
2022-04-01 00:00:00+00:00 Portfolio(\n wrapper=ArrayWrapper(\n ...
2022-05-01 00:00:00+00:00 Portfolio(\n wrapper=ArrayWrapper(\n ...
Freq: MS, Name: Close, dtype: object
>>> vbt.save(month_pfs, "month_pfs") # (1)!
>>> month_pfs = vbt.load("month_pfs") # (2)!
>>> month_pfs.apply(lambda pf: pf.total_return)
Date
2022-01-01 00:00:00+00:00 -0.048924
2022-02-01 00:00:00+00:00 0.168370
2022-03-01 00:00:00+00:00 0.016087
2022-04-01 00:00:00+00:00 -0.120525
2022-05-01 00:00:00+00:00 0.110751
Freq: MS, Name: Close, dtype: float64
- Save to disk.
- Load from disk later.
Data parsing¶
- Tired of passing open, high, low, and close as separate time series? Portfolio class methods now accept a data instance instead of just close and automatically extract the contained OHLC data. This small but handy feature saves you time!
>>> data = vbt.YFData.pull("BTC-USD", start="2020-01", end="2020-03")
>>> pf = vbt.PF.from_random_signals(data, n=10)
Index dictionaries¶
Manually creating arrays and setting their data with Pandas can often be challenging. Luckily, there is now a feature that offers much-needed assistance! Any broadcastable argument can become an index dictionary, which contains instructions on where to set values in the array and fills them in for you. It knows exactly which axis needs to be updated and does not create a full array unless necessary, saving RAM
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"])
>>> tile = pd.Index(["daily", "weekly"], name="strategy") # (1)!
>>> pf = vbt.PF.from_orders(
... data.close,
... size=vbt.index_dict({ # (2)!
... vbt.idx(
... vbt.pointidx(every="day"),
... vbt.colidx("daily", level="strategy")): 100, # (3)!
... vbt.idx(
... vbt.pointidx(every="sunday"),
... vbt.colidx("daily", level="strategy")): -np.inf, # (4)!
... vbt.idx(
... vbt.pointidx(every="monday"),
... vbt.colidx("weekly", level="strategy")): 100,
... vbt.idx(
... vbt.pointidx(every="monthend"),
... vbt.colidx("weekly", level="strategy")): -np.inf,
... }),
... size_type="value",
... direction="longonly",
... init_cash="auto",
... broadcast_kwargs=dict(tile=tile)
... )
>>> pf.sharpe_ratio
strategy symbol
daily BTC-USD 0.702259
ETH-USD 0.782296
weekly BTC-USD 0.838895
ETH-USD 0.524215
Name: sharpe_ratio, dtype: float64
- To represent two strategies, you need to tile the same data twice. Create a parameter with strategy names and pass it as
tile
to the broadcaster so it tiles the columns of each array (such as price) twice. - The index dictionary includes index instructions as keys and data as values to set. Keys can be row indices, labels, or custom indexer classes such as PointIdxr.
- Find the indices of the rows for the start of each day and the column index of "daily", then set each element at those indices to 100 (= accumulate).
- Find the indices of the rows that correspond to Sunday. If any value at those indices has already been set by a previous instruction, it will be overridden.
Slicing¶
- Similar to selecting columns, each VBT object can now slice rows using the same mechanism as in Pandas
This makes it easy to analyze and plot any subset of simulated data, without needing to re-simulate!
>>> data = vbt.YFData.pull("BTC-USD")
>>> pf = vbt.PF.from_holding(data, freq="d")
>>> pf.sharpe_ratio
1.116727709477293
>>> pf.loc[:"2020"].sharpe_ratio # (1)!
1.2699801554196481
>>> pf.loc["2021": "2021"].sharpe_ratio # (2)!
0.9825161170278687
>>> pf.loc["2022":].sharpe_ratio # (3)!
-1.0423271337174647
- Get the Sharpe ratio during the year 2020 and before.
- Get the Sharpe ratio during the year 2021.
- Get the Sharpe ratio during the year 2022 and after.
Column stacking¶
- Complex VBT objects of the same type can be easily stacked along columns. For example, you can combine multiple unrelated trading strategies into one portfolio for analysis. Under the hood, the final object is still represented as a monolithic multi-dimensional structure that can be processed even faster than separate merged objects
>>> def strategy1(data):
... fast_ma = vbt.MA.run(data.close, 50, short_name="fast_ma")
... slow_ma = vbt.MA.run(data.close, 200, short_name="slow_ma")
... entries = fast_ma.ma_crossed_above(slow_ma)
... exits = fast_ma.ma_crossed_below(slow_ma)
... return vbt.PF.from_signals(
... data.close,
... entries,
... exits,
... size=100,
... size_type="value",
... init_cash="auto"
... )
>>> def strategy2(data):
... bbands = vbt.BBANDS.run(data.close, window=14)
... entries = bbands.close_crossed_below(bbands.lower)
... exits = bbands.close_crossed_above(bbands.upper)
... return vbt.PF.from_signals(
... data.close,
... entries,
... exits,
... init_cash=200
... )
>>> data1 = vbt.BinanceData.pull("BTCUSDT")
>>> pf1 = strategy1(data1) # (1)!
>>> pf1.sharpe_ratio
0.9100317671866922
>>> data2 = vbt.BinanceData.pull("ETHUSDT")
>>> pf2 = strategy2(data2) # (2)!
>>> pf2.sharpe_ratio
-0.11596286232734827
>>> pf_sep = vbt.PF.column_stack((pf1, pf2)) # (3)!
>>> pf_sep.sharpe_ratio
0 0.910032
1 -0.115963
Name: sharpe_ratio, dtype: float64
>>> pf_join = vbt.PF.column_stack((pf1, pf2), group_by=True) # (4)!
>>> pf_join.sharpe_ratio
0.42820898354646514
- Analyze the first strategy in its own portfolio.
- Analyze the second strategy in its own portfolio.
- Analyze both strategies separately in the same portfolio.
- Analyze both strategies jointly in the same portfolio.
Row stacking¶
- Complex VBT objects of the same type can be easily stacked along rows. For example, you can append new data to an existing portfolio, or concatenate in-sample portfolios with their out-of-sample counterparts
>>> def strategy(data, start=None, end=None):
... fast_ma = vbt.MA.run(data.close, 50, short_name="fast_ma")
... slow_ma = vbt.MA.run(data.close, 200, short_name="slow_ma")
... entries = fast_ma.ma_crossed_above(slow_ma)
... exits = fast_ma.ma_crossed_below(slow_ma)
... return vbt.PF.from_signals(
... data.close[start:end],
... entries[start:end],
... exits[start:end],
... size=100,
... size_type="value",
... init_cash="auto"
... )
>>> data = vbt.BinanceData.pull("BTCUSDT")
>>> pf_whole = strategy(data) # (1)!
>>> pf_whole.sharpe_ratio
0.9100317671866922
>>> pf_sub1 = strategy(data, end="2019-12-31") # (2)!
>>> pf_sub1.sharpe_ratio
0.7810397448678937
>>> pf_sub2 = strategy(data, start="2020-01-01") # (3)!
>>> pf_sub2.sharpe_ratio
1.070339534746574
>>> pf_join = vbt.PF.row_stack((pf_sub1, pf_sub2)) # (4)!
>>> pf_join.sharpe_ratio
0.9100317671866922
- Analyze the entire range.
- Analyze the first date range.
- Analyze the second date range.
- Combine both date ranges and analyze them together.
Index alignment¶
- There is no longer a limitation requiring each Pandas array to have the same index. Indexes of all arrays that should broadcast against each other are automatically aligned, as long as they have the same data type.
>>> btc_data = vbt.YFData.pull("BTC-USD")
>>> btc_data.wrapper.shape
(2817, 7)
>>> eth_data = vbt.YFData.pull("ETH-USD") # (1)!
>>> eth_data.wrapper.shape
(1668, 7)
>>> ols = vbt.OLS.run( # (2)!
... btc_data.close,
... eth_data.close
... )
>>> ols.pred
Date
2014-09-17 00:00:00+00:00 NaN
2014-09-18 00:00:00+00:00 NaN
2014-09-19 00:00:00+00:00 NaN
2014-09-20 00:00:00+00:00 NaN
2014-09-21 00:00:00+00:00 NaN
... ...
2022-05-30 00:00:00+00:00 2109.769242
2022-05-31 00:00:00+00:00 2028.856767
2022-06-01 00:00:00+00:00 1911.555689
2022-06-02 00:00:00+00:00 1930.169725
2022-06-03 00:00:00+00:00 1882.573170
Freq: D, Name: Close, Length: 2817, dtype: float64
- ETH-USD history is shorter than BTC-USD history.
- This now works! Make sure all arrays share the same timeframe and timezone.
Numba datetime¶
- Numba does not support datetime indexes (or any other Pandas objects). There are also no built-in Numba functions for working with datetime. So, how do you connect data to time? VBT addresses this gap by implementing a collection of functions to extract various information from each timestamp, such as the current time and day of the week, to determine whether the bar is during trading hours.
Tutorial
Learn more in the Signal development tutorial.
>>> @njit
... def month_start_pct_change_nb(arr, index):
... out = np.full(arr.shape, np.nan)
... for col in range(arr.shape[1]):
... for i in range(arr.shape[0]):
... if i == 0 or vbt.dt_nb.month_nb(index[i - 1]) != vbt.dt_nb.month_nb(index[i]):
... month_start_value = arr[i, col]
... else:
... out[i, col] = (arr[i, col] - month_start_value) / month_start_value
... return out
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], start="2022", end="2023")
>>> pct_change = month_start_pct_change_nb(
... vbt.to_2d_array(data.close),
... data.index.vbt.to_ns() # (1)!
... )
>>> pct_change = data.symbol_wrapper.wrap(pct_change)
>>> pct_change.vbt.plot().show()
- Convert the datetime index to nanosecond format.
Periods ago¶
- Instead of writing Numba functions, comparing values at different bars can also be done in a vectorized way with Pandas. The problem is that there are no built-in functions to easily shift values based on timedeltas, nor are there rolling functions to check whether an event happened during a past period. This gap is filled by various new accessor methods.
Tutorial
Learn more in the Signal development tutorial.
>>> data = vbt.YFData.pull("BTC-USD", start="2022-05", end="2022-08")
>>> mask = (data.close < data.close.vbt.ago(1)).vbt.all_ago(5)
>>> fig = data.plot(plot_volume=False)
>>> mask.vbt.signals.ranges.plot_shapes(
... plot_close=False,
... fig=fig,
... shape_kwargs=dict(fillcolor="orangered")
... )
>>> fig.show()
Safe resampling¶
- Look-ahead bias is an ongoing risk when working with array data, especially on multiple time frames. Using Pandas alone is strongly discouraged because it does not recognize that financial data mainly involves bars where timestamps are the opening times, and events may occur at any time between bars. Pandas thus incorrectly assumes that timestamps indicate the exact time of an event. In VBT, there is a complete collection of functions and classes for safely resampling and analyzing data!
Tutorial
Learn more in the MTF analysis tutorial.
>>> def mtf_sma(close, close_freq, target_freq, timeperiod=5):
... target_close = close.vbt.realign_closing(target_freq) # (1)!
... target_sma = vbt.talib("SMA").run(target_close, timeperiod=timeperiod).real # (2)!
... target_sma = target_sma.rename(f"SMA ({target_freq})")
... return target_sma.vbt.realign_closing(close.index, freq=close_freq) # (3)!
>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2023")
>>> fig = mtf_sma(data.close, "D", "daily").vbt.plot()
>>> mtf_sma(data.close, "D", "weekly").vbt.plot(fig=fig)
>>> mtf_sma(data.close, "D", "monthly").vbt.plot(fig=fig)
>>> fig.show()
- Resample the source frequency to the target frequency. Since Close occurs at the end of the bar, resample it as a "closing event".
- Calculate the SMA on the target frequency.
- Resample the target frequency back to the source frequency to show multiple time frames on the same chart. Because
close
contains gaps, you cannot simply resample toclose_freq
as this might produce unaligned series. Instead, resample directly to the index ofclose
.
Resamplable objects¶
- You can resample not only time series, but also complex VBT objects! Under the hood, each object is made up of a collection of array-like attributes, so resampling means aggregating all the related information together. This is especially helpful if you want to simulate at a higher frequency for maximum accuracy and then analyze at a lower frequency for better speed.
Tutorial
Learn more in the MTF analysis tutorial.
>>> import calendar
>>> data = vbt.YFData.pull("BTC-USD", start="2018", end="2023")
>>> pf = vbt.PF.from_random_signals(data, n=100, direction="both")
>>> mo_returns = pf.resample("M").returns # (1)!
>>> mo_return_matrix = pd.Series(
... mo_returns.values,
... index=pd.MultiIndex.from_arrays([
... mo_returns.index.year,
... mo_returns.index.month
... ], names=["year", "month"])
... ).unstack("month")
>>> mo_return_matrix.columns = mo_return_matrix.columns.map(lambda x: calendar.month_abbr[x])
>>> mo_return_matrix.vbt.heatmap(
... is_x_category=True,
... trace_kwargs=dict(zmid=0, colorscale="Spectral")
... ).show()
- Resample the entire portfolio to monthly frequency and calculate the returns.
Formatting engine¶
- VBT is a comprehensive library that defines thousands of classes, functions, and objects. When working with these, you may want to "look inside" an object to better understand its attributes and contents. Fortunately, there is a formatting engine that can accurately format any in-house object as a human-readable string. Did you know the API documentation is partly powered by this engine?
>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2021")
>>> vbt.pprint(data) # (1)!
YFData(
wrapper=ArrayWrapper(...),
data=symbol_dict({
'BTC-USD': <pandas.core.frame.DataFrame object at 0x7f7f1fbc6cd0 with shape (366, 7)>
}),
single_key=True,
classes=symbol_dict(),
fetch_kwargs=symbol_dict({
'BTC-USD': dict(
start='2020',
end='2021'
)
}),
returned_kwargs=symbol_dict({
'BTC-USD': dict()
}),
last_index=symbol_dict({
'BTC-USD': Timestamp('2020-12-31 00:00:00+0000', tz='UTC')
}),
tz_localize=datetime.timezone.utc,
tz_convert='UTC',
missing_index='nan',
missing_columns='raise'
)
>>> vbt.pdir(data) # (2)!
type path
attr
align_columns classmethod vectorbtpro.data.base.Data
align_index classmethod vectorbtpro.data.base.Data
build_feature_config_doc classmethod vectorbtpro.data.base.Data
... ... ...
vwap property vectorbtpro.data.base.Data
wrapper property vectorbtpro.base.wrapping.Wrapping
xs function vectorbtpro.base.indexing.PandasIndexer
>>> vbt.phelp(data.get) # (3)!
YFData.get(
columns=None,
symbols=None,
**kwargs
):
Get one or more columns of one or more symbols of data.
- Similar to Python's
print
command, pretty-prints the contents of any VBT object. - Similar to Python's
dir
command, pretty-prints the attributes of a class, object, or module. - Similar to Python's
help
command, pretty-prints the signature and docstring of a function.
Meta methods¶
- Many methods, such as rolling apply, now come in two versions: regular (instance methods) and meta (class methods). Regular methods are bound to a single array and do not need metadata, while meta methods are not tied to any array and act as micro-pipelines with their own broadcasting and templating logic. Here, VBT solves one of the main Pandas limitations: the inability to apply a function to multiple arrays at once.
>>> @njit
... def zscore_nb(x): # (1)!
... return (x[-1] - np.mean(x)) / np.std(x)
>>> data = vbt.YFData.pull("BTC-USD", start="2020", end="2021")
>>> data.close.rolling(14).apply(zscore_nb, raw=True) # (2)!
Date
2020-01-01 00:00:00+00:00 NaN
...
2020-12-27 00:00:00+00:00 1.543527
2020-12-28 00:00:00+00:00 1.734715
2020-12-29 00:00:00+00:00 1.755125
2020-12-30 00:00:00+00:00 2.107147
2020-12-31 00:00:00+00:00 1.781800
Freq: D, Name: Close, Length: 366, dtype: float64
>>> data.close.vbt.rolling_apply(14, zscore_nb) # (3)!
2020-01-01 00:00:00+00:00 NaN
...
2020-12-27 00:00:00+00:00 1.543527
2020-12-28 00:00:00+00:00 1.734715
2020-12-29 00:00:00+00:00 1.755125
2020-12-30 00:00:00+00:00 2.107147
2020-12-31 00:00:00+00:00 1.781800
Freq: D, Name: Close, Length: 366, dtype: float64
>>> @njit
... def corr_meta_nb(from_i, to_i, col, a, b): # (4)!
... a_window = a[from_i:to_i, col]
... b_window = b[from_i:to_i, col]
... return np.corrcoef(a_window, b_window)[1, 0]
>>> data2 = vbt.YFData.pull(["ETH-USD", "XRP-USD"], start="2020", end="2021")
>>> vbt.pd_acc.rolling_apply( # (5)!
... 14,
... corr_meta_nb,
... vbt.Rep("a"),
... vbt.Rep("b"),
... broadcast_named_args=dict(a=data.close, b=data2.close)
... )
symbol ETH-USD XRP-USD
Date
2020-01-01 00:00:00+00:00 NaN NaN
... ... ...
2020-12-27 00:00:00+00:00 0.636862 -0.511303
2020-12-28 00:00:00+00:00 0.674514 -0.622894
2020-12-29 00:00:00+00:00 0.712531 -0.773791
2020-12-30 00:00:00+00:00 0.839355 -0.772295
2020-12-31 00:00:00+00:00 0.878897 -0.764446
[366 rows x 2 columns]
- Provides access to the window only.
- Using Pandas.
- Using the regular method, which accepts the same function as Pandas.
- Provides access to one or more entire arrays.
- Using the meta method, which accepts metadata and variable arguments.
Array expressions¶
- When combining multiple arrays, they often need to be aligned and broadcast before the operation itself. Pandas alone often falls short because it can be too strict. Fortunately, VBT includes an accessor class method that can take a regular Python expression, identify all variable names, extract the arrays from the current context, broadcast them, and then evaluate the expression (with support for NumExpr!)
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"])
>>> low = data.low
>>> high = data.high
>>> bb = vbt.talib("BBANDS").run(data.close)
>>> upperband = bb.upperband
>>> lowerband = bb.lowerband
>>> bandwidth = (bb.upperband - bb.lowerband) / bb.middleband
>>> up_th = vbt.Param([0.3, 0.4])
>>> low_th = vbt.Param([0.1, 0.2])
>>> expr = """
... narrow_bands = bandwidth < low_th
... above_upperband = high > upperband
... wide_bands = bandwidth > up_th
... below_lowerband = low < lowerband
... (narrow_bands & above_upperband) | (wide_bands & below_lowerband)
... """
>>> mask = vbt.pd_acc.eval(expr)
>>> mask.sum()
low_th up_th symbol
0.1 0.3 BTC-USD 344
ETH-USD 171
0.4 BTC-USD 334
ETH-USD 158
0.2 0.3 BTC-USD 444
ETH-USD 253
0.4 BTC-USD 434
ETH-USD 240
dtype: int64
Resource management¶
- New profiling tools help you measure the execution time and memory usage of any code block
>>> data = vbt.YFData.pull("BTC-USD")
>>> with (
... vbt.Timer() as timer,
... vbt.MemTracer() as mem_tracer
... ):
... print(vbt.PF.from_random_signals(data.close, n=100).sharpe_ratio)
0.33111243921865163
>>> print(timer.elapsed())
74.15 milliseconds
>>> print(mem_tracer.peak_usage())
459.7 kB
Templates¶
- It is easy to extend classes, but since VBT revolves around functions, how do we enhance them or change their workflow? The easiest way is to introduce a small function (i.e., callback) that the user can provide and that the main function calls at some point. However, this would require the main function to know what arguments to pass to the callback and how to handle its outputs. Here is a better idea: allow most arguments of the main function to become callbacks, then execute those to obtain their actual values. These arguments are called "templates" and this process is known as "substitution". Templates are especially useful when some arguments (such as arrays) should be built only once all required information is available, for example, when other arrays have already been broadcast. Each substitution opportunity has its own identifier so you can control when a template should be substituted. In VBT, templates are first-class citizens and are integrated into most functions for unmatched flexibility!
>>> def resample_apply(index, by, apply_func, *args, template_context={}, **kwargs):
... grouper = index.vbt.get_grouper(by) # (1)!
... results = {}
... with vbt.ProgressBar() as pbar:
... for group, group_idxs in grouper: # (2)!
... group_index = index[group_idxs]
... context = {"group": group, "group_index": group_index, **template_context} # (3)!
... final_apply_func = vbt.substitute_templates(apply_func, context, eval_id="apply_func") # (4)!
... final_args = vbt.substitute_templates(args, context, eval_id="args")
... final_kwargs = vbt.substitute_templates(kwargs, context, eval_id="kwargs")
... results[group] = final_apply_func(*final_args, **final_kwargs)
... pbar.update()
... return pd.Series(results)
>>> data = vbt.YFData.pull(["BTC-USD", "ETH-USD"], missing_index="drop")
>>> resample_apply(
... data.index, "Y",
... lambda x, y: x.corr(y), # (5)!
... vbt.RepEval("btc_close[group_index]"), # (6)!
... vbt.RepEval("eth_close[group_index]"),
... template_context=dict(
... btc_close=data.get("Close", "BTC-USD"), # (7)!
... eth_close=data.get("Close", "ETH-USD")
... )
... )
- Builds a grouper. Accepts both group-by and resample instructions.
- Iterates over groups in the grouper. Each group contains a label (such as
2017-01-01 00:00:00+00:00
) and the row indices corresponding to this label. - Creates a new context with information about the current group and any external information provided by the user.
- Substitutes the function and arguments using the newly populated context.
- Simple function to compute the correlation coefficient between two arrays.
- Defines both arguments as expression templates where data is selected for each group. All variables in these expressions will be automatically recognized and replaced by the current context. After evaluation, the templates will be replaced by their outputs.
- Specifies any additional information your templates depend on.
2017 0.808930
2018 0.897112
2019 0.753659
2020 0.940741
2021 0.553255
2022 0.975911
2023 0.974914
Freq: A-DEC, dtype: float64
And many more...¶
- Look forward to more killer features being added every week!