Data Formats#

Market Data Files#

Market data is organized as one file per ticker. The filename must match the ticker symbol exactly:

  • AAPL.csv or AAPL.parquet

  • SPY.csv or SPY.parquet

This applies to portfolio tickers and the benchmark ticker.

Required columns#

Each file must contain a date column and at least three price columns. The engine uses them for three distinct roles:

Role

What the engine uses it for

Date

Trading dates for the time series

Commission price

Share count for commission fee calculation (typically unadjusted VWAP)

Trade execution price

Price at which shares are bought and sold (typically split/dividend-adjusted VWAP)

Mark-to-market price

Daily portfolio valuation between rebalancing dates (typically split/dividend-adjusted close)

Column names are fully configurable — you map them to these roles in the configuration file. Additional columns in the file are ignored.

Note

All three price roles can point to the same column if your data only has one price. The three-role separation is a recommendation that reflects how institutional trading works, not a hard constraint.

Example CSV#

date,vwap,vwap_adjusted,close_adjusted
2020-01-02,74.06,74.06,73.87
2020-01-03,74.34,74.34,73.07
2020-01-06,73.19,73.19,73.75

In this example the mappings would be:

  • commission_price_columnvwap

  • trade_execution_price_columnvwap_adjusted

  • mark_to_market_price_columnclose_adjusted

  • date_columndate

Supported formats#

Format

File naming

Notes

CSV

{TICKER}.csv

Headers required; dates as YYYY-MM-DD

Parquet

{TICKER}.parquet

Efficient for large datasets

Portfolio Files#

Portfolio files define asset allocation weights at each rebalancing date. The engine auto-detects horizontal or vertical orientation based on the first column name.

Horizontal format (Ticker × Dates)#

First column must be named ``Ticker``. Subsequent columns are rebalancing dates; values are decimal weights.

Ticker  | 2020-01-02 | 2022-01-03 | 2024-01-02
AAPL    | 0.40       | 0.35       | 0.30
MSFT    | 0.35       | 0.35       | 0.40
GOOGL   | 0.25       | 0.30       | 0.30

Vertical format (Dates × Tickers)#

First column must be named ``date``. Subsequent columns are ticker symbols.

date       | AAPL | MSFT | GOOGL
2020-01-02 | 0.40 | 0.35 | 0.25
2022-01-03 | 0.35 | 0.35 | 0.30
2024-01-02 | 0.30 | 0.40 | 0.30

Rules#

  • Weights are decimals and must sum to 1.0 or less per rebalancing date — the remainder is held as cash

  • Rebalancing dates must exist within the market data date range

  • No null values allowed

  • Supported file formats: .csv, .xlsx