Portfolio Helpers#
Portfolio Helper Functions Module.
This module provides pure utility functions for portfolio operations. These functions perform data transformations without side effects and do not depend on service state or configuration.
All functions are stateless and can be used independently.
- class PortfolioProcessingResult(weights_dict: dict[Timestamp, dict[str, Scalar]], start_date_backtest: Timestamp, end_date_backtest: Timestamp)
Bases:
objectImmutable container for portfolio processing pipeline output.
- Variables:
weights_dict (dict[pd.Timestamp, dict[str, pa.Scalar]]) – Processed weights dictionary filtered by date range and without zero-weight-only tickers.
start_date_backtest (pd.Timestamp) – The earliest rebalancing date in the processed weights.
end_date_backtest (pd.Timestamp) – The latest rebalancing date in the processed weights.
- weights_dict: dict[Timestamp, dict[str, Scalar]]
- start_date_backtest: Timestamp
- end_date_backtest: Timestamp
- class PortfolioLoadResult(portfolio_entity: PortfolioEntity, portfolio_table: Table | None, portfolio_dict: dict[Timestamp, dict[str, Scalar]] | None, start_date_backtest: Timestamp | None, end_date_backtest: Timestamp | None)
Bases:
objectResult object encapsulating portfolio loading and processing outcomes.
This class provides a clean interface for accessing portfolio data through getters instead of tuple unpacking, improving code readability and maintainability.
- Variables:
_portfolio_entity (PortfolioEntity) – The validated portfolio entity.
_portfolio_table (Optional[pa.Table]) – Raw table data (None for dict loading).
_portfolio_dict (Optional[dict[pd.Timestamp, dict[str, pa.Scalar]]]) – Processed weights dictionary.
_start_date_backtest (Optional[pd.Timestamp]) – Start date for backtesting.
- get_portfolio_entity() PortfolioEntity
Return the validated portfolio entity.
- Returns:
- The portfolio entity containing tickers,
date columns, and weight data.
- Return type:
PortfolioEntity
- get_portfolio_table() Table | None
Return the raw portfolio table.
- Returns:
- Raw PyArrow table from file loading, or
Nonewhen the portfolio was loaded from a dictionary.
- Return type:
pa.Table | None
- get_portfolio_dict() dict[Timestamp, dict[str, Scalar]]
Return the processed portfolio weights dictionary.
- Returns:
- Mapping from
rebalancing dates to ticker-weight dictionaries.
- Return type:
dict[pd.Timestamp, dict[str, pa.Scalar]]
- get_start_date_backtest() Timestamp
Return the backtest start date.
- Returns:
The earliest rebalancing date after processing.
- Return type:
pd.Timestamp
- get_end_date_backtest() Timestamp
Return the backtest end date.
- Returns:
The latest rebalancing date after processing.
- Return type:
pd.Timestamp
- property portfolio_entity: PortfolioEntity
Backwards compatibility property for portfolio entity.
- property portfolio_table: Table | None
Backwards compatibility property for portfolio table.
- property portfolio_dict: dict[Timestamp, dict[str, Scalar]] | None
Backwards compatibility property for processed weights dictionary.
- property start_date_backtest: Timestamp | None
Backwards compatibility property for backtest start date.
- property end_date_backtest: Timestamp | None
Backwards compatibility property for backtest end date.
- extract_tickers_from_weights_dict(weights_dict: dict[Timestamp, dict[str, Scalar]]) list[str]
Extract tickers that have a strictly positive weight at least once.
This is a pure function that analyzes the weights dictionary and returns tickers that are active (have positive weight) at any point in time.
- Parameters:
weights_dict (dict[pd.Timestamp, dict[str, pa.Scalar]]) – Mapping from rebalancing date to “ticker -> weight”.
- Returns:
Sorted list of active tickers (alphabetical order).
- Return type:
list[str]
Notes
Tickers that never appear with positive weights are logged and excluded.
Example
>>> weights = { ... pd.Timestamp('2020-01-01'): {'AAPL': 0.6, 'MSFT': 0.0}, ... pd.Timestamp('2020-01-02'): {'AAPL': 0.5, 'MSFT': 0.5} ... } >>> extract_tickers_from_weights_dict(weights) ['AAPL', 'MSFT']
- filter_weights_by_date_range(weights_dict: dict[Timestamp, dict[str, Scalar | float]], start_date: Timestamp | date, end_date: Timestamp | date) dict[Timestamp, dict[str, Scalar | float]]
Filter weights dictionary to include only dates within the specified range (inclusive).
This is a pure function that filters dates without modifying the original dictionary. All dates are normalized to midnight for consistent comparisons.
- Parameters:
weights_dict (dict[pd.Timestamp, dict[str, pa.Scalar]]) – Mapping from rebalancing date to “ticker -> weight”.
start_date (pd.Timestamp) – Inclusive lower bound.
end_date (pd.Timestamp) – Inclusive upper bound.
- Returns:
Filtered mapping with normalized dates.
- Return type:
dict[pd.Timestamp, dict[str, pa.Scalar]]
- Raises:
PortfolioEntityError – If no dates survive filtering. Includes diagnostic information.
Example
>>> weights = { ... pd.Timestamp('2020-01-01'): {'AAPL': 0.6}, ... pd.Timestamp('2020-06-01'): {'AAPL': 0.5} ... } >>> filtered = filter_weights_by_date_range( ... weights, ... pd.Timestamp('2020-01-01'), ... pd.Timestamp('2020-03-31') ... ) >>> len(filtered) 1
- filter_zero_weight_tickers(weights_dict: dict[Timestamp, dict[str, Scalar]]) dict[Timestamp, dict[str, Scalar]]
Filter out tickers that have zero weight across all rebalancing dates.
This is a pure function that does not mutate the original mapping and returns a new dictionary containing only tickers that are active (i.e., strictly positive weight at least once).
- Parameters:
weights_dict (dict[pd.Timestamp, dict[str, pa.Scalar]]) – Mapping from rebalancing date to an inner mapping
ticker -> weight (pa.Scalar).- Returns:
New dictionary that excludes tickers that consistently have weight 0.
- Return type:
dict[pd.Timestamp, dict[str, pa.Scalar]]
Notes
A ticker is considered active if it has weight > 0 for at least one date.
Logs the removed tickers for traceability.
Example
>>> weights = { ... pd.Timestamp('2020-01-01'): {'AAPL': 0.6, 'MSFT': 0.0, 'GOOGL': 0.4}, ... pd.Timestamp('2020-01-02'): {'AAPL': 0.5, 'MSFT': 0.0, 'GOOGL': 0.5} ... } >>> filtered = filter_zero_weight_tickers(weights) >>> 'MSFT' in filtered[pd.Timestamp('2020-01-01')] False
- get_start_date_backtest(weights_dict: dict[Timestamp, dict[str, Scalar]]) Timestamp
Determine the earliest rebalancing date from a weights dictionary.
This is a pure function that finds the minimum date without side effects.
- Parameters:
weights_dict (dict[pd.Timestamp, dict[str, pa.Scalar]]) – Mapping from rebalancing date to “ticker -> weight”.
- Returns:
The earliest date available.
- Return type:
pd.Timestamp
- Raises:
PortfolioEntityError – If the mapping is empty or None.
Examples
>>> weights = { ... pd.Timestamp('2020-06-01'): {'AAPL': 0.5}, ... pd.Timestamp('2020-01-01'): {'AAPL': 0.6} ... } >>> get_start_date_backtest(weights) Timestamp('2020-01-01 00:00:00')
- get_end_date_backtest(weights_dict: dict[Timestamp, dict[str, Scalar]]) Timestamp
Determine the latest rebalancing date from a weights dictionary.
Pure helper that returns
max(weights_dict.keys())after a non-empty check. Used by the pipeline to derive the effective end date of the backtest window when the user-providedend_dateis"auto".- Parameters:
weights_dict (dict[pandas.Timestamp, dict[str, pyarrow.Scalar]]) – Mapping from rebalancing date to
{ticker: weight}. Must be non-empty.- Returns:
The latest rebalancing date present in the mapping.
- Return type:
pandas.Timestamp
- Raises:
PortfolioEntityError – If the mapping is empty or None.
Examples
>>> weights = { ... pd.Timestamp('2020-06-01'): {'AAPL': 0.5}, ... pd.Timestamp('2020-01-01'): {'AAPL': 0.6}, ... } >>> get_end_date_backtest(weights) Timestamp('2020-06-01 00:00:00')
Notes
The function does not assume the mapping is ordered; it relies on
maxover the keys so insertion order does not matter.See also
get_start_date_backtestCompanion helper returning the earliest date.