Portfolio Helpers#

Portfolio Helper Functions Module.

This module provides pure utility functions for portfolio operations. These functions perform data transformations without side effects and do not depend on service state or configuration.

All functions are stateless and can be used independently.

class PortfolioProcessingResult(weights_dict: dict[Timestamp, dict[str, Scalar]], start_date_backtest: Timestamp, end_date_backtest: Timestamp)

Bases: object

Immutable container for portfolio processing pipeline output.

Variables:
  • weights_dict (dict[pd.Timestamp, dict[str, pa.Scalar]]) – Processed weights dictionary filtered by date range and without zero-weight-only tickers.

  • start_date_backtest (pd.Timestamp) – The earliest rebalancing date in the processed weights.

  • end_date_backtest (pd.Timestamp) – The latest rebalancing date in the processed weights.

weights_dict: dict[Timestamp, dict[str, Scalar]]
start_date_backtest: Timestamp
end_date_backtest: Timestamp
class PortfolioLoadResult(portfolio_entity: PortfolioEntity, portfolio_table: Table | None, portfolio_dict: dict[Timestamp, dict[str, Scalar]] | None, start_date_backtest: Timestamp | None, end_date_backtest: Timestamp | None)

Bases: object

Result object encapsulating portfolio loading and processing outcomes.

This class provides a clean interface for accessing portfolio data through getters instead of tuple unpacking, improving code readability and maintainability.

Variables:
  • _portfolio_entity (PortfolioEntity) – The validated portfolio entity.

  • _portfolio_table (Optional[pa.Table]) – Raw table data (None for dict loading).

  • _portfolio_dict (Optional[dict[pd.Timestamp, dict[str, pa.Scalar]]]) – Processed weights dictionary.

  • _start_date_backtest (Optional[pd.Timestamp]) – Start date for backtesting.

get_portfolio_entity() PortfolioEntity

Return the validated portfolio entity.

Returns:

The portfolio entity containing tickers,

date columns, and weight data.

Return type:

PortfolioEntity

get_portfolio_table() Table | None

Return the raw portfolio table.

Returns:

Raw PyArrow table from file loading, or

None when the portfolio was loaded from a dictionary.

Return type:

pa.Table | None

get_portfolio_dict() dict[Timestamp, dict[str, Scalar]]

Return the processed portfolio weights dictionary.

Returns:

Mapping from

rebalancing dates to ticker-weight dictionaries.

Return type:

dict[pd.Timestamp, dict[str, pa.Scalar]]

get_start_date_backtest() Timestamp

Return the backtest start date.

Returns:

The earliest rebalancing date after processing.

Return type:

pd.Timestamp

get_end_date_backtest() Timestamp

Return the backtest end date.

Returns:

The latest rebalancing date after processing.

Return type:

pd.Timestamp

property portfolio_entity: PortfolioEntity

Backwards compatibility property for portfolio entity.

property portfolio_table: Table | None

Backwards compatibility property for portfolio table.

property portfolio_dict: dict[Timestamp, dict[str, Scalar]] | None

Backwards compatibility property for processed weights dictionary.

property start_date_backtest: Timestamp | None

Backwards compatibility property for backtest start date.

property end_date_backtest: Timestamp | None

Backwards compatibility property for backtest end date.

extract_tickers_from_weights_dict(weights_dict: dict[Timestamp, dict[str, Scalar]]) list[str]

Extract tickers that have a strictly positive weight at least once.

This is a pure function that analyzes the weights dictionary and returns tickers that are active (have positive weight) at any point in time.

Parameters:

weights_dict (dict[pd.Timestamp, dict[str, pa.Scalar]]) – Mapping from rebalancing date to “ticker -> weight”.

Returns:

Sorted list of active tickers (alphabetical order).

Return type:

list[str]

Notes

Tickers that never appear with positive weights are logged and excluded.

Example

>>> weights = {
...     pd.Timestamp('2020-01-01'): {'AAPL': 0.6, 'MSFT': 0.0},
...     pd.Timestamp('2020-01-02'): {'AAPL': 0.5, 'MSFT': 0.5}
... }
>>> extract_tickers_from_weights_dict(weights)
['AAPL', 'MSFT']
filter_weights_by_date_range(weights_dict: dict[Timestamp, dict[str, Scalar | float]], start_date: Timestamp | date, end_date: Timestamp | date) dict[Timestamp, dict[str, Scalar | float]]

Filter weights dictionary to include only dates within the specified range (inclusive).

This is a pure function that filters dates without modifying the original dictionary. All dates are normalized to midnight for consistent comparisons.

Parameters:
  • weights_dict (dict[pd.Timestamp, dict[str, pa.Scalar]]) – Mapping from rebalancing date to “ticker -> weight”.

  • start_date (pd.Timestamp) – Inclusive lower bound.

  • end_date (pd.Timestamp) – Inclusive upper bound.

Returns:

Filtered mapping with normalized dates.

Return type:

dict[pd.Timestamp, dict[str, pa.Scalar]]

Raises:

PortfolioEntityError – If no dates survive filtering. Includes diagnostic information.

Example

>>> weights = {
...     pd.Timestamp('2020-01-01'): {'AAPL': 0.6},
...     pd.Timestamp('2020-06-01'): {'AAPL': 0.5}
... }
>>> filtered = filter_weights_by_date_range(
...     weights,
...     pd.Timestamp('2020-01-01'),
...     pd.Timestamp('2020-03-31')
... )
>>> len(filtered)
1
filter_zero_weight_tickers(weights_dict: dict[Timestamp, dict[str, Scalar]]) dict[Timestamp, dict[str, Scalar]]

Filter out tickers that have zero weight across all rebalancing dates.

This is a pure function that does not mutate the original mapping and returns a new dictionary containing only tickers that are active (i.e., strictly positive weight at least once).

Parameters:

weights_dict (dict[pd.Timestamp, dict[str, pa.Scalar]]) – Mapping from rebalancing date to an inner mapping ticker -> weight (pa.Scalar).

Returns:

New dictionary that excludes tickers that consistently have weight 0.

Return type:

dict[pd.Timestamp, dict[str, pa.Scalar]]

Notes

  • A ticker is considered active if it has weight > 0 for at least one date.

  • Logs the removed tickers for traceability.

Example

>>> weights = {
...     pd.Timestamp('2020-01-01'): {'AAPL': 0.6, 'MSFT': 0.0, 'GOOGL': 0.4},
...     pd.Timestamp('2020-01-02'): {'AAPL': 0.5, 'MSFT': 0.0, 'GOOGL': 0.5}
... }
>>> filtered = filter_zero_weight_tickers(weights)
>>> 'MSFT' in filtered[pd.Timestamp('2020-01-01')]
False
get_start_date_backtest(weights_dict: dict[Timestamp, dict[str, Scalar]]) Timestamp

Determine the earliest rebalancing date from a weights dictionary.

This is a pure function that finds the minimum date without side effects.

Parameters:

weights_dict (dict[pd.Timestamp, dict[str, pa.Scalar]]) – Mapping from rebalancing date to “ticker -> weight”.

Returns:

The earliest date available.

Return type:

pd.Timestamp

Raises:

PortfolioEntityError – If the mapping is empty or None.

Examples

>>> weights = {
...     pd.Timestamp('2020-06-01'): {'AAPL': 0.5},
...     pd.Timestamp('2020-01-01'): {'AAPL': 0.6}
... }
>>> get_start_date_backtest(weights)
Timestamp('2020-01-01 00:00:00')
get_end_date_backtest(weights_dict: dict[Timestamp, dict[str, Scalar]]) Timestamp

Determine the latest rebalancing date from a weights dictionary.

Pure helper that returns max(weights_dict.keys()) after a non-empty check. Used by the pipeline to derive the effective end date of the backtest window when the user-provided end_date is "auto".

Parameters:

weights_dict (dict[pandas.Timestamp, dict[str, pyarrow.Scalar]]) – Mapping from rebalancing date to {ticker: weight}. Must be non-empty.

Returns:

The latest rebalancing date present in the mapping.

Return type:

pandas.Timestamp

Raises:

PortfolioEntityError – If the mapping is empty or None.

Examples

>>> weights = {
...     pd.Timestamp('2020-06-01'): {'AAPL': 0.5},
...     pd.Timestamp('2020-01-01'): {'AAPL': 0.6},
... }
>>> get_end_date_backtest(weights)
Timestamp('2020-06-01 00:00:00')

Notes

The function does not assume the mapping is ordered; it relies on max over the keys so insertion order does not matter.

See also

get_start_date_backtest

Companion helper returning the earliest date.