Welcome to Factor Pricing Model Universe documentation!¶

Factor Pricing Model Universe¶

License

Package to build universes for factor pricing model. For further details, please refer to the documentation

Installation¶

Install this via pip (or your favourite package manager):

pip install factor-pricing-model-universe

Usage¶

The library contains the pipelines to build the universe. You can run the pipelines interactively in Jupyter Notebook.

from fpm_universe import pipeline

Alternatively, for scheduled runs, you can create a configuration and run the command line entry point to create the universe.

Configuration¶

The configuration is in yaml format and contains a few inputs

Name	Description
`output_filename`	Output filename
`intermediate_directory`	Intermediate directory to export the pipeline outputs
`start_datetime`	Start datetime of the universe
`last_datetime`	Last datetime of the universe
`frequency`	Frequency of the universe. For further details, please see the “Offset aliases” in pandas documentation
`pipeline`	List of pipelines to filter the universe
`data`	Defines the data used by pipeline, or referred by yaml tag `!data`

Each pipeline returns a pandas dataframe indicating if the instrument is included into the universe on the specified date / time. For example, the pipeline returns the following dataframe

+------------+--------+-------+
|    date    |  AAPL  | GOOGL |
+------------+--------+-------+
| 2022-11-17 |  True  | False |
+------------+--------+-------+
| 2022-11-18 |  True  |  True |
+------------+--------+-------+

and it indicates AAPL is included in the universe on both 2022-11-17 and 2022-11-18 while GOOGL only on 2022-11-18.

By default, the pipeline functions are imported from module fpm_universe.pipeline.

Each data defines the method to retrieve from the source, or the operator on the source data. The return type of each data is unconstrained. It can be a json-like dict, a list, a pandas series, or even a pandas dataframe.

In the configuration, Each data can be referred by yaml tag !data, and it is loaded in lazy only when it is referred by another data object or a pipeline.

Command¶

The entry point factor-pricing-model-universe is to generate the universe regarding the given configuration to the destination, with dynamically passing the parameters to format the configuration.

The arguments of the entry point are

Argument	Description
`-c, --config TEXT`	Required. Configuration file path.
`-p, --parameter TEXT`	Parameters to be formatted in the configuration.

For example, given the configuration as follows,

output_filename: "{output_directory}/{date}.parquet"
intermediate_directory: "{output_directory}/{date}"
start_datetime: "2015-01-01"
last_datetime: "{date}"
frequency: "B"
pipeline:
  - name: range_validity
    function: range_validity
    parameters:
      values: !data initial_validity
data:
  symbols:
    function: jq_compile
    parameters:
      json_filename: "{data_directory}/index/sp500/default/{date}.json"
      pattern: "[.[] | .tickers[]] | sort | unique | .[]"
  initial_validity:
    function: jq_compile
    parameters:
      json_filename: "{data_directory}/listings/{date}.json"
      pattern: ".[] | {{ symbol: .symbol, valid_start_datetime: .ipoDate, valid_last_datetime: .delistingDate }}"
      includes:
        symbol: !data symbols

and run the following command

factor-pricing-model-universe \
  --config <path> \
  --parameter output_directory=$HOME/output \
  --parameter data_directory=$HOME/data \
  --parameter date=2022-10-20

the universe dataframe is output to $HOME/output/2022-10-20.parquet (formatted with the parameter output_directory and date).

More details…¶

Installation & Usage

Generate Universe

Project Info