API reference

Inputs

Set parameters and options

inputs.parameters_and_options.compute_agricultural_rent(rent, scale_fact, interest_rate, param, options)[source]

Convert agricultural land price into theoretical annual housing rent.

The conversion leverages the zero profit condition for formal private developers in equilibrium.

Parameters

rent (float) – Parametric agricultural land price at baseline year (2011)
scale_fact (float) – (Calibrated) scale factor for the construction function of formal private developers
interest_rate (float) – Real interest rate for the overall economy, corresponding to an average over past years
param (dict) – Dictionary of default parameters
options (dict) – Dictionary of default options

Returns

agricultural_rent – Theoretical agricultural (annual) rent (corresponds to opportunity cost of non-urbanized land)

Return type

float64

inputs.parameters_and_options.import_construction_parameters(param, grid, housing_types_sp, dwelling_size_sp, mitchells_plain_grid_baseline, grid_formal_density_HFA, coeff_land, interest_rate, options)[source]

Update default parameters with construction parameters.

Import set of numerical construction-related parameters used in the model. They depend on pre-loaded data and are therefore imported as part of a separate function

Parameters

param (dict) – Dictionary of default parameters
grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
housing_types_sp (DataFrame) – Table yielding, for each Small Place (1,046), the number of informal backyards, informal settlements, and total number of dwelling units at baseline year (2011), as well as its x and y (centroid) coordinates
dwelling_size_sp (Series) – Average dwelling size (in m²) in each Small Place (1,046) at baseline year (2011)
mitchells_plain_grid_baseline (ndarray(uint8)) – Dummy coding for belonging to Mitchells Plain neighbourhood at the grid-cell (24,014) level
grid_formal_density_HFA (ndarray(float64)) – Population density (per m²) in formal private housing at baseline year (2011) at the grid-cell (24,014) level
coeff_land (ndarray(float64, ndim=2)) – Table yielding, for each grid cell (24,014), the percentage of land area available for construction in each housing type (4) respectively. In the order: formal private housing, informal backyards, informal settlements, formal subsidized housing.
interest_rate (float64) – Real interest rate for the overall economy, corresponding to an average over past years
options (dict) – Dictionary of default options

Returns

param (dict) – Updated dictionary of default parameters
minimum_housing_supply (ndarray(float64)) – Minimum housing supply (in m²) for each grid cell (24,014), allowing for an ad hoc correction of low values in Mitchells Plain
agricultural_rent (int) – Annual housing rent below which it is not profitable for formal private developers to urbanize (agricultural) land: endogenously limits urban sprawl

inputs.parameters_and_options.import_options()[source]

Import default options.

Import set of numerical values coding for options used in the model. We can group them as follows: structural assumptions regarding agents’ behaviour, assumptions about different land uses, options about flood data used, about re-processing input data, about calibration process, about math correction relative to the original code, and about scenarios used for time-moving exogenous variables.

Returns: options – Dictionary of default options
Return type: dict

inputs.parameters_and_options.import_param(path_precalc_inp, options)[source]

Import default parameters.

Import set of numerical parameters used in the model. Some parameters are the output of a calibration process: it is the case of construction function parameters, incomes and associated gravity parameter, utility function parameters, and disamenity index for informal housing. Some other parameters are just defined ad hoc, based on existing empirical evidence.

Parameters

path_precalc_inp (str) – Path for precalcuted input data (calibrated parameters)
options (dict) – Dictionary of default options

Returns

param – Dictionary of default parameters

Return type

dict

Import data

inputs.data.compute_fraction_capital_destroyed(d, type_flood, damage_function, housing_type, options)[source]

Compute expected fraction of capital destroyed by floods.

To go from discrete to continuous estimates of flood damages, we linearly integrate between available return periods (understood as an annual event inverse probability). This function allows us to do so for a given flood type, housing type, and damage function.

Parameters

d (dict) – Dictionary of data frames yielding flood maps (maximum flood depth + fraction of flood-prone area for each grid cell) for each available return period, for a given flood type
type_flood (str) – Code for flood type considered (FU for fluvial undefended, FD for fluvial defended, P for pluvial, C for coastal)
damage_function (interp1d) – Linear interpolation for fraction of capital destroyed over maximum flood depth (in m) in a given area, for a given capital type
housing_type (str) – Housing type considered (to apply some ad hoc corrections). Should be set to “formal”, “subsidized”, “backyard”, or “informal”.
options (dict) – Dictionary of default options

Returns

Expected fraction of capital destroyed

Return type

float64

inputs.data.convert_income_distribution(income_distribution, grid, path_data, data_sp)[source]

Convert SP data for income distribution into grid dimensions.

This is used for validation as Small-Area-Level estimates are not available for population distribution across income groups.

Parameters

income_distribution (ndarray(uint16, ndim=2)) – Exogenous number of households in each Small Place (1,046) for each income group in the model (4)
grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
path_data (str) – Path towards data used in the model
data_sp (DataFrame) – Table yielding, for each Small Place (1,046), the average dwelling size (in m²), the average land price and annual income level (in rands), the size of unconstrained area for construction (in m²), the total area (in km²), the distance to the city centre (in km), whether or not the location belongs to Mitchells Plain, and the SP code

Returns

income_grid – Exogenous number of households in each grid cell (24,014) for each income group in the model (4)

Return type

ndarray(float64, ndim=2)

inputs.data.gen_small_areas_to_grid(grid, grid_intersect, small_area_data, small_area_code, unit)[source]

Convert SAL/SP data to grid dimensions.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
grid_intersect (DataFrame) – Table yielding the intersection areas between grid cells (24,014) and Small Areas (5,339) or Small Places (1,046)
small_area_data (Series) – Number of dwellings (or other variable) in each chosen geographic unit for a given housing type
small_area_code (Series) – Numerical code associated with each chosen geographic unit
unit (str) – Code defining with geographic unit should be used: must be set to either “SP” or “SAL”

Returns

grid_data – Number of dwellings (or other variable) in each grid cell (24,014) for a given housing type

Return type

ndarray(float64)

inputs.data.import_amenities(path_precalc_inp, options)[source]

Import calibrated amenity index.

Parameters

path_precalc_inp (str) – Path for precalcuted input data (calibrated parameters)
options (dict) – Dictionary of default options

Returns

amenities – Normalized amenity index (relative to the mean) for each grid cell (24,014)

Return type

ndarray(float64)

inputs.data.import_coeff_land(spline_land_constraints, spline_land_backyard, spline_land_informal, spline_land_RDP, param, t)[source]

Update land availability ratios for a given year.

This function updates the land availability regression splines to account for non-constructible land (such as roads, open spaces, etc.). This is done by reweighting the estimates with an ad hoc parameter ratio.

Parameters

spline_land_constraints (interp1d) – Linear interpolation for the grid-level overall land availability (in %)
spline_land_backyard (interp1d) – Linear interpolation for the grid-level land availability (in %) for informal backyards over the years
spline_land_informal (interp1d) – Linear interpolation for the grid-level land availability (in %) for informal settlements over the years
spline_land_RDP (interp1d) – Linear interpolation for the grid-level land availability (in %) for formal subsidized housing over the years
param (dict) – Dictionary of default parameters
t (int) – Year (relative to baseline year set at 0) for which we want to run the function

Returns

coeff_land – Updated land availability for each grid cell (24,014) and each housing type (4: formal private, informal backyards, informal settlements, formal subsidized)

Return type

ndarray(float64, ndim=2)

inputs.data.import_full_floods_data(options, param, path_folder)[source]

Compute expected fraction of capital destroyed by floods across space.

This function applies theoretical formulas to flood maps to get the theoretical expected fraction of capital destroyed across space (should households choose to live there). To do so, it leverages the import_init_floods_data and compute_fraction_capital_destroyed functions. Note that we consider the maximum flood depth across flood maps when they overlap each other, as there might be some double counting between pluvial and fluvial flood risks, and floods just spill over across space instead of piling up (bath-tub model).

Parameters

options (dict) – Dictionary of default options
param (dict) – Dictionary of default parameters
path_folder (str) – Path towards the root data folder

Returns

fraction_capital_destroyed (DataFrame) – Data frame of expected fractions of capital destroyed, for housing structures and contents in different housing types, in each grid cell (24,014)
structural_damages_small_houses (interp1d) – Linear interpolation for fraction of capital destroyed (small house structures) over maximum flood depth (in m) in a given area, from de Villiers et al., 2007
structural_damages_medium_houses (interp1d) – Linear interpolation for fraction of capital destroyed (medium house structures) over maximum flood depth (in m) in a given area, from de Villiers et al., 2007
structural_damages_large_houses (interp1d) – Linear interpolation for fraction of capital destroyed (large house structures) over maximum flood depth (in m) in a given area, from de Villiers et al., 2007
content_damages (interp1d) – Linear interpolation for fraction of capital destroyed (house contents) over maximum flood depth (in m) in a given area, from de Villiers et al., 2007
structural_damages_type1 (interp1d) – Linear interpolation for fraction of capital destroyed (type-1 house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (non-engineered buildings)
structural_damages_type2 (interp1d) – Linear interpolation for fraction of capital destroyed (type-2 house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (wooden buildings)
structural_damages_type3a (interp1d) – Linear interpolation for fraction of capital destroyed (type-3a house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (one-floor unreinforced masonry/concrete buildings)
structural_damages_type3b (interp1d) – Linear interpolation for fraction of capital destroyed (type-3b house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (two-floor unreinforced masonry/concrete buildings)
structural_damages_type4a (interp1d) – Linear interpolation for fraction of capital destroyed (type-4a house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (one-floor reinforced masonry/concrete and steel buildings)
structural_damages_type4b (interp1d) – Linear interpolation for fraction of capital destroyed (type-4b house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (two-floor reinforced masonry/concrete and steel buildings)

inputs.data.import_grid(path_data)[source]

Import standard geographic grid from the City of Cape Town.

Parameters: path_data (str) – Path towards data used in the model
Returns: grid – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
Return type: DataFrame

inputs.data.import_households_data(path_precalc_inp)[source]

Import geographic data with class distributions for households.

Parameters

path_precalc_inp (str) – Path for precalcuted input data (calibrated parameters)

Returns

data_rdp (DataFrame) – Table yielding, for each grid cell (24,014), the associated cumulative count of cells with some formal subsidized housing, and the associated area (in m²) dedicated to such housing
housing_types_sp (DataFrame) – Table yielding, for each Small Place (1,046), the number of informal backyards, of informal settlements, and total dwelling units, as well as their (centroid) x and y coordinates
data_sp (DataFrame) – Table yielding, for each Small Place (1,046), the average dwelling size (in m²), the average land price and annual income level (in rands), the size of unconstrained area for construction (in m²), the total area (in km²), the distance to the city centre (in km), whether or not the location belongs to Mitchells Plain, and the SP code
mitchells_plain_grid_baseline (ndarray(uint8)) – Dummy coding for belonging to Mitchells Plain neighbourhood at the grid-cell (24,014) level
grid_formal_density_HFA (ndarray(float64)) – Population density (per m²) in formal private housing at baseline year (2011) at the grid-cell (24,014) level
threshold_income_distribution (ndarray(int32)) – Annual income level (in rands) above which a household is taken as being part of one of the 4 income groups in the model
income_distribution (ndarray(uint16, ndim=2)) – Exogenous number of households in each Small Place (1,046) for each income group in the model (4)
cape_town_limits (ndarray(uint8)) – Dummy indicating whether or not a Small Place (1,046) belongs to the metropolitan area of the City of Cape Town

inputs.data.import_housing_limit(grid, param)[source]

Return maximum allowed housing supply in and out of historic city radius.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
param (dict) – Dictionary of default parameters

Returns

Maximum housing supply (in m² per km²) in each grid cell (24,014)

Return type

housing_limit . Series

inputs.data.import_hypothesis_housing_type()[source]

Define housing market accessibility hypotheses.

Returns: income_class_by_housing_type – Set of dummies coding for housing market access (across 4 housing submarkets) for each income group (4, from poorest to richest)
Return type: DataFrame

inputs.data.import_income_classes_data(param, path_data)[source]

Import population and average income per income class in the model.

Parameters

param (dict) – Dictionary of default parameters
path_data (str) – Path towards data used in the model

Returns

mean_income (float64) – Average median income across total population
households_per_income_class (ndarray(float64)) – Exogenous total number of households per income group (excluding people out of employment, for 4 groups)
average_income (ndarray(float64)) – Average median income for each income group in the model (4)
income_mult (ndarray(float64)) – Ratio of income-group-specific average income over global average income
income_baseline (DataFrame) – Table summarizing, for each income group in the data (12, including people out of employment), the number of households living in each endogenous housing type (3), their total number at baseline year (2011) in retrospect (2001), as well as the distribution of their average income (at baseline year)
households_per_income_and_housing (ndarray(float64, ndim=2)) – Exogenous number of households per income group (4, from poorest to richest) in each endogenous housing type (3: formal private, informal backyards, informal settlements), from SP-level data

inputs.data.import_init_floods_data(options, param, path_folder)[source]

Import raw flood data and damage functions.

More specifically, damage functions (taken from the literature) associate to a given maximum flood depth level a fraction of capital destroyed, depending on the type of capital considered. We focus here on housing structures (whose value is determined endogenously) and housing contents that are prone to flood destruction (calibrated ad hoc). We will later associate building material types to housing types considered in the model. Also note that fluvial flood maps are available both in a defended (supposedly accounting for existing protection infrastructure) and undefended version.

Parameters

options (dict) – Dictionary of default options
param (dict) – Dictionary of default parameters
path_folder (str) – Path towards the root data folder

Returns

structural_damages_small_houses (interp1d) – Linear interpolation for fraction of capital destroyed (small house structures) over maximum flood depth (in m) in a given area, from de Villiers et al., 2007
structural_damages_medium_houses (interp1d) – Linear interpolation for fraction of capital destroyed (medium house structures) over maximum flood depth (in m) in a given area, from de Villiers et al., 2007
structural_damages_large_houses (interp1d) – Linear interpolation for fraction of capital destroyed (large house structures) over maximum flood depth (in m) in a given area, from de Villiers et al., 2007
content_damages (interp1d) – Linear interpolation for fraction of capital destroyed (house contents) over maximum flood depth (in m) in a given area, from de Villiers et al., 2007
structural_damages_type1 (interp1d) – Linear interpolation for fraction of capital destroyed (type-1 house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (non-engineered buildings)
structural_damages_type2 (interp1d) – Linear interpolation for fraction of capital destroyed (type-2 house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (wooden buildings)
structural_damages_type3a (interp1d) – Linear interpolation for fraction of capital destroyed (type-3a house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (one-floor unreinforced masonry/concrete buildings)
structural_damages_type3b (interp1d) – Linear interpolation for fraction of capital destroyed (type-3b house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (two-floor unreinforced masonry/concrete buildings)
structural_damages_type4a (interp1d) – Linear interpolation for fraction of capital destroyed (type-4a house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (one-floor reinforced masonry/concrete and steel buildings)
structural_damages_type4b (interp1d) – Linear interpolation for fraction of capital destroyed (type-4b house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (two-floor reinforced masonry/concrete and steel buildings)
d_fluvial (dict) – Dictionary of data frames yielding fluvial flood maps (maximum flood depth + fraction of flood-prone area for each grid cell) for each available return period
d_pluvial (dict) – Dictionary of data frames yielding pluvial flood maps (maximum flood depth + fraction of flood-prone area for each grid cell) for each available return period
d_coastal (dict) – Dictionary of data frames yielding coastal flood maps (maximum flood depth + fraction of flood-prone area for each grid cell) for each available return period

inputs.data.import_land_use(grid, options, param, data_rdp, housing_types, housing_type_data, path_data, path_folder)[source]

Return linear regression spline estimates for housing building paths.

This function imports scenarios about land use availability for several housing types, then processes them to obtain a percentage of land available for construction in each housing type, for each grid cell over the years. It first sets the evolution of total number formal subsidized housing dwellings, then does the same at the grid-cell level, before defining land availability over time for some housing types: first formal subsidized housing, then informal backyards and informal settlements. It also defines the evolution of unconstrained land, with and without an urban edge.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
options (dict) – Dictionary of default options
param (dict) – Dictionary of default parameters
data_rdp (DataFrame) – Table yielding, for each grid cell (24,014), the associated cumulative count of cells with some formal subsidized housing, and the associated area (in m²) dedicated to such housing
housing_types (DataFrame) – Table yielding, for 4 different housing types (informal settlements, formal backyards, informal backyards, and formal private housing), the number of households in each grid cell (24,014), from SAL data. Note that the notion of formal backyards correspond to backyards with a concrete structure (as opposed to informal “shacks”), and not to backyards located within the premises of formal private homes. In any case, we abstract from both of those definitions for the sake of simplicity and as they account for a marginal share of overall backyarding.
housing_type_data (ndarray(int64)) – Exogenous number of households per housing type (4: formal private, informal backyards, informal settlements, formal subsidized), from Small-Area-Level data
path_data (str) – Path towards data used in the model
path_folder (str) – Path towards the root data folder

Returns

spline_RDP (interp1d) – Linear interpolation for the total number of formal subsidized dwellings over the years (baseline year set at 0)
spline_estimate_RDP (interp1d) – Linear interpolation for the grid-level number of formal subsidized dwellings over the years (baseline year set at 0)
spline_land_RDP (interp1d) – Linear interpolation for the grid-level land availability (in %) for formal subsidized housing over the years (baseline year set at 0)
spline_land_backyard (interp1d) – Linear interpolation for the grid-level land availability (in %) for informal backyards over the years (baseline year set at 0)
spline_land_informal (interp1d) – Linear interpolation for the grid-level land availability (in %) for informal settlements over the years (baseline year set at 0)
spline_land_constraints (interp1d) – Linear interpolation for the grid-level overall land availability, (in %) over the years (baseline year set at 0)
number_properties_RDP (ndarray(float64)) – Number of formal subsidized dwellings per grid cell (24,014) at baseline year (2011)

inputs.data.import_macro_data(param, path_scenarios, path_folder)[source]

Import interest rate and population per housing type.

Parameters

param (dict) – Dictionary of default parameters
path_scenarios (str) – Path towards raw scenarios used for time-moving exogenous variables
path_folder – Path towards the root data folder

Returns

interest_rate (float64) – Interest rate for the overall economy, corresponding to an average over past years
population (int64) – Total number of households in the city (from Small-Area-Level data)
housing_type_data (ndarray(int64)) – Exogenous number of households per housing type (4: formal private, informal backyards, informal settlements, formal subsidized), from Small-Area-Level data
total_RDP (int) – Number of households living in formal subsidized housing (from SAL data)

inputs.data.import_sal_data(grid, path_folder, path_data, housing_type_data)[source]

Import SAL data for population density by housing type.

This is used for validation as Small-Area-Level estimates are more precise than Small-Place level estimates. However, they only yield the distribution across housing types, and not income groups.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
path_folder (str) – Path towards the root data folder
path_data (str) – Path towards data used in the model
housing_type_data (ndarray(int64)) – Exogenous number of households per housing type (4: formal private, informal backyards, informal settlements, formal subsidized), from Small-Area-Level data

Returns

housing_types_grid_sal – Table yielding the number of dwellings for 4 housing types (formal private, informal backyards, formal backyards, and informal settlements) for each grid cell (24,014), from SAL estimates

Return type

DataFrame

inputs.data.import_transport_data(grid, param, yearTraffic, households_per_income_class, average_income, spline_inflation, spline_fuel, spline_population_income_distribution, spline_income_distribution, path_precalc_inp, path_precalc_transp, dim, options)[source]

Run commuting choice model.

This function runs the theoretical commuting choice model to recover key transport-related intermediate outputs. More specifically, it imports transport costs (from data) and (calibrated) incomes (see calibration.sub.compute_income), then computes the modal shares for each commuting pair, the probability distribution of such commuting pairs, the expected income net of commuting costs per residential location, and the associated average incomes.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
param (dict) – Dictionary of default parameters
yearTraffic (int) – Year (relative to baseline year set at 0) for which we want to run the function
households_per_income_class (ndarray(float64)) – Exogenous total number of households per income group (excluding people out of employment, for 4 groups)
average_income (ndarray(float64)) – Average median income for each income group in the model (4)
spline_inflation (interp1d) – Linear interpolation for inflation rate (in base 100 relative to baseline year) over the years (baseline year set at 0)
spline_fuel (interp1d) – Linear interpolation for fuel price (in rands per km) over the years (baseline year set at 0)
spline_population_income_distribution (interp1d) – Linear interpolation for total population per income group in the data (12) over the years (baseline year set at 0)
spline_income_distribution (interp1d) – Linear interpolation for median annual income (in rands) per income group in the data (12) over the years (baseline year set at 0)
path_precalc_inp (str) – Path for precalcuted input data (calibrated parameters)
path_precalc_transp (str) – Path for precalcuted transport inputs (intermediate outputs from commuting choice model)
dim (str) – Geographic level of analysis at which we want to run the commuting choice model: should be set to “GRID” or “SP”
options (dict) – Dictionary of default options

Returns

incomeNetOfCommuting (ndarray(float64, ndim=2)) – Expected annual income net of commuting costs (in rands, for one household), for each geographic unit, by income group (4)
modalShares (ndarray(float64, ndim=4)) – Share (from 0 to 1) of each transport mode for each income group (4) and each commuting pair (185 selected job centers + chosen geographic units for residential locations)
ODflows (ndarray(float64, ndim=3)) – Probability to work in a given selected job center (185, out of a wider pool of transport zones), for a given income group (4) and a given residential location (depending on geographic unit chosen)
averageIncome (ndarray(float64, ndim=2)) – Average annual income for each geographic unit and each income group (4), for one household

Equilibrium

Compute equilibrium

equilibrium.compute_equilibrium.compute_equilibrium(fraction_capital_destroyed, amenities, param, housing_limit, population, households_per_income_class, total_RDP, coeff_land, income_net_of_commuting_costs, grid, options, agricultural_rent, interest_rate, number_properties_RDP, average_income, mean_income, income_class_by_housing_type, minimum_housing_supply, construction_param, income_baseline)[source]

Run the static equilibrium algorithm.

This function runs the algorithm described in the technical documentation. It starts from arbitrary utility levels, and leverages optimality conditions on supply and demand to recover key output variables (detailed in equilibrium.sub.compute_outputs). Then, it updates utility levels to minimize the error between simulated and target number of households per income group. Note that the whole process abstracts from formal subsidised housing (fully exogenous in the model), that is added to final outputs at the end of the function.

Parameters

fraction_capital_destroyed (DataFrame) – Data frame of expected fractions of capital destroyed, for housing structures and contents in different housing types, in each grid cell (24,014)
amenities (ndarray(float64)) – Normalized amenity index (relative to the mean) for each grid cell (24,014)
param (dict) – Dictionary of default parameters
housing_limit (Series) – Maximum housing supply (in m² per km²) in each grid cell (24,014)
population (int64) – Total number of households in the city (from Small-Area-Level data)
households_per_income_class (ndarray(float64)) – Exogenous total number of households per income group (excluding people out of employment, for 4 groups)
total_RDP (int) – Number of households living in formal subsidized housing (from SAL data)
coeff_land (ndarray(float64, ndim=2)) – Updated land availability for each grid cell (24,014) and each housing type (4: formal private, informal backyards, informal settlements, formal subsidized)
income_net_of_commuting_costs (ndarray(float64, ndim=2)) – Expected annual income net of commuting costs (in rands, for one household), for each geographic unit, by income group (4)
grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
options (dict) – Dictionary of default options
agricultural_rent (float64) – Annual housing rent below which it is not profitable for formal private developers to urbanize (agricultural) land: endogenously limits urban sprawl
interest_rate (float64) – Real interest rate for the overall economy, corresponding to an average over past years
number_properties_RDP (ndarray(float64)) – Number of formal subsidized dwellings per grid cell (24,014) at baseline year (2011)
average_income (ndarray(float64)) – Average median income for each income group in the model (4)
mean_income (float64) – Average median income across total population
income_class_by_housing_type (DataFrame) – Set of dummies coding for housing market access (across 4 housing submarkets) for each income group (4, from poorest to richest)
minimum_housing_supply (ndarray(float64)) – Minimum housing supply (in m²) for each grid cell (24,014), allowing for an ad hoc correction of low values in Mitchells Plain
construction_param (ndarray(float64)) – (Calibrated) scale factor for the construction function of formal private developers
income_baseline (DataFrame) – Table summarizing, for each income group in the data (12, including people out of employment), the number of households living in each endogenous housing type (3), their total number at baseline year (2011) in retrospect (2001), as well as the distribution of their average income (at baseline year)

Returns

initial_state_utility (ndarray(float64)) – Utility levels for each income group (4) at baseline year (2011)
initial_state_error (ndarray(float64)) – Ratio (in %) of simulated number of households per income group over target population per income group at baseline year (2011)
initial_state_simulated_jobs (ndarray(float64, ndim=2)) – Total number of households in each income group (4) in each endogenous housing type (3: formal private, informal backyards, informal settlements) at baseline year (2011)
initial_state_households_housing_types (ndarray(float64, ndim=2)) – Number of households per grid cell in each housing type (4) at baseline year (2011)
initial_state_household_centers (ndarray(float64, ndim=2)) – Number of households per grid cell in each income group (4) at baseline year (2011)
initial_state_households (ndarray(float64, ndim=3)) – Number of households per grid cell in each income group (4) and each housing type (4) at baseline year (2011)
initial_state_dwelling_size (ndarray(float64, ndim=2)) – Average dwelling size (in m²) per grid cell in each housing type (4) at baseline year (2011)
initial_state_housing_supply (ndarray(float64, ndim=2)) – Housing supply per unit of available land (in m² per km²) for each housing type (4) in each grid cell at baseline year (2011)
initial_state_rent (ndarray(float64, ndim=2)) – Average annual rent (in rands) per grid cell for each housing type (4) at baseline year (2011)
initial_state_rent_matrix (ndarray(float64, ndim=3)) – Average annual willingness to pay (in rands) per grid cell for each income group (4) and each endogenous housing type (3) at baseline year (2011)
initial_state_capital_land (ndarray(float64, ndim=2)) – Value (in rands) of the housing capital stock per unit of available land (in km²) for each endogenous housing type (3) per grid cell at baseline year (2011)
initial_state_average_income (ndarray(float64)) – Not an output of the model per se : it is just the average median income for each income group in the model (4), that may change over time
initial_state_limit_city (list) – Contains a ndarray(bool, ndim=3) of indicator dummies for having strictly more than one household per housing type and income group in each grid cell

Run simulations

equilibrium.run_simulations.run_simulation(t, options, param, grid, initial_state_utility, initial_state_error, initial_state_households, initial_state_households_housing_types, initial_state_housing_supply, initial_state_household_centers, initial_state_average_income, initial_state_rent, initial_state_dwelling_size, initial_state_capital_land, fraction_capital_destroyed, amenities, housing_limit, spline_estimate_RDP, spline_land_constraints, spline_land_backyard, spline_land_RDP, spline_land_informal, income_class_by_housing_type, precalculated_transport, spline_RDP, spline_agricultural_price, spline_interest_rate, spline_population_income_distribution, spline_inflation, spline_income_distribution, spline_population, spline_income, spline_minimum_housing_supply, spline_fuel, income_baseline)[source]

Run simulations over several years according to exogenous scenarios.

After accounting for the changes in time-moving exogenous variables each period, the function returns key equilibrium values (computed with equilibrium.compute_equilibrium) under a housing supply constraint described in equilibrium.functions_dynamic and the technical documentation. This allows to account for inertia in formal private developers’ actions and return more realistic simulated outputs.

Parameters

t (int) – Year for which we want to run the function
options (dict) – Dictionary of default options
param (dict) – Dictionary of default parameters
grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
initial_state_utility (ndarray(float64)) – Utility levels for each income group (4) at baseline year (2011)
initial_state_error (ndarray(float64)) – Ratio (in %) of simulated number of households per income group over target population per income group at baseline year (2011)
initial_state_households (ndarray(float64, ndim=3)) – Number of households per grid cell in each income group (4) and each housing type (4) at baseline year (2011)
initial_state_households_housing_types (ndarray(float64, ndim=2)) – Number of households per grid cell in each housing type (4) at baseline year (2011)
initial_state_housing_supply (ndarray(float64, ndim=2)) – Housing supply per unit of available land (in m² per km²) for each housing type (4) in each grid cell at baseline year (2011)
initial_state_household_centers (ndarray(float64, ndim=2)) – Number of households per grid cell in each income group (4) at baseline year (2011)
initial_state_average_income (ndarray(float64)) – Not an output of the model per se : it is just the average median income for each income group in the model (4), that may change over time
initial_state_rent (ndarray(float64, ndim=2)) – Average annual rent (in rands) per grid cell for each housing type (4) at baseline year (2011)
initial_state_dwelling_size (ndarray(float64, ndim=2)) – Average dwelling size (in m²) per grid cell in each housing type (4) at baseline year (2011)
fraction_capital_destroyed (DataFrame) – Data frame of expected fractions of capital destroyed, for housing structures and contents in different housing types, in each grid cell (24,014)
amenities (ndarray(float64)) – Normalized amenity index (relative to the mean) for each grid cell (24,014)
housing_limit (Series) – Maximum housing supply (in m² per km²) in each grid cell (24,014)
spline_estimate_RDP (interp1d) – Linear interpolation for the grid-level number of formal subsidized dwellings over the years (baseline year set at 0)
spline_land_constraints (interp1d) – Linear interpolation for the grid-level overall land availability, (in %) over the years (baseline year set at 0)
spline_land_backyard (interp1d) – Linear interpolation for the grid-level land availability (in %) for informal backyards over the years (baseline year set at 0)
spline_land_RDP (interp1d) – Linear interpolation for the grid-level land availability (in %) for formal subsidized housing over the years (baseline year set at 0)
spline_land_informal (interp1d) – Linear interpolation for the grid-level land availability (in %) for informal settlements over the years (baseline year set at 0)
income_class_by_housing_type (DataFrame) – Set of dummies coding for housing market access (across 4 housing submarkets) for each income group (4, from poorest to richest)
precalculated_transport (str) – Path for precalcuted transport inputs (intermediate outputs from commuting choice model)
spline_RDP (interp1d) – Linear interpolation for the total number of formal subsidized dwellings over the years (baseline year set at 0)
spline_agricultural_price (interp1d) – Linear interpolation for the agricultural land price (in rands) over the years (baseline year set at 0)
spline_interest_rate (interp1d) – Linear interpolation for the interest rate (in %) over the years
spline_population_income_distribution (interp1d) – Linear interpolation for total population per income group in the data (12) over the years (baseline year set at 0)
spline_inflation (interp1d) – Linear interpolation for inflation rate (in base 100 relative to baseline year) over the years (baseline year set at 0)
spline_income_distribution (interp1d) – Linear interpolation for median annual income (in rands) per income group in the data (12) over the years (baseline year set at 0)
spline_population (interp1d) – Linear interpolation for total population over the years (baseline year set at 0)
spline_income (interp1d) – Linear interpolation for overall average (annual) income over the years (baseline year set at 0), used to avoid money illusion in future simulations when computing the housing supply (see equilibrium.run_simulations)
spline_minimum_housing_supply (interp1d) – Linear interpolation for minimum housing supply (in m²) over the years (baseline year set at 0)
spline_fuel (interp1d) – Linear interpolation for fuel price (in rands per km) over the years (baseline year set at 0)
income_baseline (DataFrame) – Table summarizing, for each income group in the data (12, including people out of employment), the number of households living in each endogenous housing type (3), their total number at baseline year (2011) in retrospect (2001), as well as the distribution of their average income (at baseline year)

Returns

simulation_households_center (ndarray(float64, ndim=3)) – Number of households per grid cell (24,014) in each income group (4) over all simulation years (30)
simulation_households_housing_type (ndarray(float64, ndim=3)) – Number of households per grid cell (24,014) in each housing type (4) over all simulation years (30)
simulation_dwelling_size (ndarray(float64, ndim=3)) – Average dwelling size (in m²) per grid cell (24,014) in each housing type (4) over all simulation years (30)
simulation_rent (ndarray(float64, ndim=3)) – Average annual rent (in rands) per grid cell (24,014) for each housing type (4) over all simulation years (30)
simulation_households (ndarray(float64, ndim=4)) – Number of households per grid cell (24,014) in each income group (4) and each housing type (4) at baseline year (2011) over all simulation years (30)
simulation_error (ndarray(float64, ndim=2)) – Ratio (in %) of simulated number of households per income group over target population per income group over all simulation years (30)
simulation_housing_supply (ndarray(float64, ndim=3)) – Housing supply per unit of available land (in m² per km²) for each housing type (4) in each grid cell (24,014) over all simulation years (30)
simulation_utility (ndarray(float64, ndim=2)) – Utility levels for each income group (4) over all simulation years (30)
simulation_deriv_housing (ndarray(float64, ndim=2)) – Difference between simulated next and current period values for housing supply per unit of available land (in m² per km²) per grid cell (24,014), for all simulation years (30).
simulation_T (ndarray(float64)) – Years (relative to baseline set at 0) used for the simulations

Dynamic functions

equilibrium.functions_dynamic.compute_average_income(spline_population_income_distribution, spline_income_distribution, param, t)[source]

Compute average income and population per income group for a given year.

This allows to update the relative distributions used to compute the equilibrium in subsequent periods (see equilibrium.compute_equilibrium).

Parameters

spline_population_income_distribution (interp1d) – Linear interpolation for total population per income group in the data (12) over the years (baseline year set at 0)
spline_income_distribution (interp1d) – Linear interpolation for median annual income (in rands) per income group in the data (12) over the years (baseline year set at 0)
param (dict) – Dictionary of default parameters
t (int) – Year for which we want to run the function

Returns

avg_income_group (ndarray(float64)) – Average median income for each income group in the model (4)
total_group (ndarray(float64)) – Exogenous total number of households per income group (excluding people out of employment, for 4 groups)

equilibrium.functions_dynamic.evolution_housing_supply(housing_limit, param, t1, t0, housing_supply_1, housing_supply_0)[source]

Yield dynamic housing supply with time inertia and capital depreciation.

We consider that formal private developers anticipate the unconstrained equilibrium value of their housing supply in future periods. Only if it is bigger than current values do we allow them to build more housing. In all cases, housing capital depreciates. This function computes a new housing supply including a time inertia parameter, that will be the (more realistic) housing supply simulated for target year (all the other equilibrium values will be updated correspondingly under this constraint). Then, the function returns the difference between simulated future and current values for housing supply.

Parameters

housing_limit (Series) – Maximum housing supply (in m² per km²) in each grid cell (24,014)
param (Dict) – Dictionary of default parameters
t1 (float64) – Target year (relative to baseline set at 0) for evolution of housing supply
t0 (float64) – Origin year (relative to baseline set at 0) for evolution of housing supply
housing_supply_1 (ndarray(float64)) – (Unconstrained) equilibrium housing supply per unit of available land (in m² per km²) for target year, per grid cell (24,014)
housing_supply_0 (TYPE) – Equilibrium housing supply per unit of available land (in m² per km²) for origin year, per grid cell (24,014)

Returns

Difference between simulated future and current values for housing supply per unit of available land (in m² per km²), per grid cell (24,014).

Return type

Series

equilibrium.functions_dynamic.import_scenarios(income_baseline, param, grid, path_scenarios, options)[source]

Define linear interpolations for time-moving exogenous variables.

Parameters

income_baseline (DataFrame) – Table summarizing, for each income group in the data (12, including people out of employment), the number of households living in each endogenous housing type (3), their total number at baseline year (2011) in retrospect (2001), as well as the distribution of their average income (at baseline year)
param (dict) – Dictionary of default parameters
grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
path_scenarios (str) – Path towards raw scenarios used for time-moving exogenous variables
options (dict) – Dictionary of default options

Returns

spline_agricultural_price (interp1d) – Linear interpolation for the agricultural land price (in rands) over the years (baseline year set at 0)
spline_interest_rate (interp1d) – Linear interpolation for the interest rate (in %) over the years
spline_population_income_distribution (interp1d) – Linear interpolation for total population per income group in the data (12) over the years (baseline year set at 0)
spline_inflation (interp1d) – Linear interpolation for inflation rate (in base 100 relative to baseline year) over the years (baseline year set at 0)
spline_income_distribution (interp1d) – Linear interpolation for median annual income (in rands) per income group in the data (12) over the years (baseline year set at 0)
spline_population (interp1d) – Linear interpolation for total population over the years (baseline year set at 0)
spline_income (interp1d) – Linear interpolation for overall average (annual) income over the years (baseline year set at 0), used to avoid money illusion in future simulations when computing the housing supply (see equilibrium.run_simulations)
spline_minimum_housing_supply (interp1d) – Linear interpolation for minimum housing supply (in m²) over the years (baseline year set at 0)
spline_fuel (interp1d) – Linear interpolation for fuel price (in rands per km) over the years (baseline year set at 0)

equilibrium.functions_dynamic.interpolate_interest_rate(spline_interest_rate, t)[source]

Return real interest rate used in model, for a given year.

Parameters

spline_interest_rate (interp1d) – Linear interpolation for the interest rate (in %) over the years
t (int) – Year for which we want to run the function

Returns

Real interest rate used in the model, and defined as the average over past (3) years to convey the structural (as opposed to conjonctural) component of the interest rate

Return type

float64

Compute intermediate outputs

equilibrium.sub.compute_outputs.compute_outputs(housing_type, utility, amenities, param, income_net_of_commuting_costs, fraction_capital_destroyed, grid, income_class_by_housing_type, options, housing_limit, agricultural_rent, interest_rate, coeff_land, minimum_housing_supply, construction_param, housing_in, param_pockets, param_backyards_pockets)[source]

Compute equilibrium outputs from theoretical formulas.

From optimality conditions on supply and demand (see technical documentation for math formulas), this function computes, for a given housing type, the following outputs. First, the demanded dwelling size in each place per income group. Then, the bid-rent function / willingness to pay (per m² of housing) in each place per income group. By selecting the highest bid, we recover the final simulated dwweling size and market rent. From there, we also compute the housing supply per unit of available land, and the total number of households in each location, per income group. To do so, it leverages the equilibrium.sub.functions_solver module.

Parameters

housing_type (str) – Endogenous housing type considered in the function: should be set to “formal”, “backyard”, or “informal”
utility (ndarray(float64)) – Utility levels for each income group (4) considered in a given iteration
amenities (ndarray(float64)) – Normalized amenity index (relative to the mean) for each grid cell (24,014)
param (dict) – Dictionary of default parameters
income_net_of_commuting_costs (ndarray(float64, ndim=2)) – Expected annual income net of commuting costs (in rands, for one household), for each geographic unit, by income group (4)
fraction_capital_destroyed (DataFrame) – Data frame of expected fractions of capital destroyed, for housing structures and contents in different housing types, in each grid cell (24,014)
grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
income_class_by_housing_type (DataFrame) – Set of dummies coding for housing market access (across 4 housing submarkets) for each income group (4, from poorest to richest)
options (dict) – Dictionary of default options
housing_limit (Series) – Maximum housing supply (in m² per km²) in each grid cell (24,014)
agricultural_rent (float64) – Annual housing rent below which it is not profitable for formal private developers to urbanize (agricultural) land: endogenously limits urban sprawl
interest_rate (float64) – Real interest rate for the overall economy, corresponding to an average over past years
coeff_land (ndarray(float64, ndim=2)) – Updated land availability for each grid cell (24,014) and each housing type (4: formal private, informal backyards, informal settlements, formal subsidized)
minimum_housing_supply (ndarray(float64)) – Minimum housing supply (in m²) for each grid cell (24,014), allowing for an ad hoc correction of low values in Mitchells Plain
construction_param (ndarray(float64)) – (Calibrated) scale factor for the construction function of formal private developers
housing_in (ndarray(float64)) – Theoretical minimum housing supply when formal private developers do not adjust (not used in practice), per grid cell (24,014)
param_pockets (ndarray(float64)) – (Calibrated) disamenity index for living in an informal settlement, per grid cell (24,014)
param_backyards_pockets (ndarray(float64)) – (Calibrated) disamenity index for living in an informal backyard, per grid cell (24,014)

Returns

job_simul (ndarray(float64)) – Simulated number of households per income group (4) for a given housing type, at a given iteration
R (ndarray(float64)) – Simulated average annual rent (in rands/m²) for a given housing type, at a given iteration, for each selected pixel (4,043)
people_init (ndarray(float64)) – Simulated number of households for a given housing type, at a given iteration, for each selected pixel (4,043)
people_center (ndarray(float64, ndim=2)) – Simulated number of households for a given housing type, at a given iteration, for each selected pixel (4,043) and each income group (4)
housing_supply (ndarray(float64)) – Simulated housing supply per unit of available land (in m² per km²) for a given housing type, at a given iteration, for each selected pixel (4,043)
dwelling_size (ndarray(float64)) – Simulated average dwelling size (in m²) for a given housing type, at a given iteration, for each selected pixel (4,043)
R_mat (ndarray(float64, ndim=2)) – Simulated willingness to pay / bid-rents (in rands/m²) for a given housing type, at a given iteration, for each selected pixel (4,043) and each income group (4)

Define optimality conditions for solver

equilibrium.sub.functions_solver.compute_dwelling_size_formal(utility, amenities, param, income_net_of_commuting_costs, fraction_capital_destroyed)[source]

Return optimal dwelling size per income group for formal housing.

This function leverages the explicit_qfunc() function to express dwelling size as an implicit function of observed values, coming from optimality conditions.

Parameters

utility (ndarray(float64)) – Utility levels for each income group (4) considered in a given iteration
amenities (ndarray(float64)) – Normalized amenity index (relative to the mean) for each grid cell (24,014)
param (dict) – Dictionary of default parameters
income_net_of_commuting_costs (ndarray(float64, ndim=2)) – Expected annual income net of commuting costs (in rands, for one household), for each geographic unit, by income group (4)
fraction_capital_destroyed (DataFrame) – Data frame of expected fractions of capital destroyed, for housing structures and contents in different housing types, in each grid cell (24,014)

Returns

dwelling_size – Simulated average dwelling size (in m²) for each selected pixel (4,043) and each income group (4)

Return type

ndarray(float64, ndim=2)

equilibrium.sub.functions_solver.compute_housing_supply_backyard(R, param, income_net_of_commuting_costs, fraction_capital_destroyed, grid, income_class_by_housing_type)[source]

Return optimal housing supply for informal backyards.

This function leverages optimality conditions function to express housing supply as a function of rents.

Parameters

R (ndarray(float64)) – Simulated average annual rent (in rands/m²) for a given housing type, for each selected pixel (4,043)
param (dict) – Dictionary of default parameters
income_net_of_commuting_costs (ndarray(float64, ndim=2)) – Expected annual income net of commuting costs (in rands, for one household), for each geographic unit, by income group (4)
fraction_capital_destroyed (DataFrame) – Data frame of expected fractions of capital destroyed, for housing structures and contents in different housing types, in each grid cell (24,014)
grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
income_class_by_housing_type (DataFrame) – Set of dummies coding for housing market access (across 4 housing submarkets) for each income group (4, from poorest to richest)

Returns

housing_supply – Simulated housing supply per unit of available land (in m² per km²) for informal backyards, for each selected pixel (4,043)

Return type

ndarray(float64)

equilibrium.sub.functions_solver.compute_housing_supply_formal(R, options, housing_limit, param, agricultural_rent, interest_rate, fraction_capital_destroyed, minimum_housing_supply, construction_param, housing_in, dwelling_size)[source]

Return optimal housing supply for formal private housing.

This function leverages optimality conditions function to express housing supply as a function of rents.

Parameters

R (ndarray(float64)) – Simulated average annual rent (in rands/m²) for a given housing type, for each selected pixel (4,043)
options (dict) – Dictionary of default options
housing_limit (Series) – Maximum housing supply (in m² per km²) in each grid cell (24,014)
param (dict) – Dictionary of default parameters
agricultural_rent (float64) – Annual housing rent below which it is not profitable for formal private developers to urbanize (agricultural) land: endogenously limits urban sprawl
interest_rate (float64) – Real interest rate for the overall economy, corresponding to an average over past years
fraction_capital_destroyed (DataFrame) – Data frame of expected fractions of capital destroyed, for housing structures and contents in different housing types, in each grid cell (24,014)
minimum_housing_supply (ndarray(float64)) – Minimum housing supply (in m²) for each grid cell (24,014), allowing for an ad hoc correction of low values in Mitchells Plain
construction_param (ndarray(float64)) – (Calibrated) scale factor for the construction function of formal private developers
housing_in (ndarray(float64)) – Theoretical minimum housing supply when formal private developers do not adjust (not used in practice), per grid cell (24,014)
dwelling_size (ndarray(float64)) – Simulated average dwelling size (in m²) for a given housing type, for each selected pixel (4,043)

Returns

housing_supply – Simulated housing supply per unit of available land (in m² per km²) for formal private housing, for each selected pixel (4,043)

Return type

ndarray(float64)

equilibrium.sub.functions_solver.explicit_qfunc(q, q_0, alpha)[source]

Explicit function that will be inverted to recover optimal dwelling size.

This function is used as part of compute_dwelling_size_formal().

Parameters

q (ndarray(float64)) – Arbitrary values for dwelling size (in m²)
q_0 (ndarray(float64)) – Parametric basic need in housing (in m²)
alpha (float64) – (Calibrated) composite good elasticity in households’ utility function

Returns

result – Theoretical values associated with observed variable left_side (see compute_dwelling_size_formal function) through optimality conditions, for arbitrary values of dwelling size

Return type

ndarray(float64)

Calibration

Main functions for calibration

calibration.calib_main_func.estim_construct_func_param(options, param, data_sp, threshold_income_distribution, income_distribution, data_rdp, housing_types_sp, data_number_formal, data_income_group, selected_density, path_data, path_precalc_inp, path_folder)[source]

Estimate coefficients of housing production function (Cobb-Douglas).

This function leverages a partial relation from our general equilibrium model, that is estimated on SP data which does not enter the simulations as an input. More precisely, it combines the expression of the optimal housing supply in the formal private sector with the highest bid condition (see technical documentation for math formulas).

Parameters

options (dict) – Dictionary of default options
param (dict) – Dictionary of default parameters
data_sp (DataFrame) – Table yielding, for each Small Place (1,046), the average dwelling size (in m²), the average land price and annual income level (in rands), the size of unconstrained area for construction (in m²), the total area (in km²), the distance to the city centre (in km), whether or not the location belongs to Mitchells Plain, and the SP code
threshold_income_distribution (ndarray(int32)) – Annual income level (in rands) above which a household is taken as being part of one of the 4 income groups in the model
income_distribution (ndarray(uint16, ndim=2)) – Exogenous number of households in each Small Place (1,046) for each income group in the model (4)
data_rdp (DataFrame) – Table yielding, for each grid cell (24,014), the associated cumulative count of cells with some formal subsidized housing, and the associated area (in m²) dedicated to such housing
housing_types_sp (DataFrame) – Table yielding, for each Small Place (1,046), the number of informal backyards, of informal settlements, and total dwelling units, as well as their (centroid) x and y coordinates
data_number_formal (Series) – Number of formal private housing units considered for each Small Place (1,046)
data_income_group (ndarray(float64)) – Categorical variable indicating, for each Small Place (1,046), the dominant income group (from 0 to 3)
selected_density (Series) – Dummy variable allowing for sample selection across Small Places (1,046) for regressions that are only valid in the formal private housing sector
path_data (str) – Path towards data used in the model
path_precalc_inp (str) – Path for precalcuted input data (calibrated parameters)
path_folder (str) – Path towards the root data folder

Returns

coeff_b (float64) – Calibrated capital elasticity in housing production function
coeff_a (float64) – Calibrated land elasticity in housing production function
coeffKappa (float64) – Calibrated scale factor in housing production function

calibration.calib_main_func.estim_incomes_and_gravity(param, grid, list_lambda, households_per_income_class, average_income, income_distribution, spline_inflation, spline_fuel, spline_population_income_distribution, spline_income_distribution, path_data, path_precalc_inp, path_precalc_transp, options)[source]

Estimate incomes per job center and income group, with gravity parameter.

This function leverages theoretical formulas from calibration.sub.compute_income and data imported through the calibration.sub.import_employment_data module. It first import transport costs and observed number of commuters per selected job centre and income group, then estimates the associated incomes for a given gravity parameter by minimizing the error over the simulated number of commuters. Then, it selects among a list of scanned values the final value of the gravity parameter (and the associated incomes) by minimizing the error over the distribution of commuters along their residence-workplace distances.

Parameters

param (dict) – Dictionary of default parameters
grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
list_lambda (ndarray(float64)) – List of values over which to scan for the gravity parameter used in the commuting choice model
households_per_income_class (ndarray(float64)) – Exogenous total number of households per income group (excluding people out of employment, for 4 groups)
average_income (ndarray(float64)) – Average median income for each income group in the model (4)
income_distribution (ndarray(uint16, ndim=2)) – Exogenous number of households in each Small Place (1,046) for each income group in the model (4)
spline_inflation (interp1d) – Linear interpolation for inflation rate (in base 100 relative to baseline year) over the years (baseline year set at 0)
spline_fuel (interp1d) – Linear interpolation for fuel price (in rands per km) over the years (baseline year set at 0)
spline_population_income_distribution (interp1d) – Linear interpolation for total population per income group in the data (12) over the years (baseline year set at 0)
spline_income_distribution (interp1d) – Linear interpolation for median annual income (in rands) per income group in the data (12) over the years (baseline year set at 0)
path_data (str) – Path towards data used in the model
path_precalc_inp (str) – Path for precalcuted input data (calibrated parameters)
path_precalc_transp (str) – Path for precalcuted transport inputs (intermediate outputs from commuting choice model)
options (dict) – Dictionary of default options

Returns

incomeCentersKeep (ndarray(float64, ndim=2)) – Calibrated average annual household income (including unemployment) for each income group (4), per grid cell (24,014)
lambdaKeep (float64) – Calibrated gravity parameter from the commuting choice model
cal_avg_income (ndarray(float64)) – Overall calibrated average income across income groups (4), for validation only
scoreKeep (ndarray(float64)) – Mean ratio of simulated over observed number of commuters per job center (185), collapsed at the income-group level (4): captures the error over our calibrated parameters
bhattacharyyaDistances (ndarray(float64)) – Bhattacharyya distances (measure the similarity of two probability distributions) between the calculated distribution of commuting distances and aggregates from the Transport Survey, for each scanned value of the gravity parameter. This is used as an auxiliary measure to pin down a unique gravity parameter (and associated matrix of incomes), and is only given as an output of the function for reference.

calibration.calib_main_func.estim_util_func_param(data_number_formal, data_income_group, housing_types_sp, data_sp, coeff_a, coeff_b, coeffKappa, interest_rate, incomeNetOfCommuting, path_data, path_precalc_inp, options, param)[source]

Calibrate utility function parameters.

This function leverages the following modules: import_amenities, estimate_parameters_by_scanning, and estimate_parameters_by_optimization. As before, we use partial relations coming from our general equilibrium structure (see technical documentation for math formulas). This time, we look at the utility function parameters that maximize a composite likelihood function for the fit on observed amenities, dwelling sizes, and population sorting by income (see calibration.sub.loglikelihood module). We proceed first by scanning over a discrete range of parameter values, then by running a smooth solver taking outputs from scanning as initial values.

Parameters

data_number_formal (Series) – Number of formal private housing units considered for each Small Place (1,046)
data_income_group (ndarray(float64)) – Categorical variable indicating, for each Small Place (1,046), the dominant income group (from 0 to 3)
housing_types_sp (DataFrame) – Table yielding, for each Small Place (1,046), the number of informal backyards, of informal settlements, and total dwelling units, as well as their (centroid) x and y coordinates
data_sp (DataFrame) – Table yielding, for each Small Place (1,046), the average dwelling size (in m²), the average land price and annual income level (in rands), the size of unconstrained area for construction (in m²), the total area (in km²), the distance to the city centre (in km), whether or not the location belongs to Mitchells Plain, and the SP code
coeff_a (float64) – Calibrated land elasticity in housing production function
coeff_b (float64) – Calibrated capital elasticity in housing production function
coeffKappa (float64) – Calibrated scale factor in housing production function
interest_rate (float64) – Interest rate for the overall economy, corresponding to an average over past years
incomeNetOfCommuting (ndarray(float64, ndim=2)) – Expected annual income net of commuting costs (in rands, for one household), for each geographic unit, by income group (4)
path_data (str) – Path towards data used in the model
path_precalc_inp (str) – Path for precalcuted input data (calibrated parameters)
options (dict) – Dictionary of default options
param (dict) – Dictionary of default parameters

Returns

calibratedUtility_beta (float64) – Calibrated surplus housing elasticity in households’ utility function
calibratedUtility_q0 (float64) – Parametric basic need in housing (in m²). Note that this parameter is not an output of the calibration per se, as it is exogenously set. It is included here for reference as it enters the households’ utility function and enters the optimization programme as an input. Note that this could be optimized over (as in Pfeiffer et al.), but only within a narrow range of values to preserve feasibilty of allocations.
cal_amenities (ndarray(float64)) – Calibrated amenity index for each grid cell (24,014): this is not normalized yet.

Calibrate income net of commuting costs and gravity parameter

calibration.sub.compute_income.EstimateIncome(param, timeOutput, distanceOutput, monetaryCost, costTime, job_centers, average_income, income_distribution, list_lambda, options)[source]

Estimate incomes per job center and number of commuters per distance bin.

This function leverages the commutingSolve() function to iterate over income values until the target number of commuters for each job center is reached. This allows to compute incomes per job center (and income group) and to find the split of commuters across distance-from-CBD brackets, for several values of the gravity parameter (from commuting choice model).

Parameters

param (dict) – Dictionary of default options
timeOutput (ndarray(float64, ndim=3)) – Duration (in min) of a round trip for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids
distanceOutput (ndarray(float64, ndim=3)) – Distance (in km) of a one-way trip for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids. Note that distances actually do not change across transport modes.
monetaryCost (ndarray(float64, ndim=3)) – Annual monetary cost (in rands) of a round trip for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids
costTime (ndarray(float64, ndim=3)) – Daily share of working time spent commuting for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids. This will be multiplied by expected income to get the opportunity cost of time.
job_centers (ndarray(float64, ndim=2)) – Number of jobs in each selected job center (185) per income group (4). Remember that we rescale the number of individual jobs to reflect total household employment, as our income and population data are for households only: one job basically provides employment for two people. This simplification allows to model households as a single representative agent and to abstract from a two-body problem. Empirically, this holds on aggregate as households’ position on the labor market is often determined by one household head.
average_income (ndarray(float64)) – Average median income for each income group in the model (4)
income_distribution (ndarray(uint16, ndim=2)) – Exogenous number of households in each Small Place (1,046) for each income group in the model (4)
list_lambda (ndarray(float64)) – List of values over which to scan for the gravity parameter used in the commuting choice model
options (dict) – Dictionary of default options

Returns

incomeCentersSave (ndarary(float64, ndim=3)) – Calibrated annual household income (in rands) for each income group (4) and each selected job center (185), for each scanned value of the gravity parameter
distanceDistribution (ndarray(float64, ndim=2)) – Share of residence-workplace distances in each 5-km from CBD bracket for each scanned value of the gravity parameter
scoreMatrix (ndarray(float64, ndim=2)) – Ratio of simulated over observed (rescaled) number of jobs for each income group (4) and each scanned value of the gravity parameter: defines an error metric for the quality of our calibration

calibration.sub.compute_income.commutingSolve(incomeCentersTemp, averageIncomeGroup, popCenters, popResidence, monetaryCost, costTime, param_lambda, householdSize, whichCenters, bracketsDistance, distanceOutput, options)[source]

Compute error and distribution for job allocation simulated from incomes.

This function leverages the compute_ODflows() function to compute a theoretical equivalent to number of jobs in each (narrowly) selected job center, and the distribution of those jobs across pre-defined distance-from-CBD brackets.

Parameters

incomeCentersTemp (ndarray(float64)) – Hourly household income (in rands) for some income group and some iteration, per selected job center
averageIncomeGroup (float64) – Average hourly median income (from data) for some income group
popCenters (ndarray(float64)) – Number of jobs in each (narrowly) selected job center for some income group. Remember that we further restrict the set of selected job centers when calibrating incomes per income group for the sake of numerical simplicity. Also remember that we rescale the number of individual jobs to reflect total household employment, as our income and population data are for households only: one job basically provides employment for two people. This simplification allows to model households as a single representative agent and to abstract from a two-body problem. Empirically, this holds on aggregate as households’ position on the labor market is often determined by one household head.
popResidence (ndarray(float64)) – Number of households living in each SP (1046), per income group (4): comes from census data and does not include people not working
monetaryCost (ndarray(float64, ndim=3)) – Hourly monetary cost (in rands) of a round trip for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids
costTime (ndarray(float64, ndim=3)) – Daily share of working time spent commuting for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids. This will be multiplied by expected income to get the opportunity cost of time.
param_lambda (float64) – Tested value for the gravity parameter used in the commuting choice model
householdSize (float) – Average number of employed workers per household for one income class (corresponds to 2 times the employment rate)
whichCenters (ndarray(bool)) – Dummy variable allowing the narrow selection of job centers used for income calibration for a given income group, among pre-selected job centers (185)
bracketsDistance (ndarray(int32)) – Array of floor values for distance brackets used to select the set of incomes + gravity parameter that best fit the distribution of residence-workplace distances according to those brackets
distanceOutput (ndarray(float64, ndim=3)) – Distance (in km) of a one-way trip for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids. Note that distances actually do not change across transport modes.
options (dict) – Dictionary of default options

Returns

score (ndarray(float64)) – Difference between observed and simulated population working in each (narrowly) selected job center (for a given income group)
nbCommuters (ndarray(float64)) – Number of commuters in each pre-defined distance bracket, for a given income group

calibration.sub.compute_income.compute_ODflows(householdSize, monetaryCost, costTime, incomeCentersFull, whichCenters, param_lambda, options)[source]

Compute probability distribution and transport costs of commuting pairs.

This function applies theoretical formulas from the commuting choice model, for a given income group and a given gravity parameter (see math appendix for more details).

Parameters

householdSize (float) – Average number of employed workers per household for one income class (corresponds to 2 times the employment rate)
monetaryCost (ndarray(float64, ndim=3)) – Hourly monetary cost (in rands) of a round trip for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids
costTime (ndarray(float64, ndim=3)) – Daily share of working time spent commuting for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids. This will be multiplied by expected income to get the opportunity cost of time.
incomeCentersFull (ndarray(float64)) – Hourly household income (in rands) for some income group and some iteration, per selected job center, rescaled to match income distribution across the overall population
whichCenters (ndarray(bool)) – Dummy variable allowing the narrow selection of job centers used for income calibration for a given income group, among pre-selected job centers (185)
param_lambda (float64) – Tested value for the gravity parameter used in the commuting choice model

Returns

transportCostModes (ndarray(float64, ndim=3)) – Expected (hourly) total commuting cost (in rands) for one agent, for all (narrowly) selected job centers, all SPs of residence (1046), and all transport modes (5)
transportCost (ndarray(float64, ndim=2)) – Expected (hourly) total commuting cost (in rands) for one agent, for all (narrowly) selected job centers and all SPs of residence (1046): corresponds to min(transportCostModes) across transport modes.
ODflows (ndarray(float64, ndim=2)) – Probability, for a given income group, to work in each (narrowly) selected job center for each potential SP of residence (1046)
valueMax (ndarray(float64, ndim=2)) – Numerical parameter for each commuting pair that prevents logarithms and exponentials used in calculations from diverging towards infinity: it corresponds to the maximum argument for exponentials that appear in the formula for transportCost: this is not used as an input in other functions and is just included here for reference
minIncome (ndarray(float64)) – Numerical parameter for each SP of residence that prevents logarithms and exponentials used in calculations from diverging towards infinity: it corresponds to minus the minimum argument for exponentials that appear in the formula for ODflows: this is not used as an input in other functions and is just included here for reference

calibration.sub.compute_income.import_transport_costs(grid, param, yearTraffic, households_per_income_class, spline_inflation, spline_fuel, spline_population_income_distribution, spline_income_distribution, path_precalc_inp, path_precalc_transp, dim, options)[source]

Compute monetary and time costs from commuting for some given year.

This function leverages the CoCT’s EMME/2 transport model for inputs.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city
param (dict) – Dictionary of default parameters
yearTraffic (int) – Year for which we want to run the function
households_per_income_class (ndarray(float64)) – Exogenous total number of households per income group (excluding people out of employment, for 4 groups)
spline_inflation (interp1d) – Linear interpolation for inflation rate (in base 100 relative to baseline year) over the years (baseline year set at 0)
spline_fuel (interp1d) – Linear interpolation for fuel price (in rands per km) over the years (baseline year set at 0)
spline_population_income_distribution (interp1d) – Linear interpolation for total population per income group in the data (12) over the years (baseline year set at 0)
spline_income_distribution (interp1d) – Linear interpolation for median annual income (in rands) per income group in the data (12) over the years (baseline year set at 0)
path_precalc_inp (str) – Path for precalcuted input data (calibrated parameters)
path_precalc_transp (str) – Path for precalcuted transport inputs (intermediate outputs from commuting choice model)
dim (str) – Geographic level of analysis at which we want to run the commuting choice model: should be set to “GRID” or “SP”
options (dict) – Dictionary of default options

Returns

timeOutput (ndarray(float64, ndim=3)) – Duration (in min) of a round trip for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids
distanceOutput (ndarray(float64, ndim=3)) – Distance (in km) of a one-way trip for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids. Note that distances actually do not change across transport modes.
monetaryCost (ndarray(float64, ndim=3)) – Annual monetary cost (in rands) of a round trip for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids
costTime (ndarray(float64, ndim=3)) – Daily share of working time spent commuting for each transport mode (5) between each selected geographic unit and each selected job center (185) centroids. This will be multiplied by expected income to get the opportunity cost of time.

Estimate utility function parameters by optimization

calibration.sub.estimate_parameters_by_optimization.EstimateParametersByOptimization(incomeNetOfCommuting, dataRent, dataDwellingSize, dataIncomeGroup, selectedSP, tableAmenities, variablesRegression, initRho, initBeta, initBasicQ, initUti2, initUti3, initUti4, options)[source]

Estimate parameters by maximizing log likelihood via gradient descent.

This function runs an interior-point algoithm over values of endogenous parameters entering households’ utility function, and returns those which maximize a composite log-likelihood. The latter is defined as the sum of log-likelihoods measuring the fit of the model on several observed data moments, namely dwelling sizes, exogenous amenities, and income sorting. To compute those separate log-likelihoods, this function calls on the calibration.sub.loglikelihood module after providing sample and variable selection for the estimation, as well as theoretical partial relations from the structure of the model, used in regressions. It leverages initial guess from estimate_parameters_by_scanning module on parameter values to start within the set of feasible solutions, and to converge towards an interior (as opposed to corner) solution. The algorithm only converges when considering the fit along all dimensions (not only exogenous amenities). This function is therefore not appropriate when we want to focus calibration on the amenity score only.

Parameters

incomeNetOfCommuting (ndarray(float64, ndim=2)) – Expected annual income net of commuting costs (in rands, for one household), for SP (1,046), by income group (4)
dataRent (Series) – Theoretical average annual rent for formal private housing, computed from data on average land prices, for each SP (1,046)
dataDwellingSize (Series) – Average dwelling size (in m²) for each SP (1,046), from SP data
dataIncomeGroup (ndarray(float64)) – Categorical variable indicating, for each Small Place (1,046), the dominant income group (from 0 to 3)
selectedSP (Series) – Dummy variable used to select SPs (1,046) with enough formal private housing to identify the regressions used in the function (less stringent than selectedDensity)
tableAmenities (DataFrame) – Table yielding, for each selected geographic unit, a set of dummy variables corresponding to available exogenous amenites
variablesRegression (list) – List containing labels for exogenous amenity variables used in the final regressions
initRho (int) – Spatial autocorrelation parameter (not used in practice, as is set to zero)
initBeta (float64) – Initial guess for the surplus housing elasticity from households’ utility function: comes from the estimate_parameters_by_scanning module
initBasicQ (float64) – Initial guess for the basic need in housing (in m²) from households’ utility function: comes from the estimate_parameters_by_scanning module
initUti2 (int32) – Target utility level of the second poorest income group
initUti3 (float64) – Initial guess for the utility level of the second richest income group: comes from the estimate_parameters_by_scanning module
initUti4 (float64) – Initial guess for the utility level of the richest income group: comes from the estimate_parameters_by_scanning module
options (dict) – Dictionary of default options

Returns

parameters (ndarray(float64)) – Vector of “calibrated parameters”. In the order: surplus housing elasticity, basic need in housing, and utility levels for income groups 3 and 4
scoreTot (float64) – Maximum value of composite log-likelihood for the fit on observed amenities, dwelling sizes, and population sorting by income: provides a performance metric for the calibration process
parametersAmenities (ndarray(float64)) – List of estimates for the impact of exogenous (dummy) amenities on the calibrated amenity index (in log-form). In the order: intercept, distance to the ocean <2km, distance to the ocean between 2 and 4km, slope between 1 and 5%, slope >5%, being located within the airport cone, distance to district parks <2km, distance to biosphere reserve <2km, distance to train station <2km, distance to urban heritage site <2km
modelAmenities (regression.linear_model.RegressionResultsWrapper) – Object summarizing the results of the log-regressions of the theoretical amenity index over observed exogenous amenity dummies
parametersHousing (int) – List of estimates related to the fit of the model on building density / housing supply: this is not included in this version of the model (vector is set equal to zero) as we already exploit this relation to estimate parameters of the construction function (compared to other versions where we use construction costs)

Estimate utility function parameters by scanning

calibration.sub.estimate_parameters_by_scanning.EstimateParametersByScanning(incomeNetOfCommuting, dataRent, dataDwellingSize, dataIncomeGroup, selectedSP, tableAmenities, variablesRegression, initRho, listBeta, listBasicQ, initUti2, listUti3, listUti4, options)[source]

Estimate parameters by maximizing log likelihood over scanned values.

This function scans over values of endogenous parameters entering households’ utility function, and returns those which maximize a composite log-likelihood. The latter is defined as the sum of log-likelihoods measuring the fit of the model on several observed data moments, namely dwelling sizes, exogenous amenities, and income sorting. To compute those separate log-likelihoods, this function calls on the calibration.sub.loglikelihood module after providing sample and variable selection for the estimation, as well as theoretical partial relations from the structure of the model, used in regressions. The purpose of this function is to provide an informed guess on initial values for the estimate_parameters_by_optimization module to run a proper gradient descent optimization, by starting within the set of interior feasible solutions. By default, we only fit the data moment on exogenous amenities (and forget about the other dimensions of the composite likelihood) when beta and q0 are pinned down, as it is the only relevant moment for the amenity score calibration. The other dimensions can be recovered by commenting out the cancelling out terms at the end of the script, if we want to recalibrate beta and q0 internally (and potentially improve the fit of the model at the price of empirical validity).

Parameters

incomeNetOfCommuting (ndarray(float64, ndim=2)) – Expected annual income net of commuting costs (in rands, for one household), for SP (1,046), by income group (4)
dataRent (Series) – Theoretical average annual rent for formal private housing, computed from data on average land prices, for each SP (1,046)
dataDwellingSize (Series) – Average dwelling size (in m²) for each SP (1,046), from SP data
dataIncomeGroup (ndarray(float64)) – Categorical variable indicating, for each Small Place (1,046), the dominant income group (from 0 to 3)
selectedSP (Series) – Dummy variable used to select SPs (1,046) with enough formal private housing to identify the regressions used in the function (less stringent than selectedDensity)
tableAmenities (DataFrame) – Table yielding, for each selected geographic unit, a set of dummy variables corresponding to available exogenous amenites
variablesRegression (list) – List containing labels for exogenous amenity variables used in the final regressions
initRho (int) – Spatial autocorrelation parameter (not used in practice, as is set to zero)
listBeta (ndarray(float64)) – List of values over which to scan for the surplus housing elasticity parameter from households’ utility function
listBasicQ (ndarray(float64)) – List of values over which to scan for the basic need in housing (in m²) from households’ utility function
initUti2 (int32) – Target utility level of the second poorest income group
listUti3 (ndarray(float64)) – List of values over which to scan for the utility level of the second richest income group
listUti4 (ndarray(float64)) – List of values over which to scan for the utility level of the richest income group
options (dict) – Dictionary of default options

Returns

parameters (ndarray(float64)) – Vector of “calibrated parameters”. In the order: surplus housing elasticity, basic need in housing, and utility levels for income groups 3 and 4
scoreTot (float64) – Maximum value of composite log-likelihood for the fit on observed amenities, dwelling sizes, and population sorting by income: provides a performance metric for the calibration process
parametersAmenities (ndarray(float64)) – List of estimates for the impact of exogenous (dummy) amenities on the calibrated amenity index (in log-form). In the order: intercept, distance to the ocean <2km, distance to the ocean between 2 and 4km, slope between 1 and 5%, slope >5%, being located within the airport cone, distance to district parks <2km, distance to biosphere reserve <2km, distance to train station <2km, distance to urban heritage site <2km
modelAmenities (regression.linear_model.RegressionResultsWrapper) – Object summarizing the results of the log-regressions of the theoretical amenity index over observed exogenous amenity dummies
parametersHousing (int) – List of estimates related to the fit of the model on building density / housing supply: this is not included in this version of the model (vector is set equal to zero) as we already exploit this relation to estimate parameters of the construction function (compared to other versions where we use construction costs)

Import exogenous amenities (for amenity index)

calibration.sub.import_amenities.import_exog_amenities(path_data, path_precalc_inp, dim)[source]

Import relevant amenity data at SP level.

Parameters

path_data (str) – Path towards data used in the model
path_precalc_inp (str) – Path for precalcuted input data (calibrated parameters)
dim (str) – Geographic level of analysis at which we want to run the commuting choice model: should be set to “grid” or “SP”

Returns

table_amenities – Table yielding, for each selected geographic unit, a set of dummy variables corresponding to available exogenous amenites

Return type

DataFrame

Import employment data (for income and gravity)

calibration.sub.import_employment_data.import_employment_data(households_per_income_class, param, path_data)[source]

Import number of jobs per income group in each selected employment center.

Parameters

households_per_income_class (ndarray(float64)) – Exogenous total number of households per income group (excluding people out of employment, for 4 groups)
param (dict) – Dictionary of default parameters
path_data (str) – Path towards data used in the model

Returns

jobsCentersNGroupRescaled – Number of jobs in each selected job center (185) per income group (4). Remember that we rescale the number of individual jobs to reflect total household employment, as our income and population data are for households only: one job basically provides employment for two people. This simplification allows to model households as a single representative agent and to abstract from a two-body problem. Empirically, this holds on aggregate as households’ position on the labor market is often determined by one household head.

Return type

ndarray(float64, ndim=2)

Log-likelihood (for utility function parameters)

calibration.sub.loglikelihood.LogLikelihoodModel(X0, Uo2, net_income, groupLivingSpMatrix, dataDwellingSize, selectedDwellingSize, dataRent, selectedRents, predictorsAmenitiesMatrix, tableRegression, variables_regression, CalculateDwellingSize, ComputeLogLikelihood, optionRegression, options)[source]

Compute composite log-likelihood for model fit given scanned parameters.

This function computes three separate log-likelihoods that capture the fit of the model along distinct data moments. Then, it sums them to give a composite log-likelihood that will be maximized to calibrate the utility function parameters (as part of scanning or smooth optimization process). More precisely, it starts by regressing theoretical values of the log amenity index on observed exogenous dummy amenity variables. The first log-likelihood is computed based on the value of residuals. Then, it defines a second log-likelihood for the fit on income sorting (matching the observed dominant income group with the highest bidding income group). Finally, it gets the residuals from the log-difference between theoretical and observed dwelling sizes, and computes the third log-likelihood from there.

Parameters

X0 (ndarray(float64)) – Set of inputs (namely, surplus housing elasticity, basic need in housing, utility levels for income groups 3 and 4) tested for a given iteration
Uo2 (int32) – Target utility level for income group 2
net_income (ndarray(float64, ndim=2)) – Expected annual income net of commuting costs (in rands, for one household), for SP (1,046), by income group excluding the poorest (3)
groupLivingSpMatrix (ndarray(bool, ndim=2)) – Dummy variable indicating, for each income group excluding the poorest (3) whether it is dominant in each SP (1,046)
dataDwellingSize (Series) – Average dwelling size (in m²) for each SP (1,046), from SP data
selectedDwellingSize (Series) – Dummy variable indicating, for each SP (1,046), whether it is selected into the sample used when regressing observed dwelling sizes on their theoretical equivalent
dataRent (Series) – Theoretical average annual rent for formal private housing, computed from data on average land prices, for each SP (1,046)
selectedRents (Series) – Dummy variable indicating, for each SP (1,046), whether it is selected into the sample used when estimating the discrete choice logit model associated with income sorting (identifying observed rents with highest bid-rents from the dominant income group)
predictorsAmenitiesMatrix (ndarray(float64, ndim=2)) – Values of selected exogenous dummy amenity variables (10, including the intercept) in each selected SP (according to selectedRents)
tableRegression (DataFrame) – Values of all exogenous dummy amenity variables (16, excluding the intercept) in each selected SP (according to selectedRents)
variables_regression (list) – List of labels for selected exogenous dummy amenity variables (9, excluding the intercept)
CalculateDwellingSize (function) – Function defining the relationship between rents and dwelling sizes in the formal private sector (see technical documentation)
ComputeLogLikelihood (function) – Log-likelihood function for a lognormal law of mean 0: we will assume that dwelling size and amenity residuals follow such a law
optionRegression (int) – Option to run GLM (instead of OLS) regression for the estimation of exogenous amenity estimates: default is set as zero as GLM is unstable
options (dict) – Dictionary of default options

Returns

scoreTotal (float64) – Value of the composite log-likelihood for the set of parameters scanned
scoreAmenities (float64) – Value of the log-likelihood for the fit on exogenous amenities
scoreDwellingSize (float64) – Value of the log-likelihood for the fit on observed dwelling sizes
scoreIncomeSorting (float64) – Value of the log-likelihood for the fit on observed income sorting (matching observed rents to highest bid-rents from dominant income group)
scoreHousing (float64) – Value of the log-likelihood for the fit on observed housing supply / building density: this is not used in this version of the model as the relation is already used for the calibration of construction function parameters (hence is set equal to zero)
parametersAmenities (ndarray(float64)) – List of estimates for the impact of exogenous (dummy) amenities on the calibrated amenity index (in log-form). In the order: distance to the ocean <2km, distance to the ocean between 2 and 4km, slope between 1 and 5%, slope >5%, being located within the airport cone, distance to district parks <2km, distance to biosphere reserve <2km, distance to train station <2km, distance to urban heritage site <2km
modelAmenities (regression.linear_model.RegressionResultsWrapper) – Object summarizing the results of the log-regressions of the theoretical amenity index over observed exogenous amenity dummies
parametersHousing (int) – List of estimates related to the fit of the model on building density / housing supply: this is not included in this version of the model (vector is set equal to zero) as we already exploit this relation to estimate parameters of the construction function (compared to other versions where we use construction costs)

Outputs

Process and display general values

outputs.export_outputs.export_map(value, grid, geo_grid, path_plots, export_name, title, path_tables, ubnd, lbnd=0, cmap='Reds')[source]

Generate 2D heat maps of any spatial input.

Parameters

value (ndarray) – Any one-dimensional array with values given at grid level
grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
geo_grid (GeoDataFrames) – Data frame with geometry for the analysis grid (24,014 points)
path_plots (str) – Path for saving output plots
export_name (str) – Name given to saved output file
title (str) – Title given tou output plot
path_tables (str) – Path for saving output plots
ubnd (float64) – Upper bound for plotted values
lbnd (float64, optional) – Lower bound for plotted values. The default is 0.
cmap (str, optional) – Type of choropleth map to be plotted (see pyplot options). The default is ‘Reds’.

Returns

gdf – Data frame with geolocalized observations from input array

Return type

GeoDataFrame

outputs.export_outputs.from_df_to_gdf(array, geo_grid)[source]

Convert map array/series inputs into grid-level GeoDataFrames.

Parameters

array (ndarray) – Any array with a grid geographic dimension
geo_grid (GeoDataFrames) – Data frame with geometry for the analysis grid (24,014 points)

Returns

gdf – Data frame with geolocalized observations from input array

Return type

GeoDataFrame

outputs.export_outputs.import_employment_geodata(households_per_income_class, param, path_data)[source]

Import number of jobs per selected employment center.

Parameters

households_per_income_class (ndarray(float64)) – Exogenous total number of households per income group (excluding people out of employment, for 4 groups)
param (dict) – Dictionary of default parameters
path_data (str) – Path towards data used in the model

Returns

jobsTable (DataFrame) – Number of jobs in each selected job center (185) per income group (4), with x and y coordinates
selected_centers (ndarray(bool)) – Array of dummies for selecting employment centers above some number of jobs threshold (185), out of the total set of transport zones (1787)

outputs.export_outputs.plot_average_income(grid, average_income, path_plots, path_tables)[source]

Line plot for average income across 1D-space.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
average_income (ndarray(float64)) – Average median income for each income group in the model (4)
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Table for average income across 1D-space.

Return type

DataFrame

outputs.export_outputs.plot_income_net_of_commuting_costs(grid, income_net_of_commuting_costs, path_plots, path_tables)[source]

Line plot for average income net of commuting costs across 1D-space.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
income_net_of_commuting_costs (ndarray(float64, ndim=2)) – Expected annual income net of commuting costs (in rands, for one household), for each geographic unit, by income group (4)
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Table for average income net of commuting costs across 1D-space

Return type

DataFrame

outputs.export_outputs.retrieve_name(var, depth)[source]

Return a string for the name of a variable.

Parameters

var – Any variable
depth (int) – Depth level of the function call (indicates how deep it should go to retrieve the original name of the variable): from 0 to 2

Returns

Name of input variable

Return type

str

outputs.export_outputs.simul_housing_demand(grid, center, initial_state_dwelling_size, initial_state_households_housing_types, path_plots, path_tables)[source]

Line plot average dwelling size in formal private housing across 1D-space.

This is not a validation plot function, and it is specifically used for subsequent periods when validation data is not available.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
center (ndarray(float64)) – x and y coordinates of geographic centre of analysis grid
initial_state_dwelling_size (ndarray(float64, ndim=2)) – Average dwelling size (in m²) per grid cell in each housing type (4)
initial_state_households_housing_types (ndarray(float64, ndim=2)) – Number of households per grid cell in each housing type (4)
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Table with simulated dwelling sizes across 1-D space

Return type

DataFrame

outputs.export_outputs.simul_pop_housing_types(grid, initial_state_households_housing_types, path_plots, path_tables)[source]

Line plot for number of households per housing type across 1D-space.

This is used specifically for subsequent periods where no validation data is available.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
initial_state_households_housing_types (ndarray(float64, ndim=2)) – Number of households per grid cell in each housing type (4)
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Table for number of households (no density) per housing type across 1D-space.

Return type

DataFrame

outputs.export_outputs.simul_pop_htype_income(grid, initial_state_households, path_plots, path_tables)[source]

Line plot per housing and income groups across 1D-space (no validation).

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
initial_state_households (ndarray(float64, ndim=3)) – Number of households per grid cell in each income group (4) and each housing type (4)
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Validation table for number of households (no density) per housing and income groups across 1D-space

Return type

Dataframe

outputs.export_outputs.simul_pop_income_groups(grid, initial_state_household_centers, path_plots, path_tables)[source]

Line plot for number of households per income group across 1D-space.

This is used specifically for subsequent periods when validation data is not available.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
initial_state_household_centers (ndarray(float64, ndim=2)) – Number of households per grid cell in each income group (4) at baseline year (2011)
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Table for number of households per income group across 1D-space (no density)

Return type

DataFrame

outputs.export_outputs.simulation_housing_price(grid, initial_state_rent, interest_rate, param, center, housing_types_sp, path_plots, path_tables, land_price)[source]

Line plot for housing/land prices/rents across 1D-space.

Breakdown is given per housing type. This function is specifically used for subsequent periods when validation data is not available.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
initial_state_rent (ndarray(float64, ndim=2)) – Average annual rent (in rands) per grid cell for each housing type (4)
interest_rate (float64) – Interest rate for the overall economy, corresponding to an average over past years
param (dict) – Dictionary of default parameters
center (ndarray(float64)) – x and y coordinates of geographic centre of analysis grid
housing_types_sp (DataFrame) – Table yielding, for each Small Place (1,046), the number of informal backyards, of informal settlements, and total dwelling units, as well as their (centroid) x and y coordinates
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots
land_price (int) – Dummy set to 1 or 0, depending on whether we want to consider theoretical land price or annual housing rent

Returns

df – Table for housing prices across 1D-space

Return type

DataFrame

outputs.export_outputs.valid_housing_demand(grid, center, initial_state_dwelling_size, initial_state_households_housing_types, housing_types_sp, data_sp, path_plots, path_tables)[source]

Line plot average dwelling size in formal private housing across 1D-space.

Note that this is a validation plot using data at SP level.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
center (ndarray(float64)) – x and y coordinates of geographic centre of analysis grid
initial_state_dwelling_size (ndarray(float64, ndim=2)) – Average dwelling size (in m²) per grid cell in each housing type (4)
initial_state_households_housing_types (ndarray(float64, ndim=2)) – Number of households per grid cell in each housing type (4)
housing_types_sp (DataFrame) – Table yielding, for each Small Place (1,046), the number of informal backyards, of informal settlements, and total dwelling units, as well as their (centroid) x and y coordinates
data_sp (DataFrame) – Table yielding, for each Small Place (1,046), the average dwelling size (in m²), the average land price and annual income level (in rands), the size of unconstrained area for construction (in m²), the total area (in km²), the distance to the city centre (in km), whether or not the location belongs to Mitchells Plain, and the SP code
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Table with simulated and validation dwelling sizes across 1-D space

Return type

DataFrame

outputs.export_outputs.valid_housing_price(grid, initial_state_rent, interest_rate, param, housing_types_sp, data_sp, path_plots, path_tables)[source]

Line plot for housing prices across 1D-space.

This focuses on validation for housing prices in the formal private sector. Note that we have missing values in the city centre and beyond 40km.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
initial_state_rent (ndarray(float64, ndim=2)) – Average annual rent (in rands) per grid cell for each housing type (4)
interest_rate (float64) – Interest rate for the overall economy, corresponding to an average over past years
param (dict) – Dictionary of default parameters
housing_types_sp (DataFrame) – Table yielding, for each Small Place (1,046), the number of informal backyards, of informal settlements, and total dwelling units, as well as their (centroid) x and y coordinates
data_sp (DataFrame) – Table yielding, for each Small Place (1,046), the average dwelling size (in m²), the average land price and annual income level (in rands), the size of unconstrained area for construction (in m²), the total area (in km²), the distance to the city centre (in km), whether or not the location belongs to Mitchells Plain, and the SP code
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Table for formal private housing prices across 1-D space (with validation data)

Return type

DataFrame

outputs.export_outputs.valid_housing_supply(grid, initial_state_housing_supply, path_plots, path_tables)[source]

Line plot for housing supply per unit of available land across 1D-space.

Breakdown is given per housing type.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
initial_state_housing_supply (ndarray(float64, ndim=2)) – Housing supply per unit of available land (in m² per km²) for each housing type (4) in each grid cell
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Table for housing supply per unit of available land across 1D-space

Return type

DataFrame

outputs.export_outputs.valid_housing_supply_noland(grid, initial_state_housing_supply, path_plots, path_tables)[source]

Line plot for housing supply (no land availability) across 1D-space.

Breakdown is given per housing type.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
initial_state_housing_supply (ndarray(float64, ndim=2)) – Housing supply per unit of available land (in m² per km²) for each housing type (4) in each grid cell
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Table for housing supply (no land availability) across 1D-space

Return type

DataFrame

outputs.export_outputs.valid_pop_housing_type(housing_type_1, housing_type_2, legend1, legend2, path_plots, path_tables)[source]

Validation bar plot for number of simulated households per housing type.

Parameters

housing_type_1 (ndarray(float64, ndim=2)) – Number of simulated households per grid cell in each housing type (4)
housing_type_2 (ndarray(float64, ndim=2)) – Number of households from data per grid cell in each housing type (4)
legend1 (str) – Legend for first array
legend2 (str) – Legend for second array
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

data – Validation table for number of simulated households per housing type

Return type

DataFrame

outputs.export_outputs.valid_pop_housing_types(grid, initial_state_households_housing_types, housing_types, path_plots, path_tables)[source]

Validation line plot for nb of households per housing type across 1D-space.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
initial_state_households_housing_types (ndarray(float64, ndim=2)) – Number of households per grid cell in each housing type (4)
housing_types (DataFrame) – Table yielding, for 4 different housing types (informal settlements, formal backyards, informal backyards, and formal private housing), the number of households in each grid cell (24,014), from SAL data.
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Validation table for number of households (no density) per housing type across 1D-space.

Return type

DataFrame

outputs.export_outputs.valid_pop_htype_income(initial_state_households, households_per_income_and_housing, legend1, legend2, path_plots, path_tables)[source]

Validation bar plot for nb of households across housing and income groups.

Parameters

initial_state_households (ndarray(float64, ndim=3)) – Number of households per grid cell in each income group (4) and each housing type (4)
households_per_income_and_housing (ndarray(float64, ndim=2)) – Exogenous number of households per income group (4, from poorest to richest) in each endogenous housing type (3: formal private,
legend1 (str) – Legend for first array
legend2 (str) – Legend for second array
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

data0 (DataFrame) – Validation table for simulated number of households across income groups in formal private housing
data1 (DataFrame) – Validation table for simulated number of households across income groups in informal backyards
data2 (DataFrame) – Validation table for simulated number of households across income groups in informal settlements

outputs.export_outputs.valid_pop_income_groups(grid, initial_state_household_centers, income_distribution_grid, path_plots, path_tables)[source]

Validation line plot for nb of households per income group across 1D-space.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
initial_state_household_centers (ndarray(float64, ndim=2)) – Number of households per grid cell in each income group (4) at baseline year (2011)
income_distribution_grid (ndarray(uint16, ndim=2)) – Exogenous number of households in each grid cell (24,014) for each income group in the model (4)
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Validation table for nb of households per income group across 1D-space (no density)

Return type

DataFrame

outputs.export_outputs.validation_density(grid, initial_state_households_housing_types, housing_types, path_plots, path_tables)[source]

Validation line plot for household density across space in 1D.

Parameters

grid (DataFrame) – Table yielding, for each grid cell (24,014), its x and y (centroid) coordinates, and its distance (in km) to the city centre
initial_state_households_housing_types (ndarray(float64, ndim=2)) – Number of households per grid cell in each housing type (4)
housing_types (DataFrame) – Table yielding, for 4 different housing types (informal settlements, formal backyards, informal backyards, and formal private housing), the number of households in each grid cell (24,014), from SAL data.
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots

Returns

df – Validation table for household density across space in 1D.

Return type

DataFrame

Process values related to floods

outputs.flood_outputs.annualize_damages(array_init, type_flood, housing_type, options)[source]

Return expected value of flood damages for given location and housing type.

The logic is the same as in inputs.data.compute_fraction_capital_destroyed.

Parameters

array_init (ndarray(float64)) – Array containing estimated damage values per available return period, for a given flood type, housing type, and grid cell
type_flood (str) – Type of flood risk considered, used to determine the number of return periods available (depends on FATHOM/DELTARES data source)
housing_type (str) – Housing type considered, used to determine which corrections to apply for pluvial flood risks
options (dict) – Dictionary of default options

Returns

Annualized / expected value of future damage flows for a given flood type, housing type, and grid cell

Return type

float64

outputs.flood_outputs.compute_content_cost(initial_state_households, initial_state_housing_supply, income_net_of_commuting_costs, param, fraction_capital_destroyed, initial_state_rent, initial_state_dwelling_size, interest_rate)[source]

Estimate value of flood-prone composite good consumption in space.

Again, this is based on model outcomes. Since cost estimates are specific to housing type, we rely on an estimation of average income per housing type that we plug into households’ budget constraint.

Parameters

initial_state_households (ndarray(float64, ndim=3)) – Number of households per grid cell in each income group (4) and each housing type (4)
initial_state_housing_supply (ndarray(float64, ndim=2)) – Housing supply per unit of available land (in m² per km²) for each housing type (4) in each grid cell
income_net_of_commuting_costs (ndarray(float64, ndim=2)) – Expected annual income net of commuting costs (in rands, for one household), for each geographic unit, by income group (4)
param (dict) – Dictionary of default parameters
fraction_capital_destroyed (DataFrame) – Data frame of expected fractions of capital destroyed, for housing structures and contents in different housing types, in each grid cell (24,014)
initial_state_rent (ndarray(float64, ndim=2)) – Average annual rent (in rands) per grid cell for each housing type (4)
initial_state_dwelling_size (ndarray(float64, ndim=2)) – Average dwelling size (in m²) per grid cell in each housing type (4)
interest_rate (float64) – Real interest rate for the overall economy, corresponding to an average over past years

Returns

content_cost – Estimated value of composite good consumption that is considered as flood-prone, for each grid cell (24,014) and each housing type (4)

Return type

DataFrame

outputs.flood_outputs.compute_damages(floods, path_data, param, content_cost, nb_households_formal, nb_households_subsidized, nb_households_informal, nb_households_backyard, dwelling_size, formal_structure_cost, content_damages, structural_damages_type4b, structural_damages_type4a, structural_damages_type2, structural_damages_type3a, options, spline_inflation, year_temp, path_tables, flood_categ)[source]

Compute total structure and content damages per housing type.

This function leverages the depth-damage functions from the literature to estimate the monetary value lost to floods based on the estimated total value of the underlying asset, per available return period. The logic is the same as in inputs.data.import_full_floods_data.

Parameters

floods (list) – List of file names for available flood maps per return period and flood type
path_data (str) – Path towards data used in the model
param (dict) – Dictionary of default parameters
content_cost (DataFrame) – Estimated value of composite good consumption that is considered as flood-prone, for each grid cell (24,014) and each housing type (4)
nb_households_formal (Series) – Number of households living in formal private housing, per grid cell
nb_households_subsidized (Series) – Number of households living in formal subsidized housing, per grid cell
nb_households_informal (Series) – Number of households living in informal settlements, per grid cell
nb_households_backyard (Series) – Number of households living in informal backyards, per grid cell DESCRIPTION.
dwelling_size (ndarray(float64, ndim=2)) – Average dwelling size (in m²) per grid cell in each housing type (4)
formal_structure_cost (ndarray(float64)) – Estimated construction cost of formal private housing structures, based on their market capital values, per grid cell (24,014)
content_damages (interp1d) – Linear interpolation for fraction of capital destroyed (house contents) over maximum flood depth (in m) in a given area, from de Villiers et al., 2007
structural_damages_type4b (interp1d) – Linear interpolation for fraction of capital destroyed (type-4b house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (two-floor reinforced masonry/concrete and steel buildings)
structural_damages_type4a (interp1d) – Linear interpolation for fraction of capital destroyed (type-4a house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (one-floor reinforced masonry/concrete and steel buildings)
structural_damages_type2 (interp1d) – Linear interpolation for fraction of capital destroyed (type-2 house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (wooden buildings)
structural_damages_type3a (interp1d) – Linear interpolation for fraction of capital destroyed (type-3a house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (one-floor unreinforced masonry/concrete buildings)
options (dict) – Dictionary of default options
spline_inflation (interp1d) – Linear interpolation for inflation rate (in base 100 relative to baseline year) over the years (baseline year set at 0)
year_temp (int) – Year (relative to baseline year set at 0) for which we want to run the function
path_tables (str) – Path for saving output plots
flood_categ (str) – Category of flood risks considered, used in name of output file

Returns

damages – Table yielding, for each return period and housing types, the estimated total damages in terms of housing structures and contents

Return type

DataFrame

outputs.flood_outputs.compute_damages_2d(floods, path_data, param, content_cost, nb_households_formal, nb_households_subsidized, nb_households_informal, nb_households_backyard, dwelling_size, formal_structure_cost, content_damages, structural_damages_type4b, structural_damages_type4a, structural_damages_type2, structural_damages_type3a, options, spline_inflation, year_temp, path_tables, flood_categ)[source]

Compute structure and content damages per housing type across space.

This function leverages the depth-damage functions from the literature to estimate the monetary value lost to floods based on the estimated total value of the underlying asset, per available return period. Here, we get spatial, and not aggregate, data. The use of this function instead of its 1D counterpart depends on the plots we are interested in.

Parameters

floods (list) – List of file names for available flood maps per return period and flood type
path_data (str) – Path towards data used in the model
param (dict) – Dictionary of default parameters
content_cost (DataFrame) – Estimated value of composite good consumption that is considered as flood-prone, for each grid cell (24,014) and each housing type (4)
nb_households_formal (Series) – Number of households living in formal private housing, per grid cell
nb_households_subsidized (Series) – Number of households living in formal subsidized housing, per grid cell
nb_households_informal (Series) – Number of households living in informal settlements, per grid cell
nb_households_backyard (Series) – Number of households living in informal backyards, per grid cell DESCRIPTION.
dwelling_size (ndarray(float64, ndim=2)) – Average dwelling size (in m²) per grid cell in each housing type (4)
formal_structure_cost (ndarray(float64)) – Estimated construction cost of formal private housing structures, based on their market capital values, per grid cell (24,014)
content_damages (interp1d) – Linear interpolation for fraction of capital destroyed (house contents) over maximum flood depth (in m) in a given area, from de Villiers et al., 2007
structural_damages_type4b (interp1d) – Linear interpolation for fraction of capital destroyed (type-4b house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (two-floor reinforced masonry/concrete and steel buildings)
structural_damages_type4a (interp1d) – Linear interpolation for fraction of capital destroyed (type-4a house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (one-floor reinforced masonry/concrete and steel buildings)
structural_damages_type2 (interp1d) – Linear interpolation for fraction of capital destroyed (type-2 house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (wooden buildings)
structural_damages_type3a (interp1d) – Linear interpolation for fraction of capital destroyed (type-3a house structures) over maximum flood depth (in m) in a given area, from de Englhardt et al., 2019 (one-floor unreinforced masonry/concrete buildings)
options (dict) – Dictionary of default options
spline_inflation (interp1d) – Linear interpolation for inflation rate (in base 100 relative to baseline year) over the years (baseline year set at 0)
year_temp (int) – Year (relative to baseline year set at 0) for which we want to run the function
path_tables (str) – Path for saving output plots
flood_categ (str) – Category of flood risks considered, used in name of output file

Returns

dict_damages – Dictionary yielding, for each return period, the estimated damages per grid cell (24,014) and housing type (4) in terms of housing structures and contents

Return type

dict

outputs.flood_outputs.compute_formal_structure_cost(initial_state_capital_land, initial_state_households_housing_types, coeff_land)[source]

Estimate construction costs of formal private housing structures in space.

Note that the estimation process relies on a theoretical relation linking market prices to capital values. The value obtained is therefore an outcome of the model, and may not correspond to accounting estimates that are common in the impact evaluation literature.

Parameters

initial_state_capital_land (ndarray(float64, ndim=2)) – Value (in rands) of the housing capital stock per unit of available land (in km²) for each endogenous housing type (3) per grid cell at baseline year (2011)
initial_state_households_housing_types (ndarray(float64, ndim=2)) – Number of households per grid cell in each housing type (4)
coeff_land (ndarray(float64, ndim=2)) – Updated land availability for each grid cell (24,014) and each housing type (4: formal private, informal backyards, informal settlements, formal subsidized)

Returns

formal_structure_cost – Estimated construction cost of formal private housing structures, based on their market capital values, per grid cell (24,014)

Return type

ndarray(float64)

outputs.flood_outputs.compute_stats_per_housing_type(floods, path_floods, nb_households_formal, nb_households_subsidized, nb_households_informal, nb_households_backyard, path_tables, flood_categ, threshold=0.1)[source]

Compute aggregate flood exposure statistics for a given flood type.

More specifically, the function returns for each available return period and each housing type, an estimated total number of households exposed and associated average maximum flood depth level. The output is used as an argument by the validation_flood function (export_outputs_floods module)

Parameters

floods (list) – List of file names for available flood maps per return period and flood type
path_floods (str) – Path towards flood maps directory
nb_households_formal (Series) – Number of households living in formal private housing, per grid cell
nb_households_subsidized (Series) – Number of households living in formal subsidized housing, per grid cell
nb_households_informal (Series) – Number of households living in informal settlements, per grid cell
nb_households_backyard (Series) – Number of households living in informal backyards, per grid cell
path_tables (str) – Path for saving output plots
flood_categ (str) – Category of flood risks considered, used in name of output file
threshold (float64, optional) – Maximum flood depth level (in m) below which we choose to discard flood risks. The default is 0.1, but is not used in benchmark version of the function.

Returns

stats_per_housing_type – Table summarizing, for a given flood type and each associated return period, the estimated total number of households per housing type living in flood-prone areas, and the associated average maximum flood depth level

Return type

DataFrame

outputs.flood_outputs.compute_stats_per_income_group(floods, path_floods, nb_households_rich, nb_households_midrich, nb_households_midpoor, nb_households_poor, path_tables, flood_categ, threshold=0.1)[source]

Compute aggregate flood exposure statistics for a given flood type.

More specifically, the function returns for each available return period and each income group, an estimated total number of households exposed and associated average maximum flood depth level.

Parameters

floods (list) – List of file names for available flood maps per return period and flood type
path_floods (str) – Path towards flood maps directory
nb_households_rich (Series) – Number of rich households, per grid cell
nb_households_midrich (Series) – Number of mid-rich households, per grid cell
nb_households_midpoor (Series) – Number of mid-poor households, per grid cell
nb_households_poor (Series) – Number of poor households, per grid cell
path_tables (str) – Path for saving output plots
flood_categ (str) – Category of flood risks considered, used in name of output file
threshold (float64, optional) – Maximum flood depth level (in m) below which we choose to discard flood risks. The default is 0.1, but is not used in benchmark version of the function.

Returns

stats_per_income_group – Table summatizing, for a given flood type and each associated return period, the estimated total number of households per income group living in flood-prone areas, and the associated average maximum flood depth level

Return type

DataFrame

outputs.flood_outputs.create_flood_dict(floods, path_floods, path_tables, sim_nb_households_poor, sim_nb_households_midpoor, sim_nb_households_midrich, sim_nb_households_rich)[source]

Create dictionary for household distribution in a given set of flood maps.

The spatial distribution is broken into income groups, as this is the relevant dimension in the plot_flood_severity_distrib function (export_outputs_floods module) that takes the output of this function as an argument.

Parameters

floods (list) – List of file names for available flood maps per return period and flood type
path_floods (str) – Path towards flood maps directory
path_tables (str) – Path for saving output plots
sim_nb_households_poor (Series) – Number of poor households, per grid cell
sim_nb_households_midpoor (Series) – Number of mid-poor households, per grid cell
sim_nb_households_midrich (Series) – Number of mid-rich households, per grid cell
sim_nb_households_rich (Series) – Number of rich households, per grid cell

Returns

dictio – Dictionary yielding, for each return period of a given flood type, the spatial distribution of households broken into income groups, along with the maximum flood depth level and fraction of flood-prone area in their residential location

Return type

dict

Display values related to floods

outputs.export_outputs_floods.plot_flood_severity_distrib(barWidth, transparency, dictio, flood_type, path_plots, ylim)[source]

Bar plot distribution of households across flood zones of varying severity.

Only selected return periods are shown for illustrative purposes. Output plot is specific to a given flood type and is broken down into income groups as a proxy of households’ vulnerability. Households are grouped into bins for different maximum flood depth levels in their residential location and the resulting population distribution is displayed without the majority living in low-risk zones (below the flood depth level given by barWidth argument), to preserve plot readability.

Parameters

barWidth (float) – Bar width argument expected by pyplot: ocrresponds to the step used to group maximum flood depth (in m) levels into bins
transparency (list) – List of transparency indices to distinguish across plotted return periods
dictio (dict) – Dictionary yielding, for each return period of a given flood type, the spatial distribution of households broken into income groups, along with the maximum flood depth level and fraction of flood-prone area in their residential location
flood_type (TYPE) – Code for flood type used as the first component of dictionary keys
path_plots (str) – Path for saving output plots
ylim (int) – Maximum value on the y-axis

Return type

None.

outputs.export_outputs_floods.round_nearest(x, a)[source]

Return rounded value to nearest decimal number above some threshold.

This is a technical calculation that is called by the plot_flood_severity_distrib function: it will allow to split values across bins, while discarding small values (smaller than half bar width) to avoid a peak at the origin of the plot.

Parameters

x (Series) – Any series of numbers with many decimals
a (float) – Level below which we want to discard below which we want to discard small values

Returns

Value rounded to the nearest decimal

Return type

float

outputs.export_outputs_floods.simul_damages(damages, path_plots, flood_categ, options)[source]

Bar plot for estimated damages for a given flood type.

Breakdown is given by housing type as depth-damage functions used to estimate costs are specific to building materials. Here, we just annualize previously estimated total damages across available return periods. The function returns two separate plots for damages done to housing structures and contents. This function (as opposed to plot_damages) is specifically used for subsequent periods when validation data is not available.

Parameters

damages (DataFrame) – Table yielding, for each return period and housing types, the estimated total damages in terms of housing structures and contents
path_plots (str) – Path for saving output plots
flood_categ (str) – Type of flood risk considered, used to define output file name
options (dict) – Dictionary of default options

Return type

None.

outputs.export_outputs_floods.simul_damages_time(list_damages, path_plots, path_tables, flood_categ, options)[source]

Plot evolution of aggregate annualized damages per housing type over time.

The function returns different plots for each housing type. Again, they are broken down across structures and contents damages. The value obtained for each year corresponds to the annualized value of the damages summed across all locations.

Parameters

list_damages (list) – List, for each simulation year, of the output of the outputs.flood_outputs.compute_damages_2d function: a dictionary yielding, for each return period, the estimated damages per grid cell (24,014) and housing type (4) in terms of housing structures and contents
path_plots (str) – Path for saving output plots
path_tables (str) – Path for saving output plots
flood_categ (str) – Type of flood risk considered, used to define output file name
options (dict) – Dictionary of default options

Return type

None.

outputs.export_outputs_floods.valid_damages(damages1, damages2, path_plots, flood_categ, options)[source]

Validation bar plot for estimated damages for a given flood type.

Breakdown is given by housing type as depth-damage functions used to estimate costs are specific to building materials. Here, we just annualize previously estimated total damages across available return periods. The function returns two separate plots for damages done to housing structures and contents.

Parameters

damages1 (DataFrame) – Table yielding, for each return period and housing types, the estimated total damages in terms of housing structures and contents. Here, we expect simulation results.
damages2 (DataFrame) – Table yielding, for each return period and housing types, the estimated total damages in terms of housing structures and contents. Here, we expect validation data.
path_plots (str) – Path for saving output plots
flood_categ (str) – Type of flood risk considered, used to define output file name
options (dict) – Dictionary of default options

Return type

None.

outputs.export_outputs_floods.validation_flood(stats1, stats2, legend1, legend2, type_flood, path_plots)[source]

Validation bar plot for household spatial distribution across flood zones.

The validation is done across some selected return periods for illustrative purposes. Breakdown is given by housing type. The output plots show the average maximum flood depth level to which households are exposed, and the total number of households living in flood-prone areas.

Parameters

stats1 (DataFrame) – Table summarizing, for a given flood type and each associated return period, the estimated total number of households per housing type living in flood-prone areas, and the associated average maximum flood depth level. We expect validation data here.
stats2 (DataFrame) – Table summarizing, for a given flood type and each associated return period, the estimated total number of households per housing type living in flood-prone areas, and the associated average maximum flood depth level. We expect simulation results here.
legend1 (str) – Legend for first data frame
legend2 (str) – Legend for second data frame
type_flood (str) – Type of flood risk considered, used to define output file name
path_plots (str) – Path for saving output plots

Return type

None.

outputs.export_outputs_floods.validation_flood_coastal(stats1, stats2, legend1, legend2, type_flood, path_plots)[source]

Validation bar plot for household spatial distribution across flood zones.

The validation is done across some selected return periods for illustrative purposes. Breakdown is given by housing type. The output plots show the average maximum flood depth level to which households are exposed, and the total number of households living in flood-prone areas. This function is specific to coastal floods as the underlying flood maps are not available for the same return periods as fluvial / pluvial flood maps.

Parameters

stats1 (DataFrame) – Table summarizing, for a given flood type and each associated return period, the estimated total number of households per housing type living in flood-prone areas, and the associated average maximum flood depth level. We expect validation data here.
stats2 (DataFrame) – Table summarizing, for a given flood type and each associated return period, the estimated total number of households per housing type living in flood-prone areas, and the associated average maximum flood depth level. We expect simulation results here.
legend1 (str) – Legend for first data frame
legend2 (str) – Legend for second data frame
type_flood (str) – Type of flood risk considered, used to define output file name
path_plots (str) – Path for saving output plots

Return type

None.