Utils

Utility functions: data manipulation, plotting helpers, and analysis tools.

When to use this module

Use project.utils for shared helpers that are reused across modules, such as:

  • I/O convenience wrappers (JSON/CSV readers)

  • data-index transformation helpers

  • logging, timing, and memory utility functions

project.utils.add_no_renovation(df)[source]
project.utils.calculate_annuities(capex, lifetime=50, discount_rate=0.032)[source]
project.utils.calculate_average(df, lifetime=50, discount_rate=0.032)[source]
project.utils.calculate_loan_annuity(capex, lifetime=50, discount_rate=0.032)[source]
project.utils.compare_bar_plot(df, y_label, legend=True, format_y=<function <lambda>>, save=None)[source]
project.utils.conditional_expectation(x)[source]

Calculate the conditional expectation of epsilon given epsilon > x, where epsilon follows a logistic distribution.

Parameters: x (float): The deterministic value greater than which epsilon is considered.

Returns: float: The conditional expectation of epsilon given epsilon > x.

project.utils.create_logger(path=None, level='DEBUG')[source]

Create logger for one run.

Parameters:

path (str) –

Return type:

Logger

project.utils.cumulated_plot(x, y, plot=True, format_x=<function <lambda>>, format_y=<function <lambda>>, round=None, ref=None, hlines=None)[source]

Y by cumulated x.

Use for marginal abatement cost curve.

Parameters:
  • x (Series) –

  • y (Series) –

project.utils.cumulated_plots(dict_df, y_label, legend=True, format_y=<function <lambda>>, save=None, ylim=None, ymin=0)[source]
project.utils.deciles2quintiles(stock, policies_heater, policies_insulation, inputs)[source]

Change all inputs from deciles to quintiles.

Parameters:
  • stock

  • policies_heater

  • policies_insulation

  • inputs

project.utils.deciles2quintiles_dict(inputs)[source]
project.utils.deciles2quintiles_list(item)[source]
project.utils.deciles2quintiles_pandas(data, func='mean')[source]
project.utils.dict2data(dict_df)[source]

Concatenate different series in a single DataFrame by interpolating indexes.

Parameters:

dict_df (dict) – Dictionnary of DataFrame.

Return type:

pd.DataFrame

project.utils.factor_annuities(lifetime=50, discount_rate=0.032)[source]
project.utils.find_discount_rate(factor, lifetime=30)[source]
project.utils.format_ax(ax, y_label=None, title=None, format_x=None, format_y=<function <lambda>>, ymin=0, ymax=None, xinteger=True, xmin=None, xmax=None, horizontal=False)[source]
Parameters:
  • y_label (str) –

  • format_y (function) –

  • ymin (float or None) –

  • xinteger (bool) –

  • title (str, optional) –

project.utils.format_legend(ax, ncol=3, offset=1, labels=None, loc='upper', left=1.04, order='reverse')[source]
project.utils.format_table(df, name='Years')[source]
project.utils.get_json(path)[source]
project.utils.get_pandas(path, func=<function <lambda>>)[source]
project.utils.get_series(path, header=0)[source]
project.utils.get_size(obj, seen=None)[source]

Recursively finds size of objects

project.utils.horizontal_stack_bar_plot(df, columns=None, title=None, order=None, save_path=None)[source]

Create a horizontal stacked bar plot from a DataFrame.

Examples: horizontal_stack_bar_plot(sobol_df.rename(index=NAME_COLUMNS), columns=[‘First order’, ‘Total order’],

title=’Influence of parameters that the ban i’, order=’Total order’, save_path=folder_name / Path(‘sobol_ban.png’))

Parameters:
  • df

  • columns

  • title

  • order

  • save_path

project.utils.make_area_plot(df, y_label, colors=None, format_y=<function <lambda>>, save=None, ncol=3, total=True, offset=1, ymin=None, loc='upper', scatter=None, left=1.04, xinteger=True)[source]
project.utils.make_clusterstackedbar_plot(df, groupby, colors=None, format_y=<function <lambda>>, save=None, rotation=0, year_ini=None, order_scenarios=None, fonttick=14, ymin=0, legend=True, figtitle=None, ymax=None, display_total=False)[source]
project.utils.make_distribution_plot(dict_df, y_label, cbar_title, format_y=<function <lambda>>, cbar_format=None, save=None)[source]
project.utils.make_grouped_scatterplots(dict_df, x, y, n_columns=3, format_y=<function <lambda>>, n_bins=2, save=None, order=None, colors=None)[source]

Plot a line for each index in a subplot.

Parameters:
  • dict_df (dict) – df_dict values are pd.DataFrame (index=years, columns=scenario)

  • format_y (function, optional) –

  • n_columns (int, default 3) –

  • n_bins (int, default None) –

  • save (str, default None) –

  • scatter (dict, default None) –

project.utils.make_grouped_subplots(dict_df, n_columns=3, format_y=<function <lambda>>, n_bins=2, save=None, scatter=None, order=None, colors=None)[source]

Plot a line for each index in a subplot.

Parameters:
  • dict_df (dict) – df_dict values are pd.DataFrame (index=years, columns=scenario)

  • format_y (function, optional) – function to format y axis

  • n_columns (int, default 3) –

  • n_bins (int, default None) – if not None, the x axis is divided in n_bins

  • save (str, default None) –

  • scatter (dict, default None) – scatter keys are the same as dict_df keys, values are pd.DataFrame (index=years, columns=scenario)

project.utils.make_hist(df, x, hue, y_label, legend=True, format_y=<function <lambda>>, save=None, kde=False, palette=None, bins=20, xlim=None)[source]
project.utils.make_horizontal_stackedbar_plot(df, y_label, colors=None, format_x=<function <lambda>>, save=None, ncol=3, ymin=0, hline=None, lineplot=None, rotation=0, loc='left', left=1.04, xmin=None, scatterplot=None, fontxtick=16, scatterplot_bis=None, legend_label='Social benefits', annotate='{:.0f}')[source]

Make stackedbar plot.

Parameters:
  • df (pd.DataFrame) –

  • y_label (str) –

  • colors (dict) –

  • format_x (function) –

  • save (str, optional) –

  • ncol (int, default 3) –

  • ymin (float, optional) –

  • hline (float, optional) –

  • lineplot (pd.Series, default None) –

  • rotation (int, default 0) –

  • loc (str, default 'left') –

  • left (float, default 1.04) –

  • xmin (int, default None) –

  • scatterplot (pd.Series, default None) –

  • fontxtick (int, default 16) –

  • scatterplot_bis (dict, default None) –

  • legend_label (str, default 'Social benefits') –

  • annotate (str, default '{:.0f}') –

project.utils.make_plot(df, y_label, colors=None, format_x=None, format_y=<function <lambda>>, save=None, scatter=None, legend=True, integer=True, ymin=0, ymax=None, hlines=None, labels=None, loc='upper', left=1.04, order_legend='reverse', ncol=3)[source]

Make plot.

Parameters:
  • df (pd.DataFrame or pd.Series) –

  • y_label (str) –

  • colors (dict) –

  • format_y (function) –

  • save (str, optional) –

  • scatter (pd.Series, default None) –

  • ymin (float, optional) –

project.utils.make_plots(dict_df, y_label, colors=None, format_y=<function <lambda>>, save=None, scatter=None, legend=True, integer=False, loc='upper', left=1.04, ymax=None, ymin=0, format_x=None, hlines=None, scatter_dict=None, labels=None, order_legend='reverse', x_tick_interval=None, ncol=3, xmin=None, xmax=None, export_csv=False)[source]

Make plot.

Parameters:
  • dict_df (dict) –

  • y_label (str) –

  • colors (dict) –

  • format_y (function) –

  • save (str, optional) –

  • scatter (pd.Series, default None) –

project.utils.make_policies_tables(policies, path, plot=True)[source]
project.utils.make_relplot(df, x, y, col=None, hue=None, palette=None, save=None, title=None, format_y=<function <lambda>>)[source]
project.utils.make_scatter_plot(df, x, y, x_label, y_label, hlines=None, format_y=<function <lambda>>, format_x=<function <lambda>>, save=None, xmin=None, ymin=None, col_size=None, leg_title=None, col_colors=None, annotate=True, xmax=None, ymax=None, diagonal_line=False, s=30)[source]
project.utils.make_sensitivity_tables(table_result, path)[source]
project.utils.make_stacked_bar_subplot(df, format_y=<function <lambda>>, fonttick=18, color=None, save=None, subplot_groups=['Housing type', 'Occupancy status'], index_group='Income tenant', stack_group='Type', annotate='{:.0f}€', annotate_bis=None, replace_legend=None, figtitle=None)[source]

Make stacked bar plot.

Parameters:
  • df (pd.Series with 4 levels of index) –

  • fonttick (int, default 18) –

  • color (str, optional) –

  • format_y (function, optional) –

  • save (str, optional) –

project.utils.make_stackedbar_plot(df, y_label, colors=None, format_y=<function <lambda>>, save=None, ncol=3, ymin=0, hline=None, lineplot=None, rotation=0, loc='left', left=1.04, xmin=None, scatterplot=None, fontxtick=16, scatterplot_bis=None, legend_label='Social benefits', annotate='{:.0f}')[source]

Make stackedbar plot.

Parameters:
  • df (pd.DataFrame) –

  • y_label (str) –

  • colors (dict) –

  • format_y (function) –

  • save (str, optional) –

  • ncol (int, default 3) –

  • ymin (float, optional) –

  • hline (float, optional) –

  • lineplot (pd.Series, default None) –

  • rotation (int, default 0) –

  • loc (str, default 'left') –

  • left (float, default 1.04) –

  • xmin (int, default None) –

  • scatterplot (pd.Series, default None) –

  • fontxtick (int, default 16) –

  • scatterplot_bis (dict, default None) –

  • legend_label (str, default 'Social benefits') –

  • annotate (str, default '{:.0f}') –

project.utils.make_swarmplot(df, y_label, hue=None, colors=None, hue_order=None, format_y=<function <lambda>>, save=None, name='Years')[source]
project.utils.make_uncertainty_plot(df, title, detailed=False, format_y=<function <lambda>>, ymin=0, save=None, scatter=None, columns=None, ncol=3, offset=1, loc='upper', left=1.04, reference='Reference')[source]

Plot multi scenarios and uncertainty area between lower value and higher value of scenarios.

Parameters:
  • df (pd.DataFrame) – Columns represent one scenario

  • title (str) –

  • detailed (bool, default False) –

  • format_y (func) –

  • ymin (float or int) –

project.utils.manual_shapley_analysis(scenarios, list_features, y)[source]
project.utils.manual_sobol_analysis(scenarios, list_features, y)[source]

Computes manually the Sobol indices for a given set of scenarios and a given output variable y

scenarios: DataFrame

DataFrame containing the scenarios

list_features: list

List of features to consider

y: str

Output variable

project.utils.memory_object(buildings)[source]
project.utils.parse_policies(config)[source]
project.utils.plot_attribute(stock, attribute, dict_order=None, suptitle=None, percent=False, dict_color=None, width=0.3, save=None, figsize=(12.8, 9.6))[source]

Make bar plot for 1 stock dataframe for one attribute in order to graphically compare.

Parameters:
  • stock (pd.Series) –

  • attribute (str) – Level name of stock.

  • dict_order (dict, optional) –

  • suptitle (str, optional) –

  • percent (bool) –

  • dict_color (dict, optional) –

  • width (float, default 0.3) –

project.utils.plot_attribute2attribute(stock, attribute1, attribute2, suptitle=None, dict_order={}, dict_color={}, percent=False, save=None, legend=True, left=1.1)[source]
project.utils.plot_ldmi_method(channel, emission, colors=None, rotation=0, save=None, format_y=<function <lambda>>, title=None, y_label='Emissions (MtCO2)')[source]

Plots LDMI decomposition method.

project.utils.plot_table(tables_policies, path)[source]
project.utils.plot_thermal_insulation(stock, save=None)[source]
project.utils.reindex_mi(df, mi_index, levels=None, axis=0)[source]

Return re-indexed DataFrame based on miindex using only few labels.

Parameters:
  • df (pd.DataFrame, pd.Series) – data to reindex

  • mi_index (pd.MultiIndex, pd.Index) – master to index to reindex df

  • levels (list, default df.index.names) – list of levels to use to reindex df

  • axis ({0, 1}, default 0) – axis to reindex df

Return type:

pd.DataFrame, pd.Series

Example

reindex_mi(surface_ds, segments, [‘Occupancy status’, ‘Housing type’])) reindex_mi(cost_invest_ds, segments, [‘Heating energy final’, ‘Heating energy’]))

project.utils.reverse_dict(data)[source]
project.utils.save_fig(fig, save=None, bbox_inches='tight')[source]
project.utils.select(df, dict_levels)[source]
project.utils.size_dict(dict_vars, n=30, display=True)[source]
project.utils.stack_catplot(x, y, cat, stack, data, palette, y_label, save=None, leg_title=None, format_y=<function <lambda>>)[source]
project.utils.subplots_attributes(stock, dict_order={}, suptitle=None, percent=False, dict_color=None, n_columns=3, sharey=False, save=None)[source]

Multiple bar plot of stock by attributes.

Parameters:
  • stock (pd.Series) –

  • dict_order (dict) –

  • suptitle (str) –

  • percent (bool) –

  • dict_color (dict) –

  • n_columns (int) –

  • sharey (bool) –

project.utils.subplots_pie(stock, dict_order={}, pie={}, suptitle=None, percent=False, dict_color=None, n_columns=3, save=None)[source]

Multiple bar plot of stock by attributes.

Parameters:
  • stock (pd.Series) –

  • dict_order (dict) –

  • pie (dict) –

  • suptitle (str) –

  • percent (bool) –

  • dict_color (dict) –

  • n_columns (int) –

  • sharey (bool) –

project.utils.timing(f)[source]
project.utils.waterfall_chart(df, title=None, save=None, colors=None, figsize=(12.8, 9.6))[source]

Make waterfall chart. Used for Social Economic Assessment.

Parameters:
  • df (pd.Series) –

  • title (str, optional) –

  • figsize