.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_nonconforming_ggl.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_nonconforming_ggl.py: Nonconforming Group Graphical Lasso experiment =================================================== Example script for Group Graphical Lasso with non-conforming dimension, i.e. some variables exist in some instances but not in all. We generate one underlying precision matrix and then drop one block of variables in each instance. .. GENERATED FROM PYTHON SOURCE LINES 9-28 .. code-block:: Python import numpy as np import pandas as pd from matplotlib import pyplot as plt import seaborn as sns from gglasso.helper.ext_admm_helper import create_group_array, construct_indexer from gglasso.helper.utils import sparsity from gglasso.helper.data_generation import generate_precision_matrix, group_power_network, sample_covariance_matrix from gglasso.helper.ext_admm_helper import check_G, consensus from gglasso.problem import glasso_problem K = 4 p = 50 M = 10 B = int(p/M) N = 200 .. GENERATED FROM PYTHON SOURCE LINES 29-35 Generating the data ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We generate one precision matrix, sample observations, and finally drop one block of variables in each instance. It is important that the observations for each instance is a ``DataFrame`` of shape ``(p_k,N_k)`` where the index has unique ids for each variable. .. GENERATED FROM PYTHON SOURCE LINES 35-52 .. code-block:: Python p_arr = (p-B)*np.ones(K, dtype = int) num_samples = N*np.ones(K) Sigma, Theta = generate_precision_matrix(p=p, M=M, style = 'powerlaw', gamma = 2.8, prob = 0.1, seed = 3456) all_obs = dict() S = dict() for k in np.arange(K): _, obs = sample_covariance_matrix(Sigma, N, seed = 456) # drop the k-th block starting from the end all_obs[k] = pd.DataFrame(obs).drop(np.arange(p-(k+1)*B, p-k*B), axis = 0) S[k] = np.cov(all_obs[k], bias = True) .. GENERATED FROM PYTHON SOURCE LINES 53-62 Creating the groups ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In this section, we create two important objects for the non-conforming case using functionalities of ``GGLasso``: ``ix_location`` is a dataframe with every variable as index and each columns is one dataset. It contains the index of the variable in the respective instance (``NaN`` when not present). ``G`` is a bookeeping array which keeps count of the indices of each group of overlapping variables. It is needed in the solver later on. **Important:** We only consider pairs of variables which appear at least in 3 instances here! .. GENERATED FROM PYTHON SOURCE LINES 62-75 .. code-block:: Python ix_exist, ix_location = construct_indexer(list(all_obs.values())) G = create_group_array(ix_exist, ix_location, min_inst = K-1) check_G(G, p) print("Dimensions p_k: ", p_arr) print("Sample sizes N_k: ", num_samples) print("Number of groups found: ", G.shape[1]) .. rst-class:: sphx-glr-script-out .. code-block:: none Creation of bookeeping array... 10% finished 20% finished 30% finished 40% finished 50% finished 60% finished 70% finished 80% finished 90% finished Dimensions p_k: [45 45 45 45] Sample sizes N_k: [200. 200. 200. 200.] Number of groups found: 1075 .. GENERATED FROM PYTHON SOURCE LINES 76-82 Visualizing ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We visualize the case of non-conforming variables by plotting the given empirical covariance matrices. Missing variable observations are in **white**. .. GENERATED FROM PYTHON SOURCE LINES 82-96 .. code-block:: Python fig, axs = plt.subplots(2,2, figsize = (8,8)) for k in range(K): ind = ix_exist.index[ix_exist.loc[:,k]] S_k = pd.DataFrame(S[k], index = ind, columns = ind) # extend matrix by nonexistent variables S_k = S_k.reindex(columns = ix_exist.index, index = ix_exist.index) ax = axs.ravel()[k] sns.heatmap(S_k, ax = ax, cmap = plt.cm.coolwarm, linewidth = 0.005, linecolor = 'lightgrey',\ cbar = False, vmin = -.5, vmax = .5, xticklabels = [], yticklabels = []) ax.set_title(f"Empirical covariance, instance {k}") .. image-sg:: /auto_examples/images/sphx_glr_plot_nonconforming_ggl_001.png :alt: Empirical covariance, instance 0, Empirical covariance, instance 1, Empirical covariance, instance 2, Empirical covariance, instance 3 :srcset: /auto_examples/images/sphx_glr_plot_nonconforming_ggl_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 97-101 Defining the GGL problem ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We now create the instance of Group Graphical Lasso problem. As we are in the non-conforming case, we need to spcify the array ``G`` which we created before. .. GENERATED FROM PYTHON SOURCE LINES 101-106 .. code-block:: Python P = glasso_problem(S = S, N = N, reg = "GGL", reg_params = None, latent = True, G = G, do_scaling = True) print(P) .. rst-class:: sphx-glr-script-out .. code-block:: none GROUP GRAPHICAL LASSO PROBLEM WITH LATENT VARIABLES Regularization parameters: {'lambda1': None, 'lambda2': None, 'mu1': None} .. GENERATED FROM PYTHON SOURCE LINES 107-110 Model selection ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Set the regularization parameter grids and do model selection. .. GENERATED FROM PYTHON SOURCE LINES 110-121 .. code-block:: Python l1 = np.logspace(0,-2,7) mu1 = np.logspace(1,-1,3) l2 = np.logspace(0,-2,4) modelselect_params = {'lambda1_range' : l1, 'mu1_range': mu1, 'lambda2_range': l2} P.model_selection(modelselect_params = modelselect_params, method = 'eBIC', gamma = 0.1, tol = 1e-7, rtol = 1e-7) print(P.reg_params) .. rst-class:: sphx-glr-script-out .. code-block:: none ------------Range search for instance 0------------ ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 83 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 83 iterations with status: optimal. ADMM terminated after 28 iterations with status: optimal. ADMM terminated after 25 iterations with status: optimal. ADMM terminated after 83 iterations with status: optimal. ADMM terminated after 35 iterations with status: optimal. ADMM terminated after 33 iterations with status: optimal. ADMM terminated after 101 iterations with status: optimal. ADMM terminated after 37 iterations with status: optimal. ADMM terminated after 40 iterations with status: optimal. ADMM terminated after 94 iterations with status: optimal. ADMM terminated after 23 iterations with status: optimal. ADMM terminated after 27 iterations with status: optimal. ADMM terminated after 50 iterations with status: optimal. ADMM terminated after 34 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 38 iterations with status: optimal. ------------Range search for instance 1------------ ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 82 iterations with status: optimal. ADMM terminated after 17 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 82 iterations with status: optimal. ADMM terminated after 28 iterations with status: optimal. ADMM terminated after 25 iterations with status: optimal. ADMM terminated after 82 iterations with status: optimal. ADMM terminated after 35 iterations with status: optimal. ADMM terminated after 33 iterations with status: optimal. ADMM terminated after 114 iterations with status: optimal. ADMM terminated after 37 iterations with status: optimal. ADMM terminated after 40 iterations with status: optimal. ADMM terminated after 90 iterations with status: optimal. ADMM terminated after 22 iterations with status: optimal. ADMM terminated after 27 iterations with status: optimal. ADMM terminated after 53 iterations with status: optimal. ADMM terminated after 34 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 35 iterations with status: optimal. ------------Range search for instance 2------------ ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 84 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 84 iterations with status: optimal. ADMM terminated after 28 iterations with status: optimal. ADMM terminated after 25 iterations with status: optimal. ADMM terminated after 83 iterations with status: optimal. ADMM terminated after 35 iterations with status: optimal. ADMM terminated after 33 iterations with status: optimal. ADMM terminated after 98 iterations with status: optimal. ADMM terminated after 36 iterations with status: optimal. ADMM terminated after 40 iterations with status: optimal. ADMM terminated after 91 iterations with status: optimal. ADMM terminated after 22 iterations with status: optimal. ADMM terminated after 27 iterations with status: optimal. ADMM terminated after 56 iterations with status: optimal. ADMM terminated after 33 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 34 iterations with status: optimal. ------------Range search for instance 3------------ ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 84 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 84 iterations with status: optimal. ADMM terminated after 28 iterations with status: optimal. ADMM terminated after 25 iterations with status: optimal. ADMM terminated after 84 iterations with status: optimal. ADMM terminated after 35 iterations with status: optimal. ADMM terminated after 33 iterations with status: optimal. ADMM terminated after 119 iterations with status: optimal. ADMM terminated after 38 iterations with status: optimal. ADMM terminated after 40 iterations with status: optimal. ADMM terminated after 87 iterations with status: optimal. ADMM terminated after 22 iterations with status: optimal. ADMM terminated after 27 iterations with status: optimal. ADMM terminated after 53 iterations with status: optimal. ADMM terminated after 33 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 39 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 21 iterations with status: optimal. ADMM terminated after 33 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 20 iterations with status: optimal. ADMM terminated after 29 iterations with status: optimal. ADMM terminated after 47 iterations with status: optimal. ADMM terminated after 45 iterations with status: optimal. ADMM terminated after 146 iterations with status: optimal. ADMM terminated after 146 iterations with status: optimal. ADMM terminated after 516 iterations with status: optimal. ADMM terminated after 203 iterations with status: optimal. ADMM terminated after 143 iterations with status: optimal. ADMM terminated after 147 iterations with status: optimal. ADMM terminated after 257 iterations with status: optimal. ADMM terminated after 168 iterations with status: optimal. ADMM terminated after 155 iterations with status: optimal. ADMM terminated after 164 iterations with status: optimal. ADMM terminated after 208 iterations with status: optimal. ADMM terminated after 180 iterations with status: optimal. {'lambda1': np.float64(0.2154434690031884), 'lambda2': np.float64(0.01), 'mu1': array([10., 10., 10., 10.])} .. GENERATED FROM PYTHON SOURCE LINES 122-129 Evaluation of results ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We print the ratio of non-zero entries per instance. We plot the distribution of non-zero entries per group. With group sparsity regularization we aim for many groups with either no non-zero or many non-zero entries per group. .. GENERATED FROM PYTHON SOURCE LINES 129-143 .. code-block:: Python print("Solution sparsity (ratio of nonzero entries): ") for k in np.arange(K): print(f"Instance {k}: ", sparsity(P.solution.precision_[k])) stats = P.modelselect_stats.copy() nnz,_,_ = consensus(P.solution.precision_, G) fig, ax = plt.subplots() sns.histplot(nnz, discrete = True, ax = ax) ax.set_yscale('log') ax.set_title('Nonzero entries per group') .. image-sg:: /auto_examples/images/sphx_glr_plot_nonconforming_ggl_002.png :alt: Nonzero entries per group :srcset: /auto_examples/images/sphx_glr_plot_nonconforming_ggl_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Solution sparsity (ratio of nonzero entries): Instance 0: 0.03232323232323232 Instance 1: 0.03333333333333333 Instance 2: 0.03333333333333333 Instance 3: 0.03333333333333333 Text(0.5, 1.0, 'Nonzero entries per group') .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 15.461 seconds) .. _sphx_glr_download_auto_examples_plot_nonconforming_ggl.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_nonconforming_ggl.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_nonconforming_ggl.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_nonconforming_ggl.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_