Tutorial 1: Running Mapper as a Library
All the code for this tutorial can be found in the tutorials/tutorial1 folder. Run those files or just copy the snippets into your MATLAB environment.
Step 1: Import the library
To use the demapper library, you need to add the paths to the demapper files. We need all the files within the code subfolder:
clear;
% from this file location, go up two levels to find the base folder
basefolder = fileparts(fileparts(fileparts(mfilename('fullpath'))));
codefolder = [basefolder,'/code'];
addpath(genpath(codefolder)); % add the code folder to the path
% locate the data folder
datafolder = [basefolder, '/hasegan_et_al_netneuro_2024/data/trefoil_knot/'];
This was set to run from the tutorials/tutorial1 location. If you are running from a different location, you need to adjust the path basefolder accordingly.
Step 2: Explore the data
Load the data from the trefoil knot and explore the 3D presentation.
% The data file is a matrix defined in a CSV file format without a header.
data_path = [datafolder, 'data_treefoil.1D'];
data = read_1d(data_path);
% Size of data will be 120x3. A 3D dataset with 120 points and 3 coordinates
size(data)
% Explore the data
figure;
plot3(data(:,1), data(:,2), data(:,3), 'x');
% Load the "nodeCData" data, coloring each node as one of 3 colors
coloring_data = [datafolder, 'data_treefoil_task_nodeCData.1D'];
nodeCData = read_1d(coloring_data);
% Explore the treefoil knot data with the colors
figure;
scatter3(data(:,1), data(:,2), data(:,3), 100, nodeCData(:, 1:3), 'filled');
Resulting image:

Figure 2.1: trefoil Knot data in 3D
Step 3: Run Mapper
Run the following code to run a simple Mapper configuration on our dataset.
% zscore the data before running Mapper
data = zscore(data);
% Generate simple Mapper options. k=6, resolution=5, gain=30
opts = BDLMapperOpts(6, 5, 30);
% Run the Mapper algorithm
res = mapper(data, opts);
What are the results, what do they look like?
res
% struct with fields:
% options: [1×1 struct]
% memberMat: [16×120 logical]
% adjacencyMat: [16×16 double]
% nodeMembers: {16×1 cell}
% knn_g: [1×1 graph]
Looks like mapper identified 16 nodes that represents the topology of the input data. Each of the original 120 data points is a member of one or more of the nodes. The res.memberMat matrix represents this membership.
memberMat
: is a logical matrix of size numNodes x numDataPoints where each row is a logical vector indicating which data points are members of the node
Another way to view the membership is through the nodeMembers cell array.
nodeMembers
: cell array contains the indices of the data points that are members of each node.
res.nodeMembers
% ans =
% 16×1 cell array
% {[107 108 109 110 111 112 113 114 115 116 117 118]}
% {[100 101 102 103 104 105 106 107 108 109 110 111]}
% {[ 93 94 95 96 97 98 99 100 101 102]}
% {[ 85 86 87 88 89 90 91 92 93 94 95 96]}
% {[ 79 80 81 82 83 84 85 86 87]}
% {[ 71 72 73 74 75 76 77 78 79 80]}
% {[ 63 64 65 66 67 68 69 70 71 72 73 74]}
% {[ 57 58 59 60 61 62 63 64]}
% {[ 47 48 49 50 51 52 53 54 55 56 57 58]}
% {[ 41 42 43 44 45 46 47 48 49 50]}
% {[ 34 35 36 37 38 39 40 41 42]}
% {[ 25 26 27 28 29 30 31 32 33 34 35 36]}
% {[ 19 20 21 22 23 24 25 26 27 28]}
% {[ 10 11 12 13 14 15 16 17 18 19 20 21]}
% {[ 3 4 5 6 7 8 9 10 11 12 13 14]}
% {[ 1 2 3 4 117 118 119 120]}
Those nodes are connected to each other based on how many data points they share. The res.adjacencyMat matrix represents this connectivity.
adjacencyMat
: is a matrix of size numNodes x numNodes where each entry is the number of datapoints that they share.
the other field of the resulting structure are:
options
: are the opts used to generate the results
knn_g
: is the Penalized Reciprocal K-Nearest Neighbors graph of the data points used to generate the mapper. Check Hasegan et al. 2024 for more details.
Step 4: Visualize the results
To visualize the results, we can simply use the MATLAB built-in plot function.
%% Figure 4.1
% Generate the simplest adjacency matrix
figure;
g = graph(res.adjacencyMat);
plot(g, 'Layout', 'force', 'Usegravity', true, 'WeightEffect', 'inverse');
%% Figure 2
% We can do better by coloring the nodes based on the average value of the node members
% averaging over nodeMembers cell array
avgNode = cellfun(@mean, res.nodeMembers);
% and by sizing the nodes based on the number of node members
nodeSize = cell2mat(cellfun(@(x) size(x, 2), res.nodeMembers, 'UniformOutput', false));
nodeSize = normalize(nodeSize, 'range', [10, 20]);
figure;
g = graph(res.adjacencyMat);
plot(g, 'Layout', 'force', 'Usegravity', true, 'WeightEffect', 'inverse', ...
'MarkerSize', nodeSize, 'NodeCData', avgNode);
colorbar
colormap parula
%% Figure 3
% We can do even better by coloring the nodes based on the average color of the node members
% This time, we use the `nodeCData` data to color the nodes
nodeColor = cellfun(@(x) mean(nodeCData(x, :), 1), res.nodeMembers, 'UniformOutput', false);
nodeColor = cell2mat(nodeColor);
figure;
g = graph(res.adjacencyMat);
plot(g, 'Layout', 'force', 'Usegravity', true, 'WeightEffect', 'inverse', ...
'MarkerSize', nodeSize, 'NodeColor', nodeColor(:, 1:3));
The resulting figures are the following:

Figure 4.1: trefoil Knot data representation after Mapper

Figure 4.2: trefoil Knot data representation after Mapper with node size representing the number of points withing each node; and node color representing the average point index. Based on `nodeMembers`

Figure 4.3: trefoil Knot data representation after Mapper with node color representing the average data point membership based on colors defined in `nodeCData` or Figure 2.1
Step 5: “Advanced” Visualization: plot_task
An even better way to visualize the composition of the nodes regarding the points, we can use the plot_task utility:
% Set the path of the resulting plot
save_path = 'figure_5_1.png';
% Load the "timing" data, coloring each node as one of 3 colors
fn_timing = [datafolder, 'data_treefoil_task.csv'];
timing_table = readtable(fn_timing, 'FileType', 'text', 'Delimiter', ',');
timing_table.task_name = string(timing_table.task_name);
% Define the colormap, so that each node is colored according to the task
cmap = [0 0 1; 0 1 0; 1 0 0]; % Blue, Green, Red
% Simply plot the task by calling `plot_task`
plot_task(res, timing_table, save_path, false, cmap);
The timing_table needed is loaded from a CSV file containing the label for each node. The first 10 elemets of the CSV are as follows, seen using the BASH command
head hasegan_et_al_netneuro_2024/data/trefoil_knot/data_treefoil_task.csv
task_name
green
green
green
green
green
green
blue
blue
blue
The resulting figure is:

Figure 5.1: trefoil Knot data representation after Mapper with a pie chart for each node, representing its point contribution
Step 6: Using other utilities
The library comes with a set of predefined utilities that can be used to better understand the results of Mapper outputs.
%% Compute some general statistics on the computed graph
resdir = './stats'; % results directory
mkdir(resdir);
stats_args = struct; % no extra args needed here.
compute_stats(res, stats_args, resdir);
%% Compute the temporal connectivity matrices
resdir = './tmp'; % results directory
mkdir(resdir);
skip_temp = false; % do not skip saving the matrices
compute_temp(res, resdir, skip_temp);
Utility compute_stats
compute_stats: computes some general statistics about the Mapper output. It generates a bunch of files as follows, as seen using the BASH commands:
ls tutorials/tutorial1/stats/
stats.json
stats_betweenness_centrality.1D
stats_betweenness_centrality_TRs_avg.1D
stats_betweenness_centrality_TRs_max.1D
stats_core_periphery.1D
stats_core_periphery_TRs_avg.1D
stats_core_periphery_TRs_max.1D
stats_degrees_TRs.1D
stats_rich_club_coeffs.1D
Specifically the stats.json file contains the following information:
cat tutorials/tutorial1/stats/stats.json
{
"n_nodes" : 16,
"coverage_nodes" : 1,
"coverage_TRs" : 1,
"distances_max" : 8,
"distances_entropy" : 3.14629,
"assortativity" : 0.238255,
"degree_TRs_avg" : 2.83333,
"degree_TRs_entropy" : 0.979869
}
Those files can further be used to for other analysis or visualization. Check the file file code/analysis/compute_stats.m for the detailed explanation of each generated file.
Note: In case we were using a different dataset that contains fMRI data, we could generate the hrfdur_stat which is the autocorrelation statistic using in Hasegan et al., 2024
For generating that we would need to provide the HRF_threshold and the TR value, for example:
stats_args = struct;
stats_args.HRF_threshold = 10;
stats_args.TR = 2.5;
Utility compute_temp
compute_temp: computes and plots the temporal matrices of the Mapper output. It generates the following files, as seen using the BASH commands:
ls tutorials/tutorial1/tmp/
compute_temp-TCM-mat.1D
compute_temp-TCM.png
compute_temp-TCM_inv-mat.1D
compute_temp-TCM_inv.png
The Temporal Connectivity Matrix (or TCM) is the similarity matrix between the original points. The TCM_inv matrix is the inverse of the TCM matrix, representing the dissimilarity of original points. Those matrices can be used as input for other analysis or visualization.
The TCM_inv can be seen as following:

Figure 6.1: Dissimilarity matrix between the original points of the trefoil knot