Tutorial 3: Running Mappers on fMRI data
=========================================
In this tutorial, we will run the Mapper algorithm on fMRI data. We expect that you have gone through the previous tutorials and have a basic understanding of the Mapper algorithm.
Let's say that you have a dataset of fMRI data of 1 participant performing a task over two MRI sessions. You analyzed your data with `fmriprep `_ and `xcpengine `_. Now, your xcpengine process generated a `cohort.csv` file with the following columns, using the schaefer400x7 parcelation atlas. The file looks like this:
.. literalinclude:: ../../tutorials/tutorial3/cohort.csv
The `path` column contains the path to the timeseries data, and the `TR` column contains the repetition time of the fMRI data.
In this tutorial, we will run the Mapper algorithm on the fMRI data of the participant for both sessions.
===============================
Step 1. Mappers configuration
===============================
Let's say we want to analyze the Mapper graphs for the fMRI data of the participant. We do not know which parameters to use for the Mapper algorithm, so we have to run the DeMapper toolbox to generate all of the Mapper graphs for different parameters.
We will generate the following configuration:
.. literalinclude:: ../../tutorials/tutorial3/mappers.json
We now have new preprocessing steps: `drop-nan` and `drop-lowvar`. The `drop-nan` step will drop the columns or TRs with NaN values and drop rows that contain at least one NaN value. The `drop-lowvar` will drop the rows (ROIs) that have a variance below the set variance (`0.01`). You can read more about those steps in the `preprocessing section `.
There are a set of Mapper parameters that will be varied: 'k', 'resolution', and 'gain'. Totally, we will generate 4x3x5=60 Mapper configurations.
The analysis step `compute_stats` now contains extra arguments. This `HRF_threshold` will set generate new stats. Check the `analysis section ` for more details.
===============================
Step 2. Running DeMapper CLI
===============================
Putting all the files together, we can set up the script to run the DeMapper toolbox:
.. literalinclude:: ../../tutorials/tutorial3/run_mappers.sh
:language: bash
Then we can run the above command:
.. code-block:: bash
./tutorials/tutorial3/run_mappers.sh
# Output:
# ...
# Total mapper errors: 0
The script might take a while to generate all the Mapper graphs. You can check the status of the Mapper runs by running:
.. code-block:: bash
tail -f results/tutorial3_mappers/status.csv
.. warning::
We will expect to have 120 total Mappers generated (60 for each session). This script will take too long to finish, so we will have to run it in parallel instead.
==========================================
Step 3. Running DeMapper CLI in parallel
==========================================
Let's modify the above script to tell the DeMapper toolbox to generate the Mappers in parallel. Also, we would like to not regenerate the ones that we already have. We will use the `--skip` argument to skip the already generated Mappers.
.. literalinclude:: ../../tutorials/tutorial3/run_mappers_parallel.sh
:language: bash
Then we can run the above command:
.. code-block:: bash
./tutorials/tutorial3/run_mappers_parallel.sh
# Output:
# ...
# Total mapper errors: 0
One could analyze the results, as we did in the previous tutorials.
===============================
Step 4. Generate task graphs
===============================
Let's say that we have some task information that is consistently different throughout the sessions. We want to see how those datapoints during the task are differently represented in the Mapper graph.
For example, during each session, the participant performed a task that was divided into 2 conditions: `rest` and `thinking`. From our task design we know during each session when the participant was in the `rest` and `thinking` conditions. We can supposedly generate the task CSV files for each session.
.. code-block:: bash
head tutorials/tutorial3/sub-1_task1.csv
# Output:
# task_name
# rest
# rest
# rest
# rest
# rest
# thinking
# thinking
# thinking
# thinking
head tutorials/tutorial3/sub-1_task2.csv
# Output:
# task_name
# thinking
# thinking
# thinking
# thinking
# thinking
# thinking
# rest
# rest
# rest
.. note::
The task CSV files contain the `task_name` column that contains the task condition for each TR.
Now we can create a new cohort file that provides the task information for each session:
.. literalinclude:: ../../tutorials/tutorial3/cohort_wtask.csv
.. warning::
The name of the task column is very important. It has to be of the format `task_path_