MATLAB is a high-level technical computing language and interactive environment for algorithm development, data visualization, data analysis, and numeric computation.
Georgia Tech licenses MATLAB for research and classroom use.
See the MathWorks website for extensive documentation on MATLAB.
What you need to know about running MATLAB on PACE-Managed systems
There are two main methods of running MATLAB on PACE systems. The first method more complex and less efficient; if you are new to pace, we recommend the second method:
- Running parallel MATLAB on a single-computer/node (more efficient)
- Running lots of matlab programs (less efficient).
- Performing parameter sweeps
- Evaluating many different inputs using the same progra
This page assumes that you already have an account with PACE and are familiar with the scheduler and terminal. Please see the getting started page for a brief tutorial on using PACE systems.
A. Running a simple, sequential (single core) matlab in batch mode (no GUI)
- First prepare your matlab script (e.g. "matlab_test.m")
- Then write a PBS script using your favorite editor as follows (change the parameters depending on your queue and job)
#PBS -N simple_matlab_job
#PBS -q <queue_name> #<-- Include the correct queue name here
#PBS -l walltime=30:00 #<-- Change the walltime here
#PBS -l nodes=1:ppn=1 #<-- Don't ask for more than 1 core if your matlab code is not parallelized
#PBS -j oe #<-- Merge output and error files
#PBS -o matlab_test.$PBS_JOBID #<-- Creates a merged output file named after the job ID
module load matlab/r2017a #<-- Change the version if needed
# Start a no-GUI, single process matlab instancte to run "matlab_test.m" taking "1" as input
matlab -nodisplay -nojvm -singleCompThread -r "matlab_test(1)"
B. Running parallel MATLAB on a single-computer/node
This script is intended to operate both on desktop and PACE computational clusters. To run, simply place the script in any directory that is read by MATLAB and call the script (i.e. add PaceParalleltoolbox_r2016b to your script). Once called, the script will open n workers for a matlab pool (on non-pace resources, it opens n-1 workers unless PaceParalleltoolbox_r2016b(true) is called) where n is the number of logical processors.
There are a few of options for the script - for more information about the script, either read the first portion of script comments or enter 'help PaceParalleltoolbox_r2016b' in a matlab terminal while the script is in a directory that MATLAB can read.
We've updated our PaceParalleltoolbox_r2016b.m to fix a few problems with the r2014b version. It's still compatible as far back as r2013b. The script is intended to be transferrable between desktop (Windows and Mac OS) testing and cluster runs. Hopefully it's easier to read - more comments have been added. All functionality from the previous script is preserved (so you can replace the old one and it should still work) but there is some new functionality.
This script is maintained by Blake Fleischer - all GT researchers should feel free to contact via firstname.lastname@example.org if there are problems or concerns. A more detailed version history for the script is available here.
Running lots of MATLAB programs or running lots of inputs through one program
Important note: How to run lots of jobs scripts are not compatible with the PaceParalleltoolbox script (any version).
Many PACE users need to run hundreds or thousands of test cases through one or two matlab scripts. Each test case may on average take a few minutes or hours, but running thousands of cases at 10 minutes apiece takes far too long to run on a desktop.
For these MATLAB applications, each test is completely independent of every other test. No results from one test depend on the results from another test.
This section describes a method that does not require modification of matlab scripts for MATLAB to run these tests on PACE systems. However, each and every test requires an independently running a full-fledged copy of MATLAB, and must start each individual matlab application independently. For this reason, this method is less efficient than the previous single-node method.
Basic steps (the "How to run lots of jobs page" details how to construct the files needed to perform this type of parameter sweep):
- Come up with the way to run your test by changing a function parameter.
- Enumerate all of the tests that must be executed
- Put every test into the jobs.txt file
- Make certain you use the "-singleCompThread" flag in the jobs.txt file
- The matlab command in the jobs.txt file should look something like this: matlab -nodisplay -singleCompThread -r "matlab_app(argument1,argument2)"
- Calculate the average of how long you expect each test to take
- Edit the paralleljob.txt script (taken from the "How to run lots of jobs page") to use a good number of cores.
- A "good" number of cores is more than 1 and will return the results in a reasonable amount of time. If you expect 1000 jobs to each run for 5 minutes, asking for 20 cores is probably a "good" number since it should take approximately 4 hours to run.
- Important: The script must use only one node: #PBS -l nodes=1:ppn=X
- Submit the paralleljobs.txt script, splitting by batchsize and batchcount if necessary
MATLAB with GPU on Pace:
There have been reports of issues with matlab GPU calculations and virtual memory usage. matlab/r2015a apparently would attempt to allocate hundreds of GBs to virtual memory, causing the scheduler to kill the job. Some improvements were seen by using a minimal amount of threads (less than 4 or so), but no comprehensive studies have been performed to thoroughly check the cause. In general for matlab/r2015a, significant GPU loading/unloading operations combined with many threads isn't recommended, but may be successful.