MA4261 Distributed Scientific Computing

General principles of parallel computing, parallel techniques and algorithms, solution of systems of linear equations, eigenvalues and singular value decomposition, domain decomposition and application (e.g., satellite orbit determination and shallow water fluid flow).

Prerequisite

Familiarity with programming some form of computational mathematics such as numerical linear algebra (MA3046) and/or numerical analysis (MA3232) and/or machine learning and/or artificial intelligence is required.

Lecture Hours

4

Lab Hours

0

Course Learning Outcomes

A student who has successfully met the objectives of this course can:


- Describe the architecture and operational principles of GPU and distributed

memory systems, including CUDA and MPI frameworks.


- Apply the Julia programming language to implement parallel algorithms using

CUDA.jl and MPI.jl for scientific computing tasks.


- Analyze the performance characteristics of parallel programs using profiling

tools such as Nsight Compute and Nsight Systems.


- Evaluate the suitability of different parallelization strategies (e.g.,

thread-level, block-level, distributed memory) for solving computational

problems in scientific and/or AI domains.


- Design and optimize parallel computing solutions that leverage GPU and MPI

resources to address real-world research problems, demonstrating effective use

of memory hierarchies and communication patterns.