Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
L
lammps
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
multiscale
lammps
Commits
ee3b7a67
Unverified
Commit
ee3b7a67
authored
6 years ago
by
Steve Plimpton
Committed by
GitHub
6 years ago
Browse files
Options
Downloads
Plain Diff
Merge pull request #1066 from rbberger/doc-fixes
minor tweak to docs
parents
2c190797
2b5618dc
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
doc/src/Speed_gpu.txt
+35
-21
35 additions, 21 deletions
doc/src/Speed_gpu.txt
with
35 additions
and
21 deletions
doc/src/Speed_gpu.txt
+
35
−
21
View file @
ee3b7a67
...
@@ -9,17 +9,17 @@ Documentation"_ld - "LAMMPS Commands"_lc :c
...
@@ -9,17 +9,17 @@ Documentation"_ld - "LAMMPS Commands"_lc :c
GPU package :h3
GPU package :h3
The GPU package was developed by Mike Brown at
OR
NL and
his
The GPU package was developed by Mike Brown
while
at
S
NL and
ORNL
collaborators, particularly Trung Nguyen (
ORNL). It provides GPU
and his
collaborators, particularly Trung Nguyen (
now at Northwestern).
versions of many pair styles
, including the 3-body Stillinger-Weber
It provides GPU
versions of many pair styles
and for parts of the
pair style, and for
"kspace_style pppm"_kspace_style.html for
"kspace_style pppm"_kspace_style.html for
long-range Coulombics.
long-range Coulombics.
It has the following general features:
It has the following general features:
It is designed to exploit common GPU hardware configurations where one
It is designed to exploit common GPU hardware configurations where one
or more GPUs are coupled to many cores of one or more multi-core CPUs,
or more GPUs are coupled to many cores of one or more multi-core CPUs,
e.g. within a node of a parallel machine. :ulb,l
e.g. within a node of a parallel machine. :ulb,l
Atom-based data (e.g. coordinates, forces) move
s
back-and-forth
Atom-based data (e.g. coordinates, forces)
are
move
d
back-and-forth
between the CPU(s) and GPU every timestep. :l
between the CPU(s) and GPU every timestep. :l
Neighbor lists can be built on the CPU or on the GPU :l
Neighbor lists can be built on the CPU or on the GPU :l
...
@@ -28,8 +28,8 @@ The charge assignment and force interpolation portions of PPPM can be
...
@@ -28,8 +28,8 @@ The charge assignment and force interpolation portions of PPPM can be
run on the GPU. The FFT portion, which requires MPI communication
run on the GPU. The FFT portion, which requires MPI communication
between processors, runs on the CPU. :l
between processors, runs on the CPU. :l
Asynchronous f
orce computations
can be performed simultaneously on the
F
orce computations
of different style (pair vs. bond/angle/dihedral/improper)
CPU(s)
and
G
PU. :l
can be performed concurrently on the GPU
and
C
PU
(s), respectively
. :l
It allows for GPU computations to be performed in single or double
It allows for GPU computations to be performed in single or double
precision, or in mixed-mode precision, where pairwise forces are
precision, or in mixed-mode precision, where pairwise forces are
...
@@ -39,21 +39,32 @@ force vectors. :l
...
@@ -39,21 +39,32 @@ force vectors. :l
LAMMPS-specific code is in the GPU package. It makes calls to a
LAMMPS-specific code is in the GPU package. It makes calls to a
generic GPU library in the lib/gpu directory. This library provides
generic GPU library in the lib/gpu directory. This library provides
NVIDIA support as well as more general OpenCL support, so that the
NVIDIA support as well as more general OpenCL support, so that the
same functionality can eventually be supported on a variety of GPU
same functionality is supported on a variety of hardware. :l
hardware. :l
:ule
:ule
[Required hardware/software:]
[Required hardware/software:]
To use this package, you currently need to have an NVIDIA GPU and
To compile and use this package in CUDA mode, you currently need
install the NVIDIA CUDA software on your system:
to have an NVIDIA GPU and install the corresponding NVIDIA CUDA
toolkit software on your system (this is primarily tested on Linux
Check if you have an NVIDIA GPU: cat
and completely unsupported on Windows):
/proc/driver/nvidia/gpus/0/information Go to
http://www.nvidia.com/object/cuda_get.html Install a driver and
Check if you have an NVIDIA GPU: cat /proc/driver/nvidia/gpus/*/information :ulb,l
toolkit appropriate for your system (SDK is not necessary) Run
Go to http://www.nvidia.com/object/cuda_get.html :l
lammps/lib/gpu/nvc_get_devices (after building the GPU library, see
Install a driver and toolkit appropriate for your system (SDK is not necessary) :l
below) to list supported devices and properties :ul
Run lammps/lib/gpu/nvc_get_devices (after building the GPU library, see below) to
list supported devices and properties :ule,l
To compile and use this package in OpenCL mode, you currently need
to have the OpenCL headers and the (vendor neutral) OpenCL library installed.
In OpenCL mode, the acceleration depends on having an "OpenCL Installable Client
Driver (ICD)"_https://www.khronos.org/news/permalink/opencl-installable-client-driver-icd-loader
installed. There can be multiple of them for the same or different hardware
(GPUs, CPUs, Accelerators) installed at the same time. OpenCL refers to those
as 'platforms'. The GPU library will select the [first] suitable platform,
but this can be overridded using the device option of the "package"_package.html
command. run lammps/lib/gpu/ocl_get_devices to get a list of available
platforms and devices with a suitable ICD available.
[Building LAMMPS with the GPU package:]
[Building LAMMPS with the GPU package:]
...
@@ -120,7 +131,10 @@ GPUs/node to use, as well as other options.
...
@@ -120,7 +131,10 @@ GPUs/node to use, as well as other options.
The performance of a GPU versus a multi-core CPU is a function of your
The performance of a GPU versus a multi-core CPU is a function of your
hardware, which pair style is used, the number of atoms/GPU, and the
hardware, which pair style is used, the number of atoms/GPU, and the
precision used on the GPU (double, single, mixed).
precision used on the GPU (double, single, mixed). Using the GPU package
in OpenCL mode on CPUs (which uses vectorization and multithreading) is
usually resulting in inferior performance compared to using LAMMPS' native
threading and vectorization support in the USER-OMP and USER-INTEL packages.
See the "Benchmark page"_http://lammps.sandia.gov/bench.html of the
See the "Benchmark page"_http://lammps.sandia.gov/bench.html of the
LAMMPS web site for performance of the GPU package on various
LAMMPS web site for performance of the GPU package on various
...
@@ -146,7 +160,7 @@ The "package gpu"_package.html command has several options for tuning
...
@@ -146,7 +160,7 @@ The "package gpu"_package.html command has several options for tuning
performance. Neighbor lists can be built on the GPU or CPU. Force
performance. Neighbor lists can be built on the GPU or CPU. Force
calculations can be dynamically balanced across the CPU cores and
calculations can be dynamically balanced across the CPU cores and
GPUs. GPU-specific settings can be made which can be optimized
GPUs. GPU-specific settings can be made which can be optimized
for different hardware. See the "packa
k
ge"_package.html command
for different hardware. See the "package"_package.html command
doc page for details. :l
doc page for details. :l
As described by the "package gpu"_package.html command, GPU
As described by the "package gpu"_package.html command, GPU
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment