Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
L
lammps
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
multiscale
lammps
Commits
f4421a1c
Commit
f4421a1c
authored
10 years ago
by
sjplimp
Browse files
Options
Downloads
Patches
Plain Diff
git-svn-id:
svn://svn.icms.temple.edu/lammps-ro/trunk@12472
f3b2605a-c512-4ea7-a41b-209d697bcdaa
parent
0eb045dc
No related branches found
No related tags found
No related merge requests found
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
doc/accelerate_opt.html
+1
-1
1 addition, 1 deletion
doc/accelerate_opt.html
doc/accelerate_opt.txt
+1
-1
1 addition, 1 deletion
doc/accelerate_opt.txt
doc/package.html
+53
-16
53 additions, 16 deletions
doc/package.html
doc/package.txt
+53
-16
53 additions, 16 deletions
doc/package.txt
with
108 additions
and
34 deletions
doc/accelerate_opt.html
+
1
−
1
View file @
f4421a1c
...
...
@@ -10,7 +10,7 @@
<HR>
<P><A
HREF =
"Section_accelerate.html"
>
Return to Section accelerate
</A>
<P><A
HREF =
"Section_accelerate.html"
>
Return to Section accelerate
overview
</A>
</P>
<H4>
5.3.6 OPT package
</H4>
...
...
This diff is collapsed.
Click to expand it.
doc/accelerate_opt.txt
+
1
−
1
View file @
f4421a1c
...
...
@@ -7,7 +7,7 @@
:line
"Return to Section accelerate"_Section_accelerate.html
"Return to Section accelerate
overview
"_Section_accelerate.html
5.3.6 OPT package :h4
...
...
This diff is collapsed.
Click to expand it.
doc/package.html
+
53
−
16
View file @
f4421a1c
...
...
@@ -22,7 +22,10 @@
<PRE>
<I>
cuda
</I>
args = Ngpu keyword value ...
Ngpu = # of GPUs per node
zero or more keyword/value pairs may be appended
keywords =
<I>
gpuID
</I>
or
<I>
timing
</I>
or
<I>
test
</I>
or
<I>
thread
</I>
keywords =
<I>
newton
</I>
or
<I>
gpuID
</I>
or
<I>
timing
</I>
or
<I>
test
</I>
or
<I>
thread
</I>
<I>
newton
</I>
=
<I>
off
</I>
or
<I>
on
</I>
off = set Newton pairwise and bonded flags off (default)
on = set Newton pairwise and bonded flags on
<I>
gpuID
</I>
values = gpu1 .. gpuN
gpu1 .. gpuN = IDs of the Ngpu GPUs to use
<I>
timing
</I>
values = none
...
...
@@ -39,6 +42,9 @@
<I>
neigh
</I>
value =
<I>
yes
</I>
or
<I>
no
</I>
yes = neighbor list build on GPU (default)
no = neighbor list build on CPU
<I>
newton
</I>
=
<I>
off
</I>
or
<I>
on
</I>
off = set Newton pairwise flag off (default and required)
on = set Newton pairwise flag on (currently not allowed)
<I>
split
</I>
= fraction
fraction = fraction of atoms assigned to GPU (default = 1.0)
<I>
gpuID
</I>
values = first last
...
...
@@ -76,6 +82,9 @@
half = half neighbor list, not thread-safe, only use when 1 thread/MPI task
n2 = non-binning neighbor list build, O(N^2) algorithm
full/cluster = full neighbor list with clustered groups of atoms
<I>
newton
</I>
=
<I>
off
</I>
or
<I>
on
</I>
off = set Newton pairwise and bonded flags off (default)
on = set Newton pairwise and bonded flags on
<I>
comm
</I>
value =
<I>
no
</I>
or
<I>
host
</I>
or
<I>
device
</I>
use value for both comm/exchange and comm/forward
<I>
comm/exchange
</I>
value =
<I>
no
</I>
or
<I>
host
</I>
or
<I>
device
</I>
...
...
@@ -163,6 +172,12 @@ exactly one MPI task per GPU, as set by the mpirun or mpiexec command.
<P>
Optional keyword/value pairs can also be specified. Each has a
default value as listed below.
</P>
<P>
The
<I>
newton
</I>
keyword sets the Newton flags for pairwise and bonded
interactions to
<I>
off
</I>
or
<I>
on
</I>
, the same as the
<A
HREF =
"newton.html"
>
newton
</A>
command allows. The default is
<I>
off
</I>
because this will almost always
give better performance for the USER-CUDA package. This means
more computation is done, but less communication.
</P>
<P>
The
<I>
gpuID
</I>
keyword allows selection of which GPUs on each node will
be used for a simulation. GPU IDs range from 0 to N-1 where N is the
physical number of GPUs/node. An ID is specified for each of the
...
...
@@ -227,6 +242,16 @@ enabled command requires a neighbor list, it will also be built on the
CPU. In these cases, it will typically be more efficient to only use
CPU neighbor list builds.
</P>
<P>
The
<I>
newton
</I>
keyword sets the Newton flags for pairwise (not bonded)
interactions to
<I>
off
</I>
or
<I>
on
</I>
, the same as the
<A
HREF =
"newton.html"
>
newton
</A>
command allows. Currently, only an
<I>
off
</I>
value is allowed, since all
the GPU package pair styles require this setting. This means more
computation is done, but less communication. In the future a value of
<I>
on
</I>
may be allowed, so the
<I>
newton
</I>
keyword is included as an option
for compatibility with the package command for other accelerator
styles. Note that the newton setting for bonded interactions is not
affected by this keyword.
</P>
<P>
The
<I>
split
</I>
keyword can be used for load balancing force calculations
between CPU and GPU cores in GPU-enabled pair styles. If 0
<
<
I
>
split
</I>
<
1.0,
a
fixed
fraction
of
particles
is
offloaded
to
the
GPU
while
force
...
...
@@ -372,7 +397,10 @@ than the other methods, which use binning.
<P>
A value of
<I>
full
</I>
uses a full neighbor lists and is the default. This
performs twice as much computation as the
<I>
half
</I>
option, however that
is often a win because it is thread-safe and doesn't require atomic
operations in the calculation of pair forces.
operations in the calculation of pair forces. For that reason,
<I>
full
</I>
is the default setting. However, when running in MPI-only mode with 1
thread per MPI task,
<I>
half
</I>
neighbor lists will typically be faster,
just as it is for non-accelerated pair styles.
</P>
<P>
A value of
<I>
full/cluster
</I>
is an experimental neighbor style, where
particles interact with all particles within a small cluster, if at
...
...
@@ -382,6 +410,14 @@ architectures such as the Intel Phi. If also reduces the size of the
neighbor list by roughly a factor of the cluster size, thus reducing
the total memory footprint considerably.
</P>
<P>
The
<I>
newton
</I>
keyword sets the Newton flags for pairwise and bonded
interactions to
<I>
off
</I>
or
<I>
on
</I>
, the same as the
<A
HREF =
"newton.html"
>
newton
</A>
command allows. The default is
<I>
off
</I>
because this will almost always
give better performance for the KOKKOS package. This means more
computation is done, but less communication. However, when running in
MPI-only mode with 1 thread per MPI task, a value of
<I>
on
</I>
will
typically be faster, just as it is for non-accelerated pair styles.
</P>
<P>
The
<I>
comm
</I>
and
<I>
comm/exchange
</I>
and
<I>
comm/forward
</I>
keywords determine
whether the host or device performs the packing and unpacking of data
when communicating per-atom data between processors. "Exchange"
...
...
@@ -513,17 +549,17 @@ setting</A>
<P><B>
Default:
</B>
</P>
<P>
For the USER-CUDA package, the default is Ngpu = 1 and the option
defaults are gpuID = 0 to Ngpu-1, timing = not enabled,
test = not
enabled, and thread = auto. These settings are made
automatically by
the required "-c on"
<A
HREF =
"Section_start.html#start_7"
>
command-line
switch
</A>
.
You can change them bu using the
package cuda command in your input
script or via the "-pk cuda"
<A
HREF =
"Section_start.html#start_7"
>
command-line
switch
</A>
.
defaults are
newton = off,
gpuID = 0 to Ngpu-1, timing = not enabled,
test = not
enabled, and thread = auto. These settings are made
automatically by
the required "-c on"
<A
HREF =
"Section_start.html#start_7"
>
command-line
switch
</A>
.
You can change them bu using the
package cuda command in your input script or via the "-pk cuda"
<A
HREF =
"Section_start.html#start_7"
>
command-line
switch
</A>
.
</P>
<P>
For the GPU package, the default is Ngpu = 1 and the option defaults
are neigh = yes, split = 1.0, gpuID = 0 to Ngpu-1, tpa =
1, binsize =
pair cutoff + neighbor skin, device = not used. These
settings are
made automatically if the "-sf gpu"
<A
HREF =
"Section_start.html#start_7"
>
command-line
are neigh = yes,
newton = off,
split = 1.0, gpuID = 0 to Ngpu-1, tpa =
1, binsize =
pair cutoff + neighbor skin, device = not used. These
settings are
made automatically if the "-sf gpu"
<A
HREF =
"Section_start.html#start_7"
>
command-line
switch
</A>
is used. If it is not used, you
must invoke the package gpu command in your input script or via the
"-pk gpu"
<A
HREF =
"Section_start.html#start_7"
>
command-line switch
</A>
.
...
...
@@ -539,11 +575,12 @@ switch</A> is used. If it is not used, you
must invoke the package intel command in your input script or or via
the "-pk intel"
<A
HREF =
"Section_start.html#start_7"
>
command-line switch
</A>
.
</P>
<P>
For the KOKKOS package, the option defaults neigh = full and comm =
host. These settings are made automatically by the required "-k on"
<A
HREF =
"Section_start.html#start_7"
>
command-line switch
</A>
. You can change them
bu using the package kokkos command in your input script or via the
"-pk kokkos"
<A
HREF =
"Section_start.html#start_7"
>
command-line switch
</A>
.
<P>
For the KOKKOS package, the option defaults neigh = full, newton =
off, and comm = host. These settings are made automatically by the
required "-k on"
<A
HREF =
"Section_start.html#start_7"
>
command-line switch
</A>
.
You can change them bu using the package kokkos command in your input
script or via the "-pk kokkos"
<A
HREF =
"Section_start.html#start_7"
>
command-line
switch
</A>
.
</P>
<P>
For the OMP package, the default is Nthreads = 0 and the option
defaults are neigh = yes. These settings are made automatically if
...
...
This diff is collapsed.
Click to expand it.
doc/package.txt
+
53
−
16
View file @
f4421a1c
...
...
@@ -17,7 +17,10 @@ args = arguments specific to the style :l
{cuda} args = Ngpu keyword value ...
Ngpu = # of GPUs per node
zero or more keyword/value pairs may be appended
keywords = {gpuID} or {timing} or {test} or {thread}
keywords = {newton} or {gpuID} or {timing} or {test} or {thread}
{newton} = {off} or {on}
off = set Newton pairwise and bonded flags off (default)
on = set Newton pairwise and bonded flags on
{gpuID} values = gpu1 .. gpuN
gpu1 .. gpuN = IDs of the Ngpu GPUs to use
{timing} values = none
...
...
@@ -34,6 +37,9 @@ args = arguments specific to the style :l
{neigh} value = {yes} or {no}
yes = neighbor list build on GPU (default)
no = neighbor list build on CPU
{newton} = {off} or {on}
off = set Newton pairwise flag off (default and required)
on = set Newton pairwise flag on (currently not allowed)
{split} = fraction
fraction = fraction of atoms assigned to GPU (default = 1.0)
{gpuID} values = first last
...
...
@@ -71,6 +77,9 @@ args = arguments specific to the style :l
half = half neighbor list, not thread-safe, only use when 1 thread/MPI task
n2 = non-binning neighbor list build, O(N^2) algorithm
full/cluster = full neighbor list with clustered groups of atoms
{newton} = {off} or {on}
off = set Newton pairwise and bonded flags off (default)
on = set Newton pairwise and bonded flags on
{comm} value = {no} or {host} or {device}
use value for both comm/exchange and comm/forward
{comm/exchange} value = {no} or {host} or {device}
...
...
@@ -157,6 +166,12 @@ exactly one MPI task per GPU, as set by the mpirun or mpiexec command.
Optional keyword/value pairs can also be specified. Each has a
default value as listed below.
The {newton} keyword sets the Newton flags for pairwise and bonded
interactions to {off} or {on}, the same as the "newton"_newton.html
command allows. The default is {off} because this will almost always
give better performance for the USER-CUDA package. This means
more computation is done, but less communication.
The {gpuID} keyword allows selection of which GPUs on each node will
be used for a simulation. GPU IDs range from 0 to N-1 where N is the
physical number of GPUs/node. An ID is specified for each of the
...
...
@@ -221,6 +236,16 @@ enabled command requires a neighbor list, it will also be built on the
CPU. In these cases, it will typically be more efficient to only use
CPU neighbor list builds.
The {newton} keyword sets the Newton flags for pairwise (not bonded)
interactions to {off} or {on}, the same as the "newton"_newton.html
command allows. Currently, only an {off} value is allowed, since all
the GPU package pair styles require this setting. This means more
computation is done, but less communication. In the future a value of
{on} may be allowed, so the {newton} keyword is included as an option
for compatibility with the package command for other accelerator
styles. Note that the newton setting for bonded interactions is not
affected by this keyword.
The {split} keyword can be used for load balancing force calculations
between CPU and GPU cores in GPU-enabled pair styles. If 0 < {split} <
1.0, a fixed fraction of particles is offloaded to the GPU while force
...
...
@@ -366,7 +391,10 @@ than the other methods, which use binning.
A value of {full} uses a full neighbor lists and is the default. This
performs twice as much computation as the {half} option, however that
is often a win because it is thread-safe and doesn't require atomic
operations in the calculation of pair forces.
operations in the calculation of pair forces. For that reason, {full}
is the default setting. However, when running in MPI-only mode with 1
thread per MPI task, {half} neighbor lists will typically be faster,
just as it is for non-accelerated pair styles.
A value of {full/cluster} is an experimental neighbor style, where
particles interact with all particles within a small cluster, if at
...
...
@@ -376,6 +404,14 @@ architectures such as the Intel Phi. If also reduces the size of the
neighbor list by roughly a factor of the cluster size, thus reducing
the total memory footprint considerably.
The {newton} keyword sets the Newton flags for pairwise and bonded
interactions to {off} or {on}, the same as the "newton"_newton.html
command allows. The default is {off} because this will almost always
give better performance for the KOKKOS package. This means more
computation is done, but less communication. However, when running in
MPI-only mode with 1 thread per MPI task, a value of {on} will
typically be faster, just as it is for non-accelerated pair styles.
The {comm} and {comm/exchange} and {comm/forward} keywords determine
whether the host or device performs the packing and unpacking of data
when communicating per-atom data between processors. "Exchange"
...
...
@@ -507,17 +543,17 @@ setting"_Section_start.html#start_7
[Default:]
For the USER-CUDA package, the default is Ngpu = 1 and the option
defaults are gpuID = 0 to Ngpu-1, timing = not enabled,
test = not
enabled, and thread = auto. These settings are made
automatically by
the required "-c on" "command-line
switch"_Section_start.html#start_7.
You can change them bu using the package cuda command in your input
script or via the "-pk cuda"
"command-line
switch"_Section_start.html#start_7.
defaults are
newton = off,
gpuID = 0 to Ngpu-1, timing = not enabled,
test = not
enabled, and thread = auto. These settings are made
automatically by
the required "-c on" "command-line
switch"_Section_start.html#start_7. You can change them bu using the
package cuda command in your input
script or via the "-pk cuda"
"command-line
switch"_Section_start.html#start_7.
For the GPU package, the default is Ngpu = 1 and the option defaults
are neigh = yes, split = 1.0, gpuID = 0 to Ngpu-1, tpa =
1, binsize =
pair cutoff + neighbor skin, device = not used. These
settings are
made automatically if the "-sf gpu" "command-line
are neigh = yes,
newton = off,
split = 1.0, gpuID = 0 to Ngpu-1, tpa =
1, binsize =
pair cutoff + neighbor skin, device = not used. These
settings are
made automatically if the "-sf gpu" "command-line
switch"_Section_start.html#start_7 is used. If it is not used, you
must invoke the package gpu command in your input script or via the
"-pk gpu" "command-line switch"_Section_start.html#start_7.
...
...
@@ -533,11 +569,12 @@ switch"_Section_start.html#start_7 is used. If it is not used, you
must invoke the package intel command in your input script or or via
the "-pk intel" "command-line switch"_Section_start.html#start_7.
For the KOKKOS package, the option defaults neigh = full and comm =
host. These settings are made automatically by the required "-k on"
"command-line switch"_Section_start.html#start_7. You can change them
bu using the package kokkos command in your input script or via the
"-pk kokkos" "command-line switch"_Section_start.html#start_7.
For the KOKKOS package, the option defaults neigh = full, newton =
off, and comm = host. These settings are made automatically by the
required "-k on" "command-line switch"_Section_start.html#start_7.
You can change them bu using the package kokkos command in your input
script or via the "-pk kokkos" "command-line
switch"_Section_start.html#start_7.
For the OMP package, the default is Nthreads = 0 and the option
defaults are neigh = yes. These settings are made automatically if
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment