Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
L
lammps
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
multiscale
lammps
Commits
1c550d8f
Commit
1c550d8f
authored
6 years ago
by
Stan Moore
Browse files
Options
Downloads
Patches
Plain Diff
Change defaults for GPU-direct to use comm host
parent
d8aa6d53
No related branches found
No related tags found
No related merge requests found
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
doc/src/Speed_kokkos.txt
+3
-4
3 additions, 4 deletions
doc/src/Speed_kokkos.txt
doc/src/package.txt
+6
-5
6 additions, 5 deletions
doc/src/package.txt
src/KOKKOS/kokkos.cpp
+14
-3
14 additions, 3 deletions
src/KOKKOS/kokkos.cpp
with
23 additions
and
12 deletions
doc/src/Speed_kokkos.txt
+
3
−
4
View file @
1c550d8f
...
@@ -102,8 +102,8 @@ the case, especially when using pre-compiled MPI libraries provided by
...
@@ -102,8 +102,8 @@ the case, especially when using pre-compiled MPI libraries provided by
a Linux distribution. This is not a problem when using only a single
a Linux distribution. This is not a problem when using only a single
GPU and a single MPI rank on a desktop. When running with multiple
GPU and a single MPI rank on a desktop. When running with multiple
MPI ranks, you may see segmentation faults without GPU-direct support.
MPI ranks, you may see segmentation faults without GPU-direct support.
These can be avoided by adding the flags '-pk kokkos
comm no
gpu/direct no'
These can be avoided by adding the flags '-pk kokkos gpu/direct no'
to the LAMMPS command line or using "package kokkos
comm no
gpu/direct no"_package.html
to the LAMMPS command line or using "package kokkos gpu/direct no"_package.html
in the input file.
in the input file.
Use a C++11 compatible compiler and set KOKKOS_ARCH variable in
Use a C++11 compatible compiler and set KOKKOS_ARCH variable in
...
@@ -273,8 +273,7 @@ to the same GPU with the KOKKOS package, but this is usually only
...
@@ -273,8 +273,7 @@ to the same GPU with the KOKKOS package, but this is usually only
faster if significant portions of the input script have not been
faster if significant portions of the input script have not been
ported to use Kokkos. Using CUDA MPS is recommended in this
ported to use Kokkos. Using CUDA MPS is recommended in this
scenario. Using a CUDA-aware MPI library with support for GPU-direct
scenario. Using a CUDA-aware MPI library with support for GPU-direct
is highly recommended and for some KOKKOS-enabled styles even required.
is highly recommended. GPU-direct use can be avoided by using "-pk kokkos gpu/direct no".
Most GPU-direct use can be avoided by using "-pk kokkos comm no".
As above for multi-core CPUs (and no GPU), if N is the number of
As above for multi-core CPUs (and no GPU), if N is the number of
physical cores/node, then the number of MPI tasks/node should not
physical cores/node, then the number of MPI tasks/node should not
exceed N.
exceed N.
...
...
This diff is collapsed.
Click to expand it.
doc/src/package.txt
+
6
−
5
View file @
1c550d8f
...
@@ -489,10 +489,9 @@ packing/unpacking operation.
...
@@ -489,10 +489,9 @@ packing/unpacking operation.
The optimal choice for these keywords depends on the input script and
The optimal choice for these keywords depends on the input script and
the hardware used. The {no} value is useful for verifying that the
the hardware used. The {no} value is useful for verifying that the
Kokkos-based {host} and {device} values are working correctly. The {no}
Kokkos-based {host} and {device} values are working correctly.
value should also be used, in case of using an MPI library that does
It may also be the fastest choice when using Kokkos styles in
not support GPU-direct. It may also be the fastest choice when using
MPI-only mode (i.e. with a thread count of 1).
Kokkos styles in MPI-only mode (i.e. with a thread count of 1).
When running on CPUs or Xeon Phi, the {host} and {device} values work
When running on CPUs or Xeon Phi, the {host} and {device} values work
identically. When using GPUs, the {device} value will typically be
identically. When using GPUs, the {device} value will typically be
...
@@ -513,7 +512,9 @@ this keyword is set to {on}, buffers in GPU memory are passed directly
...
@@ -513,7 +512,9 @@ this keyword is set to {on}, buffers in GPU memory are passed directly
through MPI send/receive calls. This reduces overhead of first copying
through MPI send/receive calls. This reduces overhead of first copying
the data to the host CPU. However GPU-direct is not supported on all
the data to the host CPU. However GPU-direct is not supported on all
systems, which can lead to segmentation faults and would require
systems, which can lead to segmentation faults and would require
using a value of {off}.
using a value of {off}. When the {gpu/direct} keyword is set to {off}
while any of the {comm} keywords are set to {device}, the value for the
{comm} keywords will be automatically changed to {host}.
:line
:line
...
...
This diff is collapsed.
Click to expand it.
src/KOKKOS/kokkos.cpp
+
14
−
3
View file @
1c550d8f
...
@@ -156,11 +156,11 @@ KokkosLMP::KokkosLMP(LAMMPS *lmp, int narg, char **arg) : Pointers(lmp)
...
@@ -156,11 +156,11 @@ KokkosLMP::KokkosLMP(LAMMPS *lmp, int narg, char **arg) : Pointers(lmp)
}
else
if
(
-
1
==
have_gpu_direct
()
)
{
}
else
if
(
-
1
==
have_gpu_direct
()
)
{
error
->
warning
(
FLERR
,
"Kokkos with CUDA assumes GPU-direct is available,"
error
->
warning
(
FLERR
,
"Kokkos with CUDA assumes GPU-direct is available,"
" but cannot determine if this is the case
\n
try"
" but cannot determine if this is the case
\n
try"
" '-pk kokkos
comm no
gpu/direct no' when getting segmentation faults"
);
" '-pk kokkos gpu/direct no' when getting segmentation faults"
);
}
else
if
(
0
==
have_gpu_direct
()
)
{
}
else
if
(
0
==
have_gpu_direct
()
)
{
error
->
warning
(
FLERR
,
"GPU-direct is NOT available, but some parts of "
error
->
warning
(
FLERR
,
"GPU-direct is NOT available, but some parts of "
"Kokkos with CUDA require it
\n
try"
"Kokkos with CUDA require it
\n
try"
" '-pk kokkos
comm no
gpu/direct no' when getting segmentation faults"
);
" '-pk kokkos gpu/direct no' when getting segmentation faults"
);
}
else
{
}
else
{
;
// should never get here
;
// should never get here
}
}
...
@@ -186,7 +186,7 @@ KokkosLMP::KokkosLMP(LAMMPS *lmp, int narg, char **arg) : Pointers(lmp)
...
@@ -186,7 +186,7 @@ KokkosLMP::KokkosLMP(LAMMPS *lmp, int narg, char **arg) : Pointers(lmp)
exchange_comm_on_host
=
0
;
exchange_comm_on_host
=
0
;
forward_comm_on_host
=
0
;
forward_comm_on_host
=
0
;
reverse_comm_on_host
=
0
;
reverse_comm_on_host
=
0
;
gpu_direct
=
0
;
gpu_direct
=
1
;
#ifdef KILL_KOKKOS_ON_SIGSEGV
#ifdef KILL_KOKKOS_ON_SIGSEGV
signal
(
SIGSEGV
,
my_signal_handler
);
signal
(
SIGSEGV
,
my_signal_handler
);
...
@@ -310,6 +310,17 @@ void KokkosLMP::accelerator(int narg, char **arg)
...
@@ -310,6 +310,17 @@ void KokkosLMP::accelerator(int narg, char **arg)
}
else
error
->
all
(
FLERR
,
"Illegal package kokkos command"
);
}
else
error
->
all
(
FLERR
,
"Illegal package kokkos command"
);
}
}
// if "gpu/direct no" and "comm device", change to "comm host"
if
(
!
gpu_direct
)
{
if
(
exchange_comm_classic
==
0
&&
exchange_comm_on_host
==
0
)
exchange_comm_on_host
=
1
;
if
(
forward_comm_classic
==
0
&&
forward_comm_on_host
==
0
)
forward_comm_on_host
=
1
;
if
(
reverse_comm_classic
==
0
&&
reverse_comm_on_host
==
0
)
reverse_comm_on_host
=
1
;
}
// set newton flags
// set newton flags
// set neighbor binsize, same as neigh_modify command
// set neighbor binsize, same as neigh_modify command
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment