- 22 Oct, 2018 1 commit
-
-
Tomas Härdin authored
-
- 17 Oct, 2018 1 commit
-
-
Tomas Härdin authored
-
- 10 Oct, 2018 2 commits
-
-
Tomas Härdin authored
-
Tomas Härdin authored
Having some trouble running with the foss toolchain, else I'd include them instead.
-
- 09 Oct, 2018 3 commits
-
-
Tomas Härdin authored
-
Tomas Hardin authored
-
Tomas Hardin authored
Downloading and placing all required tarballs in the correct places is likely to be tedious.
-
- 08 Oct, 2018 2 commits
-
-
Tomas Härdin authored
-
Tomas Härdin authored
This is needed because OpenMPI on trusty fails with this error: /usr/lib/openmpi/include/mpi_portable_platform.h:374:34: error: invalid suffix on literal; C++11 requires a space between literal and string macro [-Werror=literal-suffix] _STRINGIFY(__GNUC__)"."_STRINGIFY(__GNUC_MINOR__)"."_STRINGIFY(__GNUC_PATCHLEVEL__)
-
- 02 Oct, 2018 3 commits
-
-
Tomas Härdin authored
-
Tomas Härdin authored
They're probably not fully complete
-
Tomas Härdin authored
-
- 19 Sep, 2018 1 commit
-
-
Tomas Härdin authored
-
- 12 Sep, 2018 8 commits
-
-
Tomas Härdin authored
This has brought the speed of perftest2.sh from 2.3 kHz to 8.8 kHz, nearly quadruple the speed! There's some stuff still left to do in the master: * queueX() should only need to be called once for each "stage" in each master. It accounts for 23% of cycles. * initRefValues() accounts for 16% of cycles, perhaps there's still something that can be done to it. * There's probably some overhead in message packing/unpacking to take care of. Overall, the master spends about as much time inside itself as inside ZMQ. Memory allocations seem to account for most of that not accounted for in queueX() or initRefValues(). As for the server, there's not much else to do since 36% of time is spent in FMI and 55% is spent inside ZMQ. The remaining 9% can be considered acceptable overhead.
-
Tomas Härdin authored
-
Tomas Härdin authored
1% callgrind improvement
-
Tomas Härdin authored
No change in walltime, but callgrind says 3% fewer cycles.
-
Tomas Härdin authored
This reduces cachegrind cycles a bit, but no impact on walltime
-
Tomas Härdin authored
31% Speedup in Release, woo!
-
Tomas Härdin authored
4% Release speedup, and nicer code.
-
Tomas Härdin authored
7% Release speedup.
-
- 11 Sep, 2018 9 commits
-
-
Tomas Härdin authored
12% Release speedup. We're currently at 12 kHz on granular, compared to 5.5 kHz a day ago :]
-
Tomas Härdin authored
This gives a 25% speedup in Release mode!
-
Tomas Härdin authored
Speedup is 2% in Release, 5% in Debug
-
Tomas Härdin authored
Sometimes it's good, sometimes not. Since it modifies the master's behavior leaving it off seems best.
-
Tomas Härdin authored
~1% speedup in Release, 3-5% in Debug
-
Tomas Härdin authored
Speedup is a modest 3% in Release mode
-
Tomas Härdin authored
-
Tomas Härdin authored
-
Tomas Härdin authored
-
- 10 Sep, 2018 6 commits
-
-
Tomas Härdin authored
It seem granular tops out around 68 kHz with the current architecture, with fmigo-mpi being roughly an order of magnitude slower.
-
Tomas Härdin authored
-
Tomas Härdin authored
walltime down another 20%
-
Tomas Härdin authored
This gives an MPI speedup of 27% in Debug mode with current perftest2.sh.
-
Tomas Härdin authored
-
Tomas Härdin authored
perftest2.sh results: before: real 0m55,910s user 1m28,044s sys 1m17,734s after: real 0m46,482s user 1m2,104s sys 1m12,223s callgrind results were more impressive, but it can't every fully represent reality.
-
- 04 Sep, 2018 1 commit
-
-
Tomas Härdin authored
-
- 29 Aug, 2018 1 commit
-
-
Tomas Härdin authored
Provide examples of TCP/IP command-line use while we're at it. This fixes #2.
-
- 28 Aug, 2018 2 commits
-
-
Tomas Härdin authored
This fixes #1.
-
Tomas Härdin authored
-