summarylogtreecommitdiffstats
path: root/PKGBUILD
AgeCommit message (Collapse)Author
2024-06-05upgpkg: blis-git 1.0.r19.g5cbec6503-1Chocobo1
2024-01-09upgpkg: blis-git 0.9.0.r134.ga72e4569f-1Chocobo1
2022-04-02add option !ltohaawda
2021-08-26add conficts/provides for blashaawda
2021-03-07remove blis_profile.sh filehaawda
2020-08-07fix license directory and (make-)dependencieshaawda
2019-10-07add epochhaawda
2019-10-06remove patch, simplify package functionhaawda
2019-04-10minimal change in pkgver functionhaawda
2019-03-20better pkgver functionhaawda
2018-11-20small consistency fixhaawda
2018-10-20no longer depends on glibchaawda
2018-07-02copy filey manually to the packagehaawda
2018-07-02separate profile scripthaawda
2018-04-12Upstream fixed the Issue, download latest commits againhaawda
2018-04-08stick to a specific commit for nowhaawda
2018-01-05upstream changed build system, patch neededhaawda
2017-12-18updatehaawda
2017-12-18prepare flatten-headers.py for use with python2haawda
2017-12-14Updated version (0.2.2.119.g784289d6 -> 0.2.2.123.ga32e8a47).AUR Update Bot
Changelog ========= Added an exclusion to .travis.yml. (a32e8a47) Cleaned up after previous travis oot debugging. (b9f7d987) Attempted fix to travis oot build failure. (9091a207) Added debugging output to Makefile. (c01c71c3) Updated SHELL in common.mk from /bin/bash to bash. (784289d6) Defined SHELL in common.mk so "echo -n" works. (d9bb1d1d) Attempt 3 on .travis.yml. (9289a086) More fixes to .travis.yml. (720bfcf0) Added 'pwd' commands to .travis.yml for debugging. (8717c9c9) Added temp_dir argument to flatten-headers.sh. (6526d1d4) Merge branch 'master' of github.com:flame/blis (94755017) Added out-of-tree build test to .travis.yml file. (d0c4dd00) Ignore blis.h.interm [ci skip] (5cf7b0c4) Further attempt to fix out-of-tree builds. (8d8ff74d) Fixed off-by-one indexing in bli_cpuid.c. (70a64432) Fixed broken out-of-tree builds since 52f9e6f. (87978f62) Various typecasting fixes, mis-typed enums, etc. (513ef4d0) Removed most "old" directories. (b1508703) Modified bli_getopt() for thread-safety. (270c6598) Merge branch 'master' of github.com:flame/blis (ce4d8fab) Replaced several macros with static function APIs. (39be59f2) Merge branch 'rt' (e05a8dfa) Adding SKX kernels and configuration. (4423e33d) Various checks to ensure that arch_t id is in range. (79507337) Added 'uninstall-old-headers' target to Makefile. (fde7c112) Create/install monolithic cblas.h. (d4ee770b) Merge branch 'rt' (52f9e6f1) Fixed cntx_t packm query when ker_id > _NUM_PACKM_KERS. (21360dd8) Fixed POSIX sed non-compliance in flatten-header.sh. (244a6f4e) Generate/compile with/install monolithic blis.h. (45078621) Added missing framework support for x86_64 family. (1f30b130) Fixed a bug in e31f0b3/b131b9a. (9f39806c) Updated configs to omit setting some blocksizes. (b131b9a0) Merge branch 'rt' of github.com:flame/blis into rt (499a4c00) Subtle update to bli_blksz_init*() API. (e31f0b3e) Added 'x86_64' sub-config directory. (6c3ba502) Added a dummy file to kernels/generic. (25eee3cc) More tweaks to monolithify-header.sh (ef024ce4) Second attempt to implement travis_wait. (5028e7de) Added travis_wait prefix to testsuite via Travis. (13e5d910) Removed pnacl, emscripten support from Makefile. (a1caeba0) Improvements, bugfixes to monolithify-header.sh. (9df6dda9) Merge branch 'rt' of github.com:flame/blis into rt (21d26201) Removed unnecessary flags for generic config. (43baa3b3) [WIP] Add x86 and x86_64 processor families. (#154) (b7ca5806) Added bash script for creating monolithic headers. (870597d1) Removed unnecessary #include "blis.h" from header. (c76f77f4) Miscellaneous tweaks to gks, rt functionality. (2bb9bc6e) Miscellaneous tweaks and fixes. (d5bf79e5) Merge branch 'rt' of github.com:flame/blis into rt (673e5184) Implemented runtime hardware detection via cpuid. (2c51356a) Revert to default SIMD alignment for bulldozer. (ab57b979) Revert to default SIMD alignment for bulldozer. (8f150f28) Use perl for some substitution for OS X compatibility. (e3f10557) Merge branch 'master' into rt (dd45cfdf) Fix CVECFLAGS for bulldozer config. (f60c827b) Typecast l1mkr_t enum value prior to comparison. (3e4f42a4) Removed associative arrays from configure. (aec6e038) Added "generic" configuration. (07c35218) Minor update to .travis.yml file. (c1a98d6f) Minor header renaming ahead of bli_arch.c. (75b9383f) Fixed 'make test' target from top-level Makefile. (482af51a) Makefile updates for test drivers, testsuite. (3c269f70) Minor updates to .travis.yml, configure script. (0557189d) Merge branch 'master' into rt (2553734d) Removed a duplicate bli_avx512_macros.h header. (37534279) Implemented runtime kernel management. (453deb29) Merge branch 'master' into rt (b882648a) Fixed a pthread typo in previous commit. (e02d3cb8) Fixed bugs in gemm/gemmtrsm ukr tests in testsuite. (f5962a1a) Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge branch 'master' into rt (8b379069) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Housekeeping, induced method file/function renames. (1f3a5819) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-12-13updatehaawda
2017-12-11Updated version (0.2.2.106.gb1508703 -> 0.2.2.109.g70a64432).AUR Update Bot
Changelog ========= Fixed off-by-one indexing in bli_cpuid.c. (70a64432) Fixed broken out-of-tree builds since 52f9e6f. (87978f62) Various typecasting fixes, mis-typed enums, etc. (513ef4d0) Removed most "old" directories. (b1508703) Modified bli_getopt() for thread-safety. (270c6598) Merge branch 'master' of github.com:flame/blis (ce4d8fab) Replaced several macros with static function APIs. (39be59f2) Merge branch 'rt' (e05a8dfa) Adding SKX kernels and configuration. (4423e33d) Various checks to ensure that arch_t id is in range. (79507337) Added 'uninstall-old-headers' target to Makefile. (fde7c112) Create/install monolithic cblas.h. (d4ee770b) Merge branch 'rt' (52f9e6f1) Fixed cntx_t packm query when ker_id > _NUM_PACKM_KERS. (21360dd8) Fixed POSIX sed non-compliance in flatten-header.sh. (244a6f4e) Generate/compile with/install monolithic blis.h. (45078621) Added missing framework support for x86_64 family. (1f30b130) Fixed a bug in e31f0b3/b131b9a. (9f39806c) Updated configs to omit setting some blocksizes. (b131b9a0) Merge branch 'rt' of github.com:flame/blis into rt (499a4c00) Subtle update to bli_blksz_init*() API. (e31f0b3e) Added 'x86_64' sub-config directory. (6c3ba502) Added a dummy file to kernels/generic. (25eee3cc) More tweaks to monolithify-header.sh (ef024ce4) Second attempt to implement travis_wait. (5028e7de) Added travis_wait prefix to testsuite via Travis. (13e5d910) Removed pnacl, emscripten support from Makefile. (a1caeba0) Improvements, bugfixes to monolithify-header.sh. (9df6dda9) Merge branch 'rt' of github.com:flame/blis into rt (21d26201) Removed unnecessary flags for generic config. (43baa3b3) [WIP] Add x86 and x86_64 processor families. (#154) (b7ca5806) Added bash script for creating monolithic headers. (870597d1) Removed unnecessary #include "blis.h" from header. (c76f77f4) Miscellaneous tweaks to gks, rt functionality. (2bb9bc6e) Miscellaneous tweaks and fixes. (d5bf79e5) Merge branch 'rt' of github.com:flame/blis into rt (673e5184) Implemented runtime hardware detection via cpuid. (2c51356a) Revert to default SIMD alignment for bulldozer. (ab57b979) Revert to default SIMD alignment for bulldozer. (8f150f28) Use perl for some substitution for OS X compatibility. (e3f10557) Merge branch 'master' into rt (dd45cfdf) Fix CVECFLAGS for bulldozer config. (f60c827b) Typecast l1mkr_t enum value prior to comparison. (3e4f42a4) Removed associative arrays from configure. (aec6e038) Added "generic" configuration. (07c35218) Minor update to .travis.yml file. (c1a98d6f) Minor header renaming ahead of bli_arch.c. (75b9383f) Fixed 'make test' target from top-level Makefile. (482af51a) Makefile updates for test drivers, testsuite. (3c269f70) Minor updates to .travis.yml, configure script. (0557189d) Merge branch 'master' into rt (2553734d) Removed a duplicate bli_avx512_macros.h header. (37534279) Implemented runtime kernel management. (453deb29) Merge branch 'master' into rt (b882648a) Fixed a pthread typo in previous commit. (e02d3cb8) Fixed bugs in gemm/gemmtrsm ukr tests in testsuite. (f5962a1a) Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge branch 'master' into rt (8b379069) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Housekeeping, induced method file/function renames. (1f3a5819) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-12-09Updated version (0.2.2.104.gce4d8fab -> 0.2.2.106.gb1508703).AUR Update Bot
Changelog ========= Removed most "old" directories. (b1508703) Modified bli_getopt() for thread-safety. (270c6598) Merge branch 'master' of github.com:flame/blis (ce4d8fab) Replaced several macros with static function APIs. (39be59f2) Merge branch 'rt' (e05a8dfa) Adding SKX kernels and configuration. (4423e33d) Various checks to ensure that arch_t id is in range. (79507337) Added 'uninstall-old-headers' target to Makefile. (fde7c112) Create/install monolithic cblas.h. (d4ee770b) Merge branch 'rt' (52f9e6f1) Fixed cntx_t packm query when ker_id > _NUM_PACKM_KERS. (21360dd8) Fixed POSIX sed non-compliance in flatten-header.sh. (244a6f4e) Generate/compile with/install monolithic blis.h. (45078621) Added missing framework support for x86_64 family. (1f30b130) Fixed a bug in e31f0b3/b131b9a. (9f39806c) Updated configs to omit setting some blocksizes. (b131b9a0) Merge branch 'rt' of github.com:flame/blis into rt (499a4c00) Subtle update to bli_blksz_init*() API. (e31f0b3e) Added 'x86_64' sub-config directory. (6c3ba502) Added a dummy file to kernels/generic. (25eee3cc) More tweaks to monolithify-header.sh (ef024ce4) Second attempt to implement travis_wait. (5028e7de) Added travis_wait prefix to testsuite via Travis. (13e5d910) Removed pnacl, emscripten support from Makefile. (a1caeba0) Improvements, bugfixes to monolithify-header.sh. (9df6dda9) Merge branch 'rt' of github.com:flame/blis into rt (21d26201) Removed unnecessary flags for generic config. (43baa3b3) [WIP] Add x86 and x86_64 processor families. (#154) (b7ca5806) Added bash script for creating monolithic headers. (870597d1) Removed unnecessary #include "blis.h" from header. (c76f77f4) Miscellaneous tweaks to gks, rt functionality. (2bb9bc6e) Miscellaneous tweaks and fixes. (d5bf79e5) Merge branch 'rt' of github.com:flame/blis into rt (673e5184) Implemented runtime hardware detection via cpuid. (2c51356a) Revert to default SIMD alignment for bulldozer. (ab57b979) Revert to default SIMD alignment for bulldozer. (8f150f28) Use perl for some substitution for OS X compatibility. (e3f10557) Merge branch 'master' into rt (dd45cfdf) Fix CVECFLAGS for bulldozer config. (f60c827b) Typecast l1mkr_t enum value prior to comparison. (3e4f42a4) Removed associative arrays from configure. (aec6e038) Added "generic" configuration. (07c35218) Minor update to .travis.yml file. (c1a98d6f) Minor header renaming ahead of bli_arch.c. (75b9383f) Fixed 'make test' target from top-level Makefile. (482af51a) Makefile updates for test drivers, testsuite. (3c269f70) Minor updates to .travis.yml, configure script. (0557189d) Merge branch 'master' into rt (2553734d) Removed a duplicate bli_avx512_macros.h header. (37534279) Implemented runtime kernel management. (453deb29) Merge branch 'master' into rt (b882648a) Fixed a pthread typo in previous commit. (e02d3cb8) Fixed bugs in gemm/gemmtrsm ukr tests in testsuite. (f5962a1a) Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge branch 'master' into rt (8b379069) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Housekeeping, induced method file/function renames. (1f3a5819) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-12-08Updated version (0.2.2.102.ge05a8dfa -> 0.2.2.104.gce4d8fab).AUR Update Bot
Changelog ========= Merge branch 'master' of github.com:flame/blis (ce4d8fab) Replaced several macros with static function APIs. (39be59f2) Merge branch 'rt' (e05a8dfa) Adding SKX kernels and configuration. (4423e33d) Various checks to ensure that arch_t id is in range. (79507337) Added 'uninstall-old-headers' target to Makefile. (fde7c112) Create/install monolithic cblas.h. (d4ee770b) Merge branch 'rt' (52f9e6f1) Fixed cntx_t packm query when ker_id > _NUM_PACKM_KERS. (21360dd8) Fixed POSIX sed non-compliance in flatten-header.sh. (244a6f4e) Generate/compile with/install monolithic blis.h. (45078621) Added missing framework support for x86_64 family. (1f30b130) Fixed a bug in e31f0b3/b131b9a. (9f39806c) Updated configs to omit setting some blocksizes. (b131b9a0) Merge branch 'rt' of github.com:flame/blis into rt (499a4c00) Subtle update to bli_blksz_init*() API. (e31f0b3e) Added 'x86_64' sub-config directory. (6c3ba502) Added a dummy file to kernels/generic. (25eee3cc) More tweaks to monolithify-header.sh (ef024ce4) Second attempt to implement travis_wait. (5028e7de) Added travis_wait prefix to testsuite via Travis. (13e5d910) Removed pnacl, emscripten support from Makefile. (a1caeba0) Improvements, bugfixes to monolithify-header.sh. (9df6dda9) Merge branch 'rt' of github.com:flame/blis into rt (21d26201) Removed unnecessary flags for generic config. (43baa3b3) [WIP] Add x86 and x86_64 processor families. (#154) (b7ca5806) Added bash script for creating monolithic headers. (870597d1) Removed unnecessary #include "blis.h" from header. (c76f77f4) Miscellaneous tweaks to gks, rt functionality. (2bb9bc6e) Miscellaneous tweaks and fixes. (d5bf79e5) Merge branch 'rt' of github.com:flame/blis into rt (673e5184) Implemented runtime hardware detection via cpuid. (2c51356a) Revert to default SIMD alignment for bulldozer. (ab57b979) Revert to default SIMD alignment for bulldozer. (8f150f28) Use perl for some substitution for OS X compatibility. (e3f10557) Merge branch 'master' into rt (dd45cfdf) Fix CVECFLAGS for bulldozer config. (f60c827b) Typecast l1mkr_t enum value prior to comparison. (3e4f42a4) Removed associative arrays from configure. (aec6e038) Added "generic" configuration. (07c35218) Minor update to .travis.yml file. (c1a98d6f) Minor header renaming ahead of bli_arch.c. (75b9383f) Fixed 'make test' target from top-level Makefile. (482af51a) Makefile updates for test drivers, testsuite. (3c269f70) Minor updates to .travis.yml, configure script. (0557189d) Merge branch 'master' into rt (2553734d) Removed a duplicate bli_avx512_macros.h header. (37534279) Implemented runtime kernel management. (453deb29) Merge branch 'master' into rt (b882648a) Fixed a pthread typo in previous commit. (e02d3cb8) Fixed bugs in gemm/gemmtrsm ukr tests in testsuite. (f5962a1a) Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge branch 'master' into rt (8b379069) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Housekeeping, induced method file/function renames. (1f3a5819) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-12-07Updated version (0.2.2.99.gfde7c112 -> 0.2.2.102.ge05a8dfa).AUR Update Bot
Changelog ========= Merge branch 'rt' (e05a8dfa) Adding SKX kernels and configuration. (4423e33d) Various checks to ensure that arch_t id is in range. (79507337) Added 'uninstall-old-headers' target to Makefile. (fde7c112) Create/install monolithic cblas.h. (d4ee770b) Merge branch 'rt' (52f9e6f1) Fixed cntx_t packm query when ker_id > _NUM_PACKM_KERS. (21360dd8) Fixed POSIX sed non-compliance in flatten-header.sh. (244a6f4e) Generate/compile with/install monolithic blis.h. (45078621) Added missing framework support for x86_64 family. (1f30b130) Fixed a bug in e31f0b3/b131b9a. (9f39806c) Updated configs to omit setting some blocksizes. (b131b9a0) Merge branch 'rt' of github.com:flame/blis into rt (499a4c00) Subtle update to bli_blksz_init*() API. (e31f0b3e) Added 'x86_64' sub-config directory. (6c3ba502) Added a dummy file to kernels/generic. (25eee3cc) More tweaks to monolithify-header.sh (ef024ce4) Second attempt to implement travis_wait. (5028e7de) Added travis_wait prefix to testsuite via Travis. (13e5d910) Removed pnacl, emscripten support from Makefile. (a1caeba0) Improvements, bugfixes to monolithify-header.sh. (9df6dda9) Merge branch 'rt' of github.com:flame/blis into rt (21d26201) Removed unnecessary flags for generic config. (43baa3b3) [WIP] Add x86 and x86_64 processor families. (#154) (b7ca5806) Added bash script for creating monolithic headers. (870597d1) Removed unnecessary #include "blis.h" from header. (c76f77f4) Miscellaneous tweaks to gks, rt functionality. (2bb9bc6e) Miscellaneous tweaks and fixes. (d5bf79e5) Merge branch 'rt' of github.com:flame/blis into rt (673e5184) Implemented runtime hardware detection via cpuid. (2c51356a) Revert to default SIMD alignment for bulldozer. (ab57b979) Revert to default SIMD alignment for bulldozer. (8f150f28) Use perl for some substitution for OS X compatibility. (e3f10557) Merge branch 'master' into rt (dd45cfdf) Fix CVECFLAGS for bulldozer config. (f60c827b) Typecast l1mkr_t enum value prior to comparison. (3e4f42a4) Removed associative arrays from configure. (aec6e038) Added "generic" configuration. (07c35218) Minor update to .travis.yml file. (c1a98d6f) Minor header renaming ahead of bli_arch.c. (75b9383f) Fixed 'make test' target from top-level Makefile. (482af51a) Makefile updates for test drivers, testsuite. (3c269f70) Minor updates to .travis.yml, configure script. (0557189d) Merge branch 'master' into rt (2553734d) Removed a duplicate bli_avx512_macros.h header. (37534279) Implemented runtime kernel management. (453deb29) Merge branch 'master' into rt (b882648a) Fixed a pthread typo in previous commit. (e02d3cb8) Fixed bugs in gemm/gemmtrsm ukr tests in testsuite. (f5962a1a) Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge branch 'master' into rt (8b379069) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Housekeeping, induced method file/function renames. (1f3a5819) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-12-05Updated version (0.2.2.97.g52f9e6f1 -> 0.2.2.99.gfde7c112).AUR Update Bot
Changelog ========= Added 'uninstall-old-headers' target to Makefile. (fde7c112) Create/install monolithic cblas.h. (d4ee770b) Merge branch 'rt' (52f9e6f1) Fixed cntx_t packm query when ker_id > _NUM_PACKM_KERS. (21360dd8) Fixed POSIX sed non-compliance in flatten-header.sh. (244a6f4e) Generate/compile with/install monolithic blis.h. (45078621) Added missing framework support for x86_64 family. (1f30b130) Fixed a bug in e31f0b3/b131b9a. (9f39806c) Updated configs to omit setting some blocksizes. (b131b9a0) Merge branch 'rt' of github.com:flame/blis into rt (499a4c00) Subtle update to bli_blksz_init*() API. (e31f0b3e) Added 'x86_64' sub-config directory. (6c3ba502) Added a dummy file to kernels/generic. (25eee3cc) More tweaks to monolithify-header.sh (ef024ce4) Second attempt to implement travis_wait. (5028e7de) Added travis_wait prefix to testsuite via Travis. (13e5d910) Removed pnacl, emscripten support from Makefile. (a1caeba0) Improvements, bugfixes to monolithify-header.sh. (9df6dda9) Merge branch 'rt' of github.com:flame/blis into rt (21d26201) Removed unnecessary flags for generic config. (43baa3b3) [WIP] Add x86 and x86_64 processor families. (#154) (b7ca5806) Added bash script for creating monolithic headers. (870597d1) Removed unnecessary #include "blis.h" from header. (c76f77f4) Miscellaneous tweaks to gks, rt functionality. (2bb9bc6e) Miscellaneous tweaks and fixes. (d5bf79e5) Merge branch 'rt' of github.com:flame/blis into rt (673e5184) Implemented runtime hardware detection via cpuid. (2c51356a) Revert to default SIMD alignment for bulldozer. (ab57b979) Revert to default SIMD alignment for bulldozer. (8f150f28) Use perl for some substitution for OS X compatibility. (e3f10557) Merge branch 'master' into rt (dd45cfdf) Fix CVECFLAGS for bulldozer config. (f60c827b) Typecast l1mkr_t enum value prior to comparison. (3e4f42a4) Removed associative arrays from configure. (aec6e038) Added "generic" configuration. (07c35218) Minor update to .travis.yml file. (c1a98d6f) Minor header renaming ahead of bli_arch.c. (75b9383f) Fixed 'make test' target from top-level Makefile. (482af51a) Makefile updates for test drivers, testsuite. (3c269f70) Minor updates to .travis.yml, configure script. (0557189d) Merge branch 'master' into rt (2553734d) Removed a duplicate bli_avx512_macros.h header. (37534279) Implemented runtime kernel management. (453deb29) Merge branch 'master' into rt (b882648a) Fixed a pthread typo in previous commit. (e02d3cb8) Fixed bugs in gemm/gemmtrsm ukr tests in testsuite. (f5962a1a) Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge branch 'master' into rt (8b379069) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Housekeeping, induced method file/function renames. (1f3a5819) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-12-02Updated version (0.2.2.56.gab57b979 -> 0.2.2.97.g52f9e6f1).AUR Update Bot
Changelog ========= Merge branch 'rt' (52f9e6f1) Fixed cntx_t packm query when ker_id > _NUM_PACKM_KERS. (21360dd8) Fixed POSIX sed non-compliance in flatten-header.sh. (244a6f4e) Generate/compile with/install monolithic blis.h. (45078621) Added missing framework support for x86_64 family. (1f30b130) Fixed a bug in e31f0b3/b131b9a. (9f39806c) Updated configs to omit setting some blocksizes. (b131b9a0) Merge branch 'rt' of github.com:flame/blis into rt (499a4c00) Subtle update to bli_blksz_init*() API. (e31f0b3e) Added 'x86_64' sub-config directory. (6c3ba502) Added a dummy file to kernels/generic. (25eee3cc) More tweaks to monolithify-header.sh (ef024ce4) Second attempt to implement travis_wait. (5028e7de) Added travis_wait prefix to testsuite via Travis. (13e5d910) Removed pnacl, emscripten support from Makefile. (a1caeba0) Improvements, bugfixes to monolithify-header.sh. (9df6dda9) Merge branch 'rt' of github.com:flame/blis into rt (21d26201) Removed unnecessary flags for generic config. (43baa3b3) [WIP] Add x86 and x86_64 processor families. (#154) (b7ca5806) Added bash script for creating monolithic headers. (870597d1) Removed unnecessary #include "blis.h" from header. (c76f77f4) Miscellaneous tweaks to gks, rt functionality. (2bb9bc6e) Miscellaneous tweaks and fixes. (d5bf79e5) Merge branch 'rt' of github.com:flame/blis into rt (673e5184) Implemented runtime hardware detection via cpuid. (2c51356a) Revert to default SIMD alignment for bulldozer. (ab57b979) Revert to default SIMD alignment for bulldozer. (8f150f28) Use perl for some substitution for OS X compatibility. (e3f10557) Merge branch 'master' into rt (dd45cfdf) Fix CVECFLAGS for bulldozer config. (f60c827b) Typecast l1mkr_t enum value prior to comparison. (3e4f42a4) Removed associative arrays from configure. (aec6e038) Added "generic" configuration. (07c35218) Minor update to .travis.yml file. (c1a98d6f) Minor header renaming ahead of bli_arch.c. (75b9383f) Fixed 'make test' target from top-level Makefile. (482af51a) Makefile updates for test drivers, testsuite. (3c269f70) Minor updates to .travis.yml, configure script. (0557189d) Merge branch 'master' into rt (2553734d) Removed a duplicate bli_avx512_macros.h header. (37534279) Implemented runtime kernel management. (453deb29) Merge branch 'master' into rt (b882648a) Fixed a pthread typo in previous commit. (e02d3cb8) Fixed bugs in gemm/gemmtrsm ukr tests in testsuite. (f5962a1a) Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge branch 'master' into rt (8b379069) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Housekeeping, induced method file/function renames. (1f3a5819) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-11-01Updated version (0.2.2.55.gf60c827b -> 0.2.2.56.gab57b979).haawda
Changelog ========= Revert to default SIMD alignment for bulldozer. (ab57b979) Fix CVECFLAGS for bulldozer config. (f60c827b) Removed a duplicate bli_avx512_macros.h header. (37534279) Fixed a pthread typo in previous commit. (e02d3cb8) Fixed bugs in gemm/gemmtrsm ukr tests in testsuite. (f5962a1a) Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-10-30Updated version (0.2.2.54.g37534279 -> 0.2.2.55.gf60c827b).haawda
Changelog ========= Fix CVECFLAGS for bulldozer config. (f60c827b) Removed a duplicate bli_avx512_macros.h header. (37534279) Fixed a pthread typo in previous commit. (e02d3cb8) Fixed bugs in gemm/gemmtrsm ukr tests in testsuite. (f5962a1a) Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-10-18Updated version (0.2.2.53.ge02d3cb8 -> 0.2.2.54.g37534279).haawda
Changelog ========= Removed a duplicate bli_avx512_macros.h header. (37534279) Fixed a pthread typo in previous commit. (e02d3cb8) Fixed bugs in gemm/gemmtrsm ukr tests in testsuite. (f5962a1a) Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-09-27Updated version (0.2.2.51.g8e917b25 -> 0.2.2.53.ge02d3cb8).haawda
Changelog ========= Fixed a pthread typo in previous commit. (e02d3cb8) Fixed bugs in gemm/gemmtrsm ukr tests in testsuite. (f5962a1a) Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-09-10Updated version (0.2.2.50.gadafe974 -> 0.2.2.51.g8e917b25).haawda
Changelog ========= Updated bibtex info for BLIS5 (3m4m) article. (8e917b25) Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-08-16Updated version (0.2.2.48.gf86ce54d -> 0.2.2.50.gadafe974).haawda
Changelog ========= Merge pull request #150 from devinamatthews/vzeroupper (adafe974) Add vzeroupper to Intel AVX kernels. (7dc78b49) Removed trailing enum commas from bli_type_defs.h. (f86ce54d) Added edge handling to _determine_blocksize_b(). (60a1eeb2) Fixed a minor bug in level-3 packm management. (b01c8082) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-08-11updatehaawda
2017-08-06updatehaawda
2017-08-05Updated version (0.2.2.45.g05925dd5 -> 0.2.2.46.gb01c8082).haawda
Changelog ========= Fixed a minor bug in level-3 packm management. (b01c8082) Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-08-01Updated version (0.2.2.43.g803bbef0 -> 0.2.2.45.g05925dd5).haawda
Changelog ========= Merge pull request #146 from devinamatthews/master (05925dd5) Change lsame_ signature to match lapacke. (cecdc05d) Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-07-30Updated version (0.2.2.41.g07837395 -> 0.2.2.43.g803bbef0).haawda
Changelog ========= Fixed pthreads compile bug with previous commit. (803bbef0) Moved 'family' field from cntx_t to cntl_t. (c63980f4) Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-07-22Updated version (0.2.2.38.gca1d1d85 -> 0.2.2.41.g07837395).haawda
Changelog ========= Merge pull request #139 from Maratyszcza/emscripten (07837395) Merge branch 'master' into emscripten (ad8610b4) Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Fix Emscripten builds (8772a0b3) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-07-21Updated version (0.2.2.34.gb537b5bb -> 0.2.2.38.gca1d1d85).haawda
Changelog ========= Merge pull request #144 from devinamatthews/fix_atomics_on_bgq (ca1d1d85) Clang can't make up it's mind what to support. (733faf84) Add default #define for __has_extension. (7425d074) Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Add fallbacks to __sync_* or __c11_atomic_* builtins when __atomic_* is not supported. Fixes #143. (8823f91a) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-07-20Updated version (0.2.2.30.g1f1ec0db -> 0.2.2.34.gb537b5bb).haawda
Changelog ========= Merge pull request #133 from devinamatthews/haswell-packdim (b537b5bb) Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) PACKDIM_MR=8 didn't work out, but messing with the prefetching helps 2%. (7f41bb0a) Revert "Change PACKDIM_MR (double) for haswell to 8." (d87614af) Change PACKDIM_MR (double) for haswell to 8. (681eec91) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-07-20Updated version (0.2.2.28.g13175c5f -> 0.2.2.30.g1f1ec0db).haawda
Changelog ========= Updated ar option list used by all configurations. (1f1ec0db) Added --force-version=STRING option to configure. (5caaba2d) Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-07-19Updated version (0.2.1.131.g0e58ba1b -> 0.2.2.28.g13175c5f).haawda
Changelog ========= Updated openmp/pthread barriers with GNU atomics. (13175c5f) Added API to set mt environment variables. (0e58ba1b) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) never use libm with Intel compilers (c2c91e09)
2017-07-18Updated version (0.2.1.130.g72c8b49b -> 0.2.1.131.g0e58ba1b).haawda
Changelog ========= Added API to set mt environment variables. (0e58ba1b) Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) Version file update (0.2.2) (940a707a) Fixed a trsm1m bug that affected right-side cases. (d5a5e003) Merge branch 'master' into 1m (e80993e7) README.md update. (ca3a7924) Minor updates to test/3m4m. (6e7de6ef) Whitespace reformatting to armv8a kernels file. (f484c6cd) Merge branch 'master' into 1m (a509fbd5) Disabled experiment-related 1m code. (69b4846a) Merge pull request #118 from devinamatthews/master (513944e4) Handle k=0 correctly in KNL dgemm ukernel. (0e18f68c) Merge pull request #117 from devinamatthews/master (8b462a0e) Cast dim_t and inc_t parameters to 64-bit in KNL microkernels. (7d42fc07) Added missing "level-0" BLAS [sd]cabs1_(). (c362afc5) Fixed a minor bug in configure (issue #114). (018180c9) Merge pull request #113 from devinamatthews/knl_thread_params (ddf45e71) Change default threading parameters for KNL. (78e1b16e) Added 1m-specific APIs for bp, pb gemm algorithms. (1c732d3d) Merge pull request #111 from figual/master (a6ab91bc) Fixed missing cntx argument in ARMv8 microkernels. (7f31a630) Implemented the 1m method. (126482a3) Switched to simpler trsm_r implementation. (145a551d) Reimplemented 4x12 haswell ukernels (real only). (b3e58ee3) Adjusted stride selection of ct in macrokernels. (bdc0a264) Fixed inactive trsm_r blocksize constraint code. (031978d2) Merge pull request #109 from devinamatthews/omp_num_threads (6b5a4032) - Fix typo in bli_cntx.c - Bump BLIS_DEFAULT_NR_THREAD_MAX to 4 (a8220e3a) Add automatic loop thread assignment. (c05b3862) Consolidated 3m1/4m1 gemmtrsm, trsm ukernel code. (3b524a08) Merge pull request #108 from devinamatthews/patch-2 (ead231ac) Allow KNL to fail (62987f60) Fix some problems with OSX builds: (8f901054) Can disable trsm_r-specific blocksize constraints. (d25e6f8b) Bogus commit (1a67e368) Some fixes for .travis.yml (2cd82d67) Update .travis.yml with additional tests (a3db4e6b) Updates to non-default haswell microkernels. (8a11a217) Align strides of ct in macrokernels to that of c. (618f4331) never use libm with Intel compilers (c2c91e09) Merge pull request #105 from devinamatthews/knl (63039100) Fix up for merge to master. (216206c1) Merge branch 'master' into knl (11eb7957) Don't use %rbp in KNL packing kernels. (cd5b6681) Merge pull request #104 from devinamatthews/misspellings (956b3edf) Add flexible options for thread model (pthread/posix for pthreads etc.). (0662a3c1) Merge pull request #103 from devinamatthews/patch-1 (b7e41d71) Change .align to .p2align in Bulldozer ukernels (5117d444) Merge pull request #93 from ShadenSmith/config_check (4bd905bd) Fixed multithreading compilation bug in 970745a. (936d5fdc) Removed auto-prototyping of malloc()/free() substitutes. (8feb0f85) Reorganized typedefs to avoid compiler warnings. (970745a5) Added disabled code to print thrinfo_t structures. (28b2af8a) Fixed a configure -t omp/openmp bug from fd04869. (11eed3f6) Removed previously renamed/old files. (9cda6057) Fixed bli_gemm() segfault on empty C matrices. (22377abd) Fixed segfault in bli_free_align() for NULL ptrs. (0b571cd9) CHANGELOG update (0.2.1) (4fb9b4ef) Adds sanity check to configuration choice. (7f32dd57) Add prefetchw to 30x8 kernel. (c8e4ef93) Merge remote-tracking branch 'origin/knl' into knl (4b5a2f3d) Add (new) 30x8 KNL kernel and fix non-scatter prefetch bug. (380736bf) Try prefetchw[t1] instead of regular prefetch for C. (9f52a587) This version gets ~1550 GFLOPs on KNL wuth 16x4. (8945a151) Switch back to 24x8. I could only squeeze 24.5GFLOP out of 8x24, and scalability is not improved. (6ce4c022) Try an 8x24 kernel for the hell of it. (b8f2b555) Allocate pack buffer on MCDRAM for KNL. (7ede5863) Merge branch 'knl' of github.com:devinamatthews/blis into knl (ad89ed2e) This version gets ~26GF on one core. (2c9de740) Add optimized packing kernels for KNL. (81e2b05f) All fixed. (a7d8ca97) Add 24xk pack kernel. (963d0393) In the midst of debugging. (117b7673) Fix some row/column confusion. (8c0a4fd1) Simplify displacements -- clang assembler was badly botching EVEX compressed displacements giving false alarms for instruction length. (c44f9f96) Minor fixes for 8x24 KNL kernel. (e0cce177) Switch to 24x8 kernel, unrolled by 16. (65735bbe) Add 24x8 "KNC-style" kernel for KNL. (45d5dc97) Add 4x unrolled variant for KNL microkernel. (8ff2e069) Git rid of one RBX update. (9cb2ed9b) Add some more knobs to twiddle for KNL microkernel. (451bde07) Make knl conform to new kernel dir structure. (8c6e621c) Merge remote-tracking branch 'origin/master' into knl (ce7214c6) Add 8x24 KNL kernel. (119d0399) Merge remote-tracking branch 'origin/master' into knl (b58cda9e) Add new KNL microkernel derived from Haswell. (318f063d) Fix SIMD definitions in KNL config, and a couple of fixes to C update. (e3bd5ca6) Move bli_kernel.h before bli_threading.h in order of inclusion in blis.h. (4fe02e3d) Merge branch 'move_simd_defs' into knl (619dee0d) Merge branch 'master' into knl (b790b3d9) Rearrange KNL dgemm kernel again to streamline usage of ymm register. sgemm and dgemm now both working with Intel SDE. (4f8c05c9) Work around missing VPMULLQ on KNL. (7193230f) Fix copy-paste errors in KNL kernels. (bd44cf13) Add sgemm ukernels for KNL. vpmullq is not implemented on KNL -- needs workaround. (a11eec05) Merge remote-tracking branch 'origin/master' into knl (c38e0dab) Merge remote-tracking branch 'origin/knl' into knl (bd5e2296) Add 64-bit offset vector so we can use vgatherqpd. (4745def0) KNL ukernel compiles with gcc. (49f85177) Rewrite of KNL kernel in GNU extended asm syntax. (58b2c3cf) Translated MIC kernel to KNL and cleaned up a bit. Only real change is lack of swizzle modifiers for FMA instructions (used bcast from memory instead). (dd856c2c) Copy mic kernel to knl for transliteration. (7f27431d) Merge branch 'master' into const_correctness (f8f02f03) Merge branch 'master' into const_correctness (32c92d94) Merge branch 'master' into const_correctness (62914ccb) Add missing const to bli_read_nway_from_env. (bbf704bf) Set default value for debug_type variable. (a4d77297) Add const correctness to auxinfo_t struct (microkernels need update theoretically). (0e2447fa)
2017-07-13Updated version (0.2.1.128.g70cc825b -> 0.2.1.130.g72c8b49b).haawda
Changelog ========= Merge pull request #138 from hominhquan/membrk_set_free_fp (72c8b49b) set missing free_fp in bli_membrk_init for free-ing GEN_USE buffers (ba7cada5) Update LICENSE (70cc825b) Add new SSI acknowledgment (cf54c77b) Restored deleted lines from makefile fragments. (6e04f9df) Change to /bin/sh. (ec5c0c04) Remove shebangs from makefiles. (555ddc30) Merge pull request #128 from iotamudelta/master (f26bd7f4) Fix if/else structure. Thanks to TravisCI. (169fb05f) Restore version. (0579dfea) Mark piledriver compilable w/ clang. (a75b05c2) Mark bulldozer compilable w/ clang. (7541d46e) Correct error message. (91f89707) Indeed once can compile for carrizo also using clang. (f5131e1e) A bunch of shebang fixes from unportable /bin/bash to portable /usr/bin/env bash (5fa4e943) Merge pull request #127 from devinamatthews/fix_blis_nt_xx (cbf8710a) Fixed a bug in norm1v, norm1m. (cf39d3ef) Merge pull request #121 from jeffhammond/not-real-knl (79948512) Setting any one of BLIS_NT_[IJ][CR] overrides BLIS_NUM_THEADS. Missing BLIS_NT_XX's are defaulted to 1. Fixes #123. (fdc66f12) Merge branch 'master' of github.com:flame/blis (773a24ef) Disable complex 3m/4m in testsuite by default. (dd58c954) allow KNL build without hbwmalloc.h (i.e. emulated) (0df3541f) Merge pull request #107 from jeffhammond/intel-compilers-no-use-libm (b8854259) Fixed stray parentheses in README citations. (43007f7b) CHANGELOG update (0.2.2) (a4f1d0b8) Version file update (0.2.2) (940a707a) Fixed a trsm1m bug that affected right-side cases. (d5a5e003) Merge branch 'master' into 1m (e80993e7) README.md update. (ca3a7924) Minor updates to test/3m4m. (6e7de6ef) Whitespace reformatting to armv8a kernels file. (f484c6cd) Merge branch 'master' into 1m (a509fbd5) Disabled experiment-related 1m code. (69b4846a) Merge pull request #118 from devinamatthews/master (513944e4) Handle k=0 correctly in KNL dgemm ukernel. (0e18f68c) Merge pull request #117 from devinamatthews/master (8b462a0e) Cast dim_t and inc_t parameters to 64-bit in KNL microkernels. (7d42fc07) Added missing "level-0" BLAS [sd]cabs1_(). (c362afc5) Fixed a minor bug in configure (issue #114). (018180c9) Merge pull request #113 from devinamatthews/knl_thread_params (ddf45e71) Change default threading parameters for KNL. (78e1b16e) Added 1m-specific APIs for bp, pb gemm algorithms. (1c732d3d) Merge pull request #111 from figual/master (a6ab91bc) Fixed missing cntx argument in ARMv8 microkernels. (7f31a630) Implemented the 1m method. (126482a3) Switched to simpler trsm_r implementation. (145a551d) Reimplemented 4x12 haswell ukernels (real only). (b3e58ee3) Adjusted stride selection of ct in macrokernels. (bdc0a264) Fixed inactive trsm_r blocksize constraint code. (031978d2) Merge pull request #109 from devinamatthews/omp_num_threads (6b5a4032) - Fix typo in bli_cntx.c - Bump BLIS_DEFAULT_NR_THREAD_MAX to 4 (a8220e3a) Add automatic loop thread assignment. (c05b3862) Consolidated 3m1/4m1 gemmtrsm, trsm ukernel code. (3b524a08) Merge pull request #108 from devinamatthews/patch-2 (ead231ac) Allow KNL to fail (62987f60) Fix some problems with OSX builds: (8f901054) Can disable trsm_r-specific blocksize constraints. (d25e6f8b) Bogus commit (1a67e368) Some fixes for .travis.yml (2cd82d67) Update .travis.yml with additional tests (a3db4e6b) Updates to non-default haswell microkernels. (8a11a217) Align strides of ct in macrokernels to that of c. (618f4331) never use libm with Intel compilers (c2c91e09) Merge pull request #105 from devinamatthews/knl (63039100) Fix up for merge to master. (216206c1) Merge branch 'master' into knl (11eb7957) Don't use %rbp in KNL packing kernels. (cd5b6681) Merge pull request #104 from devinamatthews/misspellings (956b3edf) Add flexible options for thread model (pthread/posix for pthreads etc.). (0662a3c1) Merge pull request #103 from devinamatthews/patch-1 (b7e41d71) Change .align to .p2align in Bulldozer ukernels (5117d444) Merge pull request #93 from ShadenSmith/config_check (4bd905bd) Fixed multithreading compilation bug in 970745a. (936d5fdc) Removed auto-prototyping of malloc()/free() substitutes. (8feb0f85) Reorganized typedefs to avoid compiler warnings. (970745a5) Added disabled code to print thrinfo_t structures. (28b2af8a) Fixed a configure -t omp/openmp bug from fd04869. (11eed3f6) Removed previously renamed/old files. (9cda6057) Fixed bli_gemm() segfault on empty C matrices. (22377abd) Fixed segfault in bli_free_align() for NULL ptrs. (0b571cd9) CHANGELOG update (0.2.1) (4fb9b4ef) Adds sanity check to configuration choice. (7f32dd57) Add prefetchw to 30x8 kernel. (c8e4ef93) Merge remote-tracking branch 'origin/knl' into knl (4b5a2f3d) Add (new) 30x8 KNL kernel and fix non-scatter prefetch bug. (380736bf) Try prefetchw[t1] instead of regular prefetch for C. (9f52a587) This version gets ~1550 GFLOPs on KNL wuth 16x4. (8945a151) Switch back to 24x8. I could only squeeze 24.5GFLOP out of 8x24, and scalability is not improved. (6ce4c022) Try an 8x24 kernel for the hell of it. (b8f2b555) Allocate pack buffer on MCDRAM for KNL. (7ede5863) Merge branch 'knl' of github.com:devinamatthews/blis into knl (ad89ed2e) This version gets ~26GF on one core. (2c9de740) Add optimized packing kernels for KNL. (81e2b05f) All fixed. (a7d8ca97) Add 24xk pack kernel. (963d0393) In the midst of debugging. (117b7673) Fix some row/column confusion. (8c0a4fd1) Simplify displacements -- clang assembler was badly botching EVEX compressed displacements giving false alarms for instruction length. (c44f9f96) Minor fixes for 8x24 KNL kernel. (e0cce177) Switch to 24x8 kernel, unrolled by 16. (65735bbe) Add 24x8 "KNC-style" kernel for KNL. (45d5dc97) Add 4x unrolled variant for KNL microkernel. (8ff2e069) Git rid of one RBX update. (9cb2ed9b) Add some more knobs to twiddle for KNL microkernel. (451bde07) Make knl conform to new kernel dir structure. (8c6e621c) Merge remote-tracking branch 'origin/master' into knl (ce7214c6) Add 8x24 KNL kernel. (119d0399) Merge remote-tracking branch 'origin/master' into knl (b58cda9e) Add new KNL microkernel derived from Haswell. (318f063d) Fix SIMD definitions in KNL config, and a couple of fixes to C update. (e3bd5ca6) Move bli_kernel.h before bli_threading.h in order of inclusion in blis.h. (4fe02e3d) Merge branch 'move_simd_defs' into knl (619dee0d) Merge branch 'master' into knl (b790b3d9) Rearrange KNL dgemm kernel again to streamline usage of ymm register. sgemm and dgemm now both working with Intel SDE. (4f8c05c9) Work around missing VPMULLQ on KNL. (7193230f) Fix copy-paste errors in KNL kernels. (bd44cf13) Add sgemm ukernels for KNL. vpmullq is not implemented on KNL -- needs workaround. (a11eec05) Merge remote-tracking branch 'origin/master' into knl (c38e0dab) Merge remote-tracking branch 'origin/knl' into knl (bd5e2296) Add 64-bit offset vector so we can use vgatherqpd. (4745def0) KNL ukernel compiles with gcc. (49f85177) Rewrite of KNL kernel in GNU extended asm syntax. (58b2c3cf) Translated MIC kernel to KNL and cleaned up a bit. Only real change is lack of swizzle modifiers for FMA instructions (used bcast from memory instead). (dd856c2c) Copy mic kernel to knl for transliteration. (7f27431d) Merge branch 'master' into const_correctness (f8f02f03) Merge branch 'master' into const_correctness (32c92d94) Merge branch 'master' into const_correctness (62914ccb) Add missing const to bli_read_nway_from_env. (bbf704bf) Set default value for debug_type variable. (a4d77297) Add const correctness to auxinfo_t struct (microkernels need update theoretically). (0e2447fa)
2017-06-07Updated version (0.2.1.126.g6e04f9df -> 0.2.1.128.g70cc825b).haawda
2017-05-29initial uploadhaawda