Guidelines
(optimize speed)
simple-array
and avoiding broadcastsThe above numbers indicate how many times is the corresponding lisp function is faster than the corresponding numpy or pytorch function. The numbers were obtained on an Intel i7-8750H running at 3GHz. The lisp code was compiled with (optimize speed) settings for arrays of sizes smaller than 80000. Appropriate type declarations were also used. The dense-numericals code used 6 threads for arrays of sizes larger than 80000, while PyTorch was left at default settings, with torch.get_num_threads() returning a value of 6. Thus, lisp and PyTorch had the benefit of multithreading, while numpy did not.
SBCL Version: 2.2.6
Python Version: 3.8.5
Numpy Version: 1.19.0
PyTorch Version: 1.7.1.post2
And while it is true that the lisp code was inlined and compiled wherever appropriate, that is precisely the point. Note that lisp uses incremental compilation and incremental typing, providing you the benefits of dynamicity as default. But in addition, certain compilers like SBCL also bring to you the safety and performance of static typing. The safety and type-guarantees are certainly far from languages like Haskell, but it gets the work done.
None of the code used (safety 0) declarations; thus in some cases, even higher performance can be attainable at the cost of safety and the potential of running into segmentation faults if types were declared incorrectly.
I think that numericals / dense-numericals as projects separate from existing projects is justified because of the existence of several unachievable-without-significant-rewrite-or-change-of-goals-or-approach for numcl or magicl. These perhaps exist because numericals and dense-numericals and its dependencies depend on CLTL2 API, while numcl and magicl try to stick to ANSI CL. On the other hand, it also seems that Common Lisp without CLTL2 API would be terribly ill-suited for numerical computing. (See here for my wishlist of features for a numerical computing library.)
(optimize speed)
and appropriate type declarations should result in maximal inlining and minimal runtime spent on function calls.(numericals:sin array :out array :broadcast nil)
can be shortened to (numericals:sin! array)
. (Goal 5)Goal | numericals | numcl | magicl | Description |
---|---|---|---|---|
1. Maximal inlining | ✔ | ✔ | ? | needs compiler-macros or polymorphic-functions to dispatch on specialized arrays |
2. Inlining without code bloat | ✔ | ? | ? | needs separate handling of broadcast and non-broadcasting operations |
3. Compiler-notes | ✔ | ? | ? | can use compiler-macro-notes, but requires compiler-macros |
4. Numpy-like API | ✔ | ✔ | + | - |
5. OUT parameter | ✔ | + | ✔ | - |
6. Transparent printed representation | ✔ | ? | ✔ | needs wrapper structures/classes, for example dense-arrays or magicl:tensor |
7. Optional array broadcasting | ✔ | + | ? | - |