-- Leo's gemini proxy
-- Connecting to nanako.mooo.com:1965...
-- Connected
-- Sending request
-- Meta line: 20 text/gemini; charset=utf-8
I noticed that v1.11.0 of Crystal is going to ship with some new compiler options that are meant to control how much optimization is done. This should help compile times a bit, so I decided to check out the git trunk, compile Crystal, and see how it handles Benben. And, at the same time, I decided to check the performance of the Common Lisp version of Benben to see how it's coming around, and to see if it's still a good idea to continue working on that port.
For these benchmarks, I used my desktop computer, which has these specifications:
Intel Core i9-10850K, 10 physical cores, 20 logical cores
64 GB RAM
Slackware 15.0 + a custom kernel v6.6.8
Far too much disk space on both HDDs and SSDs.
The software had these versions:
Crystal: git commit e3200d9eb8814e92849503e646325b09647cfe9e (the current WIP v1.11.0)
SBCL: v2.4.0
Benben (Crystal): fossil commit 93e7e91c61c31f2738fb9ef5804ea32bd2c727a1f31f3fa04f9b5f482db234e0
Benben (Lisp): fossil commit b9b57d54983253bf092ca319e2ac6205c210231e6ed8aa1a5c96315bc1c96b00
YunoSynth: fossil commit dd7d35f6457a78a25f15dbbab958e096d7bce626
SatouSynth: fossil commit 705f3f0624a216ccbf62ca134047c13c0d680efe (not online yet)
Things to remember:
YunoSynth will always be in Crystal. Same repo as always.
SatouSynth is a rewrite of YunoSynth in Common Lisp, and is in a separate repo.
Benben is getting ported, not forked. Same repo, different branches.
SatouSynth has about 9k lines of code right now, YunoSynth has 24,205.
The method I used for the Crystal benchmarks was this, in order:
Remove the Crystal cache (~/.cache/crystal)
Remove the /tmp/foo directory
"rake clean"
Build
Render a pre-defined playlist to /tmp/foo using the new binary.
For the Crystal version, I used this when building, adding on the correct -Ox and --single-module, as-needed:
time shards build -p -Dpreview_mt -Dremiconf_no_hjson --no-debug -Dyunosynth_wd40 -Dremiaudio_wd40 -Dremisound_wd40
This is the same as the Rakefile would do for a release build, except I removed the "--release" parameter. Those "-Dxxx_wd40" defines are just to enable some unsafe optimizations in my libraries.
Building Benben for Common Lisp is a bit different right now. I have some "prepare-deps-XXX.sh" scripts that will pre-build some of the dependencies for either a debug or release build. Then, I use "make release=1" or "make debug=1" to build the binary. The two step approach is just to match my personal workflow, and because I can get away with pre-building most stuff with Lisp and then not rebuild it later. This is just temporary, but it does work.
So, building for Common Lisp was done like this:
Remove the Common Lisp build cache (~/.cache/common-lisp)
Remove the /tmp/foo directory
"make clean"
"time ./prepare-deps-release.sh"
"time make release=1"
Render a pre-defined playlist to /tmp/foo using the new binary.
Finally, I rendered a playlist containing 79 songs to a directory on an SSD ("/tmp/foo"). The songs were mostly from X68000, with a few MSX2 songs thrown in as well. These were chosen because both versions of Benben currently support their chips (I could have thrown in the PC Engine as well, but oh well). So the chip emulators exercised were the YM2151, the OKI MSM6258, and the AY-1-8910.
The times are all in seconds. Lower is better.
The "Avg. Samples/sec" column is the average number of audio samples that Benben is rendering per second. Higher is better.
The "M"ode column indicates how Benben was built. For Crystal, this is -Ox (so -O1, -O3, etc, without --single-module) or -Oxsm (same, but with --single-module added). With Crystal, --release is the same as "-O3 --single-module" according to the docs, so I just labeled this one "release".
My Common Lisp setup only has two modes right now: release or debug.
Build Times:
Mode Time O0 9 O1 15 O2 16 O3 33 O0sm 12 O1sm 81 O2sm 87 release 91
Run Times:
Mode Time Avg. Samples/sec O0 204 926,130 O1 93 2,022,830 O2 85 2,228,536 O3 87 2,178,511 O0sm 200 939,526 O1sm 32 5,864,521 O2sm 28 6,830,182 release 27 7,071,019
Build Times:
Mode Time debug 66 release 38
Run times:
Mode Time Avg. Samples/sec debug 120 1,578,634 release 41 4,633,274
So there we have it. The new -Ox options are... kinda cool, I guess? I can see myself using -O1 or -O2 to build a quick debug binary during development, and pretty much abandoning -O0 for most of my use cases. But overall, the difference between them is pretty underwhelming unless you add --single-module. But, adding --single-module doesn't look like something I would ever do because of the increased build times (I'm not counting --release here). So really, I don't see the new options being much use for me.
On the Lisp side, it's interesting to see that the release build took *less* time than the debug build. This is pretty neat, though not useful during development since I'd then lack debug information. The speed of a release build is quite good, though doing a "-O1 --single-module" is enough to inch past it with Crystal. Still, it's impressive.
I'm not sure if I'll use this info to decide what to do yet, or what I'll even do. So for now, I'll continue on my present course. Regardless, it was interesting to see the new compiler options for Crystal.
--------- Page served by Aya https://nanako.mooo.com/fossil/aya/ Aya is under the GNU Affero GPLv3 license
-- Response ended
-- Page fetched on Sun May 19 22:35:46 2024