Planet Lisp
http://planet.lisp.org/
Planet LispenQuicklisp news: July 2018 Quicklisp dist update now availablehttp://blog.quicklisp.org/2018/07/july-2018-quicklisp-dist-update-now.html
http://blog.quicklisp.org/2018/07/july-2018-quicklisp-dist-update-now.html
Hi everyone! There's a new Quicklisp update for July, and regular updates should resume on a monthly schedule.<br /><br />I'm using a new release system that <a href="https://github.com/quicklisp/quicklisp-docker">involves Docker</a> for easier build server setup and management. It took a while to get going but it should (eventually) make it easier for others to run things in an environment similar to mine. For example, it has all the required foreign libraries needed to compile and load everything in Quicklisp.<br /><br />Here's the info for the new update:<br /><b><br /></b><b>New projects</b>:<br /><ul><li><a href="https://github.com/phantomics/april/">april</a> — April is a subset of the APL programming language that compiles to Common Lisp. — Apache-2.0</li><li><a href="https://github.com/stablecross/aws-foundation/">aws-foundation</a> — Amazon AWS low-level utilities — BSD</li><li><a href="https://github.com/mgi/binary-io/">binary-io</a> — Library for reading and writing binary data. — BSD</li><li><a href="https://github.com/zodmaner/cl-bip39/">cl-bip39</a> — A Common Lisp implementation of BIP-0039 — MIT</li><li><a href="https://github.com/diasbruno/cl-bnf/">cl-bnf</a> — A simple BNF parser. — MIT</li><li><a href="https://github.com/ukari/cl-generator">cl-generator</a> — cl-generator, a generator implementation for common lisp — MIT</li><li><a href="https://github.com/defaultxr/cl-patterns/">cl-patterns</a> — Pattern library for algorithmic music composition and performance in Common Lisp. — GNU General Public License v3.0</li><li><a href="https://github.com/sirherrbatka/cl-progress-bar/">cl-progress-bar</a> — Display progress bars directly in REPL. — MIT</li><li><a href="https://gitlab.common-lisp.net/mantoniotti/CLAD">clad</a> — The CLAD System. — BSD</li><li><a href="https://github.com/robert-strandh/Concrete-Syntax-Tree/">concrete-syntax-tree</a> — Library for parsing Common Lisp code into a concrete syntax tree. — FreeBSD</li><li><a href="https://github.com/Shinmera/definitions">definitions</a> — General definitions reflection library. — Artistic</li><li><a href="https://github.com/robert-strandh/eclector/">eclector</a> — A Common Lisp reader that can adapt to different implementations, and that can return Concrete Syntax Trees — BSD</li><li><a href="https://github.com/ailisp/flute/">flute</a> — A beautiful, easilly composable HTML5 generation library — MIT</li><li><a href="https://github.com/FungusHumungus/froute/">froute</a> — An Http routing class that takes advantage of the MOP — MIT</li><li><a href="https://github.com/Shinmera/language-codes">language-codes</a> — A small library mapping language codes to language names. — Artistic</li><li><a href="https://github.com/Shirakumo/lichat-ldap">lichat-ldap</a> — LDAP backend for the Lichat server profiles. — Artistic</li><li><a href="https://github.com/Shinmera/multilang-documentation">multilang-documentation</a> — A drop-in replacement for CL:DOCUMENTATION providing multi-language docstrings — Artistic</li><li><a href="https://github.com/Shinmera/multiposter">multiposter</a> — A small application to post to multiple services at once. — Artistic</li><li><a href="https://github.com/Bike/sandalphon.lambda-list/">sandalphon.lambda-list</a> — Lambda list parsing and usage — WTFPL</li><li><a href="https://github.com/GrammaTech/sel/">sel</a> — Programmatic modification and evaluation of software — GPL 3</li><li><a href="https://github.com/Shinmera/system-language">system-locale</a> — System locale and language discovery — Artistic</li><li><a href="https://github.com/mv2devnul/taglib/">taglib</a> — Pure Lisp implementation to read (and write, perhaps, one day) tags — UNLICENSE </li><li><a href="https://github.com/Shirakumo/terrable">terrable</a> — Terragen TER file format reader — Artistic</li><li><a href="https://github.com/Shinmera/tooter">tooter</a> — A client library for Mastodon instances. — Artistic</li><li><a href="https://github.com/GrammaTech/trace-db/">trace-db</a> — Writing, reading, storing, and searching of program traces — GPL 3</li><li><a href="https://www.michaelfiano.com/projects/umbra">umbra</a> — A library of reusable GPU shader functions. — MIT</li></ul><b>Updated projects</b>: <a href="http://quickdocs.org/3d-matrices/">3d-matrices</a>, <a href="http://quickdocs.org/3d-vectors/">3d-vectors</a>, <a href="http://quickdocs.org/agnostic-lizard/">agnostic-lizard</a>, <a href="http://quickdocs.org/alexa/">alexa</a>, <a href="http://quickdocs.org/aws-sign4/">aws-sign4</a>, <a href="http://quickdocs.org/base-blobs/">base-blobs</a>, <a href="http://quickdocs.org/bodge-blobs-support/">bodge-blobs-support</a>, <a href="http://quickdocs.org/bodge-chipmunk/">bodge-chipmunk</a>, <a href="http://quickdocs.org/bodge-nanovg/">bodge-nanovg</a>, <a href="http://quickdocs.org/bodge-nuklear/">bodge-nuklear</a>, <a href="http://quickdocs.org/bordeaux-threads/">bordeaux-threads</a>, <a href="http://quickdocs.org/bt-semaphore/">bt-semaphore</a>, <a href="http://quickdocs.org/caveman/">caveman</a>, <a href="http://quickdocs.org/cepl/">cepl</a>, <a href="http://quickdocs.org/cepl.drm-gbm/">cepl.drm-gbm</a>, <a href="http://quickdocs.org/cerberus/">cerberus</a>, <a href="http://quickdocs.org/chirp/">chirp</a>, <a href="http://quickdocs.org/cl-algebraic-data-type/">cl-algebraic-data-type</a>, <a href="http://quickdocs.org/cl-async/">cl-async</a>, <a href="http://quickdocs.org/cl-autorepo/">cl-autorepo</a>, <a href="http://quickdocs.org/cl-cognito/">cl-cognito</a>, <a href="http://quickdocs.org/cl-conllu/">cl-conllu</a>, <a href="http://quickdocs.org/cl-darksky/">cl-darksky</a>, <a href="http://quickdocs.org/cl-flow/">cl-flow</a>, <a href="http://quickdocs.org/cl-forms/">cl-forms</a>, <a href="http://quickdocs.org/cl-gamepad/">cl-gamepad</a>, <a href="http://quickdocs.org/cl-geos/">cl-geos</a>, <a href="http://quickdocs.org/cl-gobject-introspection/">cl-gobject-introspection</a>, <a href="http://quickdocs.org/cl-gopher/">cl-gopher</a>, <a href="http://quickdocs.org/cl-hamcrest/">cl-hamcrest</a>, <a href="http://quickdocs.org/cl-interpol/">cl-interpol</a>, <a href="http://quickdocs.org/cl-liballegro/">cl-liballegro</a>, <a href="http://quickdocs.org/cl-libsvm-format/">cl-libsvm-format</a>, <a href="http://quickdocs.org/cl-mechanize/">cl-mechanize</a>, <a href="http://quickdocs.org/cl-mpi/">cl-mpi</a>, <a href="http://quickdocs.org/cl-muth/">cl-muth</a>, <a href="http://quickdocs.org/cl-online-learning/">cl-online-learning</a>, <a href="http://quickdocs.org/cl-pslib/">cl-pslib</a>, <a href="http://quickdocs.org/cl-python/">cl-python</a>, <a href="http://quickdocs.org/cl-random-forest/">cl-random-forest</a>, <a href="http://quickdocs.org/cl-readline/">cl-readline</a>, <a href="http://quickdocs.org/cl-redis/">cl-redis</a>, <a href="http://quickdocs.org/cl-rules/">cl-rules</a>, <a href="http://quickdocs.org/cl-sdl2/">cl-sdl2</a>, <a href="http://quickdocs.org/cl-str/">cl-str</a>, <a href="http://quickdocs.org/cl-toml/">cl-toml</a>, <a href="http://quickdocs.org/cl-yesql/">cl-yesql</a>, <a href="http://quickdocs.org/clack/">clack</a>, <a href="http://quickdocs.org/claw/">claw</a>, <a href="http://quickdocs.org/clip/">clip</a>, <a href="http://quickdocs.org/closer-mop/">closer-mop</a>, <a href="http://quickdocs.org/closure-html/">closure-html</a>, <a href="http://quickdocs.org/clss/">clss</a>, <a href="http://quickdocs.org/clx/">clx</a>, <a href="http://quickdocs.org/codex/">codex</a>, <a href="http://quickdocs.org/coleslaw/">coleslaw</a>, <a href="http://quickdocs.org/common-lisp-actors/">common-lisp-actors</a>, <a href="http://quickdocs.org/configuration.options/">configuration.options</a>, <a href="http://quickdocs.org/croatoan/">croatoan</a>, <a href="http://quickdocs.org/curry-compose-reader-macros/">curry-compose-reader-macros</a>, <a href="http://quickdocs.org/delta-debug/">delta-debug</a>, <a href="http://quickdocs.org/deploy/">deploy</a>, <a href="http://quickdocs.org/dexador/">dexador</a>, <a href="http://quickdocs.org/djula/">djula</a>, <a href="http://quickdocs.org/dml/">dml</a>, <a href="http://quickdocs.org/documentation-utils/">documentation-utils</a>, <a href="http://quickdocs.org/documentation-utils-extensions/">documentation-utils-extensions</a>, <a href="http://quickdocs.org/doubly-linked-list/">doubly-linked-list</a>, <a href="http://quickdocs.org/dufy/">dufy</a>, <a href="http://quickdocs.org/dynamic-mixins/">dynamic-mixins</a>, <a href="http://quickdocs.org/elf/">elf</a>, <a href="http://quickdocs.org/fare-scripts/">fare-scripts</a>, <a href="http://quickdocs.org/femlisp/">femlisp</a>, <a href="http://quickdocs.org/fft/">fft</a>, <a href="http://quickdocs.org/flare/">flare</a>, <a href="http://quickdocs.org/flexi-streams/">flexi-streams</a>, <a href="http://quickdocs.org/for/">for</a>, <a href="http://quickdocs.org/freebsd-sysctl/">freebsd-sysctl</a>, <a href="http://quickdocs.org/fxml/">fxml</a>, <a href="http://quickdocs.org/gamebox-frame-manager/">gamebox-frame-manager</a>, <a href="http://quickdocs.org/gamebox-math/">gamebox-math</a>, <a href="http://quickdocs.org/glsl-spec/">glsl-spec</a>, <a href="http://quickdocs.org/glsl-toolkit/">glsl-toolkit</a>, <a href="http://quickdocs.org/glyphs/">glyphs</a>, <a href="http://quickdocs.org/golden-utils/">golden-utils</a>, <a href="http://quickdocs.org/graph/">graph</a>, <a href="http://quickdocs.org/gsll/">gsll</a>, <a href="http://quickdocs.org/harmony/">harmony</a>, <a href="http://quickdocs.org/helambdap/">helambdap</a>, <a href="http://quickdocs.org/hunchensocket/">hunchensocket</a>, <a href="http://quickdocs.org/ironclad/">ironclad</a>, <a href="http://quickdocs.org/jose/">jose</a>, <a href="http://quickdocs.org/lichat-protocol/">lichat-protocol</a>, <a href="http://quickdocs.org/lichat-serverlib/">lichat-serverlib</a>, <a href="http://quickdocs.org/lichat-tcp-server/">lichat-tcp-server</a>, <a href="http://quickdocs.org/lisp-chat/">lisp-chat</a>, <a href="http://quickdocs.org/listopia/">listopia</a>, <a href="http://quickdocs.org/maiden/">maiden</a>, <a href="http://quickdocs.org/mcclim/">mcclim</a>, <a href="http://quickdocs.org/media-types/">media-types</a>, <a href="http://quickdocs.org/mito/">mito</a>, <a href="http://quickdocs.org/nineveh/">nineveh</a>, <a href="http://quickdocs.org/omer-count/">omer-count</a>, <a href="http://quickdocs.org/opticl/">opticl</a>, <a href="http://quickdocs.org/overlord/">overlord</a>, <a href="http://quickdocs.org/oxenfurt/">oxenfurt</a>, <a href="http://quickdocs.org/parachute/">parachute</a>, <a href="http://quickdocs.org/parseq/">parseq</a>, <a href="http://quickdocs.org/parser.ini/">parser.ini</a>, <a href="http://quickdocs.org/pathname-utils/">pathname-utils</a>, <a href="http://quickdocs.org/pgloader/">pgloader</a>, <a href="http://quickdocs.org/physical-quantities/">physical-quantities</a>, <a href="http://quickdocs.org/plump/">plump</a>, <a href="http://quickdocs.org/postmodern/">postmodern</a>, <a href="http://quickdocs.org/ppath/">ppath</a>, <a href="http://quickdocs.org/pythonic-string-reader/">pythonic-string-reader</a>, <a href="http://quickdocs.org/qbase64/">qbase64</a>, <a href="http://quickdocs.org/qlot/">qlot</a>, <a href="http://quickdocs.org/qt-libs/">qt-libs</a>, <a href="http://quickdocs.org/qtools/">qtools</a>, <a href="http://quickdocs.org/random-sample/">random-sample</a>, <a href="http://quickdocs.org/rove/">rove</a>, <a href="http://quickdocs.org/rtg-math/">rtg-math</a>, <a href="http://quickdocs.org/s-dot2/">s-dot2</a>, <a href="http://quickdocs.org/serapeum/">serapeum</a>, <a href="http://quickdocs.org/shadow/">shadow</a>, <a href="http://quickdocs.org/simple-flow-dispatcher/">simple-flow-dispatcher</a>, <a href="http://quickdocs.org/slime/">slime</a>, <a href="http://quickdocs.org/spinneret/">spinneret</a>, <a href="http://quickdocs.org/staple/">staple</a>, <a href="http://quickdocs.org/string-case/">string-case</a>, <a href="http://quickdocs.org/stumpwm/">stumpwm</a>, <a href="http://quickdocs.org/sxql/">sxql</a>, <a href="http://quickdocs.org/the-cost-of-nothing/">the-cost-of-nothing</a>, <a href="http://quickdocs.org/trivia/">trivia</a>, <a href="http://quickdocs.org/trivial-ldap/">trivial-ldap</a>, <a href="http://quickdocs.org/trivial-mmap/">trivial-mmap</a>, <a href="http://quickdocs.org/ubiquitous/">ubiquitous</a>, <a href="http://quickdocs.org/uiop/">uiop</a>, <a href="http://quickdocs.org/unit-formula/">unit-formula</a>, <a href="http://quickdocs.org/varjo/">varjo</a>, <a href="http://quickdocs.org/websocket-driver/">websocket-driver</a>, <a href="http://quickdocs.org/whofields/">whofields</a>, <a href="http://quickdocs.org/xhtmlambda/">xhtmlambda</a>, <a href="http://quickdocs.org/xlsx/">xlsx</a>, <a href="http://quickdocs.org/xml-emitter/">xml-emitter</a>.<br /><br /><b>Removed projects</b>: binge, black-tie, cl-ctrnn, cl-directed-graph, cl-ledger, cl-scan, readable, spartns.<br /><br />The projects removed either didn't build (cl-directed-graph) or are no longer available for download that I could find (everything else).<br /><br />To get this update, use (ql:update-dist "quicklisp").<br /><br />Enjoy!<br /><br />Wed, 11 Jul 2018 15:34:00 GMTPaul Khuong: The Confidence Sequence Method: A Computer-age Test for Statistical SLOshttp://www.pvk.ca/Blog/2018/07/06/testing-slo-type-properties-with-the-confidence-sequence-method/
http://www.pvk.ca/Blog/2018/07/06/testing-slo-type-properties-with-the-confidence-sequence-method/
<p><em>This post goes over <a href="https://github.com/pkhuong/csm">some code that I pushed to github today</a>. All the snippets below should be <a href="https://github.com/pkhuong/csm/blob/master/csm.lisp">in the repo</a>,
which also <a href="https://github.com/pkhuong/csm/blob/master/csm.h">includes</a>
<a href="https://github.com/pkhuong/csm/blob/master/csm.c">C</a> and
<a href="https://github.com/pkhuong/csm/blob/master/csm.py">Python code</a> with the same structure.</em></p>
<p>I recently resumed thinking about balls and bins for hash tables. This
time, I’m looking at large bins (on the order of one 2MB huge page).
There are
<a href="https://en.wikipedia.org/wiki/Cuckoo_hashing">many</a>
<a href="https://xlinux.nist.gov/dads/HTML/twoLeftHashing.html">hashing</a>
<a href="https://en.wikipedia.org/wiki/Dynamic_perfect_hashing">methods</a>
with solid worst-case guarantees that unfortunately query multiple
uncorrelated locations; I feel like we could automatically adapt them
to modern hierarchical storage (or address translation) to make them
more efficient, for a small loss in density.</p>
<p>In theory,
<a href="https://en.wikipedia.org/wiki/Balls_into_bins#Random_allocation">large enough bins can be allocated statically with a minimal waste of space</a>.
I wanted some actual non-asymptotic numbers, so I
<a href="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/balls-and-bins.ispc">ran numerical experiments</a> and got the following distribution of
global utilisation (fill rate) when the first bin fills up.</p>
<p><a href="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/grid.png">
<img class="center" src="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/grid-small.png" />
</a></p>
<p>It looks like, even with one thousand bins of thirty thousand values,
we can expect almost 98% space utilisation until the first bin
saturates. I want something more formal.</p>
<p>Could I establish something like a service level objective, “When
distributing balls randomly between one thousand bins with individual
capacity of thirty thousand balls, we can utilise at least 98% of the
total space before a bin fills up, x% of the time?”</p>
<p>The natural way to compute the “x%” that makes the proposition
true is to first fit a distribution on the observed data, then find
out the probability mass for that distribution that lies above 98%
fill rate. Fitting distributions takes a lot of judgment, and I’m not
sure I trust myself that much.</p>
<p>Alternatively, we can observe independent
identically distributed fill rates, check if they achieve 98% space
utilisation, and bound the success rate for this
<a href="https://en.wikipedia.org/wiki/Bernoulli_process">Bernoulli process</a>.</p>
<p>There are some non-trivial questions associated with this approach.</p>
<ol>
<li>How do we know when to stop generating more observations... without
fooling ourselves with \(p\)-hacking?</li>
<li>How can we generate something like a confidence interval for the
success rate?</li>
</ol>
<p>Thankfully, I have been sitting on a
<a href="https://github.com/pkhuong/csm">software package</a>
to compute satisfaction rate for exactly this kind of <a href="https://landing.google.com/sre/book/chapters/service-level-objectives.html">SLO</a>-type properties,
properties of the form “this indicator satisfies $PREDICATE x% of the
time,” with arbitrarily bounded false positive rates.</p>
<p>The code takes care of adaptive stopping, generates a credible
interval, and spits out a report like this <a href="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/1k-30k-report.svg"><img src="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/1k-30k-report.svg" width="100px" /></a>:
we see the threshold (0.98), the empirical success rate estimate
(0.993 ≫ 0.98), a credible interval for the success rate, and
the shape of the probability mass for success rates.</p>
<p>This post shows how to compute credible intervals for the Bernoulli’s
success rate, how to implement a dynamic stopping criterion, and how
to combine the two while compensating for multiple hypothesis
testing. It also gives two examples of converting more general
questions to SLO form, and answers them with the same code.</p>
<h1>Credible intervals for the Binomial</h1>
<p>If we run the same experiment \(n\) times, and observe \(a\)
successes (\(b = n - a\) failures), it’s natural to ask for
an estimate of the success rate \(p\) for the underlying
<a href="https://en.wikipedia.org/wiki/Bernoulli_process">Bernoulli process</a>,
assuming the observations are independent and identically distributed.</p>
<p>Intuitively, that estimate should be close to \(a / n\), the
empirical success rate, but that’s not enough. I also want something
that reflects the uncertainty associated with small \(n\), much like
in the following
<a href="http://serialmentor.com/blog/2017/9/15/goodbye-joyplots">ridge line plot</a>,
where different phrases are assigned not only a different average
probability, but also a different spread.</p>
<p><a href="https://github.com/zonination/perceptions">
<img class="center" src="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/perception-probability.png" />
</a></p>
<p>I’m looking for an interval of plausible success rates \(p\) that
responds to both the empirical success rate \(a / n\) and the sample
size \(n\); that interval should be centered around \(a / n\), be
wide when \(n\) is small, and become gradually tighter as \(n\)
increases.</p>
<p>The Bayesian approach is straightforward, if we’re willing to shut up
and calculate. Once we fix the underlying success rate \(p = \hat{p}\),
the conditional probability of observing \(a\) successes and \(b\)
failures is</p>
<p>\[P((a, b) | p = \hat{p}) \sim \hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b},\]</p>
<p>where the right-hand side is a proportion<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>, rather than a
probability.</p>
<p>We can now apply
<a href="https://en.wikipedia.org/wiki/Bayes%27_theorem">Bayes’s theorem</a> to
invert the condition and the event. The inversion will give us the
conditional probability that \(p = \hat{p}\), given that we observed
\(a\) successes and \(b\) successes. We only need to impose a
prior distribution on the underlying rate \(p\). For simplicity,
I’ll go with the uniform \(U[0, 1]\), i.e., every success rate is
equally plausible, at first. We find</p>
<p>\[P(p = \hat{p} | (a, b)) = \frac{P((a, b) | p = \hat{p}) P(p = \hat{p})}{P(a, b)}.\]</p>
<p>We already picked the uniform prior,
\(P(p = \hat{p}) = 1\,\forall \hat{p}\in [0,1],\)
and the denominator is a constant with respect to \(\hat{p}\).
The expression simplifies to</p>
<p>\[P(p = \hat{p} | (a, b)) \sim \hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b},\]</p>
<p>or, if we normalise to obtain a probability,</p>
<p>\[P(p = \hat{p} | (a, b)) = \frac{\hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b}}{\int\sb{0}\sp{1} \hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b}\, d\hat{p}} = \textrm{Beta}(a+1, b+1).\]</p>
<p>A bit of calculation, and we find that our credibility estimate for
the underlying success rate follows a
<a href="https://en.wikipedia.org/wiki/Beta_distribution">Beta distribution</a>. If
one is really into statistics, they can observe that the uniform prior
distribution is just the \(\textrm{Beta}(1, 1)\) distribution, and
rederive that the Beta is the
<a href="https://en.wikipedia.org/wiki/Conjugate_prior_distribution">conjugate distribution</a>
for the
<a href="https://en.wikipedia.org/wiki/Binomial_distribution">Binomial</a>
distribution.</p>
<p>For me, it suffices to observe that the distribution
\(\textrm{Beta}(a+1, b+1)\) is unimodal, does peak around
\(a / (a + b)\), and becomes tighter as the number of observations
grows. In the following image, I plotted three Beta distributions, all
with empirical success rate 0.9; red corresponds to \(n = 10\)
(\(a = 9\), \(b = 1\), \(\textrm{Beta}(10, 2)\)), black to
\(n = 100\) (\(\textrm{Beta}(91, 11)\)), and blue to \(n =
1000\) (\(\textrm{Beta}(901, 101)\)).</p>
<p><a href="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/credible-beta-large.png">
<img class="center" src="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/credible-beta.png" />
</a></p>
<p>We calculated, and we got something that matches my
intuition. Before trying to understand <em>what it means</em>, let’s take a
detour to simply plot points from that un-normalised proportion function
\(\hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b}\), on an arbitrary \(y\)
axis.</p>
<p>Let \(\hat{p} = 0.4\), \(a = 901\), \(b = 101\). Naïvely
entering the expression at the REPL yields nothing useful.</p>
<pre><code>CL-USER> (* (expt 0.4d0 901) (expt (- 1 0.4d0) 101))
0.0d0
</code></pre>
<p>The issue here is that the un-normalised proportion is so small that
it underflows double floats and becomes a round zero. We can
guess that the normalisation factor \(\frac{1}{\mathrm{Beta}(\cdot,\cdot)}\)
quickly grows very large, which will bring its own set of issues
when we do care about the normalised probability.</p>
<p>How can we renormalise a set of points without underflow? The usual
trick to handle extremely small or large magnitudes is to work in the
log domain. Rather than computing
\(\hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b}\), we shall compute</p>
<p>\[\log\left[\hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b}\right] = a \log\hat{p} + b \log (1 - \hat{p}).\]</p>
<pre><code>CL-USER> (+ (* 901 (log 0.4d0)) (* 101 (log (- 1 0.4d0))))
-877.1713374189787d0
CL-USER> (exp *)
0.0d0
</code></pre>
<p>That’s somewhat better: the log-domain value is not \(-\infty\), but
converting it back to a regular value still gives us 0.</p>
<p>The \(\log\) function is monotonic, so we can find the maximum
proportion value for a set of points, and divide everything by that
maximum value to get plottable points. There’s one last thing that
should change: when \(x\) is small, \(1 - x\) will round most of
\(x\) away.
<a href="https://www.johndcook.com/blog/2010/06/07/math-library-functions-that-seem-unnecessary/">Instead of <code>(log (- 1 x))</code>, we should use <code>(log1p (- x))</code></a>
to compute \(\log (1 + -x) = \log (1 - x)\). Common
Lisp did not standardise <a href="https://linux.die.net/man/3/log1p"><code>log1p</code></a>,
but SBCL does have it in internals, as a wrapper around <code>libm</code>. We’ll
just abuse that for now.</p>
<pre><code>CL-USER> (defun proportion (x) (+ (* 901 (log x)) (* 101 (sb-kernel:%log1p (- x)))))
PROPORTION
CL-USER> (defparameter *points* (loop for i from 1 upto 19 collect (/ i 20d0)))
*POINTS*
CL-USER> (reduce #'max *points* :key #'proportion)
-327.4909190001001d0
</code></pre>
<p>We have to normalise in the log domain, which is simply a subtraction:
\(\log(x / y) = \log x - \log y\). In the case above, we will subtract
\(-327.49\ldots\), or add <em>a massive</em> \(327.49\ldots\) to each log
proportion (i.e., multiply by \(10\sp{142}\)). The resulting values
should have a reasonably non-zero range.</p>
<pre><code>CL-USER> (mapcar (lambda (x) (cons x (exp (- (proportion x) *)))) *points*)
((0.05d0 . 0.0d0)
(0.1d0 . 0.0d0)
[...]
(0.35d0 . 3.443943164733533d-288)
[...]
(0.8d0 . 2.0682681158181894d-16)
(0.85d0 . 2.6252352579425913d-5)
(0.9d0 . 1.0d0)
(0.95d0 . 5.65506756824607d-10))
</code></pre>
<p>There’s finally some signal in there. This is still just an
un-normalised proportion function, not a probability density function,
but that’s already useful to show the general shape of the density
function, something like the following, for \(\mathrm{Beta}(901, 101)\).</p>
<p><a href="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/proportion.svg">
<img class="center" src="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/proportion.svg" />
</a></p>
<p>Finally, we have a probability density function for the Bayesian update of
our belief about the success rate after \(n\) observations of a
Bernoulli process, and we know how to compute its proportion
function. Until now, I’ve carefully avoided the question of what all
these computations even mean. No more (:</p>
<p>The Bayesian view assumes that the underlying success rate
(the value we’re trying to estimate) is unknown, but sampled from some
distribution. In our case, we assumed a uniform
distribution, i.e., that every success rate is <em>a priori</em> equally
likely. We then observe \(n\) outcomes (successes or failures), and
assign an updated probability to each success rate. It’s like
a many-world interpretation in which we assume we live in one of a set of
worlds, each with a success rate sampled from the uniform
distribution; after observing 900 successes and 100 failures,
we’re more likely to be in a world where the success rate is 0.9 than
in one where it’s 0.2. With Bayes’s theorem to formalise the update,
we assign posterior probabilities to each potential success rate
value.</p>
<p>We can compute an
<a href="https://en.wikipedia.org/wiki/Credible_interval">equal-tailed credible interval</a>
from that
\(\mathrm{Beta}(a+1,b+1)\)
posterior distribution by excluding the left-most values, \([0,
l)\), such that the
<a href="https://en.wikipedia.org/wiki/Beta_distribution">Beta</a> CDF
(<a href="https://en.wikipedia.org/wiki/Cumulative_distribution_function">cumulative distribution function</a>)
at \(l\) is \(\varepsilon / 2\), and doing
the same with the right most values to cut away \(\varepsilon / 2\) of the
probability density. The CDF for \(\mathrm{Beta}(a+1,b+1)\) at
\(x\) is the <a href="https://dlmf.nist.gov/8.17">incomplete beta function</a>,
\(I\sb{x}(a+1,b+1)\). That
function is really hard to compute (<a href="http://www.dtic.mil/dtic/tr/fulltext/u2/a210118.pdf">this technical report</a>
detailing <a href="https://dl.acm.org/citation.cfm?id=131776">Algorithm 708</a>
deploys <em>five</em> different evaluation strategies), so I’ll address that
later.</p>
<p>The more orthodox “frequentist” approach to confidence intervals
treats the whole experiment, from data colleaction to analysis (to
publication, independent of the observations 😉) as an
<a href="https://en.wikipedia.org/wiki/Atlantic_City_algorithm">Atlantic City algorithm</a>:
if we allow a false positive rate of \(\varepsilon\) (e.g.,
\(\varepsilon=5\%\)), the experiment must return a confidence
interval that includes the actual success rate (population statistic
or parameter, in general) with probability \(1 - \varepsilon\), for
any actual success rate (or underlying population statistic /
parameter). When the procedure fails, with probability at most
\(\varepsilon\), it is allowed to fail in an arbitrary manner.</p>
<p>The same Atlantic City logic applies to \(p\)-values. An experiment
(data collection and analysis) that accepts when the \(p\)-value is
at most \(0.05\) is an Atlantic City algorithm that returns a
correct result (including “don’t know”) with probability at least
\(0.95\), and is otherwise allowed to yield any result with
probability at most \(0.05\). The \(p\)-value associated with a
conclusion, e.g., “success rate is more than 0.8” (the confidence level
associated with an interval) means something like “I’m pretty sure that the
success rate is more than 0.8, because the odds of observing our data
if that were false are small (less than 0.05).” If we set that
threshold (of 0.05, in the example) <em>ahead of time</em>, we get an
Atlantic City algorithm to determine if “the success rate is more than
0.8” with failure probability 0.05. (In practice,
reporting is censored in all sorts of ways, so...)</p>
<p>There are ways to recover a classical confidence interval, given \(n\) observations from a Bernoulli. However,
they’re pretty convoluted, and, as
<a href="https://bayes.wustl.edu/etj/articles/confidence.pdf">Jaynes argues in his note on confidence intervals</a>, the classical
approach gives values that are roughly the same<sup id="fnref:2"><a href="#fn:2" rel="footnote">2</a></sup> as the Bayesian
approach... so I’ll just use the Bayesian credibility interval instead.</p>
<p><em>See <a href="https://stats.stackexchange.com/questions/2272/whats-the-difference-between-a-confidence-interval-and-a-credible-interval/2287#2287">this stackexchange post</a>
for a lot more details.</em></p>
<h1>Dynamic stopping for Binomial testing</h1>
<p>The way statistics are usually deployed is that someone collects a
data set, as rich as is practical, and squeezes that static data set
dry for significant results. That’s exactly the setting for the
credible interval computation I sketched in the previous section.</p>
<p>When studying the properties of computer programs or systems, we can
usually generate additional data on demand, given more time. The problem
is knowing when it’s ok to stop wasting computer time, because we have
enough data... and how to determine that without running into multiple
hypothesis testing issues (<a href="https://www.google.com/search?q=a%2Fb+test+early+stopping">ask anyone who’s run A/B tests</a>).</p>
<p>Here’s an example of an intuitive but completely broken dynamic
stopping criterion. Let’s say we’re trying to find out if the success
rate is less than or greater than 90%, and are willing to be wrong 5%
of the time. We could get \(k\) data points, run a statistical test
on those data points, and stop if the data let us conclude with 95%
confidence that the underlying success rate differs from
90%. Otherwise, collect \(2k\) fresh points, run the same test;
collect \(4k, \ldots, 2\sp{i}k\) points. Eventually, we’ll have enough data.</p>
<p>The issue is that each time we execute the statistical test that
determines if we should stop, we run a 5% risk of being totally
wrong. For an extreme example, if the success rate is exactly 90%, we
will eventually stop, with probability 1. When we do stop, we’ll
inevitably conclude that the success rate differs from 90%, and we
will be wrong. The worst-case (over all underlying success rates)
false positive rate is 100%, not 5%!</p>
<p>In my experience, programmers tend to sidestep the question by wasting
CPU time with a large, fixed, number of iterations... people are then
less likely to run our statistical tests, since they’re so slow, and
everyone loses (the other popular option is to impose a reasonable CPU
budget, with error thresholds so lax we end up with a smoke test).</p>
<p>Robbins, in
<a href="https://projecteuclid.org/euclid.aoms/1177696786">Statistical Methods Related to the Law of the Iterated Logarithm</a>,
introduces a criterion that, given a threshold success rate \(p\) and a
sequence of (infinitely many!) observations from the same Bernoulli with
unknown success rate parameter, will be satisfied infinitely often when
\(p\) differs from the Bernoulli’s success rate. Crucially,
Robbins also bounds the false positive rate, the probability that the
criterion be satisfied <em>even once in the infinite sequence of
observations</em> if the Bernoulli’s unknown success rate is exactly equal
to \(p\). That criterion is</p>
<p>\[{n \choose a} p\sp{a} (1-p)\sp{n-a} \leq \frac{\varepsilon}{n+1},\]</p>
<p>where \(n\) is the number of observations, \(a\) the number of
successes, \(p\) the threshold success rate, and \(\varepsilon\)
the error (false positive) rate. As the number of
observation grows, the criterion becomes more and more stringent to
maintain a bounded false positive rate over the whole infinite
sequence of observations.</p>
<p>There are similar “Confidence Sequence” results for other
distributions (see, for example,
<a href="https://projecteuclid.org/euclid.aos/1176343406">this paper of Lai</a>),
but we only care about the Binomial here.</p>
<p>More recently, <a href="https://arxiv.org/abs/1611.01675">Ding, Gandy, and Hahn</a>
showed that Robbins’s criterion also guarantees that, when it is
satisfied, the empirical success rate (\(a/n\)) lies on the correct
side of the threshold \(p\) (same side as the actual unknown success
rate) with probability \(1-\varepsilon\). This result leads
them to propose the use of Robbins’s criterion to stop Monte Carlo
statistical tests, which they refer to as the Confidence Sequence Method (CSM).</p>
<pre><code>(defun csm-stop-p (successes failures threshold eps)
"Pseudocode, this will not work on a real machine."
(let ((n (+ successes failures)))
(<= (* (choose n successes)
(expt threshold successes)
(expt (- 1 threshold) failures))
(/ eps (1+ n)))))
</code></pre>
<p>We may call this predicate at any time with more independent and
identically distributed results, and stop as soon as it returns true.</p>
<p>The CSM is simple (it’s all in Robbins’s criterion), but still
provides good guarantees. The downside is that it is conservative when
we have a limit on the number of observations: the method “hedges”
against the possibility of having a false positive in the infinite
number of observations after the limit, observations we will never
make. For computer-generated data sets, I think having a principled
limit is pretty good; it’s not ideal to ask for more data than
strictly necessary, but not a blocker either.</p>
<p>In practice, there are still real obstacles to implementing the CSM on
computers with finite precision (floating point) arithmetic,
especially since I want to preserve the method’s theoretical guarantees
(i.e., make sure rounding is one-sided to overestimate the left-hand
side of the inequality).</p>
<p>If we implement the expression well, the effect of rounding on
correctness should be less than marginal. However, I don’t want to be
stuck wondering if my bad results are due to known approximation
errors in the method, rather than errors in the code. Moreover, if we
do have a tight expression with little rounding errors, adjusting it
to make the errors one-sided should have almost no impact. That seems
like a good trade-off to me, especially if I’m going to use the CSM
semi-automatically, in continuous integration scripts, for example.</p>
<p>One look at <code>csm-stop-p</code> shows we’ll have the same problem we had with
the proportion function for the Beta distribution: we’re multiplying
very small and very large values. We’ll apply the same fix: work in
the log domain and exploit \(\log\)’s monotonicity.</p>
<p>\[{n \choose a} p\sp{a} (1-p)\sp{n-a} \leq \frac{\varepsilon}{n+1}\]</p>
<p>becomes</p>
<p>\[\log {n \choose a} + a \log p + (n-a)\log (1-p) \leq \log\varepsilon -\log(n+1),\]</p>
<p>or, after some more expansions, and with \(b = n - a\),</p>
<p>\[\log n! - \log a! - \log b! + a \log p + b \log(1 - p) + \log(n+1) \leq \log\varepsilon.\]</p>
<p>The new obstacle is computing the factorial \(x!\), or the
log-factorial \(\log x!\). We shouldn’t compute the
factorial iteratively: otherwise, we could spend more time in the stopping criterion than in the data generation subroutine.
<a href="https://www-fourier.ujf-grenoble.fr/~marin/une_autre_crypto/articles_et_extraits_livres/Robbin_H.-A_remark_on_Stirling%5C's_Formula.pdf">Robbins has another useful result</a>
for us:</p>
<p>\[\sqrt{2\pi} n\sp{n + ½} \exp(-n) \exp\left(\frac{1}{12n+1}\right) < n! < \sqrt{2\pi} n\sp{n + ½} \exp(-n) \exp\left(\frac{1}{12n}\right),\]</p>
<p>or, in the log domain,</p>
<p>\[\log\sqrt{2\pi} + \left(n + \frac{1}{2}\right)\log n -n + \frac{1}{12n+1} < \log n! < \log\sqrt{2\pi} + \left(n + \frac{1}{2}\right)\log n -n +\frac{1}{12n}.\]</p>
<p>This double inequality gives us a way to over-approximate
\(\log {n \choose a} = \log \frac{n!}{a! b!} = \log n! - \log a! - \log b!,\)
where \(b = n - a\):</p>
<p>\[\log {n \choose a} < -\log\sqrt{2\pi} + \left(n + \frac{1}{2}\right)\log n -n +\frac{1}{12n} - \left(a + \frac{1}{2}\right)\log a +a - \frac{1}{12a+1} - \left(b + \frac{1}{2}\right)\log b +b - \frac{1}{12b+1},\]</p>
<p>where the right-most expression in Robbins’s double inequality
replaces \(\log n!\), which must be over-approximated, and the
left-most \(\log a!\) and \(\log b!\), which must be
under-approximated.</p>
<p>Robbins’s approximation works well for us because, it is
one-sided, and guarantees that the (relative) error in \(n!\),
\(\frac{\exp\left(\frac{1}{12n}\right) - \exp\left(\frac{1}{12n+1}\right)}{n!},\)
is small, even for small values like \(n = 5\)
(error \(< 0.0023\%\)), and decreases with \(n\): as we perform
more trials, the approximation is increasingly accurate, thus less
likely to spuriously prevent us from stopping.</p>
<p>Now that we have a conservative approximation of Robbins’s criterion
that only needs the four arithmetic operations and logarithms (and
<code>log1p</code>), we can implement it on a real computer. The only challenge
left is regular floating point arithmetic stuff: if rounding must
occur, we must make sure it is in a safe (conservative) direction for
our predicate.</p>
<p>Hardware usually lets us manipulate the rounding mode to force
floating point arithmetic operations to round up or down, instead of
the usual round to even. However, that tends to be slow, so most
language (implementations) don’t support changing the rounding mode, or
do so badly... which leaves us in a multi-decade
hardware/software co-evolution Catch-22.</p>
<p>I could think hard and derive tight bounds on the round-off error, but I’d
rather apply a bit of brute force. IEEE-754 compliant implementations
must round the four basic operations correctly. This means that
\(z = x \oplus y\) is at most half a ULP away from \(x + y,\)
and thus either \(z = x \oplus y \geq x + y,\) or the next floating
point value after \(z,\) \(z^\prime \geq x + y\). We can find this
“next value” portably in Common Lisp, with
<code>decode-float</code>/<code>scale-float</code>, and some hand-waving for denormals.</p>
<pre><code>(defun next (x &optional (delta 1))
"Increment x by delta ULPs. Very conservative for
small (0/denormalised) values."
(declare (type double-float x)
(type unsigned-byte delta))
(let* ((exponent (nth-value 1 (decode-float x)))
(ulp (max (scale-float double-float-epsilon exponent)
least-positive-normalized-double-float)))
(+ x (* delta ulp))))
</code></pre>
<p>I prefer to manipulate IEEE-754 bits directly. That’s theoretically
not portable, but the platforms I care about make sure we can treat
floats as sign-magnitude integers.</p>
<span>next </span>
<div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class="line-number">1</span>
<span class="line-number">2</span>
<span class="line-number">3</span>
<span class="line-number">4</span>
<span class="line-number">5</span>
<span class="line-number">6</span>
<span class="line-number">7</span>
<span class="line-number">8</span>
<span class="line-number">9</span>
<span class="line-number">10</span>
<span class="line-number">11</span>
<span class="line-number">12</span>
<span class="line-number">13</span>
<span class="line-number">14</span>
<span class="line-number">15</span>
<span class="line-number">16</span>
<span class="line-number">17</span>
<span class="line-number">18</span>
<span class="line-number">19</span>
<span class="line-number">20</span>
<span class="line-number">21</span>
<span class="line-number">22</span>
<span class="line-number">23</span>
<span class="line-number">24</span>
<span class="line-number">25</span>
<span class="line-number">26</span>
<span class="line-number">27</span>
<span class="line-number">28</span>
<span class="line-number">29</span>
<span class="line-number">30</span>
<span class="line-number">31</span>
<span class="line-number">32</span>
<span class="line-number">33</span>
<span class="line-number">34</span>
<span class="line-number">35</span>
<span class="line-number">36</span>
<span class="line-number">37</span>
<span class="line-number">38</span>
<span class="line-number">39</span>
<span class="line-number">40</span>
</pre></td><td class="code"><pre><code class="lisp"><span class="line"><span class="o">#+</span><span class="nv">sbcl</span>
</span><span class="line"><span class="p">(</span><span class="k">progn</span>
</span><span class="line"> <span class="p">(</span><span class="nb">declaim</span> <span class="p">(</span><span class="k">inline</span> <span class="nv">%float-bits</span> <span class="nv">%bits-float</span> <span class="nv">next</span> <span class="nv">prev</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">defun</span> <span class="nv">%float-bits</span> <span class="p">(</span><span class="nv">x</span><span class="p">)</span>
</span><span class="line"> <span class="s">"Convert a double float x to sign-extended sign/magnitude, and</span>
</span><span class="line"><span class="s"> then to 2's complement."</span>
</span><span class="line"> <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">type</span> <span class="kt">double-float</span> <span class="nv">x</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="k">let*</span> <span class="p">((</span><span class="nv">hi</span> <span class="p">(</span><span class="nv">sb-kernel:double-float-high-bits</span> <span class="nv">x</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">lo</span> <span class="p">(</span><span class="nv">sb-kernel:double-float-low-bits</span> <span class="nv">x</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">word</span> <span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="nb">ash</span> <span class="p">(</span><span class="nb">ldb</span> <span class="p">(</span><span class="nb">byte</span> <span class="mi">31</span> <span class="mi">0</span><span class="p">)</span> <span class="nv">hi</span><span class="p">)</span> <span class="mi">32</span><span class="p">)</span> <span class="nv">lo</span><span class="p">)))</span>
</span><span class="line"> <span class="c1">;; hi is the high half of the 64 bit sign-magnitude</span>
</span><span class="line"> <span class="c1">;; representation... in two's complement. Extract the significand,</span>
</span><span class="line"> <span class="c1">;; and then apply the sign bit. We want to preserve signed zeros,</span>
</span><span class="line"> <span class="c1">;; so return -1 - word instead of -word.</span>
</span><span class="line"> <span class="c1">;;</span>
</span><span class="line"> <span class="c1">;; (- -1 word) = (lognot word) = (logxor word -1).</span>
</span><span class="line"> <span class="p">(</span><span class="nb">logxor</span> <span class="nv">word</span> <span class="p">(</span><span class="nb">ash</span> <span class="nv">hi</span> <span class="mi">-32</span><span class="p">))))</span>
</span><span class="line">
</span><span class="line"> <span class="p">(</span><span class="nb">defun</span> <span class="nv">%bits-float</span> <span class="p">(</span><span class="nv">bits</span><span class="p">)</span>
</span><span class="line"> <span class="s">"Convert 2's complement to sign-extended sign/magnitude, then</span>
</span><span class="line"><span class="s"> double float."</span>
</span><span class="line"> <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">type</span> <span class="p">(</span><span class="kt">signed-byte</span> <span class="mi">64</span><span class="p">)</span> <span class="nv">bits</span><span class="p">))</span>
</span><span class="line"> <span class="c1">;; convert back to sign-magnitude: if bits is negative, all but the</span>
</span><span class="line"> <span class="c1">;; sign bit must be flipped again.</span>
</span><span class="line"> <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">bits</span> <span class="p">(</span><span class="nb">logxor</span> <span class="nv">bits</span>
</span><span class="line"> <span class="p">(</span><span class="nb">ldb</span> <span class="p">(</span><span class="nb">byte</span> <span class="mi">63</span> <span class="mi">0</span><span class="p">)</span> <span class="p">(</span><span class="nb">ash</span> <span class="nv">bits</span> <span class="mi">-64</span><span class="p">)))))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">sb-kernel:make-double-float</span> <span class="p">(</span><span class="nb">ash</span> <span class="nv">bits</span> <span class="mi">-32</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nb">ldb</span> <span class="p">(</span><span class="nb">byte</span> <span class="mi">32</span> <span class="mi">0</span><span class="p">)</span> <span class="nv">bits</span><span class="p">))))</span>
</span><span class="line">
</span><span class="line"> <span class="p">(</span><span class="nb">defun</span> <span class="nv">next</span> <span class="p">(</span><span class="nv">x</span> <span class="k">&optional</span> <span class="p">(</span><span class="nv">delta</span> <span class="mi">1</span><span class="p">))</span>
</span><span class="line"> <span class="s">"Increment x by delta ULPs."</span>
</span><span class="line"> <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">type</span> <span class="kt">double-float</span> <span class="nv">x</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="k">type</span> <span class="kt">unsigned-byte</span> <span class="nv">delta</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">%bits-float</span> <span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="nv">%float-bits</span> <span class="nv">x</span><span class="p">)</span> <span class="nv">delta</span><span class="p">)))</span>
</span><span class="line">
</span><span class="line"> <span class="p">(</span><span class="nb">defun</span> <span class="nv">prev</span> <span class="p">(</span><span class="nv">x</span> <span class="k">&optional</span> <span class="p">(</span><span class="nv">delta</span> <span class="mi">1</span><span class="p">))</span>
</span><span class="line"> <span class="s">"Decrement x by delta ULPs."</span>
</span><span class="line"> <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">type</span> <span class="kt">double-float</span> <span class="nv">x</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="k">type</span> <span class="kt">unsigned-byte</span> <span class="nv">delta</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">%bits-float</span> <span class="p">(</span><span class="nb">-</span> <span class="p">(</span><span class="nv">%float-bits</span> <span class="nv">x</span><span class="p">)</span> <span class="nv">delta</span><span class="p">))))</span>
</span></code></pre></td></tr></table></div>
<pre><code>CL-USER> (double-float-bits pi)
4614256656552045848
CL-USER> (double-float-bits (- pi))
-4614256656552045849
</code></pre>
<p>The two’s complement value for <code>pi</code> is one less than
<code>(- (double-float-bits pi))</code> because two’s complement does not support
signed zeros.</p>
<pre><code>CL-USER> (eql 0 (- 0))
T
CL-USER> (eql 0d0 (- 0d0))
NIL
CL-USER> (double-float-bits 0d0)
0
CL-USER> (double-float-bits -0d0)
-1
</code></pre>
<p>We can quickly check that the round trip from float to integer and back
is an identity.</p>
<pre><code>CL-USER> (eql pi (bits-double-float (double-float-bits pi)))
T
CL-USER> (eql (- pi) (bits-double-float (double-float-bits (- pi))))
T
CL-USER> (eql 0d0 (bits-double-float (double-float-bits 0d0)))
T
CL-USER> (eql -0d0 (bits-double-float (double-float-bits -0d0)))
T
</code></pre>
<p>We can also check that incrementing or decrementing the integer
representation does increase or decrease the floating point value.</p>
<pre><code>CL-USER> (< (bits-double-float (1- (double-float-bits pi))) pi)
T
CL-USER> (< (bits-double-float (1- (double-float-bits (- pi)))) (- pi))
T
CL-USER> (bits-double-float (1- (double-float-bits 0d0)))
-0.0d0
CL-USER> (bits-double-float (1+ (double-float-bits -0d0)))
0.0d0
CL-USER> (bits-double-float (1+ (double-float-bits 0d0)))
4.9406564584124654d-324
CL-USER> (bits-double-float (1- (double-float-bits -0d0)))
-4.9406564584124654d-324
</code></pre>
<p>The code doesn’t handle special values like infinities or NaNs, but
that’s out of scope for the CSM criterion anyway. That’s all we need
to nudge the result of the four operations to guarantee an over- or
under- approximation of the real value. We can also look at the
documentation for our <code>libm</code> (e.g., <a href="https://www.gnu.org/software/libc/manual/html_node/Errors-in-Math-Functions.html">for GNU libm</a>)
to find error bounds on functions like <code>log</code>; GNU claims their
<code>log</code> is never off by more than 3 ULP. We can round up to the
fourth next floating point value to obtain a conservative upper bound
on \(\log x\).</p>
<span>log </span>
<div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class="line-number">1</span>
<span class="line-number">2</span>
<span class="line-number">3</span>
<span class="line-number">4</span>
<span class="line-number">5</span>
<span class="line-number">6</span>
<span class="line-number">7</span>
<span class="line-number">8</span>
<span class="line-number">9</span>
<span class="line-number">10</span>
<span class="line-number">11</span>
<span class="line-number">12</span>
<span class="line-number">13</span>
<span class="line-number">14</span>
<span class="line-number">15</span>
<span class="line-number">16</span>
<span class="line-number">17</span>
<span class="line-number">18</span>
<span class="line-number">19</span>
<span class="line-number">20</span>
<span class="line-number">21</span>
<span class="line-number">22</span>
<span class="line-number">23</span>
<span class="line-number">24</span>
<span class="line-number">25</span>
<span class="line-number">26</span>
<span class="line-number">27</span>
</pre></td><td class="code"><pre><code class="lisp"><span class="line"><span class="p">(</span><span class="nb">declaim</span> <span class="p">(</span><span class="k">type</span> <span class="p">(</span><span class="kt">unsigned-byte</span> <span class="mi">31</span><span class="p">)</span> <span class="vg">*libm-error-limit*</span><span class="p">))</span>
</span><span class="line"><span class="p">(</span><span class="nb">defvar</span> <span class="vg">*libm-error-limit*</span> <span class="mi">4</span>
</span><span class="line"> <span class="s">"Assume libm is off by less than 4 ULPs."</span><span class="p">)</span>
</span><span class="line">
</span><span class="line"><span class="p">(</span><span class="nb">declaim</span> <span class="p">(</span><span class="k">inline</span> <span class="nv">log-up</span> <span class="nv">log-down</span><span class="p">))</span>
</span><span class="line"><span class="p">(</span><span class="nb">defun</span> <span class="nv">log-up</span> <span class="p">(</span><span class="nv">x</span><span class="p">)</span>
</span><span class="line"> <span class="s">"Conservative upper bound on log(x)."</span>
</span><span class="line"> <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">type</span> <span class="kt">double-float</span> <span class="nv">x</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nb">log</span> <span class="nv">x</span><span class="p">)</span> <span class="vg">*libm-error-limit*</span><span class="p">))</span>
</span><span class="line">
</span><span class="line"><span class="p">(</span><span class="nb">defun</span> <span class="nv">log-down</span> <span class="p">(</span><span class="nv">x</span><span class="p">)</span>
</span><span class="line"> <span class="s">"Conservative lower bound on log(x)."</span>
</span><span class="line"> <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">type</span> <span class="kt">double-float</span> <span class="nv">x</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">prev</span> <span class="p">(</span><span class="nb">log</span> <span class="nv">x</span><span class="p">)</span> <span class="vg">*libm-error-limit*</span><span class="p">))</span>
</span><span class="line">
</span><span class="line"><span class="o">#+</span><span class="nv">sbcl</span>
</span><span class="line"><span class="p">(</span><span class="k">progn</span>
</span><span class="line"> <span class="p">(</span><span class="nb">declaim</span> <span class="p">(</span><span class="k">inline</span> <span class="nv">log1p-up</span> <span class="nv">log1p-down</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">defun</span> <span class="nv">log1p-up</span> <span class="p">(</span><span class="nv">x</span><span class="p">)</span>
</span><span class="line"> <span class="s">"Convervative upper bound on log(1 + x)."</span>
</span><span class="line"> <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">type</span> <span class="kt">double-float</span> <span class="nv">x</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nv">sb-kernel:%log1p</span> <span class="nv">x</span><span class="p">)</span> <span class="vg">*libm-error-limit*</span><span class="p">))</span>
</span><span class="line">
</span><span class="line"> <span class="p">(</span><span class="nb">defun</span> <span class="nv">log1p-down</span> <span class="p">(</span><span class="nv">x</span><span class="p">)</span>
</span><span class="line"> <span class="s">"Conservative lower bound on log(1 + x)"</span>
</span><span class="line"> <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">type</span> <span class="kt">double-float</span> <span class="nv">x</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">prev</span> <span class="p">(</span><span class="nv">sb-kernel:%log1p</span> <span class="nv">x</span><span class="p">)</span> <span class="vg">*libm-error-limit*</span><span class="p">)))</span>
</span></code></pre></td></tr></table></div>
<p>I could go ahead and use the building blocks above (ULP nudging for
directed rounding) to directly implement Robbins’s criterion,</p>
<p>\[\log {n \choose a} + a \log p + b\log (1-p) + \log(n+1) \leq \log\varepsilon,\]</p>
<p>with Robbins’s factorial approximation,</p>
<p>\[\log {n \choose a} < -\log\sqrt{2\pi} + \left(n + \frac{1}{2}\right)\log n -n +\frac{1}{12n} - \left(a + \frac{1}{2}\right)\log a +a - \frac{1}{12a+1} - \left(b + \frac{1}{2}\right)\log b +b - \frac{1}{12b+1}.\]</p>
<p>However, even in the log domain, there’s a lot of cancellation: we’re
taking the difference of relatively large numbers to find a small
result. It’s possible to avoid that by re-associating some of the
terms above, e.g., for \(a\):</p>
<p>\[-\left(a + \frac{1}{2}\right) \log a + a - a \log p =
-\frac{\log a}{2} + a (-\log a + 1 - \log p).\]</p>
<p>Instead, I’ll just brute force things (again) with
<a href="https://en.wikipedia.org/wiki/Kahan_summation_algorithm">Kahan summation</a>.
Shewchuk’s presentation in <a href="http://www.cs.cmu.edu/~quake/robust.html">Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates</a>
highlights how the only step where we may lose precision to
rounding is when we add the current compensation term to the new
summand. We can implement Kahan summation with directed rounding in
only that one place: all the other operations are exact!</p>
<span>“kahan summation” </span>
<div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class="line-number">1</span>
<span class="line-number">2</span>
<span class="line-number">3</span>
<span class="line-number">4</span>
<span class="line-number">5</span>
<span class="line-number">6</span>
<span class="line-number">7</span>
<span class="line-number">8</span>
<span class="line-number">9</span>
<span class="line-number">10</span>
<span class="line-number">11</span>
<span class="line-number">12</span>
<span class="line-number">13</span>
<span class="line-number">14</span>
<span class="line-number">15</span>
<span class="line-number">16</span>
<span class="line-number">17</span>
<span class="line-number">18</span>
<span class="line-number">19</span>
<span class="line-number">20</span>
<span class="line-number">21</span>
<span class="line-number">22</span>
<span class="line-number">23</span>
<span class="line-number">24</span>
<span class="line-number">25</span>
<span class="line-number">26</span>
<span class="line-number">27</span>
<span class="line-number">28</span>
<span class="line-number">29</span>
<span class="line-number">30</span>
<span class="line-number">31</span>
<span class="line-number">32</span>
<span class="line-number">33</span>
<span class="line-number">34</span>
<span class="line-number">35</span>
<span class="line-number">36</span>
<span class="line-number">37</span>
<span class="line-number">38</span>
<span class="line-number">39</span>
<span class="line-number">40</span>
<span class="line-number">41</span>
<span class="line-number">42</span>
<span class="line-number">43</span>
<span class="line-number">44</span>
<span class="line-number">45</span>
<span class="line-number">46</span>
<span class="line-number">47</span>
<span class="line-number">48</span>
<span class="line-number">49</span>
<span class="line-number">50</span>
<span class="line-number">51</span>
<span class="line-number">52</span>
<span class="line-number">53</span>
<span class="line-number">54</span>
<span class="line-number">55</span>
<span class="line-number">56</span>
<span class="line-number">57</span>
<span class="line-number">58</span>
<span class="line-number">59</span>
<span class="line-number">60</span>
<span class="line-number">61</span>
<span class="line-number">62</span>
<span class="line-number">63</span>
<span class="line-number">64</span>
<span class="line-number">65</span>
<span class="line-number">66</span>
<span class="line-number">67</span>
<span class="line-number">68</span>
<span class="line-number">69</span>
</pre></td><td class="code"><pre><code class="lisp"><span class="line"><span class="c1">;;; Kahan-style summation.</span>
</span><span class="line"><span class="c1">;;;</span>
</span><span class="line"><span class="c1">;;; Represent the accumulator as an evaluated sum of two doubles. As</span>
</span><span class="line"><span class="c1">;;; long as the compensation term is initially 0, the result is a safe</span>
</span><span class="line"><span class="c1">;;; upper bound on the real value, and the two terms are</span>
</span><span class="line"><span class="c1">;;; "non-overlapping." For more details, see "Adaptive Precision</span>
</span><span class="line"><span class="c1">;;; Floating-Point Arithmetic and Fast Robust Geometric Predicates",</span>
</span><span class="line"><span class="c1">;;; Shewchuk, 1997; Technical report CMU-CS-96-140R / Discrete & Comp</span>
</span><span class="line"><span class="c1">;;; Geom 18(3), October 1997. Theorem 6 in particular.</span>
</span><span class="line">
</span><span class="line"><span class="p">(</span><span class="nb">declaim</span> <span class="p">(</span><span class="k">inline</span> <span class="nv">sum-update-up</span> <span class="nv">sum-update-finish</span><span class="p">))</span>
</span><span class="line"><span class="p">(</span><span class="nb">defun</span> <span class="nv">sum-update-up</span> <span class="p">(</span><span class="nv">accumulator</span> <span class="nv">compensation</span> <span class="nv">term</span> <span class="k">&optional</span> <span class="nv">ordered</span><span class="p">)</span>
</span><span class="line"> <span class="s">"Given an evaluated sum</span>
</span><span class="line"><span class="s"> (accumulator + compensation),</span>
</span><span class="line"><span class="s"> return a new unevaluated sum for an upper bound on</span>
</span><span class="line"><span class="s"> (accumulator + compensation + term).</span>
</span><span class="line">
</span><span class="line"><span class="s"> If ordered, assume</span>
</span><span class="line"><span class="s"> term < accumulator,</span>
</span><span class="line"><span class="s"> or</span>
</span><span class="line"><span class="s"> accumulator = compensation = 0."</span>
</span><span class="line"> <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">type</span> <span class="kt">double-float</span> <span class="nv">accumulator</span> <span class="nv">compensation</span>
</span><span class="line"> <span class="nv">term</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">when</span> <span class="p">(</span><span class="nb">and</span> <span class="p">(</span><span class="nb">not</span> <span class="nv">ordered</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nb"><</span> <span class="p">(</span><span class="nb">abs</span> <span class="nv">accumulator</span><span class="p">)</span> <span class="p">(</span><span class="nb">abs</span> <span class="nv">term</span><span class="p">)))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">rotatef</span> <span class="nv">accumulator</span> <span class="nv">term</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="k">let*</span> <span class="p">((</span><span class="nv">rest-1</span> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nb">+</span> <span class="nv">compensation</span> <span class="nv">term</span><span class="p">)))</span> <span class="c1">; safe upper bound on c + t</span>
</span><span class="line"> <span class="p">(</span><span class="nb">rest</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb"><=</span> <span class="nv">compensation</span> <span class="mf">0d0</span><span class="p">)</span> <span class="c1">; tighter, still safe.</span>
</span><span class="line"> <span class="p">(</span><span class="nb">min</span> <span class="nv">term</span> <span class="nv">rest-1</span><span class="p">)</span>
</span><span class="line"> <span class="nv">rest-1</span><span class="p">))</span>
</span><span class="line"> <span class="c1">;; Perform a Dekker sum of accumulator + rest. The result is</span>
</span><span class="line"> <span class="c1">;; exact, so no need for next/prev here.</span>
</span><span class="line"> <span class="c1">;;</span>
</span><span class="line"> <span class="c1">;; Precondition: |accumulator| >= |rest| (or accumulator = 0).</span>
</span><span class="line"> <span class="p">(</span><span class="nv">a</span> <span class="nv">accumulator</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nv">b</span> <span class="nb">rest</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nv">x</span> <span class="p">(</span><span class="nb">+</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">b-virtual</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">x</span> <span class="nv">a</span><span class="p">))</span> <span class="c1">; b-virtual = value really added to a</span>
</span><span class="line"> <span class="p">(</span><span class="nv">y</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">b</span> <span class="nv">b-virtual</span><span class="p">)))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">values</span> <span class="nv">x</span> <span class="nv">y</span><span class="p">)))</span>
</span><span class="line">
</span><span class="line"><span class="p">(</span><span class="nb">defun</span> <span class="nv">sum-update-finish</span> <span class="p">(</span><span class="nv">accumulator</span> <span class="nv">compensation</span><span class="p">)</span>
</span><span class="line"> <span class="s">"Return a conservative upper bound for accumulator + compensation.</span>
</span><span class="line">
</span><span class="line"><span class="s"> In theory, (+ accumulator compensation) is equal to accumulator.</span>
</span><span class="line"><span class="s"> In practice, it doesn't hurt to do this right. The second return</span>
</span><span class="line"><span class="s"> value is the new compensation term (should never be positive)."</span>
</span><span class="line"> <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">type</span> <span class="kt">double-float</span> <span class="nv">accumulator</span> <span class="nv">compensation</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="k">let*</span> <span class="p">((</span><span class="nv">raw-sum</span> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nb">+</span> <span class="nv">accumulator</span> <span class="nv">compensation</span><span class="p">)))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">sum</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">></span> <span class="nv">compensation</span> <span class="mf">0d0</span><span class="p">)</span>
</span><span class="line"> <span class="nv">raw-sum</span>
</span><span class="line"> <span class="c1">;; if compensation <= 0, acc is already an upper</span>
</span><span class="line"> <span class="c1">;; bound.</span>
</span><span class="line"> <span class="p">(</span><span class="nb">min</span> <span class="nv">accumulator</span> <span class="nv">raw-sum</span><span class="p">)))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">delta</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">sum</span> <span class="nv">accumulator</span><span class="p">)))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">assert</span> <span class="p">(</span><span class="nb">>=</span> <span class="nv">delta</span> <span class="nv">compensation</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">values</span> <span class="nv">sum</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">compensation</span> <span class="nv">delta</span><span class="p">))))</span>
</span><span class="line">
</span><span class="line"><span class="p">(</span><span class="nb">declaim</span> <span class="p">(</span><span class="k">ftype</span> <span class="p">(</span><span class="k">function</span> <span class="p">(</span><span class="k">&rest</span> <span class="kt">double-float</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nb">values</span> <span class="kt">double-float</span> <span class="kt">double-float</span> <span class="k">&optional</span><span class="p">))</span>
</span><span class="line"> <span class="nv">sum-up</span><span class="p">))</span>
</span><span class="line"><span class="p">(</span><span class="nb">defun</span> <span class="nv">sum-up</span> <span class="p">(</span><span class="k">&rest</span> <span class="nb">values</span><span class="p">)</span>
</span><span class="line"> <span class="s">"Conservative upper bound for the sum of values, with a Kahan</span>
</span><span class="line"><span class="s"> summation loop."</span>
</span><span class="line"> <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">acc</span> <span class="mf">0d0</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nv">err</span> <span class="mf">0d0</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">dolist</span> <span class="p">(</span><span class="nv">value</span> <span class="nb">values</span> <span class="p">(</span><span class="nv">sum-update-finish</span> <span class="nv">acc</span> <span class="nv">err</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">setf</span> <span class="p">(</span><span class="nb">values</span> <span class="nv">acc</span> <span class="nv">err</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nv">sum-update-up</span> <span class="nv">acc</span> <span class="nv">err</span> <span class="nv">value</span><span class="p">)))))</span>
</span></code></pre></td></tr></table></div>
<p>We need one last thing to implement \(\log {n \choose a}\), and then
Robbins’s confidence sequence: a safely rounded floating-point value
approximation of \(-\log \sqrt{2 \pi}\). I precomputed one with
<a href="https://github.com/tarballs-are-good/computable-reals">computable-reals</a>:</p>
<pre><code>CL-USER> (computable-reals:-r
(computable-reals:log-r
(computable-reals:sqrt-r computable-reals:+2pi-r+)))
-0.91893853320467274178...
CL-USER> (computable-reals:ceiling-r
(computable-reals:*r *
(ash 1 53)))
-8277062471433908
-0.65067431749790398594...
CL-USER> (* -8277062471433908 (expt 2d0 -53))
-0.9189385332046727d0
CL-USER> (computable-reals:-r (rational *)
***)
+0.00000000000000007224...
</code></pre>
<p>We can safely replace \(-\log\sqrt{2\pi}\) with
<code>-0.9189385332046727d0</code>, or, equivalently,
<code>(scale-float -8277062471433908.0d0 -53)</code>, for an upper bound.
If we wanted a lower bound, we could decrement the integer significand
by one.</p>
<span>log-choose </span>
<div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class="line-number">1</span>
<span class="line-number">2</span>
<span class="line-number">3</span>
<span class="line-number">4</span>
<span class="line-number">5</span>
<span class="line-number">6</span>
<span class="line-number">7</span>
<span class="line-number">8</span>
<span class="line-number">9</span>
<span class="line-number">10</span>
<span class="line-number">11</span>
<span class="line-number">12</span>
<span class="line-number">13</span>
<span class="line-number">14</span>
<span class="line-number">15</span>
<span class="line-number">16</span>
<span class="line-number">17</span>
<span class="line-number">18</span>
<span class="line-number">19</span>
<span class="line-number">20</span>
<span class="line-number">21</span>
<span class="line-number">22</span>
<span class="line-number">23</span>
<span class="line-number">24</span>
<span class="line-number">25</span>
<span class="line-number">26</span>
<span class="line-number">27</span>
<span class="line-number">28</span>
<span class="line-number">29</span>
<span class="line-number">30</span>
<span class="line-number">31</span>
<span class="line-number">32</span>
<span class="line-number">33</span>
<span class="line-number">34</span>
<span class="line-number">35</span>
</pre></td><td class="code"><pre><code class="lisp"><span class="line"><span class="c1">;;; Upper bound for log c(n, s).</span>
</span><span class="line">
</span><span class="line"><span class="p">(</span><span class="nb">declaim</span> <span class="p">(</span><span class="k">type</span> <span class="kt">double-float</span> <span class="vg">*minus-log-sqrt-2pi*</span><span class="p">))</span>
</span><span class="line"><span class="p">(</span><span class="nb">defvar</span> <span class="vg">*minus-log-sqrt-2pi*</span> <span class="mf">-0.9189385332046727d0</span>
</span><span class="line"> <span class="s">"Smallest double precision value > -log sqrt(2pi)."</span><span class="p">)</span>
</span><span class="line">
</span><span class="line"><span class="p">(</span><span class="nb">declaim</span> <span class="p">(</span><span class="k">ftype</span> <span class="p">(</span><span class="k">function</span> <span class="p">((</span><span class="kt">unsigned-byte</span> <span class="mi">49</span><span class="p">)</span> <span class="p">(</span><span class="kt">unsigned-byte</span> <span class="mi">49</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">values</span> <span class="kt">double-float</span> <span class="kt">double-float</span> <span class="k">&optional</span><span class="p">))</span>
</span><span class="line"> <span class="nv">robbins-log-choose</span><span class="p">))</span>
</span><span class="line"><span class="p">(</span><span class="nb">defun</span> <span class="nv">robbins-log-choose</span> <span class="p">(</span><span class="nv">n</span> <span class="nv">s</span><span class="p">)</span>
</span><span class="line"> <span class="s">"Compute a conservative upper bound on log c(n, s) based on</span>
</span><span class="line"><span class="s"> Robbins's bounds for k!."</span>
</span><span class="line"> <span class="p">(</span><span class="nb">check-type</span> <span class="nv">n</span> <span class="p">(</span><span class="kt">unsigned-byte</span> <span class="mi">49</span><span class="p">))</span> <span class="c1">;; ensure 53 bit arith is exact.</span>
</span><span class="line"> <span class="p">(</span><span class="nb">check-type</span> <span class="nv">s</span> <span class="p">(</span><span class="kt">unsigned-byte</span> <span class="mi">49</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">assert</span> <span class="p">(</span><span class="nb"><=</span> <span class="mi">0</span> <span class="nv">s</span> <span class="nv">n</span><span class="p">))</span>
</span><span class="line"> <span class="c1">;; Handle easy cases, where c(n, s) is 1 or n.</span>
</span><span class="line"> <span class="p">(</span><span class="nb">when</span> <span class="p">(</span><span class="nb">or</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">n</span> <span class="nv">s</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nb">zerop</span> <span class="nv">s</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="k">return-from</span> <span class="nv">robbins-log-choose</span> <span class="p">(</span><span class="nb">values</span> <span class="mf">0d0</span> <span class="mf">0d0</span><span class="p">)))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">when</span> <span class="p">(</span><span class="nb">or</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">s</span> <span class="mi">1</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nb">=</span> <span class="nv">s</span> <span class="p">(</span><span class="nb">1-</span> <span class="nv">n</span><span class="p">)))</span>
</span><span class="line"> <span class="p">(</span><span class="k">return-from</span> <span class="nv">robbins-log-choose</span> <span class="p">(</span><span class="nb">values</span> <span class="p">(</span><span class="nv">log-up</span> <span class="p">(</span><span class="nb">float</span> <span class="nv">n</span> <span class="mf">1d0</span><span class="p">))</span>
</span><span class="line"> <span class="mf">0d0</span><span class="p">)))</span>
</span><span class="line"> <span class="p">(</span><span class="k">let*</span> <span class="p">((</span><span class="nv">n</span> <span class="p">(</span><span class="nb">float</span> <span class="nv">n</span> <span class="mf">1d0</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">s</span> <span class="p">(</span><span class="nb">float</span> <span class="nv">s</span> <span class="mf">1d0</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">n-s</span> <span class="p">(</span><span class="nb">float</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">n</span> <span class="nv">s</span><span class="p">)</span> <span class="mf">1d0</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">l1</span> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nb">*</span> <span class="p">(</span><span class="nb">+</span> <span class="nv">n</span> <span class="mf">.5d0</span><span class="p">)</span> <span class="p">(</span><span class="nv">log-up</span> <span class="nv">n</span><span class="p">))))</span> <span class="c1">; (+ n .5d0) is exact.</span>
</span><span class="line"> <span class="p">(</span><span class="nv">l2</span> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nb">-</span> <span class="p">(</span><span class="nb">*</span> <span class="p">(</span><span class="nb">+</span> <span class="nv">s</span> <span class="mf">.5d0</span><span class="p">)</span> <span class="p">(</span><span class="nv">log-down</span> <span class="nv">s</span><span class="p">)))))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">l3</span> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nb">-</span> <span class="p">(</span><span class="nb">*</span> <span class="p">(</span><span class="nb">+</span> <span class="nv">n-s</span> <span class="mf">.5d0</span><span class="p">)</span> <span class="p">(</span><span class="nv">log-down</span> <span class="nv">n-s</span><span class="p">)))))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">r1</span> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nb">/</span> <span class="p">(</span><span class="nb">*</span> <span class="mf">12d0</span> <span class="nv">n</span><span class="p">))))</span> <span class="c1">; (* 12d0 n) is exact.</span>
</span><span class="line"> <span class="p">(</span><span class="nv">r2</span> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nb">-</span> <span class="p">(</span><span class="nb">/</span> <span class="p">(</span><span class="nb">1+</span> <span class="p">(</span><span class="nb">*</span> <span class="mf">12d0</span> <span class="nv">s</span><span class="p">))))))</span> <span class="c1">; also exact.</span>
</span><span class="line"> <span class="p">(</span><span class="nv">r3</span> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nb">-</span> <span class="p">(</span><span class="nb">/</span> <span class="p">(</span><span class="nb">1+</span> <span class="p">(</span><span class="nb">*</span> <span class="mf">12d0</span> <span class="nv">n-s</span><span class="p">)))))))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">sum-up</span> <span class="vg">*minus-log-sqrt-2pi*</span>
</span><span class="line"> <span class="nv">l1</span> <span class="nv">l2</span> <span class="nv">l3</span>
</span><span class="line"> <span class="nv">r1</span> <span class="nv">r2</span> <span class="nv">r3</span><span class="p">)))</span>
</span></code></pre></td></tr></table></div>
<p>We can quickly check against an exact implementation with
<code>computable-reals</code> and a brute force factorial.</p>
<pre><code>CL-USER> (defun cr-log-choose (n s)
(computable-reals:-r
(computable-reals:log-r (alexandria:factorial n))
(computable-reals:log-r (alexandria:factorial s))
(computable-reals:log-r (alexandria:factorial (- n s)))))
CR-LOG-CHOOSE
CL-USER> (computable-reals:-r (rational (robbins-log-choose 10 5))
(cr-log-choose 10 5))
+0.00050526703375914436...
CL-USER> (computable-reals:-r (rational (robbins-log-choose 1000 500))
(cr-log-choose 1000 500))
+0.00000005551513197557...
CL-USER> (computable-reals:-r (rational (robbins-log-choose 1000 5))
(cr-log-choose 1000 5))
+0.00025125559085509706...
</code></pre>
<p>That’s not obviously broken: the error is pretty small, and always positive.</p>
<p>Given a function to over-approximate log-choose, the Confidence
Sequence Method’s stopping criterion is straightforward.</p>
<span>csm </span>
<div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class="line-number">1</span>
<span class="line-number">2</span>
<span class="line-number">3</span>
<span class="line-number">4</span>
<span class="line-number">5</span>
<span class="line-number">6</span>
<span class="line-number">7</span>
<span class="line-number">8</span>
<span class="line-number">9</span>
<span class="line-number">10</span>
<span class="line-number">11</span>
<span class="line-number">12</span>
<span class="line-number">13</span>
<span class="line-number">14</span>
<span class="line-number">15</span>
<span class="line-number">16</span>
<span class="line-number">17</span>
<span class="line-number">18</span>
<span class="line-number">19</span>
<span class="line-number">20</span>
<span class="line-number">21</span>
<span class="line-number">22</span>
<span class="line-number">23</span>
<span class="line-number">24</span>
<span class="line-number">25</span>
<span class="line-number">26</span>
<span class="line-number">27</span>
<span class="line-number">28</span>
<span class="line-number">29</span>
<span class="line-number">30</span>
</pre></td><td class="code"><pre><code class="lisp"><span class="line"><span class="p">(</span><span class="nb">declaim</span> <span class="p">(</span><span class="k">ftype</span> <span class="p">(</span><span class="k">function</span> <span class="p">((</span><span class="kt">unsigned-byte</span> <span class="mi">49</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nc">real</span> <span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="p">(</span><span class="mi">1</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="kt">unsigned-byte</span> <span class="mi">49</span><span class="p">)</span>
</span><span class="line"> <span class="nc">real</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nb">values</span> <span class="kt">boolean</span> <span class="kt">double-float</span> <span class="k">&optional</span><span class="p">))</span>
</span><span class="line"> <span class="nv">csm</span><span class="p">))</span>
</span><span class="line"><span class="p">(</span><span class="nb">defun</span> <span class="nv">csm</span> <span class="p">(</span><span class="nv">n</span> <span class="nv">alpha</span> <span class="nv">s</span> <span class="nv">log-eps</span><span class="p">)</span>
</span><span class="line"> <span class="s">"Given n trials and s sucesses, are we reasonably sure that the</span>
</span><span class="line"><span class="s"> success rate is *not* alpha (with a false positive rate < exp(log-eps))?</span>
</span><span class="line">
</span><span class="line"><span class="s"> Answer that question with Ding, Gandy, and Hahn's confidence</span>
</span><span class="line"><span class="s"> sequence method (CSM). The second return value is an estimate of the</span>
</span><span class="line"><span class="s"> false positive target rate we would need to stop here. This value</span>
</span><span class="line"><span class="s"> should only be used for reporting; the target rate eps should always</span>
</span><span class="line"><span class="s"> be fixed before starting the experiment."</span>
</span><span class="line"> <span class="p">(</span><span class="nb">check-type</span> <span class="nv">n</span> <span class="p">(</span><span class="kt">unsigned-byte</span> <span class="mi">49</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">check-type</span> <span class="nv">alpha</span> <span class="p">(</span><span class="nc">real</span> <span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="p">(</span><span class="mi">1</span><span class="p">)))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">check-type</span> <span class="nv">s</span> <span class="p">(</span><span class="kt">unsigned-byte</span> <span class="mi">49</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">check-type</span> <span class="nv">log-eps</span> <span class="nc">real</span><span class="p">)</span>
</span><span class="line"> <span class="p">(</span><span class="nb">assert</span> <span class="p">(</span><span class="nb"><=</span> <span class="mi">0</span> <span class="nv">s</span> <span class="nv">n</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="k">let*</span> <span class="p">((</span><span class="nv">log-choose</span> <span class="p">(</span><span class="nv">robbins-log-choose</span> <span class="nv">n</span> <span class="nv">s</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">n</span> <span class="p">(</span><span class="nb">float</span> <span class="nv">n</span> <span class="mf">1d0</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">alpha</span> <span class="p">(</span><span class="nb">float</span> <span class="nv">alpha</span> <span class="mf">1d0</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">s</span> <span class="p">(</span><span class="nb">float</span> <span class="nv">s</span> <span class="mf">1d0</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">log-eps</span> <span class="p">(</span><span class="nb">float</span> <span class="nv">log-eps</span> <span class="mf">1d0</span><span class="p">))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">log-level</span> <span class="p">(</span><span class="nv">sum-up</span> <span class="p">(</span><span class="nv">log-up</span> <span class="p">(</span><span class="nb">1+</span> <span class="nv">n</span><span class="p">))</span>
</span><span class="line"> <span class="nv">log-choose</span>
</span><span class="line"> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nb">*</span> <span class="nv">s</span> <span class="p">(</span><span class="nv">log-up</span> <span class="nv">alpha</span><span class="p">)))</span>
</span><span class="line"> <span class="p">(</span><span class="nv">next</span> <span class="p">(</span><span class="nb">*</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">n</span> <span class="nv">s</span><span class="p">)</span> <span class="p">(</span><span class="nv">log1p-up</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">alpha</span><span class="p">)))))))</span>
</span><span class="line"> <span class="p">(</span><span class="nb">values</span> <span class="p">(</span><span class="nb"><</span> <span class="nv">log-level</span> <span class="nv">log-eps</span><span class="p">)</span> <span class="nv">log-level</span><span class="p">)))</span>
</span></code></pre></td></tr></table></div>
<p>The other, much harder, part is computing credible (Bayesian)
intervals for the Beta distribution. I won’t go over the code,
but the <a href="https://github.com/pkhuong/csm/blob/47ecdd8f676a213dfb47ad0a329090a4c22b3125/csm.lisp#L465">basic strategy</a> is to invert the CDF, a monotonic function,
by <a href="https://en.wikipedia.org/wiki/Bisection_method">bisection</a><sup id="fnref:3"><a href="#fn:3" rel="footnote">3</a></sup>, and to assume we’re looking for improbable
(\(\mathrm{cdf} < 0.5\)) thresholds. This assumption lets us pick a
<a href="http://dlmf.nist.gov/8.17#ii">simple hypergeometric series</a> that is
normally useless, but
<a href="https://github.com/pkhuong/csm/blob/47ecdd8f676a213dfb47ad0a329090a4c22b3125/csm.lisp#L381">converges well for \(x\) that correspond to such small cumulative probabilities</a>;
when the series converges too slowly,
<a href="https://github.com/pkhuong/csm/blob/47ecdd8f676a213dfb47ad0a329090a4c22b3125/csm.lisp#L482">it’s always conservative to assume that \(x\) is too central</a> (not
extreme enough).</p>
<p>That’s all we need to demo the code. Looking at the distribution of
fill rates for the 1000 bins @ 30K ball/bin facet in</p>
<p><a href="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/grid.png">
<img class="center" src="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/grid-small.png" />
</a></p>
<p>it looks like we almost always hit at least 97.5% global density,
let’s say with probability at least 98%. We can ask the CSM to tell us
when we have enough data to confirm or disprove that hypothesis, with
a 0.1% false positive rate.</p>
<p>Instead of generating more data on demand, I’ll keep things simple and
prepopulate a list with new independently observed fill rates.</p>
<pre><code>CL-USER> (defparameter *observations* '(0.978518900
0.984687300
0.983160833
[...]))
CL-USER> (defun test (n)
(let ((count (count-if (lambda (x) (>= x 0.975))
*observations*
:end n)))
(csm:csm n 0.98d0 count (log 0.001d0))))
CL-USER> (test 10)
NIL
2.1958681996231784d0
CL-USER> (test 100)
NIL
2.5948497850893184d0
CL-USER> (test 1000)
NIL
-3.0115331544604658d0
CL-USER> (test 2000)
NIL
-4.190687115879456d0
CL-USER> (test 4000)
T
-17.238559826956475d0
</code></pre>
<p>We can also use the inverse Beta CDF to get a 99.9% credible
interval. After 4000 trials, we found 3972 successes.</p>
<pre><code>CL-USER> (count-if (lambda (x) (>= x 0.975))
*observations*
:end 4000)
3972
</code></pre>
<p>These values give us the following lower and upper bounds on the 99.9% CI.</p>
<pre><code>CL-USER> (csm:beta-icdf 3972 (- 4000 3972) 0.001d0)
0.9882119750976562d0
1.515197753898523d-5
CL-USER> (csm:beta-icdf 3972 (- 4000 3972) 0.001d0 t)
0.9963832682169742d0
2.0372679238045424d-13
</code></pre>
<p>And we can even re-use and extend the Beta proportion code from
earlier to generate this embeddable <a href="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/1k-30k-report.svg"><img src="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/1k-30k-report.svg" width="100px" /></a> SVG report.</p>
<p>There’s one small problem with the sample usage above: if we compute
the stopping criterion with a false positive rate of 0.1%, and do the
same for each end of the credible interval, our total false
positive (error) rate might actually be 0.3%! The next section will
address that, and the equally important problem of estimating power.</p>
<h1>Monte Carlo power estimation</h1>
<p>It’s not always practical to generate data forever. For example, we
might want to bound the number of iterations we’re willing to waste in
an automated testing script. When there is a bound on the sample size,
the CSM is still correct, just conservative.</p>
<p>We would then like to know the probability that the CSM will stop
successfully when the underlying success rate differs from the
threshold rate \(p\) (<a href="https://github.com/pkhuong/csm/blob/47ecdd8f676a213dfb47ad0a329090a4c22b3125/csm.lisp#L291"><code>alpha</code> in the code</a>). The problem here is
that, for any bounded number of iterations, we can come up with an
underlying success rate so close to \(p\) (but still different) that
the CSM can’t reliably distinguish between the two.</p>
<p>If we want to be able to guarantee any termination rate, we need <em>two</em>
thresholds: the CSM will stop whenever it’s likely that the underlying
success rate differs from either of them. The hardest probability to
distinguish from both thresholds is close to the midpoint between them.</p>
<p>With two thresholds and the credible interval, we’re
running three tests in parallel. I’ll apply a
<a href="https://en.wikipedia.org/wiki/Bonferroni_correction">Bonferroni correction</a>,
and use \(\varepsilon / 3\) for each of the two CSM tests, and
\(\varepsilon / 6\) for each end of the CI.</p>
<p>That <a href="https://github.com/pkhuong/csm/blob/47ecdd8f676a213dfb47ad0a329090a4c22b3125/csm.lisp#L542">logic is encapsulated in <code>csm-driver</code></a>.
We only have to pass a
success value generator function to the driver. In our case, the
generator is itself a call to <code>csm-driver</code>, with fixed thresholds
(e.g., 96% and 98%), and a Bernoulli sampler (e.g., return <code>T</code> with
probability 97%). We can see if the driver returns successfully <em>and
correctly</em> at each invocation of the generator function, with the
parameters we would use in production, and recursively compute
an estimate for that procedure’s success rate with CSM. The following
expression simulates a CSM procedure with thresholds at 96% and 98%,
the (usually unknown) underlying success rate in the middle, at 97%, a
false positive rate of at most 0.1%, and an iteration limit of ten thousand
trials. We pass that simulation’s result to <code>csm-driver</code>, and ask
whether the simulation’s success rate differs from 99%, while allowing
one in a million false positives.</p>
<div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class="line-number">1</span>
<span class="line-number">2</span>
<span class="line-number">3</span>
<span class="line-number">4</span>
<span class="line-number">5</span>
<span class="line-number">6</span>
<span class="line-number">7</span>
<span class="line-number">8</span>
<span class="line-number">9</span>
<span class="line-number">10</span>
<span class="line-number">11</span>
<span class="line-number">12</span>
<span class="line-number">13</span>
<span class="line-number">14</span>
<span class="line-number">15</span>
<span class="line-number">16</span>
<span class="line-number">17</span>
<span class="line-number">18</span>
<span class="line-number">19</span>
<span class="line-number">20</span>
<span class="line-number">21</span>
<span class="line-number">22</span>
<span class="line-number">23</span>
<span class="line-number">24</span>
<span class="line-number">25</span>
<span class="line-number">26</span>
<span class="line-number">27</span>
<span class="line-number">28</span>
<span class="line-number">29</span>
<span class="line-number">30</span>
<span class="line-number">31</span>
<span class="line-number">32</span>
<span class="line-number">33</span>
<span class="line-number">34</span>
<span class="line-number">35</span>
</pre></td><td class="code"><pre><code class=""><span class="line">CL-USER> (labels ((bernoulli (i &aux; (p 0.97d0))
</span><span class="line"> (declare (ignore i))
</span><span class="line"> (< (random 1d0) p))
</span><span class="line"> (generator (i &aux; (p 0.97d0)
</span><span class="line"> (alpha 0.96d0) (alpha-hi 0.98d0)
</span><span class="line"> (eps 1d-3) (max-count 10000))
</span><span class="line"> (declare (ignore i))
</span><span class="line"> (multiple-value-bind (success success-hi estimate)
</span><span class="line"> (csm:csm-driver #'bernoulli alpha eps
</span><span class="line"> :alpha-hi alpha-hi
</span><span class="line"> :max-count max-count)
</span><span class="line"> ;; check that the CSM succeeds, and that it does so
</span><span class="line"> ;; with correct estimates.
</span><span class="line"> (let ((correct-alpha (if (< p alpha)
</span><span class="line"> (< estimate alpha)
</span><span class="line"> (> estimate alpha)))
</span><span class="line"> (correct-hi (if (< p alpha-hi)
</span><span class="line"> (< estimate alpha-hi)
</span><span class="line"> (> estimate alpha-hi))))
</span><span class="line"> (cond ((and success success-hi)
</span><span class="line"> (and correct-alpha correct-hi))
</span><span class="line"> (success
</span><span class="line"> correct-alpha)
</span><span class="line"> (success-hi
</span><span class="line"> correct-hi)
</span><span class="line"> (t
</span><span class="line"> nil))))))
</span><span class="line"> (csm:csm-driver #'generator 0.99d0 1d-6))
</span><span class="line">T
</span><span class="line">T
</span><span class="line">1.0d0
</span><span class="line">2210
</span><span class="line">2210
</span><span class="line">0.993145939238895d0
</span><span class="line">0.9999999998869291d0</span></code></pre></td></tr></table></div>
<p>We find that yes, we can expect the 96%/98%/0.1% false positive/10K
iterations setup to succeed more than 99% of the time. The
<a href="https://github.com/pkhuong/csm/blob/47ecdd8f676a213dfb47ad0a329090a4c22b3125/csm.lisp#L690">code above is available as <code>csm-power</code></a>,
with a tighter outer false positive rate of 1e-9. If we only allow
1000 iterations, <code>csm-power</code> quickly tells us that, with one CSM
success in 100 attempts, we can expect the CSM success rate to be less
than 99%.</p>
<pre><code>CL-USER> (csm:csm-power 0.97d0 0.96d0 1000 :alpha-hi 0.98d0 :eps 1d-3 :stream *standard-output*)
1 0.000e+0 1.250e-10 10.000e-1 1.699e+0
10 0.000e+0 0.000e+0 8.660e-1 1.896e+1
20 0.000e+0 0.000e+0 6.511e-1 3.868e+1
30 0.000e+0 0.000e+0 5.099e-1 5.851e+1
40 2.500e-2 5.518e-7 4.659e-1 7.479e+1
50 2.000e-2 4.425e-7 3.952e-1 9.460e+1
60 1.667e-2 3.694e-7 3.427e-1 1.144e+2
70 1.429e-2 3.170e-7 3.024e-1 1.343e+2
80 1.250e-2 2.776e-7 2.705e-1 1.542e+2
90 1.111e-2 2.469e-7 2.446e-1 1.741e+2
100 1.000e-2 2.223e-7 2.232e-1 1.940e+2
100 iterations, 1 successes (false positive rate < 1.000000e-9)
success rate p ~ 1.000000e-2
confidence interval [2.223495e-7, 0.223213 ]
p < 0.990000
max inner iteration count: 816
T
T
0.01d0
100
1
2.2234953205868331d-7
0.22321314110840665d0
</code></pre>
<h1>SLO-ify all the things with this Exact test</h1>
<p>Until now, I’ve only used the Confidence Sequence Method (CSM) for
Monte Carlo simulation of phenomena that are naturally seen as boolean
success / failures processes. We can apply the same CSM to implement
an <a href="https://en.wikipedia.org/wiki/Exact_test">exact test</a> for null
hypothesis testing, with a bit of resampling magic.</p>
<p>Looking back at the balls and bins grid, the average fill rate seems
to be slightly worse for 100 bins @ 60K ball/bin, than for 1000 bins
@ 128K ball/bin. How can we test that with the CSM?</p>
<p><a href="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/grid.png">
<img class="center" src="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/grid-small.png" />
</a></p>
<p>First, we should get a fresh dataset for the two setups we wish to
compare.</p>
<pre><code>CL-USER> (defparameter *100-60k* #(0.988110167
0.990352500
0.989940667
0.991670667
[...]))
CL-USER> (defparameter *1000-128k* #(0.991456281
0.991559578
0.990970109
0.990425805
[...]))
CL-USER> (alexandria:mean *100-60k*)
0.9897938
CL-USER> (alexandria:mean *1000-128k*)
0.9909645
CL-USER> (- * **)
0.0011706948
</code></pre>
<p>The mean for 1000 bins @ 128K ball/bin is slightly higher than that
for 100 bins @ 60k ball/bin. We will now simulate the null hypothesis
(in our case, that the distributions for the two setups are
identical), and determine how rarely we observe a difference of
<code>0.00117</code> in means. I only use a null hypothesis where the
distributions are identical for simplicity; we could use the same
resampling procedure to simulate distributions that, e.g., have
identical shapes, but one is shifted right of the other.</p>
<p>In order to simulate our null hypothesis, we want to be as close to
the test we performed as possible, with the only difference being that
we generate data by reshuffling from our observations.</p>
<pre><code>CL-USER> (defparameter *resampling-data* (concatenate 'simple-vector *100-60k* *1000-128k*))
*RESAMPLING-DATA*
CL-USER> (length *100-60k*)
10000
CL-USER> (length *1000-128k*)
10000
</code></pre>
<p>The two observation vectors have the same size, 10000 values; in
general, that’s not always the case, and we must make sure to
replicate the sample sizes in the simulation. We’ll generate our
simulated observations by shuffling the <code>*resampling-data*</code> vector,
and splitting it in two subvectors of ten thousand elements.</p>
<pre><code>CL-USER> (let* ((shuffled (alexandria:shuffle *resampling-data*))
(60k (subseq shuffled 0 10000))
(128k (subseq shuffled 10000)))
(- (alexandria:mean 128k) (alexandria:mean 60k)))
6.2584877e-6
</code></pre>
<p>We’ll convert that to a truth value by comparing the difference of
simulated means with the difference we observed in our real data,
\(0.00117\ldots\), and declare success when the simulated difference
is at least as large as the actual one. This approach gives us a
one-sided test; a two-sided test would compare the absolute
values of the differences.</p>
<pre><code>CL-USER> (csm:csm-driver
(lambda (_)
(declare (ignore _))
(let* ((shuffled (alexandria:shuffle *resampling-data*))
(60k (subseq shuffled 0 10000))
(128k (subseq shuffled 10000)))
(>= (- (alexandria:mean 128k) (alexandria:mean 60k))
0.0011706948)))
0.005 1d-9 :alpha-hi 0.01 :stream *standard-output*)
1 0.000e+0 7.761e-11 10.000e-1 -2.967e-1
10 0.000e+0 0.000e+0 8.709e-1 -9.977e-1
20 0.000e+0 0.000e+0 6.577e-1 -1.235e+0
30 0.000e+0 0.000e+0 5.163e-1 -1.360e+0
40 0.000e+0 0.000e+0 4.226e-1 -1.438e+0
50 0.000e+0 0.000e+0 3.569e-1 -1.489e+0
60 0.000e+0 0.000e+0 3.086e-1 -1.523e+0
70 0.000e+0 0.000e+0 2.718e-1 -1.546e+0
80 0.000e+0 0.000e+0 2.427e-1 -1.559e+0
90 0.000e+0 0.000e+0 2.192e-1 -1.566e+0
100 0.000e+0 0.000e+0 1.998e-1 -1.568e+0
200 0.000e+0 0.000e+0 1.060e-1 -1.430e+0
300 0.000e+0 0.000e+0 7.207e-2 -1.169e+0
400 0.000e+0 0.000e+0 5.460e-2 -8.572e-1
500 0.000e+0 0.000e+0 4.395e-2 -5.174e-1
600 0.000e+0 0.000e+0 3.677e-2 -1.600e-1
700 0.000e+0 0.000e+0 3.161e-2 2.096e-1
800 0.000e+0 0.000e+0 2.772e-2 5.882e-1
900 0.000e+0 0.000e+0 2.468e-2 9.736e-1
1000 0.000e+0 0.000e+0 2.224e-2 1.364e+0
2000 0.000e+0 0.000e+0 1.119e-2 5.428e+0
NIL
T
0.0d0
2967
0
0.0d0
0.007557510165262294d0
</code></pre>
<p>We tried to replicate the difference 2967 times, and did not succeed
even once. The CSM stopped us there, and we find a CI for the
probability of observing our difference, under the null hypothesis, of
<code>[0, 0.007557]</code> (i.e., \(p < 0.01\)). Or, for a graphical summary, <a href="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/exact-report.svg"><img src="http://www.pvk.ca/images/2018-07-06-testing-slo-type-properties-with-the-confidence-sequence-method/exact-report.svg" width="100px" /></a>.
We can also test for a lower \(p\)-value by changing the
thresholds and running the simulation more times (around thirty
thousand iterations for \(p < 0.001\)).</p>
<p>This experiment lets us conclude that the
difference in mean fill rate between 100 bins @ 60K ball/bin and
1000 @ 128K is probably not due to chance: it’s unlikely
that we observed an expected difference between data sampled from
the same distribution. In other words, “I’m confident that the fill rate for
1000 bins @ 128K ball/bin is greater than for 100 bins @ 60K
ball/bins, because it would be highly unlikely to observe a difference in means
that extreme if they had the same distribution (\(p < 0.01\))”.</p>
<p>In general, we can use this exact test when we have two sets of
observations, \(X\sb{0}\) and \(Y\sb{0}\), and a statistic
\(f\sb{0} = f(X\sb{0}, Y\sb{0})\), where \(f\) is a pure
function (the extension to three or more sets of observations is
straightforward).</p>
<p>The test lets us determine the likelihood of observing
\(f(X, Y) \geq f\sb{0}\)
(we could also test for \(f(X, Y) \leq f\sb{0}\)), if
\(X\) and \(Y\) were taken from similar distributions, modulo simple
transformations (e.g., \(X\)’s mean is shifted compared to \(Y\)’s, or
the latter’s variance is double the former’s).</p>
<p>We answer that question by repeatedly sampling without replacement
from \(X\sb{0} \cup Y\sb{0}\) to generate \(X\sb{i}\) and
\(Y\sb{i}\), such that \(|X\sb{i}| = |X\sb{0}|\) and
\(|Y\sb{i}| = |Y\sb{0}|\) (e.g., by shuffling a vector and
splitting it in two). We can apply any simple transformation here
(e.g., increment every value in \(Y\sb{i}\) by \(\Delta\) to shift
its mean by \(\Delta\)). Finally, we check if
\(f(X\sb{i}, Y\sb{i}) \geq f\sb{0} = f(X\sb{0}, Y\sb{0})\); if
so, we return success for this iteration, otherwise failure.</p>
<p>The loop above is a Bernoulli process that generates independent,
identically distributed (assuming the random sampling is correct)
truth values, and its success rate is equal to the probability of
observing a value for \(f\) “as extreme” as \(f\sb{0}\) under the
null hypothesis. We use the CSM with false positive rate
\(\varepsilon\) to know when to stop generating more
values and compute a credible interval for the probability under the
null hypothesis. If that probability is low (less than some
predetermined threshold, like \(\alpha = 0.001\)), we infer that the
null hypothesis does not hold, and declare that the difference in our
sample data points at a real difference in distributions. If we do
everything correctly (<em>cough</em>), we will have implemented an Atlantic
City procedure that fails with probability \(\alpha + \varepsilon\).</p>
<p>Personally, I often just set the threshold and the false positive
rate unreasonably low and handwave some Bayes.</p>
<h1>That’s all!</h1>
<p>I pushed
<a href="https://github.com/pkhuong/csm">the code above, and much more, to github</a>,
in Common
Lisp, C, and Python (probably Py3, although 2.7 might work). Hopefully
anyone can run with the code and use it to test, not only
<a href="https://landing.google.com/sre/book/chapters/service-level-objectives.html">SLO</a>-type
properties, but also answer more general questions, with an exact
test. I’d love to have ideas or contributions on the usability front.
I have some
<a href="https://github.com/pkhuong/csm/blob/master/attic/beta-distribution.lisp">throw-away code in <code>attic/</code></a>,
which I used to generate the SVG in this post, but it’s not great. I
also feel like I can do something to make it easier to stick the logic
in shell scripts and continuous testing pipelines.</p>
<p>When I passed around a first draft for this post, many readers that
could have used the CSM got stuck on the process of moving from
mathematical expressions to computer code; not just how to do it,
but, more fundamentally, why we can’t just transliterate Greek to
C or CL. I hope this revised post is clearer. Also, I hope it’s clear
that the reason I care so much about not introducing false positive
via rounding isn’t that I believe they’re likely to make a difference,
but simply that I want peace of mind with respect to numerical issues;
I really don’t want to be debugging some issue in my tests and have to
wonder if it’s all just caused by numerical errors.</p>
<p>The reason I care so much about making sure users can understand what
the CSM codes does (and why it does what it does) is that I strongly
believe we should minimise dependencies whose inner working we’re
unable to (legally) explore. Every abstraction leaks, and leakage is
particularly frequent in failure situations. We may not need to
understand magic if everything works fine, but, everything breaks
eventually, and that’s when expertise is most useful. When shit’s on
fire, we must be able to break the abstraction and understand how the
magic works, and how it fails.</p>
<p>This post only tests ideal SLO-type properties (and
regular null hypothesis tests translated to SLO properties),
properties of the form “I claim that this indicator satisfies
$PREDICATE x% of the time, with false positive rate y%” where the
indicator’s values are independent and identically distributed.</p>
<p>The last assumption is rarely <em>truly</em> satisfied in practice. I’ve seen
an interesting choice, where the <a href="https://landing.google.com/sre/book/chapters/service-level-objectives.html">service level objective</a> is defined in
terms of a sample of production requests, which can replayed, shuffled,
etc. to ensure i.i.d.-ness. If the nature of the traffic changes
abruptly, the SLO may not be representative of behaviour in
production; but, then again, how could the service provider have
guessed the change was about to happen? I like this approach because
it is amenable to predictive statistical analysis, and incentivises
communication between service users and providers, rather than users
assuming the service will gracefully handle radically new crap being
thrown at it.</p>
<p>Even if we have a representative sample of production, it’s not true
that the <a href="https://landing.google.com/sre/book/chapters/service-level-objectives.html">service level indicators</a>
for individual requests are distributed identically. There’s an easy
fix for the CSM and our credible intervals: generate
i.i.d. <em>sets of requests</em> by resampling (e.g., shuffle the
requests sample) and count successes and failures for individual
requests, but only test for CSM termination after each resampled set.</p>
<p>On a more general note, I see the Binomial and Exact tests as
instances of a general pattern that avoids intuitive functional decompositions
that create subproblems that are harder to solve than the original
problem. For example, instead of trying to directly determine how
frequently the SLI satisfies some threshold, it’s natural to
first fit a distribution on the SLI, and then compute percentiles on
that distribution. Automatically fitting an arbitrary distribution is
hard, especially with the weird outliers computer systems spit
out. Reducing to a Bernoulli process before applying statistics is
much simpler. Similarly, rather than coming up with analytical
distributions in the Exact test, we brute-force the problem by
resampling from the empirical data. I have more examples from online
control systems... I guess the moral is to be wary of decompositions
where internal subcomponents generate intermediate values that are
richer in information than the final output.</p>
<p><small>Thank you Jacob, Ruchir, Barkley, and Joonas for all the
editing and restructuring comments.</small></p>
<div class="footnotes">
<hr />
<ol>
<li id="fn:1">
<p>Proportions are unscaled probabilities that don’t have to sum or integrate to 1. Using proportions instead of probabilities tends to make calculations simpler, and we can always get a probability back by rescaling a proportion by the inverse of its integral.<a href="#fnref:1" rev="footnote">↩</a></p></li>
<li id="fn:2">
<p>Instead of a \(\mathrm{Beta}(a+1, b+1)\), they tend to bound with a \(\mathrm{Beta}(a, b)\). The difference is marginal for double-digit \(n\).<a href="#fnref:2" rev="footnote">↩</a></p></li>
<li id="fn:3">
<p>I used the bisection method instead of more sophisticated ones with better convergence, like <a href="https://en.wikipedia.org/wiki/Newton%27s_method">Newton’s method</a> or the derivative-free <a href="https://en.wikipedia.org/wiki/Secant_method">Secant method</a>, because bisection already adds one bit of precision per iteration, only needs a predicate that returns “too high” or “too low,” and is easily tweaked to be conservative when the predicate declines to return an answer.<a href="#fnref:3" rev="footnote">↩</a></p></li>
</ol>
</div>Fri, 06 Jul 2018 22:02:40 GMTEugene Zaikonnikov: A tiny Lisp bytecode interpreter in Z-80 assemblyhttp://blog.funcall.org//lisp/2018/07/03/spectrum-lisp-vm/
http://blog.funcall.org//lisp/2018/07/03/spectrum-lisp-vm/
<p>It all started with a raid on a long abandoned hosting service. Seen a mention of it in the news, leading to a vague recollection of using it for something. Email address associated with the account was long defunct, and the service itself changed ownership a few times in the past two decades. But incredibly, I could recall login credentials <em>and</em> they worked still.</p>
<p>Amazingly, in a pile of abandoned HTML templates, obsolete software archives and Under Construction GIFs there was a source file for a project I long considered lost. It's a minimal <a href="https://github.com/varjagg/zxlispvm">Lisp bytecode interpreter</a> written in assembly for ZX Spectrum along the lines of MIT <a href="https://dspace.mit.edu/handle/1721.1/5731">AIM-514</a>. Save for address locations and maybe a couple ROM calls for error reporting it's generic Z-80 code.</p>
<p>It was a part of bigger project that should have included a primitive REPL, but no trace of that was found. Also, am quite sure there is a henious bug lurking in the mark&sweep GC. Should really find time to finally debug that!</p>Mon, 02 Jul 2018 15:00:00 GMTPaul Khuong: An Old Conjecture on Stream Transducershttp://www.pvk.ca/Blog/2018/06/24/an-old-conjecture-on-stream-transducers/
http://www.pvk.ca/Blog/2018/06/24/an-old-conjecture-on-stream-transducers/
<p>I’ve been thinking about stream processing again, and came back to
<a href="https://www.pvk.ca/Blog/Lisp/Pipes/introducing_pipes.html">an old pick-two-of-three conjecture</a>
of mine: for stream processing without dynamic allocation, “arbitrary
outputs per input, multiple consumers, multiple producers: choose
two.”</p>
<p>The question is interesting because stream processing
in constant space is a subset of <a href="https://complexityzoo.uwaterloo.ca/Complexity_Zoo:L">L</a>
(or <a href="https://en.wikipedia.org/wiki/FL_(complexity">FL</a>), and thus
probably not P-complete, let alone Turing complete. Having easily characterisable
subsets of stream processing that can be implemented in constant
space would be a boon for the usability of stream DSLs.</p>
<p>I think I find this academic trope as suspicious as
<a href="https://twitter.com/DRMacIver">@DRMavIver</a>
does, so I have mixed feelings about the fact that this one still
feels true seven years later.</p>
<blockquote class="tw-center-align twitter-tweet"><p lang="en" dir="ltr">Is it just me or do impossibility theorems which claim "these three obviously desirable properties cannot simultaneously be satisfied" always include at least one obviously undesirable or at least suspicious property?</p>— David R. MacIver (@DRMacIver) <a href="https://twitter.com/DRMacIver/status/1008977092028059648?ref_src=twsrc%5Etfw">June 19, 2018</a></blockquote>
<p>The main reason I believe in this conjecture is the following
example, <code>F(S(X), X)</code>, where <code>S</code> is the function that takes a stream
and ouputs every other value. Or, more formally, \(F\sb{i} = f(X\sb{2i}, X\sb{i})\).</p>
<p><img class="center" src="http://www.pvk.ca/images/2018-06-24-an-old-conjecture-on-stream-transducers/network.png" /></p>
<p>Let’s say <code>X</code> is some stream of values that can’t be easily
re-computed (e.g., each output value is the result of a slow
computation). How do we then compute <code>F(S(X), X)</code> without either
recomputing the stream <code>X</code>, or buffering an unbounded amount of past
values from that stream? I don’t see a way to do so, not just in any
stream processing DSL (domain specific language), but also in any
general purpose language.</p>
<p>For me, the essence of the problem is that the two inputs to <code>F</code> are
out of sync with respect to the same source of values, <code>X</code>: one
consumes two values of <code>X</code> per invocation of <code>F</code>, and the other only
one. This issue could also occur if we forced stream transducers
(processing nodes) to output a fixed number of value at each
invocation: let <code>S</code> repeat each value of <code>X</code> twice,
i.e., interleave <code>X</code> with <code>X</code> (\(F\sb{i} = f(X\sb{\lfloor i / 2\rfloor}, X\sb{i})\)).</p>
<p>Forcing each invocation of a transducer to always produce exactly one
value is one way to rule out this class of stream processing network.
Two other common options are to forbid either forks (everything is
single-use or subtrees copied and recomputed for each reuse) or
joins (only single-input stream processing nodes).</p>
<p>I don’t think this turtle-and-hare desynchronisation problem is a
weakness in stream DSLs, I only see a reasonable task that can’t be
performed in constant space. Given the existence of such tasks, I’d
like to see stream processing DSLs be explicit about the tradeoffs
they make to balance performance <em>guarantees</em>, expressiveness, and
usability, especially when it comes to the performance model.</p>Mon, 25 Jun 2018 01:21:23 GMTChristophe Rhodes: sbcl method tracinghttp://christophe.rhodes.io/notes/blog/posts/2018/sbcl_method_tracing/
http://christophe.rhodes.io/notes/blog/posts/2018/sbcl_method_tracing/
<p>Since
<a href="https://sourceforge.net/p/sbcl/sbcl/ci/9d36021d86b7db7561b2edc40324c8e5229f88b3">approximately forever</a>,
<a href="http://sbcl.org/">sbcl</a> has advertised the possibility of tracing
individual methods of a generic function by passing <code>:methods t</code> as an
argument to <a href="http://www.xach.com/clhs?q=trace"><code>trace</code></a>. Until
recently, tracing methods was only supported using the <code>:encapsulate
nil</code> style of tracing, modifying the compiled code for function
objects directly.</p>
<p>For a variety of reasons, the alternative <code>:encapsulate t</code>
implementation of tracing, effectively wrapping the function with some
code to run around it, is more robust. One problem with <code>:encapsulate
nil</code> tracing is that if the object being traced is a closure, the
modification of the function's code will affect all of the closures,
not just any single one - closures are distinct objects with distinct
closed-over environments, but they share the same execuable code, so
modifying one of them modifies all of them. However, the
implementation of method tracing I wrote in 2005 - essentially,
finding and tracing the method functions and the method fast-functions
(on which more later) - was fundamentally incompatible with
encapsulation; the method functions are essentially never called by
name by CLOS, but by more esoteric means.</p>
<p>What are those esoteric means, I hear you ask?! I'm glad I can hear
you. The Metaobject Protocol defines a method calling convention,
such that method calls receive as two arguments firstly: the entire
argument list as the method body would expect to handle; and secondly:
the list of sorted applicable next methods, such that the first
element is the method which should be invoked if the method
uses
<a href="http://www.xach.com/clhs?q=call-next-method"><code>call-next-method</code></a>. So
a method function conforming to this protocol has to:</p>
<ol>
<li>destructure its first argument to bind the method parameters to the
arguments given;</li>
<li>if <code>call-next-method</code> is used, reconstruct an argument list (in
general, because the arguments to the next method need not be the
same as the arguments to the existing method) before calling the
next method's method-function with the reconstructed argument list
and the rest of the next methods.</li>
</ol>
<p>But! For a given set of actual arguments, for that call, the set of
applicable methods is known; the precedence order is known; and, with
a bit of bookkeeping in the implementation
of <a href="http://www.xach.com/clhs?q=defmethod"><code>defmethod</code></a>, whether any
individual method actually calls <code>call-next-method</code> is known. So it
is possible, at the point of calling a generic-function with a set of
arguments, to know not only the first applicable method, but in fact
all the applicable methods, their ordering, and the combination of
those methods that will actually get called (which is determined by
whether methods invoke <code>call-next-method</code> and also by the generic
function's <a href="http://christophe.rhodes.io/notes/blog/posts/2018/sbcl_method-combination_fixes/">method combination</a>).</p>
<p>Therefore, a sophisticated (and by "sophisticated" here I mean
"written by the wizards
at <a href="https://en.wikipedia.org/wiki/PARC_(company">Xerox PARC</a>)")
implementation of CLOS can compile an effective method for a given
call, resolve all the next-method calls, perform
some
<a href="http://www.sbcl.org/sbcl-internals/Slot_002dValue.html#Slot_002dValue">extra optimizations</a> on
<a href="http://www.xach.com/clhs?q=slot-value"><code>slot-value</code></a> and slot
accessors, improve the calling convention (we no longer need the list
of next methods, but only a single next effective-method, so we can
spread the argument list once more), and cache the resulting function
for future use. So the one-time cost for each set of applicable
methods generates an optimized effective method, making use of
fast-method-functions with the improved calling convention.</p>
<p>Here's the trick, then: this effective method is compiled into a chain
of <code>method-call</code> and <code>fast-method-call</code> objects, which call their
embedded functions. This, then, is ripe for encapsulation; to allow
method tracing, all we need to do is arrange at
<code>compute-effective-method</code> time that those embedded functions are
wrapped in code that performs the tracing, and that any attempt to
<code>untrace</code> the generic function (or to modify the tracing parameters)
reinitializes the generic function instance, which clears all the
effective method caches. And then Hey Bob, Your Uncle's Presto! and
everything works.</p>
<pre><code>(defgeneric foo (x)
(:method (x) 3))
(defmethod foo :around ((x fixnum))
(1+ (call-next-method)))
(defmethod foo ((x integer))
(* 2 (call-next-method)))
(defmethod foo ((x float))
(* 3 (call-next-method)))
(defmethod foo :before ((x single-float))
'single)
(defmethod foo :after ((x double-float))
'double)
</code></pre>
<p>Here's a generic function <code>foo</code> with moderately complex methods. How
can we work out what is going on? Call the method tracer!</p>
<pre><code>CL-USER> (foo 2.0d0)
0: (FOO 2.0d0)
1: ((SB-PCL::COMBINED-METHOD FOO) 2.0d0)
2: ((METHOD FOO (FLOAT)) 2.0d0)
3: ((METHOD FOO (T)) 2.0d0)
3: (METHOD FOO (T)) returned 3
2: (METHOD FOO (FLOAT)) returned 9
2: ((METHOD FOO :AFTER (DOUBLE-FLOAT)) 2.0d0)
2: (METHOD FOO :AFTER (DOUBLE-FLOAT)) returned DOUBLE
1: (SB-PCL::COMBINED-METHOD FOO) returned 9
0: FOO returned 9
9
</code></pre>
<p>This mostly works. It doesn't <em>quite</em> handle all cases, specifically
when the CLOS user adds a method and implements <code>call-next-method</code> for
themselves:</p>
<pre><code>(add-method #'foo
(make-instance 'standard-method
:qualifiers '()
:specializers (list (find-class 'fixnum))
:lambda-list '(x)
:function (lambda (args nms) (+ 2 (funcall (sb-mop:method-function (first nms)) args (rest nms))))))
CL-USER> (foo 3)
0: (FOO 3)
1: ((METHOD FOO :AROUND (FIXNUM)) 3)
2: ((METHOD FOO (FIXNUM)) 3)
2: (METHOD FOO (FIXNUM)) returned 8
1: (METHOD FOO :AROUND (FIXNUM)) returned 9
0: FOO returned 9
9
</code></pre>
<p>In this trace, we have lost the method trace from the direct call to
the <code>method-function</code>, <em>and</em> calls that that function makes; this is
the cost of performing the trace in the effective method, though a
mitigating factor is that we have visibility of method combination
(through the <code>(sb-pcl::combined-method foo)</code> line in the trace above).
It would probably be possible to do the encapsulation in the method
object itself, by modifying the function and the fast-function, but
this requires rather more book-keeping and (at least theoretically)
breaks the object identity: we do not have licence to modify the
function stored in a method object. So, for now, sbcl has this
imperfect solution for users to try (expected to be in sbcl-1.4.9,
probably released towards the end of June).</p>
<p>(I can't really believe it's taken me twelve years to do this. Other
implementations have had this working for years. Sorry!)</p>Fri, 15 Jun 2018 21:00:21 GMTQuicklisp news: No May 2018 Quicklisp dist updatehttp://blog.quicklisp.org/2018/06/no-may-2018-quicklisp-dist-update.html
http://blog.quicklisp.org/2018/06/no-may-2018-quicklisp-dist-update.html
The computer on which I make Quicklisp builds stopped working a little while ago, and I haven't had time to dive in and work on it. As soon as it's fixed, I'll prepare and release a new dist. Sorry about the inconvenience!Fri, 01 Jun 2018 12:50:00 GMTChristophe Rhodes: sbcl method-combination fixeshttp://christophe.rhodes.io/notes/blog/posts/2018/sbcl_method-combination_fixes/
http://christophe.rhodes.io/notes/blog/posts/2018/sbcl_method-combination_fixes/
<p>At
the
<a href="https://www.european-lisp-symposium.org/2018/index.html">2018 European Lisp Symposium</a>,
the most obviously actionable feedback
for <a href="http://www.sbcl.org/">SBCL</a> from a presentation was
from
<a href="https://www.european-lisp-symposium.org/static/2018/verna.pdf">Didier's remorseless deconstruction</a> of
SBCL's support for method combinations (along with the lack of
explicitness about behavioural details in
the
<a href="http://www.xach.com/clhs?q=define-method-combination">ANSI CL specification</a> and
the
<a href="http://metamodular.com/CLOS-MOP/method-combinations.html">Art of the Metaobject Protocol</a>).
I don't think that Didier meant to imply that SBCL was particularly
bad at method combinations, compared with other available
implementations - merely that SBCL was a convenient target. And, to
be fair, there
was <a href="https://bugs.launchpad.net/sbcl/+bug/309084">a bug report</a> from a
discussion with Bruno Haible back in SBCL's history - May/June 2004,
according to my search - which had languished largely unfixed for
fourteen years.</p>
<p>I <a href="http://christophe.rhodes.io/notes/blog/posts/2018/els2018_reflections/">said</a> that I found the Symposium energising.
And what better use to put that energy than addressing user feedback?
So, I spent a bit of time earlier this month thinking, fixing and
attempting to work out what behaviours might actually be useful. To
be clear, SBCL's support for <code>define-method-combination</code> was (probably)
standards-compliant in the usual case, but if you follow the links
from above, or listen to Didier's talk, you will be aware that that's
not saying all that much, in that almost nothing is specified about
behaviours under redefinition of method combinations.</p>
<p>So, to start with, I solved the cache invalidation (one of
the
<a href="https://martinfowler.com/bliki/TwoHardThings.html">hardest problems in Computer Science</a>),
making sure that discriminating functions and effective methods are
reset and invalidated for all affected generic functions. This was
slightly complicated by the strategy that SBCL has of distinguishing
short and long <code>method-combination</code>s with distinct classes (and distinct
implementation strategies
for
<a href="http://metamodular.com/CLOS-MOP/compute-effective-method.html"><code>compute-effective-method</code></a>);
but this just needed to be methodical and careful. Famous last words:
I think that all method-combination behaviour in SBCL is now coherent
and should meet user expectations.</p>
<p>More interesting, I think, was coming up with test cases for desired
behaviours. Method combinations are not, I think, widely used in
practice; whether that is because of lack of support, lack of
understanding or lack of need of what they provide, I don't know. (In
fact in conversations at ELS we discussed another possibility, which
is that everyone is more comfortable customising
<code>compute-effective-method</code> instead - both that and
<code>define-method-combination</code> provide ways for inserting arbitrary code
for the effect of a generic function call with particular arguments.
But what this means is that there isn't, as far as I know at least, a
large corpus of interesting method combinations to play with.</p>
<p>One interesting one which came up: <code>Bike</code> on <code>#lisp</code> designed
an
<a href="https://gist.github.com/Bike/5ca14ba142f3ca3fc65e4c912f4cde9f">implementation using method-combinations of finite state machines</a>,
which I adapted to add to SBCL's test suite. My version looks like:</p>
<pre><code>(define-method-combination fsm (default-start)
((primary *))
(:arguments &key start)
`(let ((state (or ,start ',default-start)))
(restart-bind
(,@(mapcar (lambda (m) `(,(first (method-qualifiers m))
(lambda ()
(setq state (call-method ,m))
(if (and (typep state '(and symbol (not null)))
(find-restart state))
(invoke-restart state)
state))))
primary))
(invoke-restart state))))
</code></pre>
<p>and there will be more on this use
of <a href="http://www.xach.com/clhs?q=restart-bind"><code>restart-bind</code></a> in a
later post, I hope. Staying on the topic of method combinations, how
might one use this <code>fsm</code> method combination? A simple example might
be to recognize strings with an even number of <code>#\a</code> characters:</p>
<pre><code>;;; first, define something to help with all string parsing
(defclass info ()
((string :initarg :string)
(index :initform 0)))
;;; then the state machine itself
(defgeneric even-as (info &key &allow-other-keys)
(:method-combination fsm :yes))
(defmethod even-as :yes (info &key)
(with-slots ((s string) (i index)) info
(cond ((= i (length s)) t) ((char= (char s i) #\a) (incf i) :no) (t (incf i) :yes))))
(defmethod even-as :no (info &key)
(with-slots ((s string) (i index)) info
(cond ((= i (length s)) nil) ((char= (char s i) #\a) (incf i) :yes) (t (incf i) :no))))
</code></pre>
<p>(Exercise for the reader: adapt this to implement a Turing Machine)</p>
<p>Another example of (I think) an interesting method combination was one
which I came up with in the context of generalized specializers, for
an ELS a while ago:
the <a href="http://christophe.rhodes.io/notes/blog/posts/2014/http-content-negotiation-and-generalized-specializers/">HTTP Request method-combination</a> to
be used
with <a href="http://research.gold.ac.uk/9924/">HTTP Accept specializers</a>.
I'm interested in more! A github search
found
<a href="https://github.com/sirherrbatka/herrblog/blob/aa53949b984bd8eecb34a046f7d1c2d1060a4f2d/src/common/method-combinations.lisp">some</a> <a href="https://github.com/sellout/method-combination-utilities/blob/9abd3ffd8c10cbaa16664dec05e0db9c87ab51dc/src/method-combinations.lisp">examples</a> before
I ran out of patience; do you have any examples?</p>
<p>And I have one further question. The method combination takes
arguments at generic-function definition time (the <code>:yes</code> in
<code>(:method-combination fsm :yes)</code>). Normally, arguments to things are
evaluated at the appropriate time. At the moment, SBCL (and indeed
all other implementations I tested, but that's not strong evidence
given the shared heritage) do not evaluate the arguments to
<code>:method-combination</code> - treating it more like a macro call than a
function call. I'm not sure that is the most helpful behaviour, but
I'm struggling to come up with an example where the other is
definitely better. Maybe something like</p>
<pre><code>(let ((lock (make-lock)))
(defgeneric foo (x)
(:method-combination locked lock)
(:method (x) ...)))
</code></pre>
<p>Which would allow automatic locking around the effective method of
<code>FOO</code> through the method combination? I need to think some more here.</p>
<p>In any case: the <code>method-combination</code> fixes are in the current SBCL
<code>master</code> branch, shortly to be released as sbcl-1.4.8. And there is
still time (though not very much!) to apply for
the
<a href="https://www.jobs.ac.uk/search/?keywords=goldsmiths&salary_from=&salary_to=&category=0600&category=0700&jobtype=&location=01§or=&x=41&y=5">many jobs</a> advertised
at
<a href="https://www.gold.ac.uk/">Goldsmiths</a> <a href="https://www.gold.ac.uk/computing/">Computing</a> -
what better things to do on a Bank Holiday weekend?</p>Sat, 26 May 2018 08:02:23 GMTMarco Antoniotti: Some updates: bugs fixing and CLAD.http://within-parens.blogspot.com/2018/05/some-updates-bugs-fixing-and-clad.html
http://within-parens.blogspot.com/2018/05/some-updates-bugs-fixing-and-clad.html
<div dir="ltr">
Hello there,<br />
<br />
it has been a very long time since I posted here, but most recently, thanks to a couple of pesky bug reports on <a href="http://helambdap.sourceforge.net/" target="_blank">HEΛP</a> by Mirko Vukovic, and because I had a couple of days relatively free, I was able to go back to do some programming, fix (some of) the bugs and post here.<br />
<br />
Here is the story. There were two bugs which I had to deal with (there are more, of course).<br />
<ol>
<li>A bug triggered by CCL and its implementation of the Common Lisp Reader algorithm.</li>
<li>A buglet due to missing supporting data (.css files in this case) in the deployment of the <a href="http://helambdap.sourceforge.net/" target="_blank">HEΛP</a> generated documentation.</li>
</ol>
The first bug was quite difficult to track down and it boiled down to CCL bailing out on <span>READ</span> in an unexpected way (that is, with respect to other implementations). As an aside, this is a problem of the standard non having a predefined condition for "<i>error caused by the reader because it does not find a package</i>"; LW has <span>CONDITIONS:PACKAGE-NOT-FOUND-READER</span>, but that is not standard, and some implementation just signal <span>READER-ERROR</span> or <span>PACKAGE-ERROR</span>. The error was easy to "fix" once diagnosed: just don't process files that you know will be problematic, and <a href="http://helambdap.sourceforge.net/" target="_blank">HEΛP</a> can already "exclude" such files.<br />
<br />
The second bug was easier to diagnose, but the fix was more complicated (especially due to the NIH syndrome I suffer from). The problem is that ASDF moves the compiled code around, but not auxiliary data, like in my case, .css files. I could have followed what ASDF does, but I decided to go another way and came up with a small library I called <i>Common Lisp Application Data</i> (<a href="https://gitlab.common-lisp.net/mantoniotti/CLAD" target="_blank">CLAD</a>, because you need to "dress" your code).<br />
<br />
<h3>
<a href="https://gitlab.common-lisp.net/mantoniotti/CLAD" target="_blank">CLAD</a></h3>
By now, at least on Windows 10 and Mac OS X (and Linux), there is a a notion of an <i>Application and Data Folder</i>. The user version of this folder (as opposed to the system one) is <span>~/Library/</span> on Mac OS X and <span>%USERPROFILE%\AppData\Roaming\</span> (this is the "roaming" profile in W10 parlance). For Linux there are several semi-standards, one of them is the <a href="https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html" target="_blank">XDG base directory specification</a>; in this case a common solution is to use the <span>~/.config/</span> folder.<br />
<br />
<a href="https://gitlab.common-lisp.net/mantoniotti/CLAD" target="_blank">CLAD</a> assumes these "fixed" locations to create per-application or per-library subfolders of a "Common Lisp" one. That is, <a href="https://gitlab.common-lisp.net/mantoniotti/CLAD" target="_blank">CLAD</a> ensures the presence of the following folders in your account.<br />
<ul>
<li><span>Mac Os X: </span><span>~/Library/Common Lisp/</span></li>
<li><span>Windows 10: </span><span>%USERPROFILE%\AppData\Roaming\Common Lisp\</span></li>
<li><span>Linux/Unix: </span><span>~/.config/common-lisp/</span><span> (in this case, I am following ASDF lead)</span></li>
</ul>
The library exports three simple functions,<br />
<ol>
<li><span><b>user-cl-data-folder</b></span>, which returns the pathnames of the folders above.</li>
<li><span><b>ensure-app-or-library-data-folder</b></span>, which, as the name implies, ensures that a subfolder exists in the proper location.</li>
<li><span><b>app-or-library-data-folder</b></span>, which returns the pathname associated to a library or app.</li>
</ol>
A library or app can now set itself up by doing something like<br />
<br />
<b><span>(defparameter *helambdap-data-folder*</span></b><br />
<b><span> (clad:ensure-app-or-library-data-folder "HELambdaP")</span></b><br />
<b><span> "The user HELambdaP data folder.")</span></b><br />
<br />
On Mac OS X, this results in the folder <span>~/Library/Common Lisp/HELambdaP</span>; a library or an application can now rely on a clear space where to store "common" data files. For <a href="http://helambdap.sourceforge.net/" target="_blank">HEΛP</a> it solved the problem of where to find the .css files in a reliable place.<br />
<br />
Trivial? Yes.<br />
NIH syndrome? Of course.<br />
Complete? No.<br />
Useful? You be the judge of that.<br />
<br />
<hr />
(cheers)<br />
<br /></div>Thu, 17 May 2018 21:44:00 GMTLispjobs: Lisp Developer, 3E, Brussels, Belgiumhttps://lispjobs.wordpress.com/2018/05/15/lisp-developer-3e-brussels-belgium/
https://lispjobs.wordpress.com/2018/05/15/lisp-developer-3e-brussels-belgium/
<p>See: <a href="http://3eeu.talentfinder.be/en/vacature/30101/lisp-developer">http://3eeu.talentfinder.be/en/vacature/30101/lisp-developer</a></p>
<blockquote><p>You join a team of developers, scientists, engineers and business developers that develop, operate and commercialize SynaptiQ worldwide.</p>
<p>You work in a Linux-based Java, Clojure and Common Lisp environment. Your focus is on the development, maintenance, design and unit testing of SynaptiQ's real-time aggregation and alerting engine that processes time-series and events. This data engine is Common Lisp based.</p>
<p>The objective is to own the entire lifecycle of the platform, that is from the architecture and development of new features to the deployment and operation of the platform in production environment. The position is open to candidates with no knowledge of LISP if they have a good affinity and experience in functional languages.</p></blockquote>Tue, 15 May 2018 13:45:46 GMTChristophe Rhodes: algorithms and data structures term2http://christophe.rhodes.io/notes/blog/posts/2018/algorithms_and_data_structures_term2/
http://christophe.rhodes.io/notes/blog/posts/2018/algorithms_and_data_structures_term2/
<p>I presented some of the work on teaching algorithms and data
structures at
the
<a href="https://www.european-lisp-symposium.org/2018/index.html">2018 European Lisp Symposium</a></p>
<p>Given that I wanted to go to the symposium (and <a href="http://christophe.rhodes.io/notes/blog/posts/2018/els2018_reflections/">I'm glad I
did!</a>), the most economical method for going was
if I presented research work - because then there was a reasonable
chance that <a href="https://www.gold.ac.uk/">my employer</a> would fund the
expenses (spoiler: they did; thank you!). It might perhaps be
surprising to hear that they don't usually directly fund attending
events where one is not presenting; on the other hand, it's perhaps
reasonable on the basis that part of an academic's job as a scholar
and researcher is to be creating and disseminating new knowledge, and
of course universities, like any organizations, need to prioritise
spending money on things which bring value or further the
organization's mission.</p>
<p>In any case, I found that I wanted to write about the teaching work
that I have been doing, and in particular I chose to write about a
small, Lisp-related aspect. Specifically, it is now fairly normal in
technical subjects to perform a lot of automated testing of students;
it relieves the burden on staff to assess things which can be
mechanically assessed, and deliver feedback to individual students
which can be delivered automatically; this frees up staff time to
perform targeted interventions, give better feedback on more
qualitative aspects of the curriculum, or work fewer weekends of the
year. A large part of my teaching work for the last 18 months has
been developing material for these automated tests, and working on the
infrastructure underlying them, for my and colleagues' teaching.</p>
<p>So, the more that we can test automatically <em>and meaningfully</em>, the
more time we have to spend on other things. The main novelty here,
and the lisp-related hook for the paper I submitted to ELS, was being
able to give meaningful feedback on numerical answer questions which
probed whether students were developing a good mental model of the
meaning of pseudocode. That's a bit vague; let's be specific and
consider the <code>break</code> and <code>continue</code> keywords:</p>
<pre><code>x ← 0
for 0 ≤ i < 9
x ← x + i
if x > 17
continue
end if
x ← x + 1
end for
return x
</code></pre>
<p>The above pseudocode is typical of what a student might see; the
question would be "what does the above block of pseudocode return?",
which is mildly arithmetically challenging, particularly under time
pressure, but the conceptual aspect that was being tested here was
whether the student understood the effect of <code>continue</code>. Therefore,
it is important to give the student specific feedback; the more
specific, the better. So if a student answered 20 to this question
(as if the <code>continue</code> acted as a <code>break</code>), they would receive a
specific feedback message reminding them about the difference between
the two operators; if they answered 45, they received a message
reminding them that <code>continue</code> has a particular meaning in loops; and
any other answers received generic feedback.</p>
<p>Having just one of these questions does no good, though. Students
will go to almost any lengths to avoid learning things, and it is easy
to communicate answers to multiple-choice and short-answer questions
among a cohort. So, I needed hundreds of these questions: at least
one per student, but in fact by design the students could take these
multiple-chocie quizzes multiple times, as they are primarily an aid
for the students themselves, to help them discover what they know.</p>
<p>Now of course I could treat the above pseudocode fragment as a
template, parameterise it (initial value, loop bounds, increment) and
compute the values needing the specific feedback in terms of the
values of the parameters. But this generalizes badly: what happens
when I decide that I want to vary the operators (say to introduce
multiplication) or modify the structure somewhat (<em>e.g.</em> by swapping
the two increments before and after the <code>continue</code>)? The
parametrization gets more and more complicated, the chances of (my)
error increase, and perhaps most importantly it's not any fun.</p>
<p>Instead, what did I do? With some sense of grim inevitability, I
evolved (or maybe accreted) an interpreter (in emacs lisp) for a
sexp-based representation of this pseudocode. At the start of the
year, it's pretty simple; towards the end it has developed into an
almost reasonable mini-language. Writing the interpreter is
straightforward, though the way it evolved into one gigantic <code>case</code>
statement for supported operators rather than having reasonable
semantics is a bit of a shame; as a bonus, implementing a
pretty-printer for the sexp-based pseudocode, with correct indentation
and keyword highlighting, is straightforward. Then armed with the
pseudocode I will ask the students to interpret, I can mutate it in
ways that I anticipate students might think like (replacing <code>continue</code>
with <code>break</code> or <code>progn</code>) and interpret that form to see which wrong
answer should generate what feedback.</p>
<p>Anyway, that was the hook. There's some evidence
in <a href="https://research.gold.ac.uk/23155/">the paper</a> that the general
approach of repeated micro-assessment, and also the the consideration
of likely student mistakes and giving specific feedback, actually
works. And now that the (provisional) results are in, how does this
term compare with <a href="http://christophe.rhodes.io/notes/blog/posts/2018/algorithms_and_data_structures_term1/">last term</a>?
We can look at the relationship between this term's marks and last
term's. What should we be looking for? Generally, I would expect
marks in the second term's coursework to be broadly similar to the
marks in the first term - all else being equal, students who put in a
lot of effort and are confident with the material in term 1 are likely
to have an easier time integrating the slightly more advanced material
in term 2. That's not a deterministic rule, though; some students
will have been given a wake-up call by the term 1 marks, and equally
some students might decide to coast.</p>
<p><a href="http://christophe.rhodes.io/notes/blog/posts/2018/algorithms_and_data_structures_term2/term2-vs-term1.png"><img src="http://christophe.rhodes.io/notes/blog/posts/2018/algorithms_and_data_structures_term2/200x-term2-vs-term1.png" width="200" height="150" alt="plot of term 2 marks against term 1: a = 0.82, R² = 0.67" class="img" /></a></p>
<p>I've asked R to draw the regression line in the above picture; a
straight line fit seems reasonable based on the plot. What are the
statistics of that line?</p>
<pre><code>R> summary(lm(Term2~Term1, data=d))
Call:
lm(formula = Term2 ~ Term1, data = d)
Residuals:
Min 1Q Median 3Q Max
-41.752 -6.032 1.138 6.107 31.155
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.18414 4.09773 0.777 0.439
Term1 0.82056 0.05485 14.961 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 10.46 on 107 degrees of freedom
(32 observations deleted due to missingness)
Multiple R-squared: 0.6766, Adjusted R-squared: 0.6736
F-statistic: 223.8 on 1 and 107 DF, p-value: < 2.2e-16
</code></pre>
<p>Looking at the summary above, we have a strong positive relationship
between term 1 and term 2 marks. The intercept is approximately zero
(if you got no marks in term 1, you should expect no marks in term 2),
and the slope is less than one: on average, each mark a student got in
term 1 tended to convert to 0.8 marks in term 2 - this is plausibly
explained by the material being slightly harder in term 2, and by the
fact that some of the assessments were more explicitly designed to
allow finer discrimination at the top end - marks in the 90s. (A note
for international readers: in the UK system, the pass mark is 40%,
excellent work is typically awarded a mark in the 70% range - marks of
90% should be reserved for exceptional work). The average case is,
however, only that: there was significant variation from that average
line, and indeed (looking at the quartiles) over 50% of the cohort was
more than half a degree class (5 percentage points) away from their
term 2 mark as "predicted" from their mark for term 1.</p>
<p>All of this seems reasonable, and it was a privilege to work with this
cohort of students, and to present the sum of their interactions on
this course to the audience I had. I got the largest round of
applause, I think, for revealing that as part of the peer-assessment I
had required that students run each others' code. I also had to
present some of the context for the work; not only because this was an
international gathering, with people in various university systems and
from industry, but also because of the large-scale disruption caused
by
<a href="https://www.ucu.org.uk/strikeforuss">industrial</a> <a href="https://ussbriefs.com/">action</a> over
the <a href="https://www.uss.co.uk/">Universities Superannuation Scheme</a> (the
collective, defined benefit pension fund for academics at about 68
Universities and ~300 other bodies associated with Higher Education).
Perhaps most gratifyingly, students were able to continue learning
despite being deprived of their tuition for three consecutive weeks;
judging by their performance on the various assessments so far,</p>
<p>And now? The students will sit an exam, after which I and colleagues
will look in detail at those results and the relationship with the
students' coursework marks (as I did <a href="http://christophe.rhodes.io/notes/blog/posts/2017/analysing_algorithms_and_data_structures_data/">last
year</a>). I will
continue developing this material (my board for this module currently
lists 33 todo items), and adapt it for next year and for new cohorts.
And maybe you will join me?
The <a href="https://www.doc.gold.ac.uk/computing/">Computing department</a>
at <a href="https://www.gold.ac.uk">Goldsmiths</a> is hiring lecturers and senior
lecturers to come and participate in research, scholarship and
teaching in computing:
a
<a href="https://jobs.gold.ac.uk/vacancy/lecturer-in-creative-computing-348799.html">lecturer in creative computing</a>,
a
<a href="https://jobs.gold.ac.uk/vacancy/lecturer-in-computer-games-348630.html">lecturer in computer games</a>,
a
<a href="https://jobs.gold.ac.uk/vacancy/lecturer-in-data-science-348531.html">lecturer in data science</a>,
a
<a href="https://jobs.gold.ac.uk/vacancy/lecturer-in-physical-and-creative-computing-348527.html">lecturer in physical and creative computing</a>,
a
<a href="https://jobs.gold.ac.uk/vacancy/lecturer-in-computer-science-348441.html">lecturer in computer science</a> and
a
<a href="https://jobs.gold.ac.uk/vacancy/senior-lecturer-in-computer-science-348301.html">senior lecturer in computer science</a>.
Anyone reading this is welcome
to <a href="mailto:c.rhodes@gold.ac.uk">contact me</a> to find out more!</p>Tue, 08 May 2018 19:17:43 GMT