Thursday, June 8, 2017

The Making of Cells: A Case Study in Dumb Luck

Preamble

Almost ten years ago I wrote up a library I had developed twelve years earlier, in another post named The Cells Manifesto.

Cells and the many other dataflow/reactive libraries like it are great but developers generally are immune to the paradigm shift. I do not feel bad: they are also immune to Lisp.

And Lisp is quite the analog for the dataflow paradigm: the deader each gets, the more you run into them. C++ and Java have discovered lambda, and good luck finding a UI framework that does not have "data-binding" or some reactive hack behind it.

What the Hell does this have to do with the invention of Cells? Well, I have a chance to land my second Cells user and he keeps saying he needs a deeper understanding and I flashed back to a crazy talk before Jay Sulzberger's Linux group on the Isle of Manhattoes as Jay likes to say in which a digression on Lisp led to an unplanned re-invention of Cells over the keyboard as a pack of FSF "C" programmers looking over my shoulder shouted out Lisp bug fixes (they did pretty good, and loved Cells) and it occurred to me that maybe the problem is that when programmers get excited about a new coding trick they start chewing the scenery with Holy Grail hyperbole and turn everyone off. Where was I?

Oh, right. The idea here is to finally get across the incredible power of dataflow by just talking about the problem we were trying to solve and how we solved it and how we then fixed the solution and slippery sloped up to dataflow.

The Problem: Dynamic, Nested UI Layout

We were building an educational math app aimed at struggling students, so the last thing we wanted was for them to be fighting to type in their math. We developed a WYSIWYG maths editor as a start and at a higher level of layout worried about them running out of room as they worked away. Resize bars need not apply, these kids and grownups are unhappy already.

I came up with a dynamic layout calculation scheme and on like day two my lead developer JD reported a problem with my approach. The use case was quite specific: fractions.

Background: the geometry I conceived for any visual element consisted of (a) that element's (x,y) offset within its parent container, and (b) the element's local bounds: left, top, right, bottom relative to (0,0) (hence "local"). Where did a box get drawn? Sum the offsets of my containers starting from the window's (0,0) and shift the local bounding box accordingly.
Fun note: a later subordinate complained (to others) that my geometry was understandable only to myself. Harrumph. I felt better when I learned OpenGL has the same system.
 So what was the problem with fractions? My algorithm hoped to calculate a child's geometry before calculating its parent's geometry, but to keep the operands of a fraction centered vertically the numerator, e.g., had to be horizontally inset half the difference between the fraction's width and the numerator's width. But the fraction did not know its width until the numerator had decided its width, because the fraction would take the max of the numerator and denominator. So neither child nor parent could go first.

Ok, ok: later I realized fractions should have a vertical alignment attribute saying "centered" and that the numerator should simply position itself horizontally at minus half its width, but I looked at this accidental use case and realized my overall idea of deciding an element's entire geometry in one go would never hunt. I also knew there was no circularity in fact, only in my algorithm.

"Everyone to the whiteboard!", I called out as if it were a breach in the city wall of Harfleur.

The Invention

True to my insight that it was not just fractions, I started with laying out the problem number (be it "1." or "42(a)."), the problem statement (be it "Simplify" or "Solve and characterize as conditional, contradiction, or identity") and below them the problem of any size the student typed in.

Step Zero

"OK, what is the left edge of the problem number?", I asked rhetorically, accidentally solving the first problem: my "baby steps" habit when struggling with hard problems had led me to contemplate the derivation of one bit of the widget's geometry, not its full geometry, 

"Zero," I concluded. "Same with the top." Unlike OpenGL, screen geometries tend to grow as you go down the screen.

"We are on a roll," I said, sensing Harfleur was ours. "Now what is the right edge?"

Step 1

 Nah, we need some caveats here.

One. All code below was just typed in using the Blogger text editor. Parens will not balance, typos will abound.

Two. If you do not know Lisp, the code may be a challenge. A good example is that (defstruct a b c) is a raely used short from that defines a struct of type 'a with slots b and c.

Three, Hell, it has been a year since I coded Lisp in anger. If something below looks like Clojure, it probably is.

Four. Goofs and obscurity aside, if like me you stare hard at examples and try to work out what must be going on, you will come away with a ton of questions and "that won't work!"s. The story here of incremental elaboration continued for weeks at a steady pace and for years in fits and starts as new challenges to the scheme arose. Maybe this one time, do not stare too hard.

We return you now to our show, picking up with Step 1 where, after luckily shifting to slot-wise thinking for the left edge (and top) of a widget, I contemplated the right edge of the same widget, which we wanted to grow and shrink as the student edited the problem number.

Step 1

"OK, the right edge is the left edge plus the string width of the problem number given the font metrics of the chosen font," I stated rather obviously since we were already doing that. But Harfleur was ours.

A value had been specified as a function of other values. When making the instance of the textedit widget, we would not specify a reasonable width of 42, too big for most problem numbers and scrolling for larger numbers, we would specify:

  (make-instance 'textedit
    ...etc...
    :lr (lambda (self)
          (+ (ll self)
             (fontStringWidth (text self) (math-font *app*))))

Cool. We never again have to worry about in which order to calculate things: other values we read will be calculated JIT. Er, how?

We just need an accessor (reader) that knows what we are up to (and a new defmodel macro to write all these accessors):

    (defmethod lr ((self textedit))
        (let ((sv (slot-value self 'lr)))
            (if (functionp sv)
               (funcall sv self)
               sv)))

Yeah, we need to do better because a function is a fine value for a slot so we cannot assume it is a rule to be funcalled, but it is a fair assumption for screen locations and it will not last long anyway.

JD knocked it off in an afternoon and we were delighted. Then of course the wheels came off: forever recomputing geometry was too slow. Note that this is an exponential problem, because property P might  compute off several other properties and several other properties might compute off P, and recursively so as container after nested container computes their properties.

Step 2

We cache the calculation, and now we need a struct:

     (defstruct cell rule value)

Note this solves our earlier problem: we no longer need to assume a function is a formula. Indeed, a function is a function, we are looking for cells now.

Brilliant: when we get a Mac OS update event we recompute the whole geometry but never recompute anything twice. We clear all the cached values and recompute and redraw. 

That lasted two days. Back then 100mhz was fast so soon enough we had enough stuff on the screen that a full recomputation too slow.

The good news is that I knew it would not scale, but when it comes to optimizing performance it is always nice to be asked. We had been asked.

Step 3

To get past this we have to recalculate only as needed. If the student types another digit in the denominator, we need to recalculate the denominator's width, recalculate its horizontal offset within the fraction, and -- iff the denominator is wider than the numerator -- recalculate the width of the fraction and pass that up the container hierarchy.

Note the happy optimization: if the denominator was sufficiently less wide than the numerator, the fraction would decide on the same width as before and there would be no need to recompute further up the container hierarchy. I digress.

So how do we decide "recalculate only as needed"?

   (defstruct cell rule value users)

A cell needs to know what other cells used its value in their calculations, so it can tell them to recalculate if it changes. Hold onto your hats (and this is just the first gust):

   (defmethod lr ((self textedit))
        (let ((sv (slot-value self 'lr)))
            (if (typep sv 'cell)
                (progn
                  (when *user*
                      (c-record-user sv *user*))
                  (let ((*user* sv))
                      (setf (value sv) (funcall rule self))
                sv)))

Ya gotta love Lisp special variables: above we are at once (a) tracking who might be using us and (b) letting anyone our rule happens to access know that we are using them. What may not jump out at the reader is that this works recursively because, at start-up of new structure, the variables our rule accesses may have to JIT compute their values, rebinding *user* to themselves.

Thanks to Lisp macros, this does not require too much typing (so the semantics stand out). First, the macro:

   (defmacro c? (rule)
       `(make-instance 'cell
            :rule (lambda (self)
                         ,rule)))

(The sick thing above is that the lambda parameter self will be captured by the code of the rule. Hygiene? We don't need no stinkin' hygiene!)

And now:

  (make-instance 'textedit
    ...etc...
    :lr (c? (+ (ll self)
               (fontStringWidth (text self) *math-font*)))

But that only records the dependency. Gust two:

    (defmethod (setf lr) (new-value self)
      (let ((sv (slot-value self 'lr)))
        (if (typep sv 'cell)
           (progn
              (when (not (eql new-value (value sv)))
                (setf (value sv) new-value)
                (dolist (user (users sv))
                   (c-recompute user))) ;; <---- dominos="" fall="" font="" the="">
              new-value)
           (setf (slot-value self 'lr) new-value))))

Note how propagation halts on a non-changing change. We are really making functional scream.

Note also that I was so focused on making functional fast that I missed that the world had turned upside down.


Those are inversion goggles. Wear them for a few days non-stop and suddenly the world becomes easily navigable; the brain flips the world back, as if a single bit controlled our vision.

In this case, the better metaphor might be of a river changing direction. Yes, the original problem knowing in what order to pull information into a computation, but after tweak number three the library was about empowering some initial change in state to domino efficiently to other state. Pull had become push, causation had been harnessed.

There we go again, chewing the scenery. Let's do practical again: how does the screen actually get updated? Let us just look at the snippet above where change is detected, now with a simple trick to do more than just change other computed values:

      (when (not (eql new-value (value sv)))
         (c-observe self 'lr new-value (value sv)) ;; <--- font="" the="" trick="">
         (setf (value sv) new-value)
         (dolist (user (users sv))
            (c-recompute user)))

c-observe is a generic multi-method we author any way we like as an "on-change" callback:

   (defmethod c-observe ((self vis-obj) (slot (eql 'lr)) new old)
       ... insert here code to trigger the update of the
       union of the old + new rectangles of self...)

Not only are we arranging for efficient, minimal, and complete change, we now have a hook to augment change processing arbitrarily.

One final point: we mentioned up front that nowadays everyone is doing reactive. That is true, but Hoplon/Javelin is the only one that does not require explicit subscribe and notify (and that works by code inspection so it recomputes excessively). Thanks to Lisp macros and some relatively modest engineering, subscribe looks like this (again):

  (make-instance 'textedit
    ...etc...
    :lr (c? (+ (ll self)
               (fontStringWidth (text self) *math-font*)))

...and notify looks like this:

     (setf (text self) (concatenate 'string (text self) new-char))

ie, Invisible. No need to break my coding flow and code up a publisher at a new source and subscription here, I just write my code and let the engine note the dependency. The more complicated a UI gets, the exponentially greater this win.







3 comments:

  1. love the tone and mindfulness of this. 'chewing the scenery' is a great expression!

    ReplyDelete
  2. Thanks, Johann!

    btw, Nice catch: I love that "chewing" expression. What a great, hilarious mental image. Leave it to stage folk.

    ReplyDelete
  3. "Fun note: a later subordinate complained (to others) that my geometry was understandable only to myself. Harrumph. I felt better when I learned OpenGL has the same system."
    Being a math student and having tried to work out what the heck is going on with the math in OpenGL and why people decided to choose the conventions as they did, I kind of see the point. They could decide their conventions however they pleased, but for some reason they decided that the axes are to be interpreted completely differently depending on what space you're working with. World space has z up but camera space has it forwards I think and texture space interprets it differently yet I think and after this went on for a while I thought to myself "screw this, it seems easier to just pretend that none of this stuff exists and I have to do everything myself".

    ReplyDelete