Wednesday, March 5, 2008

My Nastiest Macro Ever?!

[WARNING: Do not try at home. The following code relies on extensions not included here. Plz visit the DSB thread for something you can play with]

It all started because a Lisp/Arc noob "got it" when they saw how I used Lisp destructuring to make some code a little more readable. That in turn started with a challenge to improve some code to write an SQL query or something.

I suggested rearranging the Lisp query spec to be friendlier to the consumer, then used Arc destructuring as the first improvement (and then greatly missed CL format):
(let (tbl key . fields) *sql-template*
(prn (string "insert into " tbl
" ( " (let c nil
(each (f . rest) fields ;; more destructuring
(when c (= c (string c ", ")))
(= c (string c f)))
c)
", " (car key)
" ) values ( "
(let c nil
(each (f . rest) fields
(when c (= c (string c ", ")))
(= c (string c ":" f)))
c)
", " (last key) ".nextVal"
" ) returning " (car key) " into " (string ":" (car key)))))


When the "got it" came in, I decided to dust off my lite version of CL's destructuring-bind (actually done to implement a CL-style defun for Arc since its DEF has only optional args) and really give the Arcers an eyeful.

The functionality we are after, by example:
(let data (list 1 2 7 nil 'p 5)
(dsb (x y &o (a (+ y 1)) (z 4) &k (p 98) (q (+ a 1))) data
;; we want to see identical pairs, cuz next I
;; print first a variables runtime binding
;; and then its expected value
(prs "args" x 1 y a 7 z nil p 5 q 8)
(prn)))


The above should produce "args 1 1 2 2 7 7 nil nil 5 5 8 8". Yes, I apologized to the Arc forum for my lame test "utility". Anyway, here is what I came up with, and even though I have written well over five hundred macros this one just might be the most mind-bending I have done.

First, the final form (probably still buggy!):
(mac dsb (params data . body)
(w/uniq (tree kvs)
`(withs (,tree ,data
,@(with (reqs nil key? nil opt? nil keys nil opts nil)
(each p params
(if
(is p '&o) (do (assert (no opt?) "Duplicate &o:" ',params)
(assert (no key?) "&k cannot precede &o:" ',params)
(= opt? t))
(is p '&k) (do (assert (no key?) "Duplicate &k:" ',params)
(= key? t))
key? (push-end p keys)
opt? (push-end p opts)
(do (assert (~acons p) "Reqd parameters need not be defaulted:" p)
(push-end p reqs))))
(with (n -1)
(+ (mappend [list _ `(nth ,(++ n) ,tree)] reqs)
(mappend [list (carif _) `(if (< ,(++ n) (len ,tree))
(nth ,n ,tree)
,(cadrif _))] opts)
`(,kvs (pair (nthcdr ,(++ n) ,tree)))
(mappend [list (carif _)
`(aif (member',(carif _) ,kvs)
(cadr it)
,(cadrif _))] keys)))))
,@body)))

That hurt. But why is it possibly the wildest one I have ever written? It certainly is not the longest. Let me put it this way: macros are always tricky but most of them only go up to ten; this one goes to eleven.

First, let's look at a simpler example:
(dsb (x &k (y (++ x))) (list 1 'y 2)
(list x y))

...and the desired expansion:
(withs (u1 (list 1 2)
x (nth 0 u1)
(u2 (pair (nthcdr 2 u1)))
y (bif (assoc 'y u2) (cadr it) (++ x)))
(list x y))

Come to think of it, next time I have to write a macro this tricky I just might write out the expansion I have in mind rather than try to hold it in my imagination as I code. Lesson learned, although macros normally arise after we have indeed written several long versions and then decide to write a macro, so we do have expansions to stare at. Anyway...

Macro writing does present a challenge in all but the simplest token-replacing cases: while writing the macro we are working in two different times at once, macro-expansion time and run-time. I am writing macro code M (to run at macro-expansion time) whose input is source code S and whose output is code R that will Actually Run at run-time. As the macro author, I start a macro function in mindset M, then use a backquote to start specifying code in mindset R, then use the unquoting comma to jump back to mindset M. It is as if my time machine went haywire and I am experiencing now and the future on alternate blinks of the eye.

When this becomes second nature, btw, you can add Lisp to your resume.

So why is this example worse than usual? How does it go to eleven? Look at the binding of U2 (which for the cognoscenti is my fakery of the unique symbol uniq (CL's gensym) would produce). Omigod! The runtime code R is finishing the job of parsing the list being destructured by taking arguments beyond the last optional position and building them into an a-list to be read by the value forms of each keyword, themselves doing more parsing to decide if a runtime value was supplied or if they should use the form (if any!!!) specified as the default in the parameter list!!!

Aaaaaargghh!!! Maybe this one goes to twelve. I observed on the Arc forum while describing all this that it was a good thing I had written the macro before thinking about it. But there we were, so before closing the thread I decided to mention some simple macro-writing tips:

1. The nice thing about CL-style and now Arc-style macros is that they are just like any other code we write, so we can debug them the same way, e.g, by placing print statements in the M code, if you recall my naming convention. This will be a mind-bender at first, or it was for me anywho.

2. If when you test a macro it dies on the compile of a usage (not the run) your bug is in the M code, not the produced runtime R code. Let's go into some detail:

I can write a macro and have that not compile:
(mac surprise! (f1 f2)
`(do1 ,f1, f2, ,f3))

... because f3 is undefined. Duh. Easy enough to nail those. Once I fix the macro:
(mac surprise! (f1 f2)
`(do1 ,f1, (/ ,f2 0))

...I can try it out and come to grief on a reasonable looking usage because the macro generates buggy code:
(surprise! 10 11)

...which should produce a division by zero error by expanding to:
(do1 10 (/ 11 0))

But there is another way to mess up! Isn't this fun! You can have macro code M that compiles OK but has a bug of its own that will be encountered while the M code is running (to produce the R code for the compiler):
(mac surprise! (f1 f2)
(let x (len f2) ;; golly I hope f2 is a proper list
`(do1 ,f1, (/ ,f2 0))

That version of the macro compiles, but now:
(def three-two-one ()
(surprise! 10 11)

...should not compile, Arc should squawk during compilation of three-two-one about 11 not being a proper list.

In summary, when and how a macro usage goes South tells you in which of these many ways possible you have screwed up. Until you develop an instinct for that, this will help:

2. Once your macro itself compiles (not a use thereof), use macex to preview and debug the R code your macro will pass on to the compiler:
(prn (macex '(surprise! 10 11)))

The above may fail if, as we discussed, you have a runtime bug in your macro expansion code M. If you are developing a macro and testing it in your application and get confused as to when the wheels are coming off, take a step back and cut-and-paste the failing usage into the above, not forgetting the quote.

This by the way is a helpful reminder that macros eat symbolic source code, not text strings, and not runtime values, by which latter I mean this next bit should work even though the variables hi and mom do not exist -- at this point again it is all just symbols:
(prn (macex '(surprise! hi mom)))

Use macex also when you simply do not get the behavior you want, when a macro usage compiles and runs but gives the wrong result. It might be that a macro expansion is writing bad code, and it is easier to see in the expanded form of an actual usage than it is staring at the slicing-dicing of a macro function. Come to think of it, we can see that with this example:
(surprise! 1 2)
-> 1

Puzzled as to why we get 1 instead of 2, we use macex:
(macex '(surprise! 1 2))
->(do1 1 2)

Aha! That is supposed to be do! The do1 is left over from an earlier version where it made sense. Something like that.

3. Divide and Conquer I: first I did DSB just with required params, then with optional params, then with keyword params, then with default values, then with computed default values, yadda yadda yadda, nothing new in this tip but there ya go.

4. Divide and Conquer II: This gets back to tip #1: macros are like any other code, meaning we can break complex chunks out into standalone functions to be tested separately. See that little state machine at the beginning of DSB that parses the parameter list? That is a nice little standalone chunk with a big job to do, let's give it its own home:
(def dsb-params-parse (params)
(with (reqs nil key? nil opt? nil keys nil opts nil)
(each p params
(if
(is p '&o) (do (assert (no opt?) "Duplicate &o:" ',params)
(assert (no key?) "&k cannot precede &o:" ',params)
(= opt? t))
(is p '&k) (do (assert (no key?) "Duplicate &k:" ',params)
(= key? t))
key? (push-end p keys)
opt? (push-end p opts)
(do (assert (~acons p) "Reqd parameters need not be defaulted:" p)
(push-end p reqs))))
(list reqs opts keys)))

And test:
(prn (dsb-params-parse '(x y &o (a (+ y 1)) z &k (p 98) (q (+ a 1)))))
-> ((x y)((a (+ y 1)) z)((p 98) (q (+ a 1))))

Awesome. Now look at DSB:
(mac dsb (params data . body)
(w/uniq (tree kvs)
`(withs (,tree ,data
,@(let (reqs opts keys) (dsb-params-parse params)
(with (n -1)
(+ (mappend [list _ `(nth ,(++ n) ,tree)] reqs)
(mappend [list (carif _) `(if (< ,(++ n) (len ,tree))
(nth ,n ,tree)
,(cadrif _))] opts)
`(,kvs (pair (nthcdr ,(++ n) ,tree)))
(mappend [list (carif _)
`(aif (assoc ',(carif _) ,kvs)
(cadr it)
,(cadrif _))] keys)))))
,@body)))

Sweet! Now we can give Arc a Cl-style defun if like me we are fond of them:
(mac defun (name params . body)
(w/uniq (args)
`(def ,name ,args
(dsb ,params ,args ,@body))))

Applied, we create the CL poster boy for mixed optional and keyword args:
(defun read-from-string (s &o eoferr? eofval &k start end keep-whitey?)
....)

Too easy? :)

Somewhere out there some Pythonista believes the sky is falling, that I have forked Arc, and that no one will ever be able to read my code. (a) They are wrong, and (b) this is a wonderfully extreme example in which we really are (yes!) extending the language. If they do not like this, maybe they will like some of the simpler transformations mac allows.

2 comments:

Daniel said...

I do not wish to sound dismissive, and I fear I'll sound so, but I was doing that in Forth back in the 80s.

Well, not that that, and I understand you have also been doing that for a long time now, and so have a lot of people. But I cannot help but feel amazed that this is any news to anyone. That you have to go to the trouble to explain it.

In Forth, using the oldest syntax for it for the sake of it's elegance, is:

: "macro"-name <build (stuff that happens at compile time) does> (stuff that happens at run time) ;

Or, in other words, everything between <build and does> happens at compile time, and it set up whatever you will be needing at run time. The stuff after does> will take place at run time.

I'm sorry, I'm ranting. It's just that the concept feels obvious to me.

Kenny Tilton said...

But I cannot help but feel amazed that this is any news to anyone.

(a) My imagined audience was mostly Arc noobs who are also macro noobs.

(b) I do not see where you addressed the paragraph in which I explicitly explained why this macro felt like "one more" than usual. That is what you have to explain to be obvious, not just the usual compile vs runtime thing. Maybe you missed it, explaining your astonishment.