Sunday, March 9, 2008

Tilton's Law of Programming: Fear No Evil

We just had this exchange on comp.lang.lisp:


srdjan.m wrote:
On Mar 9, 12:01 am, Ken Tilton wrote:

danb wrote:

On Mar 8, 10:59 am, "srdjan.m" wrote:

(defun test ()
(let ((x '(1)))
(not-ok x)
x))
CL-USER> (test)
(1 C)
CL-USER> (test)
(1 C C)
I really do not understand why let is not initializing x
quote (the single quotation mark) doesn't create a new list, and of
course nconc alters x in place. So you're repeatedly binding x to the
same list and appending 'c to that list.
Incidentally, this is the universal beginner's question. It's been
asked at least three or four times in the last few months. So you
have plenty of company :)

IIANM somewhere in here is either a list or a pointer to a list of these
common nooby coding gotchas:

http://www.cs.cmu.edu/Groups/AI/html/faqs/lang/lisp/top.html

kenny


Indeed there it is http://www.cs.cmu.edu/Groups/AI/html/faqs/lang/lisp/part1/faq-doc-4.html.

Oh my: "Destructive operations, such as NCONC, SORT, DELETE, RPLACA, and RPLACD, should be used carefully and sparingly. In general, trust the garbage collector: allocate new data structures when you need them."

That's nuts! It sows fear and trembling while offering no actual guidance as to when to use destructive operations and when not.

"Use them when you need them." I wonder what the next question will be. (Hint: "When do I need them?".)

As for using them sparingly, what on earth does that mean? What if I "need them" twenty-five times, should I spare a few of them because twenty-five is too many?

Technology is like an ill-tempered dog: never show it fear. If you are not sure how X works do not start inventing weird usage rules unrelated to how the technology works hoping they will somehow keep you safe. You will end up worrying a lot, not get the most out of your tool, and still get bit in the ass.

Instead, take a minute or hour to find out how X works. In this case, there is nothing "in general" or "sparingly" or "as a rule" about it: if you own all the list structure on which you are about to operate, always use the destructive variant. If not, never use the destructive variant.

The non-excepting exception is when the function you are writing is meant to be destructive, in which case the caller is responsible for using it as they would any other destructive function.

So when do I own structure? When I have created it, or called functions that have created it for me. When do I not own structure? When it has been passed to me.

Suppose we have a silly function like this:
(defun do-filter-and-flatten (lists do test)
(apply 'append
(delete-if-not test
(mapcar do lists))
mapcar generates fresh structure, so I can use delete-if-not instead of remove-if-not on the list produced by mapcar. But the lists themselves -- the members of the input parameter lists -- came from somewhere else and might be in use, so they cannot be touched. Note that the lists structure itself came from someplace else but does not come into play in this example because we begin by effectively copying that structure while mapcar-ing the do function.

We can turn that around to drive home the point, by applying the filter first (and getting a much different function):
(defun filter-do-and-flatten (lists do test)
(apply #'append
(let ((tmp (loop for list in lists
when (funcall test list)
collect list)))
(map-into tmp do tmp))))
We cannot touch lists (via delete-if-not) because it has been passed to us, so we test and collect. But now we own the returned cons cells bound to tmp and are free to whack their CARs with the mutating map-into -- which Stanislaw Halik just pointed out to me and which I do not think I had even heard of in thirteen years of Lisping!

But if I may digress, Kenny don't play let, certainly not in the middle of a Grahamanian cascade:
(defun filter-do-and-flatten (lists do test)
(loop for list in lists
when (funcall test list)
nconc (funcall do list)))
A final note. What if in this last version we knew that any "do" function would massage every element returning a new list along the way. Could we make nconc the last step? I would not. Sometimes functions like these decide they have no work to do and then return the input list structure. We use destructive functions only when we know we own the structure we will be altering, otherwise not.

Then and only then shall we fear no evil.

[All code above Just Typed In, corrections are welcome (and thanks to Anonymous for further reminding me that the non-destructive remove-if-not cannot be assumed to copy -- it might return the input list untouched).]

9 comments:

srdjan said...

Hi,

I very much enjoyed reading this post and I especially liked this line "Technology is like an ill-tempered dog: never show it fear."

Hopefully, SLIME debugger will help me in showing no fear:)

Anonymous said...

Nice rant. Paul Graham has written something very similar about who owns what, in On Lisp:

"Who owns arguments and return values? The convention in Lisp seems to be that an invocation owns objects it receives as return values, but not objects passed to it as arguments."

Also, thanks for the c.l.l pointer to the newbie guidelines. Some of them are idiosyncratic, as you observe. This one struck me: "Use complex numbers to represent points in a plane." Okay, even if I've come across commercial systems that don't do this...

Anonymous said...

According to the hyperspec, remove-if-not can return the list unchanged. Thus, if no element is removed, it would not be safe to pass tmp to map-into.

Kenny Tilton said...

According to the hyperspec, remove-if-not can return the list unchanged.

PWUAHAHAHAHAHAA! Does anyone remember a similar sequence in which some guy publicly tried to write a decent C++ "copy" constructor or something leading to this just enormous thread of amendments and refinements and gotcha fixes, the best part being he had started out with, "Look, this is easy..."?

I'll be honest, I have seen that cautionary note about remove-if-not or maybe it was something else and (a) been aghast but (b) never internalized it, I must have time bombs sitting out there waiting to go off.

Oh, well. Once more into the breach!...

Anonymous said...

Doesn't this mistake Anonymous pointed out show exactly why one shoudn't use the destructive functions without a good reason?
Using destructive updates forces you to think about this 'ownership' thing, while that could be completely ignored if you used only the nondestructive functions. One less unnecessary detail to think about and potential bug to resolve. nconc & co should be used as performance optimizations only.

Kenny Tilton said...

Doesn't this mistake Anonymous pointed out show exactly why one shoudn't use the destructive functions without a good reason?

No, the title still holds: "Fear No Evil". The standard is perfectly clear, so there is no ambiguity, I just screwed up because I had been programming Lisp for years without realizing remove does not always copy. (Makes me wonder why I never got bit. Anyway...) But I did happen to read that append does not copy the last list, so I am not all bad. The Right Answer is to learn more, not fear more.

The thing is, we need to learn this stuff anyway because we simply cannot get away with never using destructive functions -- noobs are forever writing dead slow stuff even on simple exercises because they get into a combinatorial explosion of consing by having unnecessary non-destructive calls in a heavy-hitter location.

And since we have to learn anyway, we may as well use the right function. Not sure? Look it up. The CLHS is real handy that way.

Anonymous said...

Just was pointed at your blog, excellent stories. It struck me that this particular item shows the equivalent of classic Objective-C memory management (which uses retain counts); it's a common newbie mistake regarding object ownership.

Anonymous said...

Well I've decided to dive in. No fear, right?

I'll be checking this space often while I make my own contributions
here:

http://dihymo.blogspot.com/ and on c.l.l.

Dave Roberts said...

My own personal rule on this is to use functional programming whenever possible and let the GC deal with it. This eliminates all potential for a whole class of bugs. The only time I won't do that is when it's just so dead obvious that I own the list structure in the particular function I'm writing (like the function is building it cons by cons using push with a possible nreverse).

The thing is, even if you make sure your destructive operations are working right the first time, if you're playing around with some hairy corner of the CLHS then you're probably asking for trouble when you next go in and touch that function. At a minimum, leave yourself a good comment telling yourself what you were doing and where the limits are.

If I later find that I have produced some pig-slow piece of crap code, I'll go in and optimize. Typically, my biggest optimizations are in not doing something completely retarded with my algorithm as opposed to need to use nconc rather than append.