Monday, February 11, 2008

A Tall Ship and a Star to Steer Her By

A few things collided yesterday. Background: I slammed Scheme and academia for the star by which they chose to steer Scheme's design, namely purity and a small specification. Yep, you heard me right. A design goal was a small spec, nothing to do with making programming easier. From the FAQ:
Advocates of Scheme often find it amusing that the Scheme standard is shorter than the index to CLtL2.
Amuse this. Anyway, after that I got yelled at for being mean to Scheme and academia. Hey, I thought that was what blogs were for.

Now the other thing that happened. Paul Graham released a new Lisp called Arc for which his guiding star is the brevity of the *application code* written in Arc. (He also likes a language to be small but I cannot be mean to him because he once was nice enough to give me a tip on how my RoboCup team could play better defense.)

Mr. Graham further mentioned that Arc development includes a Web app library because he wants there to be a real-world application "pushing down" on the core language design. This reminds me of my cherished Cells library whose twelve-year evolution has always been pushed by application development mostly because I am just a simple application programmer not smart enough to come up with fancy theories in the abstract so I just use things till they break and then fix them just well enough to get them working again. But I digress.

The idea of having an application shape Arc went down well -- no one whined about it and on Usenet that is tantamount to a standing ovation -- but Mr. Graham's choice of brevity as a pole star in navigating his Arc led to a raucous of blogwar and I still had that defender of academia challenging me with "Hey, McCarthy was an academic". Things got ugly, the lights went out, I took a chair over the head, and here is what we (I) decided.

First principles are Good Things. In Lisp, every form returns a value. That first principle works great. This despite Tilton's Law of Programming:

All X all the time is the root of all evil.

Examples: Prolog where it is all logic and unification and Smalltalk and Java where it is all objects. So how did "every form returns a value" get past Tilton's Law? Beats me, must be a Deep Truth somewhere about the power of the functional paradigm which is not all that hard to understand if one has worked on even a small amount of stateful code. At any rate, we will not reject out of hand the idea of one idea shaping a language, but we do see that we gots to pick the right idea.

Next question: are we OK with shorter being better, all other things being equal? Yes, that is correct. Note that reductio adsurbum counterexamples such as APL and K and Perl do not work because in those cases the shorterness takes us out of the space of all other things being equal. Changing Lisp's lambda to Arc's fn makes it shorter without making it obscurer, but changing lambda to a hieroglyphic would.

Next issue: the shorter specs of Scheme and Arc. Would you walk up to a working programmer and start jumping up and down about this amazing new language guess what it's spec fits on one page!!! The working programmer would worry about you. Yeah it makes the compiler easier to write but do we working programmers tell you compiler writers our problems?

Small specs. Pfft. Ever walk into an auto garage and see one of those huge red chests filled with drawer after drawer of tools? Those chests are the seven years of med school for a mechanic. We civilians might think it an overwhelming selection but the mechanic knows it by heart, even those tools used like every two months.

Mr. Graham talked at one point about a language being small enough to keep in one's head. I think the mistake there is the assessment of how big that small can be. People using something a lot (and I hope we are talking about designing a language for someone using it every day) can keep quite a bit in their heads. Meanwhile, hey, these are computers, a good IDE lets us zero in on documentation with a minimum of keystrokes to get the details of a keyword we forget, including zero keystrokes cuz as I finish typing the name of a function my IDE status bar shows the expected parameter list.

On the flip side of big, a scanty starting tool chest means everyone either reinvents the missing bits or grabs one or another incompatible library off the shelf which is why Scheme programmers cannot share their work. Some of them point to specific implementations such as DrScheme and say Hey, it has everything! in which case what was the point of the small standard? You now have a big language unlike anyone else's. Juusst peachy.

All that has been achieved with a small starting standard is making my code incompatible with someone else using some other big Scheme implementation. Lisp went through that without trying to because it too started small, insanely small, but then they cured the natural outcome (a fragmented community) when that threatened its DARPA contracts by creating Common Lisp the standard and everyone saluted and conformed. They still lost the DARPA contracts but those will be back eventually because the standard now has the Common Lisp community all pulling in the same direction: I can use Other People's Code and they can use mine.

The subtext here is that CL may be a big language but I use all of it (except series, but give me time, it has only been thirteen years). One of the classic auto mechanic sayings is that any given task is easy when you have the right tool, and all of us shade tree mechanics forced into retirement by ever more complex automobiles remember how great were those moments when someone handed us the right tool and we were able to put down the Vice-Grips.

Brevity of application code? Great! Brevity of spec, not so much. Scheme needs to close up shop because if they fix what is wrong with it (hygienic macros, NIL not being false, a tiny standard) they end up with CL.

Arc is differentiated by its use of syntax to yield shorter applications, so it can stay, but it does need to add some drawers to the tool chest. And academia needs to start writing applications at the same time they are having bright ideas so they stop designing cool things that happen to be useless for programming. More on constraints later.

23 comments:

Anonymous said...

Not returning w/o an explicit return may seem annoying in Python, but in fact it is very consistent with the design of the language. One of the most cherished Python maxims is “Explicit is better than implicit.”, which combined with an implicit return in every form would sound quite stupid.

Kenny Tilton said...

Ryszard: [cough] yes, it is consistent with the design. I am saying the design choice was a mistake forcing added noise obfuscating what is really going on in the code. As for explicit is better, I realize one would have to live with it for a while to get the "feel", but once one internalizes the principle, well, it *is* explicit! Hey, there's a form! Look at that value shooting out of it! As a reverse example, one of my (and others') more common bugs stems from CL iterators such as dolist and even loop *not* returning a value by default.

Unknown said...

I wish all language wars were between different dialects of lisp :)

Anonymous said...

I'm interested in why you consider hygienic macros to be something Scheme needs to fix. Can you expand on that?

Anonymous said...

How do you consider (progn (values))?

I think that it might be more correct that any form can be interpreted as producing results if the caller wishes to do so.

That is, there is a universal protocol for getting results from the evaluation of forms -- not that all forms produce results.

Kenny Tilton said...

I wish all language wars were between different dialects of lisp

The nastiest fights are intra-family. :)

But seriously, if Arc gets someone fired up about Lisp and they immediately want to start doing production work, they will want to know whether to use Scheme or Common Lisp. Now they know. :)

Anonymous said...

... I am just a simple application programmer not smart enough to come up with fancy theories in the abstract so I just use things till they break and then fix them just well enough to get them working again.
- KT

(F) We continue our explorations. The prime paradigma of the pragmatic designer is known as "poor man's induction", i.e. he believes in his design as long as "it works", i.e. until faced with evidence to the contrary. (He will then "fix the design".) The scientific designer, however, believes in his design because he understands why it will work under all circumstances. The transition from pragmatic to scientific design would indeed be a drastic change within the computer industry.
-EWD

reference

Timbo

Kenny Tilton said...

The scientific designer, however, believes in his design because he understands why it will work under all circumstances.

Cool. Written by an academic, though. :) One I wager who has never written an application as sophisticated as the first one I ever did on my Apple II. (I had to expand to 32k!.)

The tip off is ever thinking anything can be known to work under all circumstances. Sounds like his tour de force calculated fibonacci numbers.

PWUAAHHAHAAHA! I just scrolled down! Dijkstra! But I was close, his shortest-path algorithm comes in at 14 lines. Puh-leaze.

The space Paul Graham and I are concerned with is what he describes as the one in which we do not even know what program to write. The will be a big phat application programs with layers of abstraction and modules within modules, tens of thousands of lines of Lisp.

My code pleases me in two ways. First, it does tend to work under all circumstances, even straightaway after massive refactoring. Second, it tends to work under unanticipated circumstances. Third (I lied) it easily absorbs new requirements.

But all this happens not because I turn away from the computer or towards some UML tool and come up with a scientific design, it happens because I Just Start Writing and then follow one simple rule: if I am having trouble programming under my current design, I change or discard the design.

Sure, I try to design well up front and I do have some luck at that, but my real strength is simple honesty with myself. Kenny, this sucks, let's redo it. I never get married to my own code.

I am also lazy. That makes me look for higher-order solutions and, again, cut and run when my own code starts wearing me out.

[I should confess that that Apple II application was my education in how agonizing it is not to throw out my own code. :)]

If Dijkstra could have done something as dense and hairy as Cells by starting with a scientific design that he knew a priori would always work, great, Uncle Bert was right, there are aliens walking amongst us.

Kenny Tilton said...

I'm interested in why you consider hygienic macros to be something Scheme needs to fix. Can you expand on that?

Regarding hygiene, unhygiene (capturing a lexical variable) does not come up a lot but when it does it rocks. It means we have developed a suffieciently elaborate DSL that some macros expect to be expanded within a lexical context set up (in part) by another overarching macro in the same DSL.

But my phrasing misled. What I meant to say was "Scheme macros should be like CL macros", including how one writes a macro. But! I have never made an effort to learn the latter, I just watched (and failed to understand) an earnest attempt by a top Schemer to explain them on comp.lang.lisp and I read PG on Scheme macros, this from Graham's essay "Being Popular":

"Hygienic macros embody the opposite principle. They try to protect you from understanding what they're doing. I have never heard hygienic macros explained in one sentence...snip...Hygienic macros are intended to protect me from variable capture, among other things, but variable capture is exactly what I want in some macros."

I suppose someday I should try again on Scheme macros just to have a better understanding of such a train wreck of design.

Anonymous said...

Scheme's spec may actually be a net loss. I know at least two would-be Schemers who heard how cool it was that the spec was so short and tried to teach themselves the language by reading it.

The Arc-in-Arc approach has a different purpose. Besides appealing to 'hacker' types who dig that sort of thing (Scheme has that too), it provides an especially smooth and reassuring path to deep understanding. "There are no secret handshakes here," It says to new users. "You won't bog down on Arc, and we promise not to waste your time with silly 'magic' things like in perl."

Potential new users of a language -- particularly a lisp -- are risk-averse. They're naturally afraid of having wasted their time if the new language is weird or hard or the exclusive domain of a monastic elite.

Kenny Tilton said...

I suppose someday I should try again on Scheme macros just to have a better understanding of such a train wreck of design.

http://www.xs4all.nl/~hipster/lib/scheme/gauche/define-syntax-primer.txt

That is a beautifully written introduction to Scheme macros. I read just a little of the ELEVEN THOUSAND WORDS!!!!!! Sorry. Why so much to explain how to take the input source code and munch on it to cobble together output source code for the compiler?

(Un)simple: "The syntax-rules sublanguage is not Scheme!" (emphasis theirs)

More expansively: "At this point, things are starting to get complicated. We can no longer look upon macros as `simple rewrites'. We are starting to write macros whose purpose is to control the actions of the macro processing engine. We will be writing macros whose purpose is not to produce code but rather to perform computation."

Super! Just like CL! Hang on...

"A macro is a compiler. ... The language in which we will be writing these compilers is NOT Scheme. It is the pattern and template language of syntax-rules."

Prolog!? Say it ain't so!

"There is just one problem: the model of computation is non-procedural. Simple standard programming abstractions such as subroutines, named variables, structured data, and conditionals are not only different from Scheme, they don't exist in a recognizable form!"

Sounds hard, let's go shopping!

Don't get me wrong, I do not mind tackling Prolog (so to speak) when I have to, and in fact I am using embedded Prolog from CL as we speak on my Algebra application. But it can be devilishly hard and produce surprising and hard to debug behavior at the drop of a cut, so... why can't I just manipulate the input source like any other data to straightforwardly get what I want?

Anyone know? Does this all somehow derive from the hygiene thing? Or is this just academia showing off again how much smarter they are than me? :)

Kenny Tilton said...

> monastic elite

The link was intended as a reference to the title of the blog, rather a statement about its author.

I cannot parse that sentence, I think you left out a "not" and an "as" but then I cannot guess which goes where. However this bit is easy to settle once and for all:

"Potential new users of a language -- particularly a lisp -- are risk-averse. They're naturally afraid of having wasted their time if the new language is weird or hard or the exclusive domain of a monastic elite."

Nonsense. Can you have a few of these people send me emails? How can I possibly waste time on a language if it is too hard? I'll punch out in three days.

As for these imaginary friends of yours fearing some exclusive domain of a monastic elite, puh-lease, it is a programming language, not the Order of the Raccoon. People do not read Graham and/or the Road to Lisp and get excited and try Lisp and get even more excited and then discover an obnoxious Lisper and start digging their Java books out of the waste bin.

Thomas said...

I have never made an effort to learn the latter, I just watched (and failed to understand) an earnest attempt by a top Schemer to explain them on comp.lang.lisp and I read PG on Scheme macros

Only somewhat related, but not all "hygenic" (hypochondriac? OCD?) macro systems are equally bad. The scheme community had a variety of design spaces to choose from, and not surprisingly they chose the most arcane. Syntactic closures were much more defmacro-like, but let you / made you decide what environment you want the symbols in the expansion to use. They're significantly more difficult to understand than Lisp2+packages, which solves the same problem, but if you insist on a Lisp1 with a flat namespace, they seem to be a reasonable compromise.

Kenny Tilton said...

How do you consider (progn (values))?

(print
__(if (values)
____'good-point
____'not-much-of-an-exception))

Anonymous said...

Wow, the consistent message goes at least as far back as 2002:

http://mail.python.org/pipermail/python-list/2002-December/174264.html

Dan Weinreb said...

The Common Lisp index is about 1080 lines. The R5RS Scheme standard has about 100 lines per page, so the Common Lisp index would be about 11 pages. The R5RS Scheme standard itself is 50 pages. So the claim that the Scheme standard is shorter than the Common Lisp index is hyperbole.

The new R6RS Scheme standard is 90 pages for the basic standard, plus 71 pages for the basic libraries (the kind of thing that's in the Common Lisp manual), for a total of 161 pages.

Kenny Tilton said...

the Common Lisp index would be about 11 pages. The R5RS Scheme standard itself is 50 pages.

I won't argue with your numbers, but I have CLtL2 in my lap and its three indexes total forty-six pages. Meanwhile, my favorite c.l.l post of 2007:

R6RS is now available at www.r6rs.org . A copy will also be posted on schemers.org .

kenny wrote:
Gee, one hundred forty-two pages, up from five [sic]. Got my hopes up for the language.

Josip Gracin wrote:
It's gonna be tough creating an index that big.

Anonymous said...

Small specs. Pfft. Ever walk into an auto garage and see one of those huge red chests filled with drawer after drawer of tools? Those chests are the seven years of med school for a mechanic. We civilians might think it an overwhelming selection but the mechanic knows it by heart, even those tools used like every two months.

You're slipping - isn't this ones of those arguments by metaphor that you're always griping about?

As Arc wedges everything into the language (they're discussing infix operators at the moment, but only when the first argument is a number, but only on a full moon, sigh), it's going to turn into one hell of a mess.

A language should have a small spec. Scheme (like C) is already bigger than it should be. It's the library that needs to be big and standardized.

Kenny Tilton said...

You're slipping - isn't this ones of those arguments by metaphor that you're always griping about?

One of the great things about laying down rules is being able to post hoc lay down the exceptions. :)

Analogies are fine for communicating where one is going with an argument but do nothing to prove one's point. What is scary is when someone comes back with, "Oh, yeah? What happens when the guy gets a new job and he has to move that big tool chest and it falls over on him when he is loading it in the truck?".

In this case, tho, the mechanic's tool chest is almost not an analogy, is it? Maybe we need a new metric: analog distance.

As for language vs. library size, you forget that I am just a simple application programmer: no diffence.

Anonymous said...

Maybe we need a new metric: analog distance.

Sound like something an academic would invent. :)

As for language vs. library size, you forget that I am just a simple application programmer: no difference.

One difference I'd point out is when I look at Common Lisp's (and Scheme's) special syntax for things dealing with quasiquotes, I only see gibberish. I understand breaking the "everything is an s-expression" rule once for 'a, and maybe a second time for (a . b), but backtick pound comma splat is pretty `(#'(,@ugly)) and too much for me to keep in my little application developer head.

Is this tool used so frequently that it should be part of the language with a special syntax and not part of a library with sensible names?

Kenny Tilton said...

Maybe we need a new metric: analog distance.

Sound like something an academic would invent. :)


Yes! You people are coming up to speed nicely.

backtick pound comma splat is pretty `(#'(,@ugly)) and too much for me to keep in my little application developer head.

I feel your pain. But I think (loop for hack across "`,@") is the library. You can try rewriting macro bodies without them, but after a week you will invent them.

And they are pretty easy unnested: ` is just ' plus "hey, not everything, look for ,". And @ is just "Splice here each thing in the list (not the list)".

When they get nested, well, that is what the infinite number of monkeys are for, they just keep trying things until it works.

I mean, I still do not understand why ,', works (if I even have that right). :)

Dave Roberts said...

I think you're right on here. Starting small is fine. The biggest problem is that after multiple decades, standard Scheme is still too small.

In fact, as big as CL is, I would argue that it isn't big enough and many of the issues I have with it are in areas where there isn't any common specification and implementations solve the problems differently. Good examples there are sockets and threads. IMO, CL needs another reformation to bring it up to date with modern issues such as those, following the 17 years it has been in suspended animation. Unfortunately, it doesn't look like that's going to happen. Perhaps the CLRFI or CDR initiatives will help there.

Kenny Tilton said...

Dave, we had good luck solving the disparate FFI problem with CFFI, and my recall is faint but were you the gentleman we almost paired with Perry Metzger to write a specification for the equivalent pseudo-standard for sockets? Too bad that never happened, it sounded like you two were going to do a great job between you.