Saturday, March 22, 2008

"We Can Live With the Way You Handled That"

Tilton's Lemma had come under attack and the title of this entry was the happy result. You forgot Tilton's Lemma? Here ya go:
  • Refinements to requirements cannot vary fundamentally from the original.
The point is that careful design is worth the effort so that software can be successfully rolled out even though users naturally produce RFE after RFE during user acceptance testing. No matter what they come up with the software will adapt to it with a minimum of refurbishing.

So what was the attack? We must begin with the launch meeting of the relevant project, where my conduct was such I rarely again got asked to meetings, but as a tease I will share that this was one of the most profound God is speaking to you experiences I have had in software development, one of those moments where one senses forces unknown at play.

I was being given a nice new standalone task perfect for a cowboy contractor such as myself who was good with code but not with teams. My manager Frank would be perfect for such a loose cannon, one who cared not about how loud I snored when I catnapped or at what hour I showed up or how badly my lunch reeked as I ate it over the keyboard but only if the code worked so he could sleep at night.

Just one problem. I walk into the launch meeting and sit down having no idea what is happening. After thirty minutes of sitting quietly at one corner of the table having nothing to do with any of the discussion I have deduced that a foreign branch of the bank has accounting software that will be shut down and covered by the software at our central location and that I will be writing the interface to take the feed from an off-the-shelf humongoid accounting package and feed it into the central system in such a way that it replicates their existing system.

I have grown increasingly intrigued at this prospect as the meeting drags on because our group has more than a few people assigned to supporting the massive off-the-shelf system (shall we call it Isis?) and it seemed worthy of months of apprenticeship. On top of that I certainly had no knowledge of the foreign branch's system. Best of all I knew less than nothing about accounting. Okay, no, best of all was that no one was talking to me at the meeting so I developed this crazy idea that they were just going to turn to me at the end and say, Need anything else, Ken?

Then Tony the Senior AVP turned to me.

"Need anything else, Ken?"

Left to my own devices for the first 99% of the meeting I had had time not only to deduce its ending but also prepare an answer. My fans from comp.lang.lisp can tell you His Kennyness likes nothing better than a good analogy and it had occurred to Himself that the developing situation resonated strongly with an image that formed spontaneously in his mind of a couple with a dog on a leash standing on some corner in the middle of Manhattan deciding how best to get to the Botanical Gardens in Brooklyn and after considerable discussion and perhaps even a consultation with maps street, bus, and subway settling on a means and path of transveyance and turning to the dog and saying, "Lead on, Fido!"

Hence my prepared answer, before not one but two AVPs and the users on conference call from Canada:

"Woof! Woof!", I offered, and looking back I regret only not having thought to wiggle my ass in my chair to signify enthusiastic tail wagging.

When it became clear that the import of my barking had eluded those present and had at least stunned Canada into silence I expanded on it by observing that I knew neither the sending application nor the application I was meant to replicate nor the entire subject of accounting and concluded by wondering aloud if the guy outside tending the falafel cart might not be a better fit since he at least knew how to make change.

The Canadians had heretofore been utterly thrilled at the idea of having their accounting taken over by our central gang that could not shoot straight (I'll wait while those of you who missed the irony go back for a look) and now they had White Fang as the contractor assigned to the task, confirming their worst nightmares.

They agreed to provide a thorough specification and the AVP and I ever after when meeting in the hallway would bark at each other in greeting and at his going away dinner -- well, after the second round of shots it was pretty much non-stop howling, thank god for private rooms, we should have had one. I digress.

Remember: this is about users not being able to change requirements discontinuously. So now we know how I ended up with the most beautiful requirements from which I have ever worked. Requirements? Bah, this was pseudo code. Just brilliant. They identified a dozen kind of transactions/situations/circumstances I could expect from Isis and they specified exactly how each should be handled.

And this "exactly" was not just exact. There was this tremendous regularity in how they specified the handing of each case. Each was expressed cleanly as so many debit this, credit that, debit this subtransactions. These subtransactions just begged to be viewed as microcode for the larger instructions specifying how each case was to be handled.

So that is the code I wrote, after effusively praising Team Canada to themselves and anyone else I could find. The high-level code read Isis transactions, identified the high-level case, and invoked its handler, each handler dispatching one or more microcode subtask handlers. We are talking easy work and soon we sail into user acceptance done thoroughly for a couple of months because I mentioned they did not trust us. I also know it was a long time because by the time the call came through with the title of this entry I had forgotten about the entire issue to which they referred, which was this:

After a week my bugs had been rooted out and IRs were being handled by letting them know how they had screwed up the test. Frank was happy, I was snoring, life was good, we had time to buzz about this new write once run anywhere language called Java that would let browsers run applets.

Then well into the test when the IRs had slowed to a drip I got a call from Canada. I had blown a transaction. Lemme see it... OK, sorry, that is what the spec says. No, these have to be handled this way, the debit goes over here and credit goes over there when X is Y.

Now as usually happens in these deals I still did not know anything about accounting but some of the words had taken on glimmers of meaning and I asked, But isn't this the same as whatever? And then I threw around some terms and said that from my poor understanding what they were suggesting seemed wrong. Nope, that is how we do it. OK. Bye. Bye.

And now comes the good part. I invoked Tilton's Lemma -- not because Kenny could not see the legitimacy of their accounting, though that encouraged me, but because of something else I found to be strangely compelling: my code would not do what they wanted.

I tried to see how I could patch in the exception and there was no way. I would have needed action at a distance and Einstein had already spoken on that. What was needed was for the microcode to behave differently based on information (X and Y) available only at the top level in which the higher level transactions were being identified and their handlers invoked. The only way to give Canada what they wanted was either to add a parameter to everything in the call chain (which I would have called woof-woof, I think) or even more elegantly to set a global variable at the higher level that this one microcode handler could watch out for. Ewwww.

Now please do not think I am posing as a person of integrity or principle or anything because I am the most miserably pragmatic soul you can hope to meet, but I got back on the phone to Canada.

"Marcia, Kenny. My code does not want to do what you want, eh? Your spec is pretty clear on all these things and just from the sound of it it seems the way I am handling it now is right anyway. I can kludge things up to get what you want, but I will hate myself in the morning."

Marcia said she would talk to Bernie, her VP, and get back to me. She did not do so for so long that I forgot about it and to her credit she did not even have to make the call but she did.

"Kenny, Marcia. I spoke to Bernie. The way you are handling that transaction will be OK. We can live with the way you are doing it."

Tilton's Lemma lived on. Somehow clean design (starting I freely acknowledge with a great spec) had flushed out a faulty business practice. The users had indeed offered an RFE discontinuous with all their other stated requirements, but it turned out to be such an egregious error that they agreed to fix it. (To their credit again!)

Why does this war story give me goose bumps? Yes, I need to get out more. And OK, maybe it is not such a big deal. The careful architecture forced on us by having to program these godawful machines makes it easy for programmers to see inconsistencies the business experts miss.

I just get a kick out of it -- a sense of God or Plato or the Tao throwing their weight around as long as we take design seriously and hop on the phone over such things.

TLOP: The Worst Thing You Can Say About Software Is That It Works

I walked out of my cubicle, turned left, went up one maze turn, turned into the next cubicle. This was clearly a while ago, before email and even before I had reached the point where I would phone a guy ten feet away in case they were not there so I could leave voice mail.

I wanted to tell Mike that I had enhanced the julian date function he had shared with me so he could replace his with mine instead of the system growing a multiplicity of julian date functions.

Mike was another contractor on some godforsakenly dreary application being developed at cruising altitude in some megabank, a great guy always joking well and laughing about everything including his obsession with the real estate scheme infomercials he apparently watched non-stop in his free time.

I told Mike I had snazzed his function up nicely, generalizing it or adding some whiz-bang feature or something, and mentioned he could now modify his program to use the new version.

Something I need to get across to My Cherished Reader is the utter and profound honesty of Mike's reaction. Indeed, sufficient googling may well surface Mike's blog entry detailing in turn his astonishment and horror at what I had proposed. For when I finished there was a brief moment of incomprehension and then a dawning and his eyes grew wide and his jaw dropped and then a huge smile of wonder spread across his face, his palms rose in supplication for any possible justification of my madness and at last words came to him.

"But...but...," he stammered for effect. "My code already works!"

It was a grand moment, two solid technologists sincere in their craft standing face to face with not one scintilla of common ground on the simplest of questions.

Mike was reeling at the idea that a working module should be touched. He repeated his encomium ("It works!") and then with no offense intended and with solid good-feeling camaraderie just started laughing at me.

Meanwhile I knew without doubt that swapping in my version would likely take one minute (what's that? about sixty seconds?) and that application systems should be kept as simple as possible (and so not have two julian date functions (can you say Y2K?)).

I shrugged -- it was a small matter -- and retraced my path back to my cubicle before the pheromone trail could grow cold. Well, this is my blog, so Mike was wrong. Tilton's Law:
  • The worst thing you can say about software is that it works.
I think I can prove that with geometry: if a pile of code does not work it is not software, we'll talk about its merit when it works, OK? Therefore to say software works is to say nothing. Therefore anything substantive one can say about software is better than to say it works. QED. (That last bit is the geometry.)

So what else besides that it works can you say about software, especially about applications written in tall buildings? Well, unless one is banking on the relatively high probability that the project will get canceled before ever it sees light of day, the software will eventually be asked to do new things or to do old things differently.

So the software must not be brittle and must not have any more built-in assumptions than necessary and where one builds in the assumptions it would be nice if they were abstracted out of the core so that new requirements can be handled as gracefully as possible.

Note that new and changed requirements come along not just after a successful deployment but also during development, perhaps even more so because only when users see software do they realize what they wanted. They should do a study: I bet requirements change more just implementing the software than they do ever after. Brittle, inflexible software can be so hard to change during development that projects never make it out the door no matter how well the software "works" on the original requirements.

Reminds me of another war story. Schedule "B".

The law on tax shelters had just been changed eliminating the bogus write-offs those things produced and some financial establishment needed some contractor to write the software to generate statements for clients detailing exactly how badly they would be hosed by the new law. Apparently they had tried it in-house and the guy assigned to it had quit after working on it for a couple of months to no avail. Jobs were everywhere back then in the Eighties, why not?

God forbid I should digress: it was a fun interview, really nice group and nice manager. The best point was when he asked His Kennyness what was His greatest weakness. Neat question!

Lemme I was looking at the floor going through coding, debugging, DB design, and it was taking a while which I thought was pretty funny and I am sure they were all enjoying it but I really wanted to come up with something and then it came to me.

"Testing!" I almost yelled. "I am terrible at testing!"

Whew! Reminds me of the time I had a little team (this was before people realized I should just be chained to an out-of-the-way workstation and hosed down and thrown food as needed on big projects) and was assigned a junior nooby who asked me the same question about three times and who hated me because apparently Hell hath no fury like a woman gently suggested she might take notes, which was great (that she hated me) because she was so useless we just had her testing my code and believe me no software has ever been subjected to such merciless, sustained, and deliberate abuse as was mine. Every twenty or thirty minutes we would hear this triumphant rejoicing "Ha!" explode from her cubicle as she found another bug and imagined another dagger plunged between my ribs when in fact I was loving the whole thing because to me testing is just God and Abraham and Isaac all over again. Where was I? Schedule "B".

Anyway, the task went very well. The manager (another Mike) gave me a very clear spec and I did my usual structured architectural blah-blah-blah and the users were very nice and called me every once and a while with RFEs. Bringing me to my point.

Mike was a great guy but also button-downed and into process and told me not to listen to the users, make them ask him and he would tell them, No. So I was always very nice with the users and said I welcomed their thoughtful consideration and input and could barely restrain myself from just dashing it off right now if I could just put them on hold for a second and I will indeed start looking at the matter while it would help very much if they could just run this great idea past Mike, he's a bit of a stickler for protocol, terrible bother I know, hate to ask this of you.

OK, they would say.

Then I would put down the phone and look at the code to see what was needed. Oh. Scratch, scratch, poke. Done.

Then I would go into Mike's office and tell him the users had asked for X and he would make a face and I would say I had already made the change if that would affect his decision.

Mind you this did not happen because I consciously planned for change, I just worried about things besides whether the code worked. Simple architectural tidiness went straight to the business bottom line: the users got exactly what they needed at no extra cost to the enterprise. How does simple clean design make RFEs easy to handle? Tilton's Lemma:
  • Refinements to requirements cannot vary fundamentally from the original.
It is possible for users to say, Oh my, you have the runners running the bases clockwise? We wanted them going counterclockwise. It is not possible for the users to come back and say, We wanted them going first to second base, then over to first, then to third, and then home. Some fundamental force of nature constrains how we can screw up and keeps it within a metalevel such that refinements to requirements will never break code all that badly.

[Keep your eyes open for the "We Can Live With the Way You Handled That" mind-bending illustration of the power of clean design.]

The flipside war story was in a different tall building working on a system so badly designed that after a certain point it pretty much stopped evolving at all.

I walked up to my manager and said I had an idea on how to improve something I was working on for her so that it could be used for other things.

"No, no, no," she said. "I just need this one thing working before Thursday's meeting so I can score one on the users. They are out to get us."

Ah, systems development as an act of war. The system they had developed was such a horror that the only thing that ever moved it forward was a team of very well-paid outside contractors who managed against all odds to add new functionality, pausing from their toils every several days when one of them would take a design two-by-four across his skull from said system and lean back and poll the others on how badly each felt the system hindered them, an order of magnitude generally being settled on as no exaggeration however often it usually is.

The IT department ended up with a bunker mentality lobbing back hand grenade code releases to suppress fire from users strafing us with gratuitous RFEs as the organization staggered towards collapse and acquisition, and it being a financial institution to no small degree dependent on how well it processed information I cannot help thinking this system was to blame.

But it did work.

Oh, sorry. Lisp? Well, the more powerful the language the more time one has for that much easier a job of designing an application as opposed to banging out whatever source code passes the unit tests. And when it comes to separating specifics out into a configuration area that drives a more universal engine... can you say DSL? Sher ya can.

Monday, March 10, 2008

Tilton's Law: Solve the First Problem

This was such a weird project. Scheduled for five days altogether. My friend from the clinical drug trial venture was also a tech recruiter who got me about half my tech jobs over the years and this one was a real throwaway.

What we had was a mid-80s start-up in the educational software game producing exactly the kind of mind-numbing drill and practice software that was supposed to revolutionize education because Look, Ma! We used computers!

Now they were stuck on some software problem and needed help fast. Their stack was Tandy, Cobol, and some micro database package. My skills were Apple, Cobol, and ISAM and in those days that was a deerskin glove fit so off I went for a mutual look-see.

I was on the beach, why not?

The next morning I am walking up to an apartment building where this enterprise had wedged itself into what was meant to be doctor's offices. Inside I sit down with the top guy in his office and the entire company joins us.

The staff unleashes a thirty minute nightmare tale of software crashes, dysfunctions, anomalies, and disrepair as each person takes turns reciting some utterly bizarre malfunction of the application, all with the database software as the likely culprit. It was a tag-team misery report, a through the looking glass panoply of software non-determinsim. It was wonderful.

A half dozen times I formulated "Explanatory Guess X" only to hear in the speaker's next sentence that they had thought it might be X and but no luck. I mean it was really wonderful and then finally it ended. My head was spinning.

"Have you worked with the Tandy OS," the manager asked.
"Yes, but it does not sound like Cobol is your problem."
"No. I don't suppose you have worked with this DBMS?"
A pause.

"Can you help us?" See straw. Clutch,

I have no idea what to tell them.

"Is the DBMS any good?", I recover enough to ask.
"I checked it out pretty well. It got great reviews, it is supposed to be the best."
I look down at my shoes.

The contract was for five days. The longest any single glitch had stopped me was for five days. Do the arithmetic.

"Yes," I say.

It took seven. They paid up front for the first five, never paid for the last two probably because they did not have it or maybe because of the way things went. You'll see. And I am surprised it came to seven days, I only remember one or two. I never ran their software once and I do not remember even touching a computer. Here is what happened.

After signing on I took home the manuals for their DBMS and a listing of their schema definition. It took maybe a day to decide that everything looked right. The next day I ask Tom the programmer how hard it would be to just initialize an empty database and start over entering the data.

"Easy", says Tom.

Welcome to Tilton's Law: Solve the First Problem. They had described to me twenty distinct failures and that was too many for me, I am not smart like you guys, I cannot just figure these things out in the shower.

I wanted to turn the software off and turn it back on with a clean slate and see what went wrong first and stop right there. I just wanted to see what went wrong first and fix that. I suspect that needs no explanation, but what am I doing up on this soapbox if I am not going to explain these things?

Here goes. Once upon a time my sleazebag ward politician buddy and I were cruising the singles bars back when they had such things and he got nicely eviscerated by a woman we were chatting up. My buddy had said something cynical and she had challenged him on it.

"Oh, I have compromised my principles a few times," he conceded with a sly grin.

"You can only compromise your principles once," she replied. "After that you don't have any."

Software is the same. This stuff is hard enough to get right when things are working nominally, but once they go wrong we no longer have a system that even should work.

Back on the project, the next day I get a call.

"Bad news," Tom says. Uh-oh.

"What happened?"
"Same thing. Mary was entering the 118th record and the program crashed."

I pretty much fell out of my chair. Somewhere in the thirty minute firestorm of issues I had heard the number 118.

"118 sounds familiar."
"Yep," Tom moaned inconsolably. "That's what happened before. Sorry, no difference."
I was doing cartwheels.

"Tom, how hard would it be to write a program to just write out a couple hundred records, just put in dummy data, 1-2-3-4-5...?"
"That would be easy."
"Awesome, do that and let's see what happens in batch mode," says me.
"And reinitialize the DB first, OK?"

The next day I hear from Tom. Sounds like he is calling from the morgue.

"Bad news, Kenny."
Oh, no. It worked.

"What happened?"
"Same thing. The program wrote out 118 records and crashed. Sorry, Kenny."
Oh, yeah, I just hate easily reproducible errors. Not!

"Listen, Tom, let's try making the buffer allocation bigger."

The next day, "Bad news. Same thing."
I am icing the champagne; this is one solid, reproducible bug. But what about the others?

"Tom, remember the first time this thing crashed, before I came on board?"
"Did you start over from a fresh database or just resume working on the one that had been open when the DBMS had crashed?"
"We just continued working with the same DB."
"Oh. OK."

Tilton's Law (Solve the First Problem) had been broken as badly as broken can be. A DBMS had failed while writing data and they had tried to continue using the same physical DB. This transgression is so severe it almost does not count.

Normally Tilton's Law refers to two or three observed issues that do not necessarily seem even to be in the same ballpark. The law says pick out the one that seems most firstish and work on that and only that until it is solved. The other problems might just go away and even if not the last thing we need to do while working on one problem is to be looking over our shoulders at possible collateral damage from some other problem.

Two minutes later I am on the phone to DBMS tech support .

"Hi, we're reliably crashing after adding 118 records in one sitting."
"Yes, that is a known problem."
Oh. My. God.

"Would you like us to send you the patch for that?", she asks.
"That would be lovely."

This being before the advent of the Interweb we confirmed our mailing address and asked for it to be sent out ASAP and overnight delivery. But we are not done yet. Tilton's Law or no, all I have solved is P1, the first problem.

"One more thing," I say.
"If we continue working with the DB after this crash..."
"Oh, no. Don't do that. It's hopelessly corrupted at that point."

Were some of the other issues unrelated to the first crash? I will let you know as soon as this test I have running to solve the halting problem finishes.

Meanwhile, the conversation had suggested how we might get them up and entering data now. Apparently we were crashing because of a bug that surfaced when more than so many records were being held in the buffer before being written out. We had tried making the buffer bigger, only making things worse.

"Tom, we can wait for the patch, but I have one last idea in mind that might get this thing working for you. Want to try one more thing?"
"Try making the buffer half the size it was when we started."
A few minutes later he comes back.

"It works now."
"Yeah, baby!"
"I had it loop to one thousand. No problem."
"Cool. Let's tell the others and go get drunk."

Nope. Something is wrong. Tom is just standing in the doorway all deer and headlights.

"Can I ask you something?", Tom asks quietly.
"I do not understand why making the buffer smaller made the program work."
"Well there was this bug that had to do with being unable to keep more than so many records in memory and with a smaller buffer the software did not try to keep so many in memory."
Long pause.

"OK, but why does it work now?"

"Maybe 118 multiplied by the record size is more than 16,384 and somewhere in the DBMS logic there was an integer overflow so the problem does not come up if the cache is smaller and the software flushes the cache before it gets to 16,384."

"All right," says Tom "But I do not understand why we make the buffer smaller and now the software works."

This was surreal. I try a different tack, a really dumb one, but sometimes when a grizzly bear has your back to the wall all you can do is tap dance.

"Look. There are multiple code paths in an application, right? Every conditional is a fork in the path. A bug exists in some branch or other out of all the code paths, right? By changing a fundamental parameter we send the code down a different code path. Avoiding the bug."

"I just don't understand why making the buffer smaller makes the program work."

Then it came to me. I was Dr. Chandra in 2010 trying to get Hal to fire the rockets, and Tom was Hal stuck in a Mobius loop unable to resolve my understanding of the confusion with his confusion of the understanding.

"I don't know, Tom," I say. "I don't know why it works now."
Tom nods.

Suddenly Mike, the project lead, appears.

"Kenny, Tom. In my office. Now."

"OK, this has to stop. Kenny, I am paying you to solve this problem and you have Tom doing all your work. He has his own work to do. From now on you work on this problem and Tom you do what you are supposed to be doing. Have I made myself clear?"

Remember in Annie Hall when Woody Allen turns to the camera and asks, Why can't real life be like this?

"Actually...I think I'm done."

Leaving Mike and his facial expression frozen in spacetime, I turn to Tom with raised eyebrows for his assent and Tom nods. I turn back to Mike, who no longer knows where he is.

"It turns out this is a known bug. You'll have a patch tomorrow or the next day. In the meantime we found a workaround and you are up and running. Mary can start entering your data, um, now."

Mike recovers.

"So basically I am sitting here making a complete ass out of myself?"

Good for him. We all had a good laugh, shook hands and I was on my way and Tilton's Law of Programming was reaffirmed: Always solve the first problem. The corollary: there only ever is the first problem.

Kenny and The Firing Squad, Episode II: COBOL Has Macros?!!

[[If it is good enough for George Lucas (random release of episodes) it is good enough for me.]

Roll the trailer: Kenny hanging with one hand onto the rear axle of a runaway Cobol stagecoach headed straight for a cliff off which a team of otherwise sensible horses soon will be jumping reaching with his other hand to his ankle sheath and pulling out... COPY...REPLACING!]

Well, someone on comp.lang.lisp made a joke about Cobol and Lisp so I thought I would tell this twenty years later still astonishing cobol-and-lisp-separated-at-birth? war story within a war story about how COPY...REPLACING -- no, that is real Cobol , not a Cobol joke -- ended up taking away the award for Best Supporting Language Feature.

A war story within a war story, with the same theme because I guess they shared a central figure, a young project lead we will call (...thinking of a name...) Reggie. That is good. I got it from Regular Guy and I want to make clear that Reggie was a good guy with a good sense of humor, someone I would party with any time, a young guy his manager (another good guy... look, all these people working in tall buildings are good guys tolerating loose cannon per diems like me to get their code written but not inviting us to the company Christmas party (in the cafeteria -- no loss -- I would have showered had I known) actually causing a boycott by one great guy who could not believe I was left sitting alone at my workstation) ... a young guy his manager trusted implicitly I am sure because of Reggie's technical triumphs past.

The outer story is that I was not even supposed to be working on the task from which I was about to be fired for incompetence. I had not even interviewed with Reggie. (Was that the problem? Hmm.) Reggie had gone to an RDB training class and done his first relational schema and an in-house consulting team had deemed it faulty. Reggie's manager backed him so I was brought in as an out-house... hang on... external consultant to provide a second opinion.

I agreed with the in-house team but Reggie would not budge and the manager still backed him (and I need to admit no one was able to make Reggie see his confusion which must be a story in itself because he was listening to us, he really was a good guy). The schema stayed, and a stage was now a phase:

Reggie's mistake was that he saw that two drug trial business objects (phase and stage) both had a one-to-many relationship with the sample biz object, so he treated stage as a phase. But a stage is not a phase. In Phase IV of testing the samples from Phase III were divided into different stages.) But Reg was fixated on that one abstract commonality and could not get past it.

So? Going away lunch for Kenny? Maybe just handshakes all around since I had only been there a couple of weeks? Nope. It turns out I was there for another reason besides my brilliant mastery of relational DB design: they needed someone to grind out nineteen nearly identical query screens which can best be described as exactly what 4GL packages provide at the push of a button.

Feel free to groan.

Yep. Nineteen screens, each aimed at some node in the DB hierarchy. A few fields at the top constituting key fields and selection criteria, while below was a scrolling list of matching items, each column showing some attribute. Select a row, hit the zoom key and.... sorry, could those of you still awake please nudge the ones who are snoring?

The stack was VAX/VMS, COBOL, TDMS for forms management, and I forget what RDB. Based on the nineteen identical screens take one down pass it around eighteen screens to code on the wall functional spec a child of three could see some template approach was in order, if only through the miracle of cut and paste. But I had been around long enough to know how fast the Red Sea can collapse when after cutting and pasting nineteen screens the user comes along and says could these screens work just a wee bit differently?

Now it happens I had been on a project run by a team of English yobbos who had nothing but contempt for us the few American cowboys (they called us) hired to fill some chairs but they were great fun and pretty good and we had together discovered that Cobol not only had copy-replacing but it could even replace part of a name. I forget what it took, but we found just one letter in the ASCII character set (pretend it was the octothorpe) that lets us code:
perform edit-#fieldname. a template and then (guessing at much of this syntax):
copy "edit-template" 

Well, one was all we needed, so away we went and life was good and now here I am staring at nineteen screens and I forgot to mention Cobol's MOVE-CORRESPONDING which could populate TDMS data records from DB data records if I was bright enough to name things the same and stand back Argentina! Here come the screens!

I told Reggie what I was going to do. He said, Fine. I said it would help knowing which of the nineteen screens was the nastiest (OK, they were not all that identical). He got it. Fourteen, is a beast. It pulls in data from two yadda yadda. Off I went.

You all know how it goes. I spend a week on screen fourteen, not because it takes a week but because I am building a framework, with one eye on the other eighteen screens sitting in the corner. Hell, I even built an IDE.

The command language for VMS was DCL, a gorgeous little thing. This was all before the multi-window days of VMS. Things got a little tedious as I bounced from editing the TDMS form to the Cobol template to the screen-specific Cobol copy-replacing the template to testing the thing, so I wrote a DCL script that just needed to know on which screen I was working and what I wanted to do and then did The Right Thing. When I exited I fell back into the script, which asked what now? Way cool and productive.

Where was I? Oh. You know how it goes. With screen fourteen working I launch into screen one, hoping it Just Works because Reggie and I had picked out A Beast. Ah, not bad. Screen two. Check. Screen three. Here's a twist. Scribble scribble... good.

Screen four. Excuse me? Two lines of info in order to display one detail? Ouch, did not see that one coming. Bang bang hammer thwack... OK. And on we go. Sure, I still had to build a TDMS form for each screen, but I will live. Or so I thought.

It is Monday of week three and Reggie is standing at the door to my windowless little room.

"How's it going?"
"I ask because I haven't seen anything yet."
"Sorry?" Hairs on end.

In five years of programming including three as a pricey independent consultant I had yet to hear one discouraging word from a manager unless you count the time the guy (a very big guy) stood over me beet red literally spitting "Sh*t! Sh*t! You didn't test it, did you?!" as my program gleefully copied only every third record in its first production run because I had left in the bit that did that to make tests run faster. Another day.

"Well it has been two weeks and I haven't seen one screen yet."
"I told you. I am doing all nineteen, not one."
"That's all well and good, but I will be at a class all week, won't be back till Friday, and I need to see something by then."
"Fine." Not sure if you can see the smoke rising from my collar.

It is now Friday noon. Kenny has logged no overtime, but Kenny has had his game face on for four solid days. Co-workers discovered they could rewarm their coffee faster than the microwave if they just put it in the same room with me.

Kenny has little hope of a Hollywood ending, but suddenly Reggie is standing in the doorway, right where he was when he hung the sword over my neck.

"How's it going?
"Good. How was the class?"
"So where do we stand?"
"With the screens?"
"They're done."
"They're done? All of them?"
"Well, if that's true, that's great."
Withering look. And the fun has only begun.

Late Monday Reggie swings by with a list of trivial bugs and issues.

"I really like the way it works."
"It's your design."
"Right," and reaches over his shoulder to pat himself on the back.
Did I mention he was a good guy?

"Could you print out the source code for me? I'll read it on the train."
"Will do," looking forward to the praise and astonishment soon to be heaped on me.

Tuesday. My turn to stand in the door of his office.

"Kenny, I was trying to read your code on the train home."

"First I'm up here, it says perform this, I have to flip down to find it, then there it says perform that, I'm flipping down again, then I flip back up and it says perform something else and I am flipping down again. Then up then down then up again. What's up with that?"

"It's called structured programming. It's in all the books."

I forgot to mention that by this point I had already given notice. I explained that I thought I was signing on for a short DB design review, not a job in a body shop. They reacted with all the chagrin of the monks at learning Ace Ventura was leaving the temple.

"Yeah, well that's all well and good, but if I cannot understand your code we won't be able to use it."

Well, I was right about one thing, only it turned out to be the garbage heap. By now I think I have made clear to my Honored Reader that I do not consider myself an exemplar of human social interaction including especially the delicacy required by the workplace, but what I did next disappointed even myself but felt great because it was so perfectly honest: I slowly closed my eyes, leaned my head against the door jam, and just stayed that way.

Reggie could not help laughing. Did I mention... yeah, I did.

"All, right. Look. We'll do a code walk-through with Roger [his co-lead] this afternoon, see if we can figure this out."


This afternoon. Twenty-thirty minutes. Copy-replacing. Here is the template, here is a usage. The usage has all and only the stuff specific to that screen. The template. blah blah blah. This is here because of screen eight, two lines to show one detail. If you get other screens like that use eight as your starting point. Oh, this is because of fourteen, that special XYZ requirement. Yadda yadda yadda but there was one strange thing and I was deeply worried.

Neither of them asked a question. Ever. Not one. Have I conveyed the smallness of the questioning? This cannot be good, but I had no more to say.

"Any questions? Reg?"
"No." Pause. "Pretty obvious, actually."

And then silence.

Sunday, March 9, 2008

Tilton's Law of Programming: Fear No Evil

We just had this exchange on comp.lang.lisp:

srdjan.m wrote:
On Mar 9, 12:01 am, Ken Tilton wrote:

danb wrote:

On Mar 8, 10:59 am, "srdjan.m" wrote:

(defun test ()
(let ((x '(1)))
(not-ok x)
CL-USER> (test)
(1 C)
CL-USER> (test)
(1 C C)
I really do not understand why let is not initializing x
quote (the single quotation mark) doesn't create a new list, and of
course nconc alters x in place. So you're repeatedly binding x to the
same list and appending 'c to that list.
Incidentally, this is the universal beginner's question. It's been
asked at least three or four times in the last few months. So you
have plenty of company :)

IIANM somewhere in here is either a list or a pointer to a list of these
common nooby coding gotchas:


Indeed there it is

Oh my: "Destructive operations, such as NCONC, SORT, DELETE, RPLACA, and RPLACD, should be used carefully and sparingly. In general, trust the garbage collector: allocate new data structures when you need them."

That's nuts! It sows fear and trembling while offering no actual guidance as to when to use destructive operations and when not.

"Use them when you need them." I wonder what the next question will be. (Hint: "When do I need them?".)

As for using them sparingly, what on earth does that mean? What if I "need them" twenty-five times, should I spare a few of them because twenty-five is too many?

Technology is like an ill-tempered dog: never show it fear. If you are not sure how X works do not start inventing weird usage rules unrelated to how the technology works hoping they will somehow keep you safe. You will end up worrying a lot, not get the most out of your tool, and still get bit in the ass.

Instead, take a minute or hour to find out how X works. In this case, there is nothing "in general" or "sparingly" or "as a rule" about it: if you own all the list structure on which you are about to operate, always use the destructive variant. If not, never use the destructive variant.

The non-excepting exception is when the function you are writing is meant to be destructive, in which case the caller is responsible for using it as they would any other destructive function.

So when do I own structure? When I have created it, or called functions that have created it for me. When do I not own structure? When it has been passed to me.

Suppose we have a silly function like this:
(defun do-filter-and-flatten (lists do test)
(apply 'append
(delete-if-not test
(mapcar do lists))
mapcar generates fresh structure, so I can use delete-if-not instead of remove-if-not on the list produced by mapcar. But the lists themselves -- the members of the input parameter lists -- came from somewhere else and might be in use, so they cannot be touched. Note that the lists structure itself came from someplace else but does not come into play in this example because we begin by effectively copying that structure while mapcar-ing the do function.

We can turn that around to drive home the point, by applying the filter first (and getting a much different function):
(defun filter-do-and-flatten (lists do test)
(apply #'append
(let ((tmp (loop for list in lists
when (funcall test list)
collect list)))
(map-into tmp do tmp))))
We cannot touch lists (via delete-if-not) because it has been passed to us, so we test and collect. But now we own the returned cons cells bound to tmp and are free to whack their CARs with the mutating map-into -- which Stanislaw Halik just pointed out to me and which I do not think I had even heard of in thirteen years of Lisping!

But if I may digress, Kenny don't play let, certainly not in the middle of a Grahamanian cascade:
(defun filter-do-and-flatten (lists do test)
(loop for list in lists
when (funcall test list)
nconc (funcall do list)))
A final note. What if in this last version we knew that any "do" function would massage every element returning a new list along the way. Could we make nconc the last step? I would not. Sometimes functions like these decide they have no work to do and then return the input list structure. We use destructive functions only when we know we own the structure we will be altering, otherwise not.

Then and only then shall we fear no evil.

[All code above Just Typed In, corrections are welcome (and thanks to Anonymous for further reminding me that the non-destructive remove-if-not cannot be assumed to copy -- it might return the input list untouched).]

Saturday, March 8, 2008

My Biggest Lisp Project

Someone asked how much Lisp I have really done. I am building a resume these days getting ready to look for some Lisp work, so I thought I would kill two birds with one stone and write up my experience as the architect and lead developer (out of two, for the most part) of a clinical drug trial management system.

Over a couple of years we built a system consisting of eighty thousand lines of Lisp, having probably thrown away another fifty thousand along the way. We were in the classic situation described by Paul Graham in On Lisp: not knowing exactly what program we were writing when we set out on our mission. We also used C libraries for: writing and reading 2D barcodes; forms scanning; character recognition; generating TIFFs, and probably a couple I am forgetting.

The application was a nasty one:

-- capture clinical drug trial patient data as it was generated on paper at the participating physician's site, using scanner and handwriting recognition tools;

-- validate as much as possible with arbitrary edits; such as complex cross-edits against other data;

-- allow a distributed workgroup to monitor sites and correct high-order mistakes;

-- track all changes and corrections; and

-- do everything with an impeccable audit trail, because the FDA has very strict requirements along these lines.

Making things worse, doctors are often as sloppy about trial details as the FDA is strict about having the rules followed. But drug companies cannot run trials themselves because the FDA demands that investigators be independent to avoid conflict of interest. Getting compliance is tough, and that is the opportunity we were targeting -- more better compliance through automation at a granular, near real-time level sufficient to give drug companies effective oversight over investigator performance.

The stakes are tremendous. Blockbuster drugs can earn millions of dollars a day but only while under patent protection. Unfortunately, patents must be acquired at the start of the trial process, which can run for many years. A third or more of the revenue-rich patent life is spent just getting to market. Big snafus in trials can force months of delay with an opportunity cost of millions a day.

When I got the call from my good friend who was the angel and visionary on this project, he was two years in with not a lot to show for it and his last top developer had just given notice. I went in to hear what they were up to and do an exit interview with the dearly departing.

The business plan was to score big by handling hundreds of trials a year. This would be especially tough because every trial was different. Each involved a custom set of forms to be collected over a series of patient exams. These forms varied from exam to exam. Business logic dictated validation of the forms and how the trial was to run and varied from trial to trial as dictated by what is known as a trial protocol.

When I heard all this I knew we would have to find a solution that did not involve custom programming for each trial. The application would have to be configurable without programming, by a power user trained in the software. If Lisp is the programmable programming language, we needed a programmable application.

Later I learned that competitors in our space had half our functionality and could not handle more than fifteen trials a year and were not profitable. They attempted what they called a "technology transfer" to the drug companies, translating as "we cannot scale this approach but maybe you can". Hmm.

The departing guru showed me what they had so far, which was a system built in Visual Basic with an SQL database. First came the demo of the interactive module, a pure mockup with no substance behind it. Then he showed me the scanning and forms recognition tools in action. He printed out a form built using Word or Visio, scanned it back in, then opened the JPEG file in a manual training tool that came with the recognition software. Field by field he painfully showed the software where each field was, what its name should be, and whether it was numeric, or alpha, yadda yadda.

Ouch. The process was slow and created a brittle bridge from form to application. Worse, these forms might be redesigned at any time leading up to trial commencement in response to concerns from external trial review panels and they can change during a trial in response to field experience. At any given time multiple versions of the same form could be in existence (trials at different sites do not start and stop together), so not only would developers be forever retraining the recognition and then modifying the software to know about new or changed fields, but they also would have to keep alive all the multiple versions of form definitions and match them to specific forms as they got scanned back in or opened for editing.

I was already thinking about automating everything and this process was one that had to be automated. I asked the departing guru if the forms recognition software could be trained via an API instead of via the utility program he was using. He looked at me a moment and said, Yes, realizing I think that is what they should have done. I realized he was leaving in part because he was just a systems guy at heart and this was one deadly application problem to undertake.

To my friend's relief I agreed to give his vision a try. Now it was my turn, but I did not consider for a moment switching to Lisp. This was a serious business application and I knew my friend would never go for it. Nowadays with more experience that is exactly where I would have begun, back then I did not even consider fighting that fight.

No problem. I had pulled off table-driven designs in the past using COBOL and the table-driven design was going to be the key to our success, not the language. Making the table thing work also meant we would want a custom language so we could express the table data easily.

Back home, I looked up at my bookshelf for the manuals to various Lex/Yacc tools I had bought over the years. But I did not look for very long. I knew I could at least use Lisp to quickly prototype the language using a few clever macros, while Lex/Yacc were known to be bears to use --and I had only bought the tools, never really played with either in anger. So Lisp it would be to prototype the trial specification language.

Then I noticed something, my latest hobby creation using Lisp and my cherished Cells hack: an interactive version of the New York Times Double-Acrostic puzzle. It had individual boxes for each letter like any good crossword and that is what we needed for the patient data! So I was halfway home, and sure enough after a few hours I was looking at a perfect mockup of the first page of the sample clinical trial we were using to build the software. And my mouth dropped open at what I saw.

A blinking cursor. In the first character box on the page. I saw that and knew we would be using Lisp for the project. Hunh?

Well, we had to print these forms out, train the forms recognition software, scan the forms back in, feed the scan image to the trained software, and then -- wait for it -- let the users make corrections to the data. And there it was, the blinking cursor waiting for my input. I had been so concerned first with getting the form laid out and ready for printing that I had forgotten that I had cannibalized the code from the crossword puzzle software.

We're done! How famous are those last words? Anyway, my next step was to make sure this wonderful stuff running on my Macintosh under Digitool MCL would run on Windows NT, the planned deployment platform. I figured Lisp I could sell, but the Mac? In 1998? Not.

So on my own dime I acquired an entry-level version of AllegroCL and spent a few weeks porting everything to Windows and it was more of an OS port because Common Lisp is an ANSI standard with which vendors comply pretty well. When I saw everything working on Windows I called my friend, took a deep breath, and said I wanted to use Lisp.

"OK," he said.
"And an OODB," I said.

Some fight. The OODB was AllegroStore, then the vendor's solution for persistent CLOS. (Now they push AllegroCache, a native solution where AStore was built atop the C++ OODB ObjectStore.)

Why the OODB? I had noticed something while playing with the database design for the system: I missed objects. I had done some intense relational design back in the day and enjoyed it but by the time in question I had also done a ton of OO and I missed it in my schema design. Not only would AStore give me OO in my DB, it would do so in the form of persistent CLOS, meaning my persistent data would be very much like all my other application data. The whole problem of mapping from runtime data structures to the database -- something programmers have long taken for granted -- would go away.

About a year later we had the whole thing working and I was about to take a working vacation traveling first to San Francisco for LUGM '99, the annual Lisp meeting (here is the best talk) and then on to Taiwan for two weeks of work and socializing with friends. I decided to use this time to tackle a performance problem that had reached "The Point of Being Dealt With". The solution I had in mind seemed like it would be fun (more famous last words) and nicely self-contained and not too onerous for a working holiday.

Basically my approach to performance is not to worry about it until I have introduced enough nonsense that the application is running slow enough to hurt my productivity. Then it is time for what I call Speed Week, a true break from new development dedicated purely to making the system hum again. And we had reached that point.

Otherwise the application was wonderful, just as planned. A trial form was specified using defform, defpage, defpanel, deffield, etc etc. From that single specification came beautifully typeset paper forms, the corresponding DB schema, the unrestricted business logic, the automatic training, scanning, and recognition, and screen forms for on-line editing of the data once scanned in.

Life was good, but run-time instantiation of a form was a pig. It occurred to me that each time a page was instantiated the exact same layout rules were being invoked to calculate where all the text and input fields should fall. The forms author merely specified the content and simple layout such as rows and columns and Cell rules used fontmetrics to decide how big things were and then how to arrange them neatly. The author simply eyeballed the result and decided how many panels (semantically related groups of fields) would fit on a page. So coming straight from the source there was a load of working being done coming up with exactly the same values each time. Some timing runs showed this was where I had my performance problem.

What to do? Sure, we could (and did!) memoize the results and then reuse the values when the same page was loaded a second time, or we could... and then my mouth dropped open again. The alternative (described soon) would let me handle the six hundred pound gorilla I have not mentioned.

First I have to tell you about an overarching problem. One prime directive was that trial sites be fully functional even if they were off the network. Thin-client solutions need not apply. So we had to get client sites set up with information specific to all and only those trials they would be doing. And keep that information current as specifications changed during a trial. My six hundred pound gorilla was version control of the software and forms.

Now here is the alternative. What if we instantiate a form in memory, let the cells compute the layout, and then traverse the form writing out a persistent mirror image of what we find, including computed layout coordinates? Business logic can be written out symbolically and read back in because thanks to Dr. McCarthy code is data. We avoid the redundant computations, but more importantly we now had a changed form specification as a second set of data instead of as a software release. Work on the original performance problem had serendipitously dispatched Kong, because now the replication scheme we would be doing anyway would be moving not just trial data in from the clinical sites but also the configuration data from the drug companies out to the sites. We're done!

If it works. The test of the concept was simple. I designed one of the forms using the trial specification DSL (the macros), compiled, loaded, instantiated, displayed, printed it, did some interactive data entry. Cool, it all works.

Then I "compiled" the form as described above, writing everything out to the OODB. Then I ended my Lisp session to erase any knowledge of the original form source specification.

Now I start a new Lisp session and do not load the source code specifying the form. Using a second utility to read the form information back in from the database, I instantiate the form in memory. And it works the same as I had with the one instantiated from the original specifications. Life was good and about to get better.

Round about now we had the whole thing working, by which I mean all the hard or interesting pieces had been solved and seen to run from form design to printing to capture and validation. Work remained, such as the partial replication hack, but the DB design had taken this requirement into account and was set up nicely to support such a beast. That in turn would let us toss off the workgroup requirement by which remote trial monitors hired by drug companies could keep an eye on the doctors. But then came a four-day Fourth of July and I decided to treat myself to some fun.

For eighteen months we had been working with exactly one specific form from the sample trial. It occurred to me to see how powerful was my DSL, the mechanism by which eventually power users would be describing an arbitrary clinical drug trial to the generic trial manager application, and actually build out the remaining forms of the eighteen visit trial. The experiment would be compromised two ways.

First, I would be the power user. Talk about cheating. But we understood the friendliness of the specification language would have to be developed over time as folks other than me came to grips with it, and also that those power users would have push-button access to software experts when they got stuck -- we were developing an in-house application, not a presentation authoring tool for the general public.

The second compromise went the other way, making things harder. As I proceeded through the new forms I would be encountering specific layout requirements for the first time, and writing new implementation code as much as I would be simply designing the forms in power user mode. Regarding this, I imposed a constraint on myself: I would design the forms to match exactly the forms as they had been designed in Word, even though in reality one normally takes shortcuts and tells the user "you know, if you laid it out this way (which actually looks better) we would not have to change the layout code". But I wanted to put the principle to test that we had developed a general-purpose forms design language.

What happened? It was a four day weekend and I worked hard all four days but the weather was great and I a good dose of Central Park each day skating and fruitlessly pursuing romance. (I liked to concentrate on new skaters because they cannot skate well enough to get away from me.) On the development side as predicted I spent at least half my time extending the framework to handle new layout requirements presented by different forms. But by the end of the weekend I was done with the entire trial. Yippee.

I spent a day writing out the pseudo-code for the partial replication scheme and the other guy started on that while I started thinking about the workgroup aspect. It occurred to me that the mechanism for storing forms describing patient visits could be used for any coherent set of trial information, such as a monitor's so-called "data query" in which they did a sanity check on something that had passed validation but still looked wrong. The only difference with these forms compared to the trial data forms (already working) would be that they would not be printed and scanned, so... we're done!

Talk about code re-use. Somewhere along the way I had accidentally created a 4GL in which one simply designed a screen form and went to work, the database work all done for you.

Now I started working on the interface that would knit all this together. GUIs are insanely easy with my Cells hack. Downright fun, in fact. So much fun, so easy to do, and then so foolproof that I missed them almost immediately. The persistent CLOS database lacked Cells technology. Of course. It was just persistent data as stored by the AllegroStore ODB.

But Cells at the time was implemented as a CLOS metaclass, and so was AllegroStore... no, you are kidding... multiple-inheritance at the metaclass level?! That and a little glue and... we're done! The GUI code now simply read the database and showed what it found to the user. When anything changed in the database, the display was updated automatically. (I always enjoy so much seeking out the Windows "Refresh view" menu item. Not!)

For example, the status of form Visit #1 for Patient XYZ might be "Printed". Or -- since one business rule said forms should be printed and scanned in short order so the sytem could tell when forms had been lost -- it might say "Overdue for Scanning". So now the user looks around, finds the form, puts it in the sheet feeder of the scanner and hits the scan button. In a moment the user sees the status change to "Ready for Review".

The beauty of having wired up the database with Cells is that the scanning/recognizing logic does not need to know that the user is staring at an "Overdue" indicator on the screen that no longer applies. Tech support will not be fielding calls from confused users saying "I scanned it six times and it still says overdue!" The scanning logic simply does its job and writes to the database. The Cells wiring notices that certain GUI elements had been looking for this specific information and automatically triggers a refresh. And as always with Cells, the GUI programmer did not write any special code to make this happen, they simply read that specific bit of the database. Cells machinery transparently recorded the dependency just as it automatically propagates any change.

In the end we had taught Cells and the ODB a few kinds of tricks. It was possible for a dynamic slot of a (dynamic) GUI instance to depend on the persistent slot of a DB instance, or on the population of a peristent class ("oh, look, a new data query just came in"). Persistent instances can have dynamic slots in AllegroStore and these could depend on persistent slots, and perhaps the scariest bit: persistent slots could depend on other persistent data. The database was alive!

One change could ripple out to cause other change in the database. For example, "Overdue Form" was a persistent attribute calculated from the fact that a form had been printed but that it had not been scanned. If that status held for a day the database automatically grew a new persistent instance, an "Alert" instance visible to trial monitors who could intervene to see why the clinical site was not taking care of business. When the form got scanned, Cells logic caused the status to move up to "Scanned/ready for review" and the "Alert" instance got deleted. All in classic Cells declarative fashion: "an Alert exists if a document is overdue for one day" takes care of both creating and destroying the instance.

I'll never forget a religious moment I had scanning forms so I could work on the interface's mechanism for correcting scanner or recognition errors. When I called up page two of some form all the fields were blank. I had never seen the scanner/recognition software completely miss a page, but a peek at the log file showed the page had indeed gone unrecognized. (We used 2-D barcodes to identify pages.) Stunned at this first ever failure (out of hundreds of trials) I just grabbed the second page, put it back in the sheet feeder and hit the scan button.

The nice thing about the barcodes is that I could just do that, I did not have to tell the software "OK, now I am scanning page two of form 1 for patient XYZ." The barcode data was in fact nothing but the GUID we assigned to the page when creating it in the ODB, so the printed paper had object identity. :) The other neat thing here is what when instantiating a page we linked it to the version of the form template from which it was derived. This took care of the version control problem created by changing forms -- a scanned page was able to look in the database to find the template from which it was created and scan and open itself. Users never had to say what they were dropping into the sheet feeder or worry about the order. Back to my missing page...

The scanner started whirring, the page went through, and then log diagnostics from the recognition process started zooming by (OK, this time we recognized the page, the last time was truly a fluke) and then even before it happened I realized (Omigod!) what was about to happen.

I turned my eyes to the blank page still up on my screen and waited...Boom! The data appeared. Having wired the database to the screen, what happened was this: the recognition logic simply read the forms and wrote out the results as usual, updating each persistent form field with a value. The screen field widget had gotten its display value by reading that DB field. Cells told the screen field widget to "calculate" again its display value. It was different, so the Cell observer for the screen field generated an update event for the field. The application redrew the field in response to the update event.

That was so cool to see happen.

We're done! Literally. :( Eventually our tiny little operation was never able to persuade big pharma we could handle the grave responsibility of not screwing up trials, even though IBM itself loved our work and worked with us to pitch it to pharma. But that is another story.

OpenLaszlo-Cells Smackdown 2008

Yes, I pointed this out before on c.l.lisp but there has to be some upside to senility and freely repeating myself is it: an OpenLaszlo introductory video.

Pretty far into the video, after some horrifyingly dirty language like "JDOM" which was all the more horrifying for the speaker clearly believing he was talking plainly (driving home my dinosaurosity and at the same time making me feel a whole lot better about it) we learn that OpenLaszlo includes as one bell amidst all the whistles...wait for it... dataflow! ta-dum!!! Pretty damn transparently, to boot.

Meanwhile over in the comp.lang.lisp archives you'll find the author of the latest Common Lisp tome saying he cannot see the point of my equivalent Cells library and news of my invitation to ECLM 2008 where I have been invited to unravel this profound mystery live and in color as if that will make any difference.

Meanwhile back in the above video we discover the speaker presuming the advantages are obvious to his audience and offering no further explanation beyond pointing it out.

Tilton's Law: Solve the Right Problem. Maybe I do not need to explain dataflow better, maybe I need a new audience.


ps. In the yobbos's defense, I just checked the Laszlo white paper on their product and they do not even mention dataflow explicitly. Just a couple of mentions of layout being "declarative". :)

Wednesday, March 5, 2008

My Nastiest Macro Ever?!

[WARNING: Do not try at home. The following code relies on extensions not included here. Plz visit the DSB thread for something you can play with]

It all started because a Lisp/Arc noob "got it" when they saw how I used Lisp destructuring to make some code a little more readable. That in turn started with a challenge to improve some code to write an SQL query or something.

I suggested rearranging the Lisp query spec to be friendlier to the consumer, then used Arc destructuring as the first improvement (and then greatly missed CL format):
(let (tbl key . fields) *sql-template*
(prn (string "insert into " tbl
" ( " (let c nil
(each (f . rest) fields ;; more destructuring
(when c (= c (string c ", ")))
(= c (string c f)))
", " (car key)
" ) values ( "
(let c nil
(each (f . rest) fields
(when c (= c (string c ", ")))
(= c (string c ":" f)))
", " (last key) ".nextVal"
" ) returning " (car key) " into " (string ":" (car key)))))

When the "got it" came in, I decided to dust off my lite version of CL's destructuring-bind (actually done to implement a CL-style defun for Arc since its DEF has only optional args) and really give the Arcers an eyeful.

The functionality we are after, by example:
(let data (list 1 2 7 nil 'p 5)
(dsb (x y &o (a (+ y 1)) (z 4) &k (p 98) (q (+ a 1))) data
;; we want to see identical pairs, cuz next I
;; print first a variables runtime binding
;; and then its expected value
(prs "args" x 1 y a 7 z nil p 5 q 8)

The above should produce "args 1 1 2 2 7 7 nil nil 5 5 8 8". Yes, I apologized to the Arc forum for my lame test "utility". Anyway, here is what I came up with, and even though I have written well over five hundred macros this one just might be the most mind-bending I have done.

First, the final form (probably still buggy!):
(mac dsb (params data . body)
(w/uniq (tree kvs)
`(withs (,tree ,data
,@(with (reqs nil key? nil opt? nil keys nil opts nil)
(each p params
(is p '&o) (do (assert (no opt?) "Duplicate &o:" ',params)
(assert (no key?) "&k cannot precede &o:" ',params)
(= opt? t))
(is p '&k) (do (assert (no key?) "Duplicate &k:" ',params)
(= key? t))
key? (push-end p keys)
opt? (push-end p opts)
(do (assert (~acons p) "Reqd parameters need not be defaulted:" p)
(push-end p reqs))))
(with (n -1)
(+ (mappend [list _ `(nth ,(++ n) ,tree)] reqs)
(mappend [list (carif _) `(if (< ,(++ n) (len ,tree))
(nth ,n ,tree)
,(cadrif _))] opts)
`(,kvs (pair (nthcdr ,(++ n) ,tree)))
(mappend [list (carif _)
`(aif (member',(carif _) ,kvs)
(cadr it)
,(cadrif _))] keys)))))

That hurt. But why is it possibly the wildest one I have ever written? It certainly is not the longest. Let me put it this way: macros are always tricky but most of them only go up to ten; this one goes to eleven.

First, let's look at a simpler example:
(dsb (x &k (y (++ x))) (list 1 'y 2)
(list x y))

...and the desired expansion:
(withs (u1 (list 1 2)
x (nth 0 u1)
(u2 (pair (nthcdr 2 u1)))
y (bif (assoc 'y u2) (cadr it) (++ x)))
(list x y))

Come to think of it, next time I have to write a macro this tricky I just might write out the expansion I have in mind rather than try to hold it in my imagination as I code. Lesson learned, although macros normally arise after we have indeed written several long versions and then decide to write a macro, so we do have expansions to stare at. Anyway...

Macro writing does present a challenge in all but the simplest token-replacing cases: while writing the macro we are working in two different times at once, macro-expansion time and run-time. I am writing macro code M (to run at macro-expansion time) whose input is source code S and whose output is code R that will Actually Run at run-time. As the macro author, I start a macro function in mindset M, then use a backquote to start specifying code in mindset R, then use the unquoting comma to jump back to mindset M. It is as if my time machine went haywire and I am experiencing now and the future on alternate blinks of the eye.

When this becomes second nature, btw, you can add Lisp to your resume.

So why is this example worse than usual? How does it go to eleven? Look at the binding of U2 (which for the cognoscenti is my fakery of the unique symbol uniq (CL's gensym) would produce). Omigod! The runtime code R is finishing the job of parsing the list being destructured by taking arguments beyond the last optional position and building them into an a-list to be read by the value forms of each keyword, themselves doing more parsing to decide if a runtime value was supplied or if they should use the form (if any!!!) specified as the default in the parameter list!!!

Aaaaaargghh!!! Maybe this one goes to twelve. I observed on the Arc forum while describing all this that it was a good thing I had written the macro before thinking about it. But there we were, so before closing the thread I decided to mention some simple macro-writing tips:

1. The nice thing about CL-style and now Arc-style macros is that they are just like any other code we write, so we can debug them the same way, e.g, by placing print statements in the M code, if you recall my naming convention. This will be a mind-bender at first, or it was for me anywho.

2. If when you test a macro it dies on the compile of a usage (not the run) your bug is in the M code, not the produced runtime R code. Let's go into some detail:

I can write a macro and have that not compile:
(mac surprise! (f1 f2)
`(do1 ,f1, f2, ,f3))

... because f3 is undefined. Duh. Easy enough to nail those. Once I fix the macro:
(mac surprise! (f1 f2)
`(do1 ,f1, (/ ,f2 0))

...I can try it out and come to grief on a reasonable looking usage because the macro generates buggy code:
(surprise! 10 11)

...which should produce a division by zero error by expanding to:
(do1 10 (/ 11 0))

But there is another way to mess up! Isn't this fun! You can have macro code M that compiles OK but has a bug of its own that will be encountered while the M code is running (to produce the R code for the compiler):
(mac surprise! (f1 f2)
(let x (len f2) ;; golly I hope f2 is a proper list
`(do1 ,f1, (/ ,f2 0))

That version of the macro compiles, but now:
(def three-two-one ()
(surprise! 10 11)

...should not compile, Arc should squawk during compilation of three-two-one about 11 not being a proper list.

In summary, when and how a macro usage goes South tells you in which of these many ways possible you have screwed up. Until you develop an instinct for that, this will help:

2. Once your macro itself compiles (not a use thereof), use macex to preview and debug the R code your macro will pass on to the compiler:
(prn (macex '(surprise! 10 11)))

The above may fail if, as we discussed, you have a runtime bug in your macro expansion code M. If you are developing a macro and testing it in your application and get confused as to when the wheels are coming off, take a step back and cut-and-paste the failing usage into the above, not forgetting the quote.

This by the way is a helpful reminder that macros eat symbolic source code, not text strings, and not runtime values, by which latter I mean this next bit should work even though the variables hi and mom do not exist -- at this point again it is all just symbols:
(prn (macex '(surprise! hi mom)))

Use macex also when you simply do not get the behavior you want, when a macro usage compiles and runs but gives the wrong result. It might be that a macro expansion is writing bad code, and it is easier to see in the expanded form of an actual usage than it is staring at the slicing-dicing of a macro function. Come to think of it, we can see that with this example:
(surprise! 1 2)
-> 1

Puzzled as to why we get 1 instead of 2, we use macex:
(macex '(surprise! 1 2))
->(do1 1 2)

Aha! That is supposed to be do! The do1 is left over from an earlier version where it made sense. Something like that.

3. Divide and Conquer I: first I did DSB just with required params, then with optional params, then with keyword params, then with default values, then with computed default values, yadda yadda yadda, nothing new in this tip but there ya go.

4. Divide and Conquer II: This gets back to tip #1: macros are like any other code, meaning we can break complex chunks out into standalone functions to be tested separately. See that little state machine at the beginning of DSB that parses the parameter list? That is a nice little standalone chunk with a big job to do, let's give it its own home:
(def dsb-params-parse (params)
(with (reqs nil key? nil opt? nil keys nil opts nil)
(each p params
(is p '&o) (do (assert (no opt?) "Duplicate &o:" ',params)
(assert (no key?) "&k cannot precede &o:" ',params)
(= opt? t))
(is p '&k) (do (assert (no key?) "Duplicate &k:" ',params)
(= key? t))
key? (push-end p keys)
opt? (push-end p opts)
(do (assert (~acons p) "Reqd parameters need not be defaulted:" p)
(push-end p reqs))))
(list reqs opts keys)))

And test:
(prn (dsb-params-parse '(x y &o (a (+ y 1)) z &k (p 98) (q (+ a 1)))))
-> ((x y)((a (+ y 1)) z)((p 98) (q (+ a 1))))

Awesome. Now look at DSB:
(mac dsb (params data . body)
(w/uniq (tree kvs)
`(withs (,tree ,data
,@(let (reqs opts keys) (dsb-params-parse params)
(with (n -1)
(+ (mappend [list _ `(nth ,(++ n) ,tree)] reqs)
(mappend [list (carif _) `(if (< ,(++ n) (len ,tree))
(nth ,n ,tree)
,(cadrif _))] opts)
`(,kvs (pair (nthcdr ,(++ n) ,tree)))
(mappend [list (carif _)
`(aif (assoc ',(carif _) ,kvs)
(cadr it)
,(cadrif _))] keys)))))

Sweet! Now we can give Arc a Cl-style defun if like me we are fond of them:
(mac defun (name params . body)
(w/uniq (args)
`(def ,name ,args
(dsb ,params ,args ,@body))))

Applied, we create the CL poster boy for mixed optional and keyword args:
(defun read-from-string (s &o eoferr? eofval &k start end keep-whitey?)

Too easy? :)

Somewhere out there some Pythonista believes the sky is falling, that I have forked Arc, and that no one will ever be able to read my code. (a) They are wrong, and (b) this is a wonderfully extreme example in which we really are (yes!) extending the language. If they do not like this, maybe they will like some of the simpler transformations mac allows.