Saturday, October 31, 2009

The King is Dead!? Long live... Scala? Clojure?!

Whoa, why wasn't I told Java is closing its doors? I guess I have been out of touch, word seems to be everywhere. I had to go here (blog of the guy who created Groovy) to find out. James is whooping it up over Scala as Java's successor. Yes, the guy who invented Groovy prefers Scala. Quite a bit:
I can honestly say if someone had shown me the Programming in Scala book by by Martin Odersky, Lex Spoon & Bill Venners back in 2003 I'd probably have never created Groovy. -- James Strachan

Damn. So he pretty much invented Groovy by mistake? Did not know about the two-year old Scala? No one mentioned it to him? Groovy got admitted to the standard in the meantime?

Well, Johnathan Edwards reinvented Python Trellis (ergo Cells) without knowing it, and I did not know about Garnet's KR or constraints -- but Groovy got adopted as official Java! You think Scala might have come up over coffee. Anyway...

Steele said Java brought the world half-way to Lisp. I do not think Lisp means what he thinks it means. Proof might be how hard it is for folks to climb out of the pit of javathink. If Java had been a stepping stone to Lisp it would have made the next step easier, not harder. But Java still cannot do closures. Please. And a quick look at closures in Scala has me thinking, omigod, they call that closures?

Clojure starts to look like a Good Move. I see it mentioned in writings on the death throes of Java and that is a big marketing win. The superwhacky thing here is that both Scala and Clojure are syntactically discontinuous from Java. Folks always thought successors had to have syntax similar to the succeeded though Dylan should have served as cautionary counter-evidence.

No, it is not the syntax. The necessary bridging element seems to be....wait for it...the Java runtime! How did that tail end up wagging the language adoption dog? But Clojure gets the nod along with Scala just for sitting atop the JRE! You people scare me.

Well, if Steele were right Clojure would prevail over Scala. Right now googlefight has Scala winning five to one. Maybe Rich Hickey can move 17% of the world 90% of the way to Lisp?

Monday, August 10, 2009

To write, or not to write?

>> By the way, to change the subject a little, who was it who said,
>> "there are no dead languages, only dead minds"?
> Dunno, but to change the subject even more, Socrates objected to writing
> since it deprives an idea of a mind in which it can "live". So yeah.

Interesting. "Free writing" is a form that lives within a mind but also create a permanent record and slow the mind down enough to achieve more coherence so the mind can work out hard problems. Comedy writing is necessary to trigger a laugh response because every word matters, but then the words must be delivered as if they were coming live from a mind. Exceptions are improv and semi-improv such as Eddie Izzard, of which Mr. Socrates would approve because they arise within a living mind.

I get a lot of complaints about the writing in this blog because I deliberately write as chaotically as I think. Other times I found I gave a much better talk if I read from something written beforehand precisely because otherwise the living mind is too chaotic to get the talk done in anywhere near the time available.

I was just getting ready to videotape an improvised bit to get the good bits to then pull into a fixed, written bit because I am finding good stuff comes out only if the mind is not slowed down as by free writing.

The question is whether Eddie Izzard is lazy, or if Socrates is right on this. Does Izzard do better by capturing his improv and distilling it down to a precise fixed bit, or does he do worse? Or does he just lack the ability to deliver the prepared as if it were unprepared.

We're getting pretty close to talking about programming in Lisp vs NotLisp now. Lisp programming unconstrained by static typing and blessed with a rich library once that library is mastered such that it is all at the programmer's fingertips allows the code to flow freely yet mostly correctly from a live mind even as that mind is forming the solution the code embodies. Diagram that.

Friday, June 26, 2009

I Feel A Naggum (RIP) Coming On: Quads

I sometimes begin c.l.l rants with "I feel a naggum coming on...". What is a naggum? Normally:
naggum (n): A rant along one of Erik Naggum(1965-2009)'s themes.
That might be self-referentially hopeless which is fine because that is not what I am talking about, I just thought that would be a clever title. In this case a "naggum" is a nugget of Erikian technology. First, his specification of what he called quads (see below), and my poor implementation (even further below and good luck even figuring out how to test it) .



From: Erik Naggum (
Subject: Re: XML->sexpr ideas
Newsgroups: comp.lang.lisp
Date: 2004-01-19 04:24:43 PST

* Kenny Tilton
| Of course it is easy enough for me to come up with a sexpr format off
| the top of my head, but I seem to recall someone (Erik? Tim? Other?)
| saying they had done some work on a formal approach to an alternative
| to XML/HTML/whatever.
| True that? If so, I am all ears.

Really? You are? Maybe I didn't survive 2003 and this is some Hell
where people have to do eternal penance, and now I get to do SGML all
over again.

Much processing of SGML-like data appears to be stream-like and will
therefore appear to be equivalent to an in-order traversal of a tree,
which can therefore be represented with cons cells while the traverser
maintains its own backward links elsewhere, but this is misleading.

The amount of work and memory required to maintain the proper backward
links and to make the right decisions is found in real applications to
balloon and to cause random hacks; the query languages reflect this
complexity. Ease of access to the parent element is crucial to the
decision-making process, so if one wants to use a simple list to keep
track of this, the most natural thing is to create a list of the
element type, the parent, and the contents, such that each element has
the form (type parent . contents), but this has the annoying property
that moving from a particular element to the next can only be done by
remembering the position of the current element in a list, just as one
cannot move to the next element in a list unless you keep the cons
cell around. However, the whole point of this exercise is to be able
to keep only one pointer around. So the contents of an element must
have the form (type parent contents . tail) if it has element contents
or simply a list of objects, or just the object if simple enough.

Example: 123 would thus be represented by (foo nil "123"),
123456 by (foo nil "123" bar nil "456"), and
123456 by #1=(zot nil (foo #1# "123"
bar #1# "456")).

Navigation inside this kind of structure is easy: When the contents in
CADDR is exhausted, the CDDDR is the next element, or if NIL, we have
exhausted the contents of the parent and move up to the CADR and look
for its next element, etc. All the important edges of the containers
that make up the *ML document are easily detectible and the operations
that are usually found at the edges are normally tied to the element
type (or as modified by its parents), are easily computable. However,
using a list for this is cumbersome, so I cooked up the «quad». The
«quad» is devoid of any intrinsic meaning because it is intended to be
a general data structure, so I looked for the best meaningless names
for the slots/accessors, and decided on QAR, QBR, QCR, and QDR. The
quad points to the element type (like the operator in a sexpr) in the
QAR, the parent (or back) quad in the QBR, the contents of the element
in the QCR, and the usual pointer to the next quad in the QDR.

Since the intent with this model is to «load» SGML/XML/SALT documents
into memory, one important issue is how to represent long stretches of
character content or binary content. The quad can easily be used to
represent a (sequence of) entity fragments, with the source in QAR,
the start position in QBR, and the end position in QCR, thereby using
a minimum of memory for the contents. Since very large documents are
intended to be loaded into memory, this property is central to the
ability to search only selected elements for their contents -- most
searching processors today parse the entire entity structure and do
very little to maintain the parsed element structure.

Speaking of memory, one simple and efficient way to implement the quad
on systems that lack the ability to add native types without overhead,
is to use a two-dimensional array with a second dimension of 4 and let
quad pointers be integers, which is friendly to garbage collection and
is unambiguous when the quad is used in the way explained above.

Maybe I'll talk about SALT some other day.

Erik Naggum | Oslo, Norway

Act from reason, and failure makes you rethink and study harder.
Act from faith, and failure makes you blame someone and push harder.


(in-package :ukt)

;;;(defstruct (juad jar jbr jcr jdr)

(defun qar (q) (car q))
(defun (setf qar) (v q) (setf (car q) v))

(defun qbr (q) (cadr q))
(defun (setf qbr) (v q) (setf (cadr q) v))

(defun qcr (q) (caddr q))
(defun (setf qcr) (v q) (setf (caddr q) v))

(defun qdr (q) (cdddr q))
(defun (setf qdr) (v q) (setf (cdddr q) v))

(defun sub-quads (q)
(loop for childq on (qcr q) by #'qdr
collecting childq))

(defun sub-quads-do (q fn)
(loop for childq on (qcr q) by #'qdr
do (funcall fn childq)))

(defun quad-traverse (q fn &optional (depth 0))
(funcall fn q depth)
(sub-quads-do q
(lambda (subq)
(quad-traverse subq fn (1+ depth)))))

(defun quad (operator parent contents next)
(list operator parent contents next))

(defun quad* (operator parent contents next)
(list operator parent contents next))

(defun qups (q)
(loop for up = (qbr q) then (qbr up)
unless up do (loop-finish)
collecting up))

(defun quad-tree (q)
(list* (qar q)
(loop for childq on (qcr q) by #'qdr
while childq
collecting (quad-tree childq))))

(defun tree-quad (tree &optional parent)
(let* ((q (quad (car tree) parent nil nil))
(kids (loop for k in (cdr tree)
collecting (tree-quad k q))))
(loop for (k n) on kids
do (setf (qdr k) n))
(setf (qcr q) (car kids))


(defun test-qt ()
(print (quad-tree #1='(zot nil (foo #1# ("123" "abc")
. #2=(bar #1# (ding #2# "456"
dong #2# "789")))))))

(print #1='(zot nil (foo #1# ("123" "abc")
. #2=(bar #1# (ding #2# "456"
dong #2# "789")))))

(defun test-tq ()
(let ((*print-circle* t)
(tree '(zot (foo ("123")) (bar (ding) (dong)))))
(assert (equal tree (quad-tree (tree-quad tree))))))

(defun testq ()
(let ((*print-circle* t))
(let ((q #1='(zot nil (foo #1# ("123" "abc")
. #2=(bar #1# (ding #2# "456"
dong #2# "789"))))))
(print '(traverse showing each type and data preceded by its depth))
(quad-traverse q (lambda (q depth)
(print (list depth (qar q)(qcr q)))))
(print `(listify same ,(quad-tree q))))
(let ((q #2='(zot nil (ding #2# "456"
dong #2# "789"))))
(print '(traverse showing each "car" and itd parentage preceded by its depth))
(print '(of data (zot (ding (dong)))))
(quad-traverse q (lambda (q depth)
(print (list depth (qar q)
(mapcar 'qar (qups q)))))))))

;;;(defun tree-quad (tree)

(defun testq2 ()
(let ((*print-circle* t))
(let ((q #2='(zot nil (ding #2# "456"
dong #2# "789"))))
(print '(traverse showing each "car" and itd parentage preceded by its depth))
(print '(of data (zot (ding (dong)))))
(quad-traverse q (lambda (q depth)
(print (list depth (qar q)
(mapcar 'qar (qups q)))))))))

Monday, June 22, 2009

American & Iran: Separated at Birth?

Am I the only one grooving specifically on the fact that Iranians are telling their authority figures to go f*ck themselves? Here is the country we thought we hated but it turns out they are as kick-ass as us when it comes to political freedom, and we utterly respect them for their strength. Omigod, Americans and Iranians are going to get along great!

Monday, May 11, 2009

How to teach math

> On Sat, 09 May 2009 15:30:52 -0400, Kenneth Tilton wrote:
Well, i was not really trolling, I was forking the thread to make fun of  the New Math that tried to get the numeral/number distinction across to five year olds.

Someone responded:

You find it better to start with medieval concepts working gradually on to the mathematics of XIX century, while explaining each next year what was wrong with the things they learnt a year ago?

I said all that? Ma's gonna be right proud. But... 

Funny you should ask. Yes, I suspect the path society took to get to what it knows now about math is the path an individual neuronal mass should follow. ie, kids should encounter zero and roman numerals and place value and algebraic variables in the same order society developed those ideas.  The history of math is your math curriculum guide.

The New Math erred by selecting the logical organization of mathematical concepts as its curricular pole star. Next came Constructivism, which wanted kids to reinvent math. From scratch. Cool idea, but too slow.

Instead, let the history of mathematics dictate the order in which things are directly taught. Maybe go further and teach math as history with less emphasis on computation. Math often advanced when needed to solve real problems. Maybe we can shut up the little devils asking why they need to learn this stuff.

As for explaining all along the way what was wrong with the ideas taught the day before, hey, ever read a book on programming? They typically develop a chunk of code iteratively, presenting ever more improved variations on a primitive original. Come to think of it, ever develop some software? Same thing.

Here's the deal: most folks do not even know zero had to be invented. One understands zero better if one has done without it and then the teacher invents it for you. Something like that.

Wednesday, February 4, 2009

Cells: The Secret Transcript

The Boss asked me to give the group fifteen minutes on Cells because I have been talking about it for a while as a future better mousetrap for us and then suddenly last week threatened actually to apply it to qooxdoo and the front end.

I forwarded to everyone a link to a reasonably complete yet relatively brief write-up which tells you everything you need to know about Cells. I know that if I were in your shoes I would not have read it so I presume no one has. But I would like to determine how many folks I will be boring to tears if I review said document, so I will first cut to the chase and ask if anyone has any questions based on what they read.


OK. Cells is at once the simplest and hardest thing in the world for programmers to understand. Simple because the idea is just to have slot values of objects work like cells in a spreadsheet, and everyone knows how spreadsheets work. What is hard is understanding that one can program computers this way.

I only have fifteen minutes so: Yes, you can. Program inputs are assigned by good old imperative code to input cells the same way a user types values into a spreadsheet when they are doing what-if analysis or recording, say, actual monthly expenditure into a budget spreadsheet. Intermediate, derived, and aggregate cells compute new values based on those new inputs from predefined rules just as user changes to a spreadsheet propagate to other spreadsheet cells. Observers on cells let the emergent working model manifest its decisions with more good old fashioned imperative code, usually by simply updating the screen or playing a sound or controlling some external device over a serial port or updating a database.

Boom, we're done. Why is it so great? What part of the superiority of functional and declarative paradigms should I explain first? As I outlined in the material you did not read, a lot of things have to happen when a program receives an input. The programmer coding the event handler has to look at the event and decide all the things that have to happen in light of that event, and any things that follow from those first things. Not only must they reliably see to all those things, but they must do them in the right order. The analogy to a real spreadsheet is quite strong, if you imagine hand-implementing a spreadsheet with old-fashioned pencil and paper.

So the first things Cells does is eliminate a lot of work and thus a lot of bugs. Because the work eliminated is tedious, Cells also makes programming a lot more fun. But there is more.

The declarative paradigm means I always know why a slot has a certain value, because all the logic appears in one place, in the rule assigned to that slot. Without Cells any number of lines of code may have assigned a value to a particular slot and a unified deriving rule certainly cannot be divined even if one were to track them all down.

There is more. Most people hate OO because it never quite panned out. Objects turned out not to be reusable. One of the nicest features of Cells is that two different instances can have different rules for the same slot. That makes objects reusable. Yayyyyyyy.

I did not use Cells in the Kleaner because that was more of a straight calculation running from start to finish. I like to decribe the role for Cells being in any situation where one has an unpredictable stream of data and one is keeping a model with a sufficiently large amount of internal state consistent with that stream of inputs. Two examples being a GUI and a RoboCup client. The Kleaner worked by compiling statistics from a fixed store and then translating exactly once dirty data into corrected data.

Where Cells would have been useful would have been in implementing Phil's ideas about an ongoing stream of data leading to rediscernment of things previously discerned. Cell rules would take new raw inputs and propagate them over to tables of probabilities which would then reach out to existing cleaned data and possibly redecide from the original raw state a new cleaned state.

So no, I do not use Cells for everything, but in this case it was only because the full functionality had not been addressed.

A fun note is that I have in the past applied Cells to a database, specifically the old AllegroStore persistent CLOS database. This works two ways. One is that a user can be looking at a screen and as the underlying data changes the screen changes. That may sound like old news but with Cells one doe not have to write any code to make it happen. One just says "this view shows this users overdue books" and when the date changes the overdue status on every book gets updated and a new book appears in the list on the screen if someone happens to be looking, simply by someone having written code to list overdue books on the screen as if it were an unchanging value.

The other thing that happens is hinted at above. Things like overdue books and amount of fines owed and paid can be calculated from scratch by reading a users entire history of checkouts and returns, but sometimes it is useful to record such derived values in the database and update them incrementally as books are checked out and returned. We can have code in programs do it and hope they run at the right time, or we can have the code in the database (as datapoints mediated by Cells) and be sure they run and run immediately. We get timeliness of data, efficiency, and we still get consistency even as we introduce redundancy.

I left something out. That is bad because the thing I left out is the thing I am planning to do with Cells on my own anyway and then apply to the FE. Cells makes it dead easy to drive a separate framework from Lisp. In my note I mentioned tcl/Tk and Gtk. These are two killer C GUI frameworks with their own homebrewed little object models. We want to program in Lisp, and we want our models driven by Cells for all the reasons above. No problem. We build a model out of instances of CLOS classes mapping isomorphically onto Tk or GTk classes and use Cell observers to pipe information (thru an FFI or even literally a pipe) to the C library or runtime to drive there the creation and animation of C instances.

Works great, and one amazing programmer Peter Hildebrandt pulled off a trifecta in which he had Cells driving and driven by both GTk and a C physics engine, name forgotten.

For a while I kinda marvelled at how Cells could be so useful for such disparate activities, and do so in the same application, the two activities being building an application model and having some other programming framework dance to that model's tune.

I figured it out in time for ECLM 2008, not they were able to understand me. I opened by telling them that Cells was the single most powerful library they could use, because Cells is about change and nothing is more fundamental than change.

Programming is hard because like someone doing a spreadsheet on paper we programmers end up with the burden of propagating change thoughout our models. It is tedious work, it must be done reliably, and there is a lot of it as internal program state multiplies, exponentially a lot. This exponential growth in interdependence of program state is what led Brooks to declare that a silver bullet was not only unlikely to be found but that it would be impossible to find; he felt the complexity was ineluctable because as states multiply there is nothing that can be done to avoid the explosion of interdependence.

I wrote to Dr Brooks recently and asked him if he had ever looked at dataflow. He said he was familiar with the concept, but no. Oops.

Sunday, January 25, 2009


Spring cannot come soon enough for Bobi, who is losing it badly these days on comp.lang.lisp:

Slobodan Blazeski wrote:
Dear board members

I'm baseball player for a several time periods (days,
moths ,years,decades) I've noticed that interest in baseball is
dwindling, and baseball is becoming less and less relevant and will
soon become extinct with only baby boomers supporting it, and even
those are either going to die or switch to golf. In order to save our
favorite sport I propose we make drastic changes and adapt more modern
things like:
a. Playing on the beach sand wearing swimwear like in beach
volleyball, very modern sport. Check Thiobe for growth rate
b. Replacing bats  with  hockey sticks. Note that hockey is popular in
many world countries and we should think international
c. Including  24-Second Shot Clock like in NBA that will make our
sport more lively and fast paced
d. Square playing fields should be replaced with the more common
rectangular one like found in many popular sports : soccer, football,
tennis etc

Including this will make baseball prosper.
very truly yours
Concerned Semi-Ex Baseball Player
Avenue of delusional weirdos Number 23

Bobi may not be as crazy as he thinks he is. Baseball suffered extreme popularity anxiety in the late Sixties and did indeed tinker with the game. Thinking more offense would attract more fans the pitching mound was lowered so pitchers did not get extra energy into the ball from falling into a pitch. The American League adopted the designated hitter to eliminate the 11% nil pitcher from batting lineups (eliminating as well an awful lot of interesting strategy). They avoided the salary caps of the NBA and instituted free agency (well, no, they lost a lawsuit) which allowed bigger markets like NYC, Boston, and LA to buy better teams, and bigger markets are always good for ratings. Minnesota fans will follow the Dodgers, Los Angeles fans will not follow the Twins. 

The changes went beyond the playing field. Ballparks added mascots and a disgusting cacophony of party music between innings so loud you can barely talk, and limited alcohol sales late in games to make the experience more family-friendly cuz you know how the losing fans get in their third hour of drinking.

Now baseball is hugely popular again so tinkering with grand institutions can work. Right?

Wrong. In the end, baseball is just a great game: multi-dimensional and deep. Quality tells, and which quality one emphasizes matters. Hockey and basketball have non-stop action and are fading in popularity, while baseball and football like great music have a variety, a rhythm, a balancing of quiet against intense. Baseball has the pitch, football has the snap. All scales from small to large from inning or drive to the game or season always and invariably end up condensed into one point of explosive tension when the pitcher releases or the center snaps the ball.

Intense without quiet merely exhausts. A boxing match with two brawlers spurning defense landing bombs back and forth brings the crowd to its feet but those who love the sport do so for its nickname, The Sweet Science. They still talk about one genius of defense who won a round without throwing a punch. Between evenly matched fighters one solid punch (forget the knockout, the cartoon haymakers of Rocky n) brings the crowd screaming to its feet, the culmination of rounds of careful, tentative, mutual exploration. A single knockdown becomes a cause for pandemonium and one punch knockouts almost do not happen between the best and when they do they are talked about for a long time. I digress.

Tinkering. Basketball has all the action in the world and now faces its own popularity crisis. Racism is one factor, another is probably the salary cap that has San Antonio in the championship series instead of New York. Another problem: poor defense, and a twenty-point lead does not mean anything. 

But worst of all is the lack of dimensionality. There just is not that much to these games to argue about over the water cooler. Baseball? Boston still talks about the time Grady Little [thx, Xach. ed.] left Pedro Martinez in one inning too long against the Yankees in game seven of the ALCS. Come on, he had thrown a hundred pitches! Everyone knows Pedro is useless after a hundred pitches! You just never hear anything like that about hockey or basketball, which both boil down to great athletes pretty much just playing run and gun.

Baseball never needed tinkering, though tinker they did. The fundamental quality of the game first ensured its survival throught the hard times when fans strayed for the quick fix of non-stop hockey and basketball action. Now the richness, subtlety, and sophistication of the game has some stadiums selling out most games of the year of a very long season.

Moral for Lisp left as an exercise.

Tuesday, January 20, 2009

Tilton's Law: Solve the Failure First

The team was at my throat.

"Just use the new search!," they bellowed.

The mission critical, project saving, do or die demo to upper management was eight hours away and we had not even begun the always dicey process of moving the software from the development system to one within reach of the Demomeister, and I was trying to find out why the old search was so slow.

"Soon," I replied.

We had a new search I was told was a screamer but I continued poking around putting in metrics trying to figure out why the old search was so slow. Had we not been a virtual remote telecommuting team I would not have lived to tell this tale but we were so they had no choice and I reassured then that "soon" meant ten minutes and they shut up.

Why was I still trying to understand the perplexing sloth of the old when a whole new replacement module was available and working fine and pretty much the demo on which all our jobs and a cool project depended was coming on like freight train?

Tilton's Law: Solve the failure first.

Early on we learned the other side of that coin: Solve the first problem. The commonality, let's do the war story first, war stories are more fun than preaching.

Back we go a quarter of a century to my first contract with a client who would become my sole recurring client for the next decade. I was being hired to take over maintenance of an application whose author had been one of the first to die of AIDS. I was reminded of the whole business by a conversation with another developer recently about the nature of working on OPC. Other People's Code.

In my IT career I have worked always at the poles of software development, either writing new code or performing massive overhauls of OPC, never that relaxed zone between in which one simply maintains and extends in small ways a long-lived system. The second pole (OPC overhauls) always seemed to me an intimate one-way encounter with some anonymous predecessor, an encounter usually involving me roundly and steadily cursing them out. You can imagine then how eerie it was working on this system from this predecessor who was not so anonymous this time, especially when I learned that the poor guy was in bad shape during one stint but needed the money and so worked on the code I was now working on even as his fate rose up to meet him (this well before the days of the cocktails of today that make ones fate less certain). This guy I do not remember cursing out so much.

But I digress. Our lesson today is how to piss off your coworkers by insisting on solving a failure first, by which I mean even if you do decide to punt on X make sure you understand how X failed. I am not alone in this. In 2001 the movie when the crew determines that the unit Hal said was no good was fine he says fine let's put it back in and let it fail. Sure, he was really looking for a way to kill the crew but we learned in 2010 that Hal was just a computer system and I think the bit about putting the supposedly OK/not OK system back in to see if it failed was one of Hal's systems working nominally in accordance with Tilton's Law: we need to understand broken things.

And now at long last, my unsolved failure. My predecessor's, actually. The application was a securities database with a nightly feed of data applied to the cumulative DB by a batch program. This is late 80s, primitive stuff. A security could have three IDs because three groups were tracking securities and each had their own ID system. We had tens of thousands of records in our VAX/VMS RMS file, and a separate RMS key for each of the three possible IDs. So far so yawn. Here comes the fun part.

Two of the IDs were populated all the time. The other one was populated five percent of the time. Big deal, right? Right, very big deal, the poster boy for Solve the Failure First. What happened was this quesswork reconstruction:

My predecessor Paul (I picked "Paul" because it easier to type than predecessor) had a problem. His program ran an initial test load of a hundred securities in a few seconds. Fine. Everything looked good. So then he ran it against a full daily feed, which would include news of every security traded that day so it would be -- OK, I confess I completely forget even the order of magnitude, let's say tens of thousands and declare up front that that is idiotic and I am sorry, but here is what happened: the damn thing ran forever. There probably was no immediate specific great mystery because Paul probably had the program printing something (a count, the last ID recorded, something) right to his VT-100 console as it went and he could see that the program had started out zooming along but then gradually got slower and slower until just adding one security to the database (and this is just good old ISAM, mind you) took... wait for it... twenty seconds. Oh. My. God. What on earth is happening?

Paul got a clue. Every once in a while two records were written out bang-bang, as fast as at the start. Dig dig dig puzzle puzzle...ah, there it is. Any record for which we have all three IDs is written out in nothing flat. Any record (you know, the ninety-five percent) with just two will (by the end of the run) be written out three per minute, 180/hour, or 1000/fuggedaboutit.

Paul realized what was going on. The ISAM file system had no problem storing data with duplicate keys, which was a good thing because Paul was storing a whole lot of data with one key 95% the same: spaces. Poor ISAM it seemed was chugging thru all the duplicates looking for the last one after which it would record the latest duplicate. And apparently it took twenty seconds back then to walk (effectively) the entire index of a hundred-thousand record file.

Now the good news is that we would never need to look something up using spaces as the key value sought, so....what can we do? Paul was no slouch. He popped open the RMS reference manual and to his delight discovered he was not the first to pass this way and gleefully added the option "NULL_VALUE=SPACES" (translated: "if the value is spaces, Just Don't Index this record on this key") to the key definitions in the file definition script he was using to initialize the file and recreated the file and re-ran the program from scratch.

The change did not help. At all. I think we all know that feeling, as visceral as a dropping elevator.

There it was, the option explicitly intended to solve the problem he had explicitly encountered, and it did not change a thing. Impossible. But this happens to us programmers all the time. We know what to do. Compile the damn code, because we made the edit change but forgot to compile. Or link. Or, in Lisp, to zap the faultily specialized method. Or something.

So Paul edited the definition again and checked that NULL_VALUE = SPACES was the right syntax and right spelling and on the right key -- to hell with that, he put it on all three damn keys -- and he saved it and checked the date and created the file again and ran his program again and you know it did not run any faster or I would not be telling this story.

OK, time to get serious. Or if he was good he did this without all the huffing and puffing of the preceding paragraph. He Just Typed In "rms/analyze sdb.dat". And RMS looked at the file itself (not the script used to create it) and confirmed that "NULL_VALUE = SPACES" was operative for all indexes.

Momma don't let your kids grow up to be programmers.

What comes next is hard to convey. I can tell you but if you have not worked on this code or (we will learn) run this batch application it is hard to convey how much blood, sweat, tears, CPU time, and delayed nightly batch closes for how many years resulted from Paul's not first solving the failure of NULL_VALUES=YES.

Well, maybe this is a fair glimpse of the enormity that followed: the problem got sorted out only because the head of operations and I got to talking one day and something reminded him and next thing I know he is pretty much down on his knees begging me to find some way to eliminate the two-hour merge step that held up the nightly close every night. 

"It just sits there for two hours," he groaned. "It kills us every night. Please, if you can, please, do something to make this go away."

Whoa. I had inherited this system and been asked to enhance it but no one had said a word about this. The code was far and away the best OPC I had ever dealt with so everything got the benefit of the doubt, including the (soon-to-be explained) two hour merge. As in, if it is there, it must be there for a good reason. What was not there was The Story of the Unsolved Failure of NULL_VALUE=SPACES, but even if it had been I would have taken that at face value, too, because the NULL_VALUE option was unknown to me. But enough of this flash forward, let's get back to poor Paul.

NULL_VALUE was not working as it should. Software is like that. Good programmers do not let bad software stop them. Plan B. A rule is born: Thou shalt not write new securities to the securities database where the massive duplicates will make each write take twenty seconds. Paul decides to write them to a second file initialized empty on each run. Since we only got dozens of new securities in one batch, that file would never have the massive count of duplicates and writes would be lightning fast. Then we just do a sort/merge at the end of the batch to combine the new securities in with the old. Oops. "Just."

The funny "you can run but you cannot hide" moral within the moral being that I did the calculations one day and worked out that twenty seconds times the average number of new securities in a day was exactly as long as the sort/merge that was just killing the folks down in operations. And I bet Paul realized that but only after writing all the crazy code he had to write to work with two files at once as if there were only one file and at that point he just gave up and moved the thing into production. Speaking of crazy code...

You should have seen it. Looking back I cannot recall why it should have been so hard, but I did overhaul that code and I was forever tripping over it. The idea is simple. To look up a security to see if we already have it, first look in the real DB and if it is not there look in the daily "new stuff" DB and if it is not there, ah, it is new. If it is, update it. Just remember to update the right file, because we can get data from two sources about the same new security.

Piece of cake, right? A bottleneck function for all reads and updates... anyway, it seemed like the issue was always getting underfoot as I worked, and just looking at the code one saw again and again this check here/there code, and both Paul and I were the kind of engineers always on the lookout for ways to make code non-redundant. I would think my memory was faulty but I also remember eliminating Paul's Plan B after solving his failure first and that was no picnic. It just permeated the application.

So what was the first failure and how did it get solved? First, I had noticed the issue myself while using Datatrieve to add a record to the securities DB for test purposes. I hit enter and thought I had crashed the system because it went away and never came back and like every egomaniacal programmer out there I always assumed that whenever a system stopped responding the last thing I had done must have broken it so there I sat in dread for twenty seconds until the system finally responds. Wow. Twenty seconds? And then I guess I added a record specifying all three keys and it responded instantly.

But this idea of null values not being recorded in an index was new to me, and we did not have the Internet back then where I could just ask the ether what was going on so it was only a coincidence that just after the guy in operations had begged me for a fix that I was visiting with the lads from a prior contract and I moaned that RMS sucked because it could not handle files with hundreds of thousands of records and they laughed at me and said they were handling millions with RMS.

I can actually remember the look on my face, a neat trick when you think on it.

I haul ass back to work and pull out the RMS reference manual and I can tell you that dead trees aside there is one good thing about paper documentation: right above the entry for NULL_VALUES close enough to catch my eyes was the entry for NULL_KEYS.

Yep. You need to specify both. Paul had specifed NULL_VALUE=SPACES. He had not specified NULL_KEYS=YES. The default for NULL_KEYS? Guess.

I kinda wretch inside even now thinking about the astonishing amount of money, work, debugging, and delayed batches that followed from one simple failure to understand one broken thing.

The meta-lesson shared with "Solve the First Problem"? In programming, never deal with the unknown. This game is hard enough.


The punchline is that I never solved the first failure from Scene I of this tragedy. As my father used to say, "Do as I say, not as I do." We did have a deadline, and I did narrow down the location of the problem in a way that reassured me somewhat that it would not jump up to bite the new code in the rear end. And even in its breech the law is confirmed: we do need to address the underlying problem which I have some confidence I now understand because it still presents problems for the software but it will go away only when bigger problems are solved and they are much bigger so I am keeping my sights set on them.