A Collaboratively-Maintained Set of Exercises Teaching Fluency in the Whole of Practical Programming and Computer Science
Daniel S. Wilkerson; 30 August 2005
I read much about C++ for a few years before I ever wrote code in it; I knew about OO design and many C++ problems, but I couldn't code. I decided I would just do all the exercises in the first 15 chapters of Stroustrup. They varied wildly: some easy, some long and tedious; it really looks like Stroustrup himself never did them. Instead simply wrote a little program using each C++ feature he mentioned. I used both the EDG and GCC compilers and found quite a few differences; that was particularly unsettling. I managed to get through in 2 weeks, writing about 8 thousand lines of code.
It worked. I went from having a frustrating theoretical understanding of C++ to being able to code in C++; it was so satisfying. And here is the amazing thing: I didn't "know" any more about C++! I was already a C and Java programmer so I basically knew all the design patterns, all the problems with OO, many of the C gotchas, etc. I couldn't have *said* more about C++ than before, and yet now I knew C++.
In many other languages, at least French, German, and Hungarian, there are two words for "know": there is to "know a fact", savoir in French and tudni (TUD-ni) in Hungarian, and there is to "know or be familiar with in an intimate way" (usually of people), connaitre in French and ismerni (ISH-mer-ni) in Hungarian (don't know the German, but I know its there; Karl?). Before I "tud" or "sait" C++; after I "ismer" or "connais" (conj.?) C++. And this is the basic problem with Academia: it teaches people to tud, but not to ismer.
I have noticed that people like Scott and Karl who have spent many years coding instead of wasting their time on Theory and Mathematics like I did just have a much deeper understanding of code and of Unix. Despite some years of experience now, I still get stuck with lots of little things like getting my printer or speakers to work, figuring out why the CPAN installer won't get the latest Perl module, updating libc so I can use Yahoo messenger without borking the rest of my system, diagnosing and fixing linker problems (hint: Scott says the line number is meaningless but the message is correct), etc. etc. Yet others can simply figure it out, even if they do not know any more about the specific problem when they start.
These are "little" problems, but when it is something that is stopping you and no one else is around you can't make progress: it is a BIG problem. Also it has a "chilling effect": I am still running Slackware 8.0 and have a crufty old IP chains firewall because I'm afraid to touch anything. I am way better than I was, but my office-mates still can't use the printer hooked up to my machine because we just can't debug the weird lpd response. I know Scott or Karl could figure it out but I refuse to bug them about it.
I have repeatedly asked people what the answer is to this predicament of knowing and not. The answer is always something like "well, you learn with experience". I am sure that is true, but what experience? How can I go have some of that experience? I read the introduction to a book on how to solve math problems by an ex grad student in Math at Cal; he said wrote the book because he spent a lot of time in grad school just being completely stuck on homework problems and that he thought it had just been a waste of his time. The problem was that his book was just more top-down advice on problem solving that immediately put me to sleep and didn't help at all.
What we need is some "guided experience" for people designed to produce a specific result. When I was in Hungary, we learned math using the famous Hungarian method: do fun, simple, graduated exercises over and over and don't worry about definition/theorem/proof any more than you have to. It is pretty darn simple and it works. Halmos, the famous Hungarian expositor of mathematics, said that the secret to learning mathematics was "examples". When asked the secret to learning physics, Feynman said "examples".
I like the word "exercise" somehow better than "example" because the latter implies a top-down approach to me and the former a bottom-up one, which I prefer. Either way, what I think is missing is a book like the famous one by La'szlo Lova'sz "Combinatorial Problems and Exercises". It is a huge math book consisting of three sections in the classic Hungarian Method style. Section 1: A few pages of exposition, a huge number of exercises. Section 2: hints on the exercises. Section 3: solutions to the exercises. I think you are getting the idea.
What if we had some sort of public repository of exercises, an "exercise-pedia". It would be some sort of online collaboration perhaps like wikipedia: it could include expositions on various parts of programming and computer science practice, but these would just be intended to frame the real point: carefully graduated lists of exercises to do on all practical topics. Really you could start with just a source repository such as exercises.tigris.org and start checking stuff into it.
A set of exercises would be intended to produce a particular result, such as to really "ismer" or "be intimate with" TCP/IP. A useful one would be just to lead someone through a toy implementation of a TCP/IP stack: if the code isn't optimized and has a bare-bones feature set, it can't be that hard to write. In the end you install it on your computer to prove to yourself that it works. If you want to know something about the traffic patterns, you can just hack it into your own version; no need to read man pages on ethereal.
You must submit only exercises that you have actually done yourself, so we don't get the stupid effect of so many books where the exercises are too hard, to easy, boring, patchy, incomplete, full of mistakes, etc. etc. Those who do the exercises could augment the exercises or the exposition, fix bugs, evaluate them for quality, etc. Others could attempt to provide references to standard material, books, web pages, courses, HOWTOs, wiki-entries.
The most important thing would be that the very experienced people could attempt to edit them for comprehensiveness. That is, I not only don't know Unix, I don't even know how much Unix I don't know. A friend of mine was so thankful to learn about "nohup" because he had been up all night hitting "enter" so his process on another machine wouldn't die. This was a grad student in EECS at Berkeley. Holes like this in your understanding can be devastatingly expensive. If he had had an internal understanding of how processes are managed he would have known such a thing must exist or he could have written it. A comprehensive set of exercises would allow you to finally get to the point where "that's all there is!" As my Dad put it you have islands of ignorance instead of islands of knowledge.
Another part of completeness is minimality: big, pseudo-encyclopedic, verbose books on software engineering are useless. The point of exercises it so distinguish certain kinds of understanding for someone. If they are learning emacs, knowing C-s C-w really is in the core of what you should know; knowing SGML mode could be further down the list. The aesthetic would be to put as little as possible in a given stage, but also to have each stage be complete: if you know how to open a file, you know how to close it etc.
Another grad student was in my office one day and said "wow, if I knew emacs as well as you I'd be twice as productive!" I really think knowing these "mundane" things has a profound effect on someone's ability to think and work. You really cannot call someone a C programmer who doesn't know how to use the preprocessor to expand macros or gdb to step through code or make to organize their build process, or at least RCS to manage versioning. And yet the programming classes teach this scheme bull-shit instead and I have never seen one yet teach a debugger. We could at least write this stuff down.
But there is an even greater importance. Logging on one day, I saw "Everything should be done top-down, except the first time." The only problem is, that in the world of software, most of the time we are doing things for the first time. Yet everything taught in classes is top-down; there is no respect given really to just "hacking".
In Soto Zen Buddhism top-down is called "light" and bottom-up is called "dark". To paraphrase Shunryu Suzuki, sometimes you study the texts of great masters and sometimes you just sit and let it all get mixed together in your belly in the dark.
Darkness is a word for merging upper and lower.
Light is an expression for distinguishing pure and defiled....Right in light there is darkness,
but don't confront it as darkness.Right in darkness there is light,
but don't see it as light.
There is also an engineering saying "First imitate, then innovate", which is another way to put it. You don't just learn from ideas, you also learn in a way you know not how it works. When I was a kid I would think "I can't have kids, how would I teach them to talk!" But they just learn how to talk if you talk to them and play with them.
True innovation doesn't come top-down from the theory. It comes just from being intimate with reality and paying attention. In Zen and The Art of Motorcycle Maintenance, Pirsig talks about debugging a motorcycle. He says something like when all else fails, you just sit there; the answer is just waiting for you to be quiet enough so that it can come up and whisper in your ear.
Most of what people seem to be doing is just repeating each other's old ideas. I had to explain one very simple idea to a hardware professor three times in a row before he heard it; I could *see* his mind snapping back repeatedly to the way "everyone knows" you are supposed to think about that problem. It turns out, no one seems to have ever thought of it my way. I thought of it because I just didn't know enough about hardware. As Suzuki says in Zen Mind, Beginner's Mind:
In the mind of the beginner there are many possibilities. In the mind of the expert there are few.
Now we can't all be wholly ignorant of how computers work and expect to make progress. However, we can gain an intimacy of computers by really knowing how they work by playing with all the intimate details, rather than from just indulging in the fantasies that "The Theory is the Real Thing". There is the conference called "Foundations of Computer Science". You should see it! Its all math! The Foundation of Computer Science isn't math, it's silicon!