Wednesday, March 30, 2011

Loops and Making Horseshoes

Loops, such as the FOR loop or the WHILE loop, are programing constructs for repeating a set of instructions until some termination criteria is met. Two primary things define a loop: 1) What you do inside the loop, and 2) the conditions to stop looping.

Drex's new apprentice, Simon, was not working out. In fact, Drex never had a worse apprentice in his thirty-five years as a blacksmith. Simon could barely lift the hammer, let alone swing it with sufficient force to shape metal. However, worse than that Simon also lacked the necessary intelligence to carry out even simple tasks. Had it not been for his diminutive size, Drex might have thought that Simon was actually an ogre.

Drex found himself constantly repeating instructions:
“Now, hit the metal again.”
“And again.”
“And again...”
Drex's patience was wearing very thin. Drex hated repeating himself.

Finally, Drex decided to try an experiment. “Simon, hit the metal twice.” he commanded.

Clank. Clank. Simon complied.

“Now turn it over and hit it three times.” Drex commanded.

Simon flipped the deformed looking horseshoe with the tongs, and he hit it once. Then he paused. He looked back at Drex. Simon looked confused.

Drex sighed loudly. Was that really too much for Simon to handle? The boy was hopeless.

“It is a loop!” proclaimed Drex loudly. He knew that Simon would not understand, but at least shouting made Drex feel better. “A simple, simple loop.”

“A loop?” asked Simon.

“Yes. Haven't you ever heard of a loop?”

Simon shook his head sadly.

Drex realized that they had hit the core of the problem. How could Simon function as a reasonable blacksmith without understanding loops? Then again, Drex had no idea how Simon could function as a human without understanding loops.

“A loop is defined by two things: something to do and a way to know when to stop doing it. You keep doing that one thing over and over until you stop.” explained Drex calmly.

Simon stared back blankly.

“Think about a one mile race.” Drex suggested. “You run around the track, until you have gone a mile. That is four laps, right? So, the running is the thing you do and having run a full mile is the criteria for stopping. The track even looks like a loop.”

“I run until someone tells me to stop.” declared Simon loudly.

“Of course you do.” muttered Drex.

“In this case,” continued Drex, “I want you to keep hammering that horseshoe until it is flat. As soon as it is flat, you can stop. WHILE the horseshoe is not flat, hit it with the hammer.”

“Okay.” agreed Simon happily. He promptly set about hitting the horseshoe over and over again until it was flat. Then he stopped. By the end Simon was breathing heavily from the effort, but the horseshoe was completely flat.

Drex was stunned. How had Simon understood that?

“Good. Now go get the coals hot.” Drex commanded.

Simon looked confused again.

Drex sighed. “It is another loop. Pump the bellows 10 times. FOR each number that you count from 1 to 10, give the bellows a pump.”

“Okay.” Simon again got to work, pumping the bellows exactly ten times. He counted loudly each time.


Over the course of a week, Drex determined that Simon would repeat tasks if they were well specified in a loop. He would tell Simon exactly what task to repeat and exactly how long to repeat it. Sometimes he would simply tell Simon to count up to a certain number. Other times he would phrase it like a WHILE loop, telling Simon to continue doing something until a goal had been met.

Simon responded amazingly well to these structured commands. The blacksmith's shop was filled with the noise of Simon cheerfully counting and hammering. “One… bang… two... bang…”

Eventually, Drex was even able to move onto nested loops, issuing instructions such as “WHILE the sword is not thin enough: Turn it over and FOR each number from 1 to 5, hit it with the hammer.” And Simon would happily go about banging the sword into shape while turning it over after every five hits.

Unfortunately, this formalized approach only worked to a point. Disaster finally struck when Drex tried to teach Simon about multi-threading. As Drex and Simon stood outside the burning blacksmith's shop, Drex admitted defeat. Before leaving town for an open blacksmith's position in Garroow, he set Simon up with a better matching job -- counting laps for runners on the local track team.

Tuesday, March 29, 2011

The Importance of (Variable) Names

Having clear, meaningful variable names is an important factor in writing understandable and maintainable code.

By the time Princess Ann had reached the northernmost outpost within the kingdom, she was losing hope. Her father, King Fredrick, had sent her on a quest to save the kingdom from impending darkness almost a month ago. So far, Ann had found nothing. Meanwhile reports of roaming dragons and hordes of goblins increased throughout the kingdom. Ann felt completely demoralized.

The outpost of Garroow had been hit particularly hard by the recent chaos. The goblin attacks had been increasing in recent weeks. The commander, Sir Aat, had sent word to Ann's father that the outpost was in desperate need of reinforcements. At a loss for better stops on her quest, Princess Ann headed north to Garroow.

The situation in Garroow was worse than she had expected. During her first night at the outpost, a small goblin attack almost overwhelmed it. The fifty person garrison barely held off just three, relatively lethargic, goblins. She heard the captain shouting orders at his solders: “Ut, guard the South wall. No, I meant Ot. Ut, stay where you are.” “Drex, swap places with Plex, we need an archer on the wall not a blacksmith.” “Et, secure that door.”

Eventually, the soldiers repelled the attack and extinguished the fires. However, the lingering feeling of chaos and confusion continued to bother Ann. It worried her that the garrison's response had been so disorganized. It was like watching a turtle try to chase its own tail. The problem was not the number of soldiers in Garroow, but rather how they were being commanded.

Ann resolved to fix the situation before leaving the garrison. She spent the entire night pondering the different algorithmic strategies, certain that one of them would help the garrison run more efficiently. As she had been taught from an early age: almost every problem has an algorithmic solution. She debated the pros and cons of using dynamic programming to organize communication lines, using a min-max search to optimize strategy, and using randomization to avoid local minima. But none of these approaches seemed to address the core problem. Ultimately, the true issue dawned on her at 3am, and she fell asleep confident that she knew how to fix the situation.

“Sir Aat,” she addressed the commander at breakfast the next morning. “We need to discuss the attack last night.”

“Yes.” agreed the commander. “Now you see why we need the reinforcements.”

“No. I do not.” responded Ann.

The commander looked shocked. The rest of the dining hall fell silent. Everyone waited to see what Ann said next.

“What you need are better names.” Ann continued.

The commander laughed deeply. “You don't understand. We have already improved our names. When a soldier joins the outpost, they are assigned a new name. Every name is short, so that commanders can call out orders quickly in battle.”

“No. It is not.” disagreed Ann. “It is inefficient.”

“No offense Princess Ann, but what do you know about commanding in battle?” he asked.

“Only what I observed last night. But from that limited introduction, I can assure you that the names are hurting your efforts.”

“I think you are mistaken.” declared the commander. “They allow us to issue commands at incredible speeds.”

“Yes they do.” agreed Ann. “But they are prone to mistakes. Last night, you corrected yourself 89 different times. The names are too similar and thus too easy to confuse. Plex and Drex. Ut, Ot, Et, and Aat. The short names do not help!”

“Ha! What would you suggest?” scoffed the commander.

“Use descriptive names. For example, Plex should be called 'South Tower Archer' or at least 'Archer 1'. That more accurately reflects his role.”

“That is crazy!” bellowed the commander as he slammed his mug of coffee on the table. “Do you know how long it takes to say 'South Tower Archer' in the heat of battle? We would waste valuable time.”

“Do you know how long it takes to say 'Drex, swap places with Plex, we need a archer on the wall not a blacksmith.'? Any measure of efficiency needs to take into account the time spent on corrections.” countered Ann.

The commander had no response for that.

“And what about you?” Ann continued. “Why not have them call you Commander or Captain?”

“Our names already reflect rank.” replied the commander. “The names proceed down the ranks in alphabetical order. It allows any solder to instantly know who outranks them! It makes life simple!”

“No. It does not.” corrected Ann again. “In order for the solders to even refer to each other, they have to learn new made-up names. Why not have them learn the ranks instead? Either way they have to learn something new. Only, in this case, the ranks mean something.”

“But we have a good system!” argued the commander.

Ann sighed. “It is like programming a complex algorithm.” she explained. “Using very short variable names can make it feel more efficient to program, because you can type out the code faster. But, in the long run, it can do more harm than good. It becomes easy to make mistakes and difficult to sort out what is happening. Often times, slightly longer names can make a significant difference.”

The commander opened his mouth to argue, but was unable to think of a rebuttal. Instead, he sat at his table, mouth open, with a confused look on his face. After a while he spoke.

“I think you might have a point.” Secretly, the commander also felt a small pang of relief. He had never been fond of his own assigned name. He often found himself day dreaming of his solders snapping a salute and shouting “Yes Commander!” in unison. Maybe this change would not be so bad after all.

That afternoon, the commander changed every soldier's name to be longer, but more meaningful. Over the next few days, the troops stumbled through drills, getting used to their longer names. But soon, Ann began to see efficiency improve.

A week later, on Ann's final night in Garroow, there was another goblin attack. This time the invading force consisted of 10 highly trained goblin special forces troops. The Garroow soldiers turned away the attack with ease.

As Ann left the garrison, she took a small bit of pride in the dramatic improvements in the forces there. After indulging in the brief moment of happiness, she turned her horse East and continued on her quest to save the kingdom.


Monday, March 28, 2011

Computer Memory and Making Dinner

Memory in a computer is a hierarchy. At the very top you have the ultra-expensive and ultra-fast memory called registers. And at the bottom you have mind-numbingly slow but large storage such as a hard disk. Why does this matter? From a programmer's point of view, understanding the memory hierarchy can allow the programmer to optimize the program to work efficiently within this hierarchy. The correct bits of information are kept at the lowest (and fastest) levels. And from a non-programmers point of view, these concepts provide insight into how a computer works.

It has been rumored* that the entire basis of the modern computer memory hierarchy is based off a single late night conversation between Dr. Simon Babbage XI and Chef Louis Von-Register IX (aka the Iron Cache). The two had been arguing late into the night about the most efficient way to organize ingredients in a kitchen. It was a heated debate between empirical evidence and learned rules of thumb. Unfortunately, the argument became so heated that neither of the men realized that they were both proposing the exact same scheme. Ultimately, the night marked both the end of a years' long friendship and a major step forward in computer architecture.

Thus, a good analogy for understanding how computer memory works is naturally to view it in terms of food storage and dinner preparation:

Registers - Registers hold values that you are using at this moment. For example, these could be the inputs into an addition operation in the CPU. This storage is analogous to your own hands or maybe the current mixing bowl. It is literally the "stuff" that you are working with at that moment.

L1 Cache - The L1 Cache holds data that is cached from main memory, but may not be in use at the moment. It is relatively small, but very fast to access. This cache is analogous to the kitchen counter right next to where you are working. You can put stuff down there that you know you will need in just one minute. Just do not leave open containers of milk there all night.

L2 Cache - The L2 Cache is a larger cache between the L1 cache and main memory. It is like the far end of the counter - still easy to get things, but somehow a little more annoying.

RAM - RAM is your kitchen cabinets: larger, but further. You can store a lot more ingredients there than you will need for just the current meal. But, when you are carefully stirring the soup, while trying to keep it at the perfect simmer, it can be a little annoying to run to the cabinet to get more salt.

Hard Drive / Network / Flashdrive - These represent the slow, but very large storage devices. They are the neighborhood supermarket of the memory system. There is a lot of food there, but if you need to go there to get something for a particular recipe, you will probably miss dinner. Ideally, you want to limit your trips to the supermarket and not dash out at every step of the recipe.

Of course most of these levels are conveniently hidden from even low-level programmers. The obvious analogy here is the assistant chef that almost reads your mind by: helping you get the correct ingredients, putting things on the counter, cleaning up unused items from the counter, and running simple errands. Fortunately, the computer's memory system does not get cranky and talk about you behind you back (I think).

*Actually, no such rumor existed until I made up that story. There is probably a much better (or at least more accurate) story for how the computer memory hierarchy came into existence, but I doubt it involves a thrown souffle.

Friday, March 25, 2011

Hunting Dragons with Binary Search

Binary search is an algorithm for efficiently finding a target value within a sorted list. The algorithm maintains a shrinking search window. It narrows down that window by repeatedly ruling out half of the remaining search space.

At each step, the algorithm checks the middle item in the current window and compares it with the target value. If the value in the middle is less than the target, you can rule out the lower half of the window. If the value is greater than the target, you can rule out the upper half. This process repeats until the target item is found or the window shrinks to a single (non-matching) item.

The morning's news was nothing short of dismal. King Fredrick had only been awake for twenty minutes when he decided that today was going to be awful. It had started manageably enough, with his head butler informing him that his favorite crown was still being polished and was not available. However, that news was quickly followed by his chief knight, Sir Braver, informing King Fredrick that a dragon was now terrorizing his kingdom.

"Tell me what happened," commanded the king in a weary tone. He hated dragons.

"There is a dragon," declared Sir Braver.

"You have already told me that part. Now tell me what has happened. Has anyone seen this dragon? Has it attacked anyone?" prompted the king.

“Yes, sir. We have received six confirmed sightings. Farmer McDonald has lost his prize cow."

"The one that won last year’s competition?" inquired the king. If he recalled correctly, it was a sizable cow and would be a perfect target for the dragon.

"Yes, sir," answered the knight.

The king sighed. The story sounded like a textbook dragon attack. Soon the dragon would eat half the cows in the kingdom. "Well, you had better take the dragon hunter and get rid of the dragon."

"But sir, no one knows where the dragon is. How shall we hunt it?" asked the knight.

"Really?" the king asked, surprised that his head knight did not know how to hunt a dragon.

Sir Braver remained silent.

"Find the last farm that it attacked,” said the king. "Dragons take long naps after each meal, so you have time to catch it there.”

"But sir, how will we know where that is?" asked the knight.

"Have you studied dragon attacks at all?" the king asked. "There is an order to these attacks. Dragons are smart. They eat the biggest cow in the area, then the second biggest, then the third, and so forth. They keep eating smaller and smaller cows, until the only remaining cows are too small to be worth the bother. Then the dragon goes on to the next kingdom."

"But sir…" started the knight.

"Use the rankings from last year's cow competition!" cried the king, losing his patience. "They have a sorted list of the largest cows!"

"Ah. Of course, sir. We shall work down the list until we find the dragon!" confirmed the knight.

The king nodded, relieved that the knight was finally thinking on his own. Sir Braver and the dragon hunter would visit the each of the farms on the list, in order, and find the dragon. Even as he tried to comfort himself with this thought, warning bells went off in the king's head.

"Not down the list," the king admonished. "Do you know anything about searching?"

The knight looked confused. He had been certain that he was on the correct track. "But sir, I don't understand" he said.

"How long does it take you to travel between farms?" asked the king.

"A few hours, sir."

"How many farms are on the list?"

"Maybe a hundred."

"That means that it could take you hundreds of hours. By that time, the dragon will have eaten more cows. It might even be gone by then."

"But, sir…"

"Use binary search," commanded the king.

"Binary search, sir?" asked Sir Braver, regretting having fallen asleep in his 'Algorithms for Knights' class.

"Start in the middle of the list and check whether the cow is still there. If it is still there: rip the list in half, throw away the bottom half, and repeat the process on the top half of the list. If the cow is gone: rip the list in half, throw away the top half, and repeat the process on the bottom half of the list. Each time you visit a farm, rip the list in half and concentrate your search on only one of the two halves."

"But sir, that sounds complex," protested Sir Braver.

“It is not," argued the king. "It is simple. If the cow in the middle of the list is still at the farm, then so are all the cows below it on the list. That is how dragons eat. It is not going to eat a smaller cow first. That would be absurd.

“On the other hand, if the cow is missing, then so are all the cows above it on the list. Dragons do not skip around lists randomly. There is an order to these things."

“What if the dragon moves while we are searching?” asked the knight.

Finally, a reasonable question. “Your search effectively tracks two bounds: the smallest cow that you know has been eaten and the largest cow that you know has not been eaten. If you get to a point where those bounds are next to each other on the list, and you do not see a dragon, then proceed to the largest uneaten cow and wait there.”

"But sir, how is that any faster than just scanning down the list?" asked the knight.

The king sighed, again. His confidence in the knight’s ability to handle this task was eroding rapidly. "If the dragon has attacked 45 farms, how long will it take you to scan those farms? 90 hours? But in the case of binary search: after the first farm you will have ruled out 50 farms on the list -- one way or another. After the second farm, you will have ruled out another 25. Each time, you are cutting the problem in half. In fact, you can find the dragon after checking only 7 farms."

The knight nodded, yet looked unconvinced. "But sir, are you sure?"

The king screamed. "Of course I am sure! I am the king! And, further, every idiot knows that binary search is a logarithmic time algorithm while scanning down a list is a linear time algorithm. NOW GO AND GET THAT DRAGON!"

For the first time in his long career as a knight, Sir Braver ran away. He told himself that he was running in order to carry out the king's orders as quickly as possible. But in truth, the king looked really mad.

As he watched his top knight clank noisily down the hall, the king decided that it was indeed time to go back to bed.

For more on binary search also see Binary Searching for Cinderella.

Thursday, March 24, 2011

Pointers and Walk-In Closets

Pointers are variables that hold a particular type of data: addresses in the computer's memory. A main advantage of using pointers is the flexibility that they provide when interacting with data stored in memory. A single, relatively small, pointer can give the location of a massive chunk of data sitting in memory, allowing functions to access the data without having to pass around the full block of data itself. Programmers can also create complex data structures by using pointers to link different blocks of data.

The powerful Wizard Marcus lived in a small apartment in downtown New Athens. The rent was high and the views were terrible, but that was the price of living downtown. To be honest, neither of those factors bothered Marcus. What annoyed him was the size of the apartment. Marcus had accumulated many belongings over the years, and he liked having a place to put them all. The apartment had only one pathetically small closet.

Being a wise and powerful magician, Marcus turned to magical solutions to make his life better. First, he shrunk all of his belongings so that they would all fit in his closet. This created quite a mess. There were so many tiny items thrown into the closet that he could never find anything. One night it took him hours of searching with a magnifying glass in order to find his favorite wizard's hat. As a result, he was tremendously late for a date.

Next, he tried dehydrating all of his possessions down to powder and putting them in labeled jars. His closet soon resembled a spice rack filled with hundreds of tiny jars. Unfortunately, rehydrating cloths left them thoroughly soaked, and a single poorly timed sneeze scattered his favorite cloak over his living room. Marcus quickly abandoned that idea as well.

Finally, he realized that he did not need to actually keep all of his possessions “right here” as long as he could easily retrieve them. He bought a huge castle in the rural outlands to store his belongings. When he needed an object, he would summon it. Similarly, when he was done with using it, he would send it back. The process effectively gave him a giant virtual closet.

In only a short matter of time, Marcus found that he had a new problem. He no longer remembered what was in the castle or even how much space he had left. He found himself randomly summoning items from different rooms in the castle just to see what was there. Marcus needed a better system.

In order to keep track of what he stored in the castle and where it was, Marcus created a formal accounting system. He divided the castle into a giant labeled grid of squares, each indicating different spaces where items could be. Then he created a scroll listing all of his items. Each time he sent an item to the castle, he carefully recorded a pointer to where that object was sent. Using this scroll he could quickly find the location from which to summon any item. He only needed to keep one small scroll of pointers in his downtown apartment. This system worked great.

Or, at least, the system worked great most of the time. Occasionally, Marcus got careless and forgot to record where he sent an item. As a result the item was lost and the space inside the castle wasted. Other times, Marcus would forget to check that a space was free before sending an object there. One night, after returning from a party, he accidentally sent his coat to the same location as a priceless vase. The vase exploded as the large coat appeared inside it. These were the inherent dangers of carelessness with his pointers.

Despite the occasional accident, the system worked well. Marcus learned to be careful about tracking everything in his scroll, and he was able to live for many years in the most fashionable neighborhoods.

Wednesday, March 23, 2011

Caching and the Librarian of Alexandria

Caching data means storing a copy of the data in a quickly accessible location to speedup future accesses of that data. The key trade-off is that the faster storage locations for data tend to be smaller and thus hold less data. For example, accessing data from RAM is much faster than reading it off a hard drive, but RAM is both smaller and more expensive per-unit of storage. By caching data that is used often, you can greatly speed up a program.

The Library of Alexandria is still famous to this day for the size and breadth of its collection. However, few people know that the true magic of the library was the librarian’s ability to find relevant resources with amazing speed. Within a few seconds he could often locate the canonical reference to answer a patron’s question.

For young Peter, this magical ability had always been a thing of wonder. He had grown up in a small house four blocks from the library and had spent many afternoons sitting at the tables listening to the librarian answer questions. “What should I feed my new pet yak?” “Where can I find the sweetest prunes in Alexandria?” The scope of knowledge commanded by the librarian was almost legendary. Peter dreamed of one day learning the magic. And, after ten years of training, Peter got his wish: he was accepted as the librarian’s apprentice.

On the first day of his apprenticeship, Peter arrived three hours early. As the librarian took him on a tour of the library, Peter eagerly followed him around, scribbling notes into his new notebook. Although Peter knew the public areas by heart, he soaked in every word. After the tour was complete, the librarian and his new apprentice took their places behind the reference desk and waited for the first patron of the day.

The first question came from a small man holding a rather large rock. “I’m trying to build a wheel. And, well … I’m not sure what shape to go with. Do you have any good references?”

The librarian disappeared into the stacks, only to reappear a few seconds later with a scroll entitled “Wheels and How to Design Them.” The patron was delighted. He took the scroll over to a nearby desk and started to quickly skim through it.

“Oh … I see … a circle … hmm … brilliant,” the patron quietly mumbled to himself.

“How did you do that?” The apprentice asked. “How did you find the scroll so quickly? Please teach me the magic of the library.”

“Magic?” The librarian asked. “I’m not sure about that, but I can show you my system.”

The librarian led Peter into the back room. The room was dimly lit by two small oil lamps. At the back was a single door leading to the vaults where the thousands of scrolls were kept. The librarian walked over to the a small table that held a coffee pot and a small stack of scrolls. He pointed at the scrolls. “I usually find the answers in one of these.”

Peter was confused. “Twenty scrolls? The library has tens of thousands of scrolls. How does having twenty in the back room help? Are they magic?”

“Not at all,” chuckled the librarian. “It turns out that people are pretty boring. They all ask the same types of questions. In fact, I’ve found that 95% of the questions in a single day can be answered with only twenty scrolls.”

“Is that all?” Peter asked, visibly disappointed.

“Yes,” replied the librarian. “What did you hope for?”

“I’m not sure. But I was certainly expecting something magical. Maybe you were summoning the scrolls or could teleport yourself to the correct shelf. But … this seems like cheating.”

“Cheating?” The librarian laughed. “Do you realize how much effort goes into making this efficient enough to appear magical? You can’t pick any twenty scrolls. They have to be the right twenty scrolls.”

At the news of this challenge, the apprentice perked up a bit. “And how do you pick them? Telepathy?”

“When I first started, I kept the most frequently read scrolls in my cache. Every time someone read a scroll, I would note that. I kept a count for each scroll.”

“I see why that would work,” said the apprentice, sounding slightly bored.

“Aha! But it didn’t work!” exclaimed the librarian. “Everything seemed to be going fine, until the same scrolls got stuck in my cache. Remember that massive flood we had two years ago? During the entire flood, I still had three scrolls on how to survive a drought. It was a disaster.”

“Then I had a new idea,” he continued. “I could keep more data and use statistics. I meticulously kept track of which scrolls people were viewing and correlated it to the season, weather, and fads of the day. I created complex predictive models of which scrolls would be most requested on a given day. I could pre-fill my shelf every morning!”

“Did that work?” asked Peter.

“Yes. But was a lot of work. So I stopped doing it.”

Peter stared at the floor. Disappointment flowed through him. He wondered if he should have become a blacksmith instead.

“Eventually, I got lazy and left the most recent scrolls on the shelf. When I needed to go down to the stacks, I’d grab whatever scroll had been used least recently and put it back. It turned out to be a pretty good system.”

“Really? So the magic behind the Library of Alexandria is that you keep the twenty most recently viewed scrolls on a shelf in the back room?”

“Yes. It allows me to answer 95% of people’s questions without going past this room.

The apprentice nodded numbly. Now he was certain that he should have become a blacksmith.