Monday, April 30, 2007

And you thought all the good domain names were taken?

Using only the magnetic letters stuck to my refrigerator, we were able to discover these promising, yet unregistered domain names! (imagine the media empire that could be built around that!) (quick doctor)

And venturing beyond the refrigerator, we also found these gems: (I went ahead and registered this one) (that has eight 'r's)

There's a goldmine in replacing 'for' with '4' and 'ate' with 8.

Don't limit yourself to good ideas...

Saturday, April 28, 2007

Whose reality are you living in? Whose reality would you rather live in?

Mental frames are the biased and limited way in which information is perceived or understood. Because the human brain is inherently biased and limited, we are always in some mental frame. That frame determines how we relate to and understand reality. As far as you can tell, that frame is reality.

This is why the world seems better when we are happy, and worse when we are sad. It is also a reason that people on "the other side" of an issue seem so stupid, misinformed, and out of touch with reality. Not surprisingly, mental frames are often discussed with respect to politics. However, their scope and importance goes well beyond that -- they ultimately determine not only our perceptions, but our whole mindset about what is valuable, practical, or dangerous, and what behaviors are responsible and acceptable.

If someone pursues their passions, are they boldly living life to the fullest, or are they simply being frivolous and irresponsible? The answer, of course, depends on your frame.

Perhaps you might prefer to be objective about reality and escape these limiting frames. Unfortunately, that's not really an option, at least not while we are stuck with these monkey brains inside our heads. Although it is nice to attempt objectivity, we must accept that we are human, and therefore limited. To deny that and claim true objectivity is to deny the truth and be stuck in a very limiting and annoying frame.

Mental frames help to explain why some people needlessly stay in bad jobs, bad relationships, or other bad situations -- in their reality, it makes sense. Clearly, some frames are better than others (from my perspective :).

Here's the good news: You can switch frames!

Many influences can shift your frame, such as reading books or taking a walk in the park, but one of the most powerful influences is the people around us. We tend to synchronize frames with our friends, family, co-workers, and anyone else we encounter (including the people on tv), though obviously some of those people are more influential than others.

Therefore, if you want to change your reality, change your surroundings. Find people with a more attractive reality, and live with them. This is very important. When you spend a lot of time with people, their reality becomes your reality.

For example, if you are interested in startups, but work in a big company, you are in danger. If you stay there too long, you will be drawn into the big company frame shared by the other "lifers". Startups will all seem too risky, frivolous, or impractical, and you'll spend the rest of your life in that big company (and posting bitter comments on TechCrunch).

Similarly, if you dream of pursuing some other career or lifestyle that is not shared by the people around you, then you either need to accept that it's not going to happen, or you need to change your situation.

Please note: I'm not suggesting that you completely cut off contact with people outside of your desired frame (that's how cults operate, btw). To the contrary, it's good to keep in contact with a wide variety of people -- that will help provide perspective and keep you from becoming trapped in your new frame (which might not be as great as it seemed from the outside). What I am suggesting is that your target frame should have significant representation in your life (like 50%), or at the very least, some minimal representation (5%) so that it doesn't completely fade from your reality.

Update: To better understand frames, think of a time when you were excited about an idea or possibility (or anything else), but when you shared that thought with someone else, they somehow ridiculed, doubted, or otherwise criticized it. How did that make you feel about the idea? Were you a little less excited, more doubtful, or perhaps even somewhat embarrassed about it? If so, you've just entered further into their frame -- their reality is becoming your reality. If you want to nurture your dreams, it's better to share them with people whose frame is compatible with the dream.

Read my previous post on reality. What am I doing to your frame?

Update Two: One of the commenters on news.yc suggests that if everyone really were in their own frame/reality, then it would be impossible to communicate or build products for other people. This is a good point, and it would be true if our frames were completely disjoint. Fortunately, they are not -- we always have something in common. However, the more our frames differ, the more trouble we have communicating. This is why it can be so difficult to communicate with a broad audience, such as on a blog, and why I've decided to write primarily for those with similar frames (because it's easier).

For more thoughts on how your mental frame affects your life, there are some interesting posts on Steve Olson's blog (though he uses the term 'belief system').

Tuesday, April 24, 2007

The secret to making things easy: avoid hard problems

That may seem obvious, but in my experience most engineers prefer to focus on the hard problems. Working on hard problems is impressive to other engineers, but it's not a great way to build successful products. In fact, this is one of several reasons why YouTube beat Google Video: Google spent a lot of time solving technically challenging problems, while YouTube built a product that people actually used (using PHP and MySQL, I think, which is not at all technically impressive).

For me, the most effective method of getting things done quickly is to cheat (technically), take a lot of shortcuts, and find an easier way around the problem (and before anyone jumps in with some comment about security or bank transactions, there are obviously a few exceptions). You only need to think ahead enough to avoid painting yourself into a corner, or have a plausible plan for escaping the corner. There's always an easier way -- work lazier, not harder. Note that this doesn't preclude doing things that SEEM difficult -- easy solutions to important problems that LOOK really hard are the best.

I was reminded of this while replying to the comments on news.yc in response to my post on disks and databases. Whenever anyone mentions the possibility of not using a conventional database, a lot of people will immediately reply that databases solve a lot of very difficult problems, and that you shouldn't put a lot of work into reinventing the wheel. These people are, of course, correct.

The thing is, a lot of those difficult problems are irrelevant for 99% of products. For example, "real" databases can handle transactions that are too large to fit in memory. That was probably a really important feature in 1980. Today, you can buy a computer with 32GB of memory for around $5000. How many GB transactions do you suppose Twitter performs? My guess is zero -- I suspect that their average transaction size is closer to 0.0000002 GB (messages are limited to 140 characters).

I want to be perfectly clear about one thing though: I'm not advising you to ditch your database! If your database is plenty fast, then the easiest thing to do is probably "nothing", and that's what I advise you do. If, however, your db is getting slow or overloaded, then you need to do two things:
  1. Understand the problem
  2. Fix the problem
The correct solution to your problem will depend on your situation. For example, if you have some data that's very important but doesn't change very often (username and password), and some data that gets updated continually but doesn't have to be correct (last active time or hit counters), then a simple solution would be to leave the important data in your database and move the less important data into something really simple but less reliable.

Want an example of "simple but less reliable"? Here's one (in one or two easy steps):
  1. All updates go in to memcached, but not the database
  2. (optional) A background process occasionally copies entries from memcached to the db. Without this, the values will be completely lost when memcached restarts.

Monday, April 23, 2007

The problem with conventional databases

They were designed for old computers, and lot has changed in the last 10 or 20 years.

Here are stats on some random hard disk from 1991, the Seagate ST-2106N:
Capacity: 106 MB
Average seek: 18 ms
Max full seek: 35 ms

And here are some numbers from a newer drive, the Seagate ST3500830NS:
Capacity: 500,000 MB
Random write seek: less than 10 ms

In the past 15 years, hard disk capacity has increased about 5000x. However, the ability to seek has only increased 3.5x.

In those same 15 years, DRAM capacity has also increased dramatically, probably more than 5000x. My friend just bought a computer with 32 GB of RAM -- that's 320x the capacity of the 1991 hard disk!

Here's another important stat:
1991: strlen("Paul Buchheit") = 13
2007: strlen("Paul Buchheit") = 13

Surprised? Probably not. My point? Our databases are still storing a lot of the same things they were in 1991 (or 1970).

The problem? Most databases are still storing my name using the same techniques that they did 15 or more years ago -- they seek the disk head to some specific location and then read or write my name to that location (this seek can be deferred with write ahead logging, but it will need to happen eventually). These databases rely on the one operation that DID NOT dramatically improve in the past 15 years! They still perform like it's 1991.

Here's another way of looking at what has happened:

Relative to storage capacity, seeks/sec is approaching zero. Disks should no longer be thought of as a "random access" devices.

Some people try to beat that exponential curve by simply buying more disks. Here are a few more stats to consider (actual measurements from my computer):
Disk - sequential read/write: 52 MB/sec
Disk - random seeks: 100/sec
DRAM - sequential read/write: 3237 MB/sec
DRAM - random read/write: 10,000,000/sec

For sequential access, DRAM is about 62x the speed of disk -- disk is way slower, but only by a few orders of magnitude.

However, for random access, DRAM is 100,000 times the speed of disk! Buying and maintaining 100,000 disks would certainly be a hassle -- I don't recommend it.

What can you do instead of buying 100,000 disks? Keep your data in memory, or at least all of the small items, such as names, tags, etc. Fortunately, there are some easy tools available for doing that, such as memcached. If your db is rarely updated, that may be enough. However, if you also have frequent updates, then your database will be back to thrashing the disk around updating its b-trees, and you will be back to 1991 performance. To fix that, you may need to switch to a different method of storing data, one which is log based (meaning that the db updates are all written to sequential locations instead of random locations). Maybe I'll address that in another post.

Finally, one more interesting stat: 8 GB of flash memory cost about $80

Flash has some weird performance characteristics, but those can be overcome with smarter controllers. I expect that flash will replace disk for all applications other than large object storage (such as video streams) and backup.

Update: I'm not suggesting that everyone should get rid of their databases and replace them with some kind of custom data storage. That would be a big mistake. If your database is working fine, then you shouldn't waste much time worrying about these issues. The intent of this post is to help you understand the performance challenges faced by databases that rely on disk. If your database is having performance problems, the correct solution will depend on your situation, and may be as simple as tuning the db configuration.

I also neglected to mention something very important, which is that most databases have a configurable buffer cache. Increasing the size of that cache may be one of the easiest and most effective ways to improve db performance, since it can reduce the number of disk reads.

Saturday, April 21, 2007

What if I could not only change the way things are, but the way things have always been?

I could make the sky blue and pumpkins orange. I could make silly words, like 'snorkel', and silly species, like monkeys. I could make flowers beautiful, and babies cute. I could make you read these words, right now.

Of course it would be very difficult to prove this to anyone, including myself. You see, once I make the sky blue, it has always been that way. Even if I made something really crazy or unlikely, that would just be the way things are. You could challenge me, saying, "Change the calendar, so that one of the months has 28 days," but once I do, it was already like that (and of course there would be some reason), and the challenge would be something else.

It's tempting to think, "There's no way to prove it one way or the other, so it doesn't matter," but that would be saying that the truth does not matter. It would mean that we are closing off all aspects of reality that our mind can not understand or measure.

I'm not telling you to abandon measurement or believe in these abilities -- that would be foolish. Instead, simply ponder the possibilities. This is not about belief -- it's about disbelief. Reality is larger than we can possibly comprehend.

Are you certain of something? If so, is it possible that you aren't seeing the big picture? Perhaps you would change your mind if your understanding were a little broader.

As you consider this, you may begin to sense gaps in your reality. If certainty is gone, then nothing is definitely impossible. Maybe invention is a simple matter of observing what has always existed, and change happens when you notice parts of your self that were there all along. Maybe big ideas are only impractical for those who lack vision and imagination.

Monday, April 16, 2007

And there it is. Microsoft complains about Google being anti-competitive.

Things can change quickly. What was a crazy prediction a few years ago, just happened. Which of today's little startups will be the target of anti-trust complaints in a few years?

From today's nyt:
Microsoft, a veteran defendant of epic antitrust battles in the United States and Europe, is urging regulators to consider scuttling Google's plan to buy DoubleClick, an online advertising company.

Microsoft contends that the $3.1 billion deal, announced on Friday, would hurt competition in the fast-growing market for advertising on the Web and raises questions about how much personal information would be collected by Google, already a dominant player in online advertising.

Bradford L. Smith, Microsoft's general counsel, said in an interview yesterday that Google's purchase of DoubleClick would combine the two largest online advertising distributors and thus "substantially reduce competition in the advertising market on the Web."

Sunday, April 15, 2007

1 - 2 + 3 - 4 + 5 - 6 + ... made easy

One of the top links on reddit right now is about the infinite series 1 - 2 + 3 - 4 + 5 -6 + ...

The wikiepedia entry is interesting, but it involves some math that I don't quite understand. Apparently the other reddit readers don't understand it either, since one of the top comments there says, "Really, there's no sum (the sum is undefined). 1/4 is just some number that mathematicians can use to compare sequences like that."

The answer 1/4 is a little less arbitrary than that!

Instead, try simply doing the arithmetic:
1 = 1
1 - 2 = -1
1 - 2 + 3 = 2
1 - 2 + 3 - 4 = -2
1 - 2 + 3 - 4 + 5 = 3
1 - 2 + 3 - 4 + 5 - 6 = -3

See the pattern? It's 1, -1, 2, -2, 3, -3, 4, -4, 5, -5...

What happens if we plot those points on a graph and then draw separate lines for the positive and negative numbers? (first point at position [0,1], the second at [1,-1], third at [2,2], fourth at [3,-2], etc)

The line of positive numbers is defined by the equation y = x/2 + 1
The line of negative numbers is defined by the equation y = -x/2 - 0.5

Notice that the two lines intersect at y = 0.25, which also happens to define the line midway between the other two lines (since they have slopes of +1/2 and -1/2).

It seems that if we "average" the two equations, like so:
(y + y) / 2 = ((x/2 + 1) + (-x/2 - 0.5)) / 2
y = (x/2 - x/2 + 1 - 0.5) / 2
y = 1/4


I don't know if a real mathematician would approve of my method, but at least it's easy to understand.

Saturday, April 14, 2007

Google buys DoubleClick, for a double-dose of the advertisin'

I just wanted to say that. ("His name is Upgrayedd, spelled with two D's, for a double dose of the pimpin'... You see gentlemen, a pimp's love is very different from that of a square." Makes for an interesting analogy, eh?)

Here is the new order:
  • Microsoft is the new IBM (which is to say, mostly irrelevant)
  • Google is the new Microsoft (but I like the new one better)
  • Facebook is the new Google
And Yahoo? They are the new Lotus. They'll continue their slide into total irrelevance and will eventually be bought by the new IBM (Microsoft).

Back when Google was tiny and Microsoft was terrifying, I used to tell people that eventually Microsoft would be making anti-trust complaints against Google. Most people felt that I had a very weak grasp on reality. Now I'm hearing grumblings. After all, it's not like Microsoft has a real shot at direct competition -- they are behind and continue to lose ground.

Please note: I do NOT work for Google, or anyone else for that matter.

Update: Some people seem to be taking my "G is the new MS", "FB is the new G", etc statements a little too literally. I'm not suggesting that they are literally becoming those other companies, but rather that they are taking their place in the tech ecosystem, in the sense of X is the new Y.

"Perfect" is the enemy of "good enough", and

I hope that you already know that. Perfectionism is a disease. It stops progress and drives us crazy. Perfect is unreachable.

But a while back I had a strange dream, and when I awoke I realized something else: "Good enough" is the enemy of "At all"!

Now I'm not suggesting that quality doesn't matter -- sometimes "good enough" or even "nearly perfect" is very important (brain surgery comes to mind). But more often, "good enough" isn't actually necessary and gets in our way. "Good enough" stops us from ever getting started in the first place.

Forget about your lack of talent, skills, knowledge, time, resources, or whatever else you need to be "good enough". Start an inane blog, take bad photographs, upload boring videos to YouTube, write bad software, create useless products, play bad music, and make ugly art. Forget "good enough", and then simply indulge in the joy of creation.

What did I experience in my dream? Passive consumption is the boring old form of entertainment. Joyous creation is the future of entertainment!

And who knows, maybe after we break through the static friction of quality, we'll discover that some of our work really is good enough, or maybe even great. But remember, although quality is nice, it's not the point.

The self-styled curmudgeons will continue to complain about all this senseless creation, but don't mind them -- they are simply flinging feces through the bars of their monkey cages. They are annoying but irrelevant -- they are not the ones who create the future.

I like xkcd.

Friday, April 13, 2007

Webserver in bash

And not using perl or any of that fancy stuff. It's the inane things that keep me awake at night.

Getting nc to behave turned out to be the most difficult part. It won't exit until both ends of the connection are closed. Correction, making blogger not mangle this code was the most difficult part.

# -- http://localhost:9000/hello?world

[ -p $RESP ] || mkfifo $RESP

while true ; do
( cat $RESP ) | nc -l -p 9000 | (
REQ=`while read L && [ " " "<" "$L" ] ; do echo "$L" ; done`
echo "[`date '+%Y-%m-%d %H:%M:%S'`] $REQ" | head -1
cat >$RESP <<EOF
HTTP/1.0 200 OK
Cache-Control: private
Content-Type: text/plain
Server: bash/2.0
Connection: Close
Content-Length: ${#REQ}


Update: Fixed script so that it also work in Linux, where tr lacks the -u option.