Tuesday, April 15, 2008

How does a lack of joins stop web developers from writing slow apps?

How does a lack of joins stop web developers from writing slow apps?

on a previous post about appengine, noodl asked a good question

How does a lack of joins stop web developers from writing slow apps? Surely having to write more queries or denormalising a schema makes things slower?

I was doing to write a long diatribe here (actually I did, and just deleted it).

by removing joins it forces the developer to work, and stop and think about what he is doing. writing that ‘cool’ 3-5-table join using all inner selects and left joins, while easy to type and runs great on a small dataset but usually sucks in performance. those queries can sometimes be rewritten into a set of SQL statements and a for-loop without much work by a developer. but usually these things are found about 1 week prior to launch (or 1 week after depending on if you ditched that performance QA phase when things were getting tight).

the usuall approach to these kind of things is even more ‘coolness’ you get to add a caching layer on top of your app. so instead of doing the 3-5 table join, you just get to do it for a single user every couple of hours when the cache expires, and add another layer of complexity for your operations team to maintain

The other approach I’ve seen used to great effect is pre-caching. in this example you can do it at the database layer. you take those 3-5 table join statements and just execute them in batch mode and store those results in a de-normalzed table. the web pages then reads that table instead.

so to answer your question directly. lack of joins stops web-developers from writing bad queries that the user have to wait on (and use memcache for) and they can either write the query using multiple selects (which can be faster in some cases, a wash in others, and slower in some as well) or query denormalized tables holding pre-computed results which is the fastest way.

updating the tables may be slower, but if you use async writing mechanisms the user wont be waiting for the write to occur. you may need to be a bit smart when a user has just updated something and your waiting for your writer to finish, but for most requests you can avoid that if you know the results will be there within a second/minute or so. or you can devise some sync/async writer where your update page updates some things so that the user knows it’s been updated, but his friends might see the update a bit later

No comments: