Tweaking EntriesPerRebuild for Faster Rebuilds
Some time ago, before Movable Type 3.2 was released, I wrote about a number of ways to optimize static page publishing performance. I've been spending some time looking through this logic and I'm happy to report the strides have been made to make the performance of MT's static page publishing even better. Some of the tips are not necessary or relevant because of these enhancements. For instance previous/next entry data is more efficiently loaded and cached avoiding the excessive (and slow) number of database lookups that were once required. There still is room for improvement, but things are heading in the right direction. I'm optimistic this will continue as ruminations of a new version circulate throughout the community.
While studying the rebuild code in MT 3.2 I happened upon a performance enhancement that can decrease the rebuild times of individual archives. Perhaps this has been known to others and not publicized well enough.
What I realized over the weekend (and should have known having poured through MT's code dozens of times) is that each batch of individual entries is a separate request to to the server. If you are running under CGI, as most are, MT has to be loaded and all data (entries, templates etc.) has to be fetched from the database. By decreasing the number of times MT has to do this, the less time and resources are needed.
The trick is to increase the number of entries per rebuild (EntriesPerRebuild) in your system configuration, mt-config.cgi. By default MT uses 40 entry batches. If you are like me you left it at this default and have potentially been missing out on improved rebuild times.
Why individual entries are built in batches is completely understandable and necessary. This is not a design flaw or bug in Six Apart's design. The problem lies in the nature of CGI applications and browser timeouts.
A CGI application is only live when it's requested. A CGI application starts up, does some processing on the request, returns a response and shuts down. If you have a significant number of entries in your weblog the processing time required will also be significant. Browsers will only wait for a certain amount of time (the timeout period) before it will give up assuming some network problem exists. With the browser quitting the rebuild process never completes. By breaking the process into small pieces this condition can be avoided and valuable user feedback provided.
The default of 40 is a pretty conservative number though. If you have optimized templates such as the ones that ship with MT, you can safely increase that number thereby reducing the number of times MT has to startup and initialize itself. What number to use depends on a number of factors -- your browser timeout, how optimized your templates are and finally server processing power. Some trial and error may be necessary depending on how far you push this. The danger in this optimization is that you set the number too high and the rebuild process surpasses your browser timeout. So setting EntriesPerRebuild to 999999 is not advisable, though tempting.

4 Comments