|
|
|
|
|
|
|
|
|
|
|
|
|
|
Boost your server performance with HTTrack (continued)
All this work had to be done for every page every time a client requested it, over and over. The Web server would gasp under the load, especially with earlier versions of the Microsoft XML parser (more recent versions of MSXML have much better performance). Again, this isn't an issue if the traffic load isn't that high, but if the traffic load is high, then you need very fast servers to keep generating those dynamically constructed pages.
This situation didn't appeal to the part of me that loves high performance computing applications. Why should the Web servers keep rebuilding the same pages over and over and over? The underlying content of many Web sites doesn't change that often, so repeatedly building the same page and returning the same result seemed inefficient. I wanted to find a solution that would allow sitegarden/xml Web sites to work under even the highest traffic load.
A helpful utility After thinking about it for a while I decided that I needed to find some sort of utility that would slurp the entire sitegarden/xml site down into static HTML pages. As static HTML, the site could be run on any Web server at all. The original Lotus Domino and Microsoft IIS content management servers wouldn't be needed for production usage and could be used only for creating and maintaining the site content. As static HTML, the site could run on any of the servers listed in the Spec WEB99 results, meaning that performance would be the fastest in the world.
Such a utility is often called a "site mirroring" application. I embarked on an extensive search of the Web for such a utility and found many of them. For two full weeks I downloaded utility after utility and tried them out. None of them were good enough, and they all had one problem or another that made them useless for my purposes. It was frustrating and annoying. Surely it isn't that hard to mirror a Web site.
Site mirroring utilities work by placing a request to the Web server for a page. They then examine the HTML returned by the Web server and re-map all the links in the HTML so that they are local links. Next they grab the images and other resources on the page and store them locally. They then write the HTML page and associated resources out to disk. Next they visit all of the links on the page that point to other pages within this site and repeat the process. Eventually the entire site has been mirrored and can be browsed locally or uploaded to any Web server that can serve HTML pages.
Fortunately I eventually found precisely the utility that I was looking for, HTTrack Website copier (at http://www.httrack.com). I tried it out, and it ran beautifully, grabbing the home page, converting it to a local HTML file, and then effortlessly working its way through the rest of the site until the entire site had been copied. Most impressive was that HTTrack had no problem with links contained within Javascript code, something that no other site mirroring utility was able to handle properly.
Final thoughts There was one issue that I had with HTTrack however, and that was file naming. Sitegarden/xml pages are identified by the Lotus Domino UNID (unique identifier), a 32 character hex number passed as a URL parameter when loading a page. I needed the mirrored HTML file names to include the UNID in the filename, but HTTrack didn't do this. I emailed Xavier Roche, one of the HTTrack developers, and asked him if he could include such a feature. Before long, a new beta version was released with exactly this functionality. Impressive stuff.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
-- Advertisement --
Learn Notes and Domino 8 at your place and pace!
Learn Notes and Domino in your office and/or home! TLCC's highly acclaimed distance learning courses for users, developers, and admins will enhance your career and your resume.
The many included activities and demos will make you a pro! Expert instructor help is a click away.
Click here to try a FREE demo course!! |
-- Advertisement --
Mark your calendar for in-depth Lotus training, May 12-14, Boston
Join experts and peers May 12-14 in Boston for educational and networking events that deliver real-world Lotus training so you can increase productivity and efficiency in your company, advance your skills, and squeeze the most from your current environment. One registration gets you into THE VIEW's Admin2010 and Lotus Developer2010.
Register by April 10 to save $200. |
|
|
|
|
|
|
|
|
|
|