Problem 1:
Innovation is not only inventing a new device or program. Innovation can involve coming up with an optimal solution to an unusual problem or an unusual and complex combination of ordinary or unusual problems.
The Problem to Solve: Panic!!! The web server’s CPU load is at 80!!!
Shortly after starting at Clickmarks, my CEO and a couple VPs called me late at night in a panic. An advertiser had mistakenly advertised our web site before the scheduled date and our web server was not ready for prime time. The CPU load was at 80 — one of the highest CPU loads I had ever seen on a Unix server of any kind.
Reducing the Immediate Impact of the Problem
Immediately, we modified the web server parameters to cut back the number of concurrent requests allowed on the system as a stop-gap measure and users were able to use the web site, though performance was slow. Christine, our VP of Technology and I immediately took a new 4 processor system into our colocation facility at AboveNet in San Jose and installed it tying it into our Coyote Point load balancer pair and into our Sun which held our Oracle database.
Analysis for a Healthy Solution
Back at headquarters in Fremont the next day, we set up a machine to simulate load against a test web server using Silk and monitored the web server and the database system for load narrowing the problem down to a single standard-CGI program written in Perl.
This Perl program was about 14,000 lines long including the main file and all the files it referenced. This meant that each time a user would contact that web page, the Apache web server would bring into memory Perl which would bring in the CGI program and all the files it referenced, do a just-in-time compile, and then begin running. But, before it could do any real work, it would have to log into the database.
Some performance gains could be realized using mod-perl since Perl would be in the web server ready to go at all times. But, the lion’s share of the load had nothing to do with bringing Perl or even the program into memory as these tended to say pinned into virtual memory anyway. The lion’s share of the load was doing the just-in-time compilation required before the program could begin work.
The Solution: FastCGI
Once the lead developer was made aware of the program causing the severe performance problem, we evaluated both mod-perl and FastCGI and installed both. Then he modified this program to work with FastCGI.
This modification is generally very simple. Ordinary CGI programs are started by the web server. The web server pulls in Perl, in this case, as this program was a Perl program. Then internally it does a quick just-in-time compile. Only in this case, the compiling was not at all quick. The program with all its included files was approximately 14,000 lines long! That means each time a user would request this web page, the web server would have to bring in a copy of the Perl executable program, and Perl would have to load in all the files in the 14,000 line program and compile them, and only then could the program begin running and log into the database and begin processing the actual work. The program is therefore designed to start, log in, do processing, and die.
Modifying a program like that to work with FastCGI is usually trivial. A programmer merely creates a small loop at the beginning of the program which executes a function that causes it to wait for the web server to give it some work. It then processes this work, but instead of dying, the program merely returns to the beginning of the loop and waits for the next request to come in.
This means the 14,000 line compilation is done once when the web server is started. Also, the database login can be done once, and sometimes database connection pooling can be advantageous.
The Results: 135 Times Increase in Performance
We tested the performance after the change and it was improved by a factor of 135 times. The new version of the program was implemented into production and from that time on we only needed two servers to provide fail-over. No load ever came in so heavy that it could not be handled by a small fraction of the power of one server.
Problem 2
Time to Staff Up!!!
Innovation is not limited to inventing a new device or solving an immediate IT problem. It can involve coming up with a solution to any kind of problem or complex combination of problems in an optimal way.
The company was new. I was employee number 11 and we were in a new building. We had no network connection and had been waiting for a proper T1 installation. I was hired in as a Director of Information Technology and Operations, and was also called Chief Systems Architect. Next door was a company who was gracious enough to lend us a network line temporarily so we could have some connectivity to the outside world — namely our servers in San Jose.
Many companies and investors were excited about what our company had to offer, and with the venture capital we were receiving, there were Linux and Windows systems to purchase, set-up, and network. There were firewalls, routers, and VPNs to install for security. System monitoring became vital as the company grew and split out into separate Development, QA, Support, Infrastructure, and Production environments. So, my CEO and friend, Umair, came to me and said, “Dan, I’ll tell you what. You can work yourself to death trying to do this on your own, or you can get on the ball and staff up and build the IT department. It’s up to you.”
Well, Umair is a very honest and competent man, but it was not at all up to me. The IT department needed to be built. Desperately. From the IT Department was spawned the QA Department, and then the Support Department. And the company grew from 11 employees to about 85.
I never stopped being a hands-on manager stepping into different roles to meet the company’s needs doing budgetting, hiring, and sadly some firing, dealing with vendors to come up with standard workstation and web server configurations, stepping into a network architect role, then a systems architect role the next in Unix, and then drawing on my Windows administrators and outside network and voip experts. And we had just begun to move into a little virtualization and Java/JSP work when I experienced a personal tragedy at home. My wife was having an affair and our marriage of ten years was coming to an end.
Dealing with the tragedy and attempting to save my family, I worked with other managers to migrate my managerial duties to other directors and managers. Then I worked to make sure the IT Department was in good shape for another IT Director to step in and take over and I resigned.
I would continue to return to visit Clickmarks over the next few years while it was still in Fremont. I have very fond memories of the company and stay in contact with some of the VP’s and Directors who became close friends of mine during our great time together building a great new company.
Sorry, the comment form is closed at this time.