code quality vs. economics part II

| 2 Comments
The last time I wrote about this, the problem that I had encountered was that there was an sql query that was select * from table where conditions. Then Later in the perl code it would strip more rows out that could have been done with a 'like' in the sql. 

Everybody decided that it would take just as long and probably longer for the database server to retrieve the larger dataset and send it across the network and then have it processed on the webserver than if it had just done the LIKE itsself.

I will not argue this point. It most likely is faster for a single query to do it the 'right' way.

The problem here is scale. We have 5000+ corporate clients, all banging away on one mysql database. The thing to consider here is that the database server has many components to it, and some things are easy to add or make faster (RAM, Raided Hard drives), while other things are not: CPU's.

The reason it is more cost effective money wise, is that anytime you free up the cpu to do other tasks, its a win. So, with the transfer of data mostly relegated to the Hard drives  and network with minimal action required from the cpu, the cpu is then freed to perform another task for another queued request.

Basically we are moving cpu cycles from one machine to another, with all the network and drive overhead that entails, because it is not cheap to add another mirrored database server, however load balanced web servers are much cheaper and easier to come by.

I hope that this clears up the confusion. Is it faster to do it this way? Almost certainly not. 
However there is an economic benefit. This benefit is difficult to measure, and honestly countered by maintenance costs, but that is beyond the scope of this entry at this time.


2 Comments

I still think that this is going to be very slow. You will at the very least be sending all that data over the network from your overworked database to the webserver. Even if your cpu is running at 1%, if the other webservers can't get to the database because it's busy sending huge amounts of data you are going to have severe performance and probably even stability issues.

But to take a step back, you make it sound like these corporate clients have access to the actual database, and not some form of API; if that's the case they will probably do normal queries (not client-side) and "take down your database." But personally I don't worry about these issues till something actually happens. Databases are fast.

The customers do not have direct access to the database, but the appliance sent to them does. These 5000+ appliances are inserting/updating the same mysql database There is some type of buffering here. that I haven't seen, that delays the inserts by up to 20 minutes, but prevents the server from going down.

The access that they have is via the web interface on the webservers in the same datacenter as the database server.

The issue at hand here, is that the webservers are idle, or can be easily purchased so that they are. There is also enough bandwidth between web and database servers that the connection does not saturate, so the processing of non indexed rows gets moved to the webserver. Sure this is slower for a single row, but it frees up the database processor to start parsing/executing a new query.

What it comes down to is Money. Sure it would be faster/easier to buy another database server so that the string parsing queries could be handled on the box, but the expense/expertise needed to maintain that setup is much greater than moving the data to the webserver and letting them handle the load.

As you said, "I don't worry about these issues till something actually happens".

I am new at the company, and found it interesting that the long arm of economics made it into my lines of code.

About this Entry

This page contains a single entry by leonard published on June 4, 2010 9:11 PM.

monotony in development was the previous entry in this blog.

Ironman status? is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.