June 16, 2009

It was summer of 2000 when we first learned about Sybase IQ and its revolutionary column vector database technology. As a long time Sybase ASE and Oracle DBA we were used to database engines that organize data in a row by row method. For quite some time we had difficulties to think in column terms and not in row terms.

A column vector database requires totally different methodology for performance and tuning efforts. Nothing is straight forward and the message that more data volume doesn’t make a difference in the query performance is not easy to understand. For example: A traditional database engine allows the usage of only one index per table in the same query. Sybase IQ has no limits. If each column in the query requires a different index, it will use a different index. In fact, by default every column is an index.

Getting our hands around the fact of having queries perform up to a 1000 times faster on Sybase IQ than on traditional row based RDBMS systems is no easy matter either. Of course in an Oracle implementation with the OLAP technology similar results can be achieved. However, you are paying for the underlying OLTP engine regardless if you’re using it or not. Sybase IQ doesn’t have this overhead.

One of the key features of Sybase IQ is its data compression. We worked with Sybase IQ systems that easily exceeded 80% compression ratio. Meanwhile, every database vendor introduced data compression into their database engines, but Sybase IQ is the undisputed leader in the highest compression ratio of them all.

This post is not meant to explain how Sybase IQ works and why it is so superior in analytical query processing compared to its row processing based counter parts.

We know that there are other data warehouse systems out there that are equally as fast as Sybase IQ and some are even faster, but in this article I am focused on the Sybase IQ engine and its recent setting of a new benchmark record for TPC-H transactions. This record is all about saving money while providing blazing fast performance.

OK, back to what’s new in version 15 of Sybase IQ.

There are two major improvements in the new release that are worth mentioning.

1. The overall query performance was once again dramatically improved and yields in an average 20%-50% performance gain, compared to the previous Sybase IQ release.

What does this mean for your business?

Analytical queries are typically CPU hungry monsters that can eat up your entire processing resources. Producing results faster means more queries will be processed in the same time window.

It also means the hardware upgrade can be postponed for a while. Considering that the associated QA requirements to move an entire production system to a new hardware platform can be a very expensive proposition and combined with the cost of the new hardware maybe not worth the investment. In comparison; a standalone upgrade of the database engine might be worth the effort.

It further means that cheaper server hardware on Linux can be used to build Sybase IQ multiplex systems that produce high end performance results on a slim budget. Due to Sybase IQ’s architecture there are no added network constraints either.

2. Multiple writer nodes in a single multiplex environment.

This is an enormous step forward. Previously a typical Sybase IQ was build with one big server that acts as the writer node and many smaller servers for the reader nodes. The thinking was to provide the best hardware to the CPU intensive load jobs to minimize the load windows. The downside of this architecture was that in a failure situation, one of the smaller servers would take over the writer part and then would be helplessly overwhelmed in case the writer node couldn’t be fixed in time for the next load.

It is also economically not practical to devote high-end, expensive server hardware to a job that only last for a fraction of a daily work load. Having multiple writer nodes solves this problem once and for all.
Utilizing all the available processing power in a multiplex environment ultimately leads to faster load performance, which can be solved without upgrading the writer node server hardware over and over again.

Also, another data load performance improvement is the new feature of loading data directly from clients. This means that data can be loaded from files using a simple SQL statement instead of copying data files onto a server and then using the bcp command.

Of course there are other major improvements in security, flexibility and integration support, but the two improvements above are the major contributors to any cost savings or cost avoidance initiative a business is taking on these days.

Sybase also improved their client apps to better manage Sybase IQ, easier develop applications for Sybase IQ and more effectively monitor Sybase IQ. Once the Achilles heel of Sybase, these tools are now very usable and mature.

From a cost/performance point of view, Sybase IQ is a force to be reckoned with and due to its column vector architecture there is no other major database engine in the market like it. To support Sybase’s strong performance in technology they also had their best financial year ever in 2008 and the best quarter on record in Q1 2009.

We hope you enjoyed our brief introduction into Sybase’s data warehouse engine Sybase IQ and its latest version 15 features.