Observing the NoSQL hype through the eyes of an SQL performance consultant is an interesting experience. It is, however, very hard to write about NoSQL because there are so many forms of it. After all, NoSQL is nothing more than a marketing term. A marketing term that works pretty well because it goes to the heart of many developers that struggle with SQL every day.
Archive for the ‘Performance’ Category
There was an interesting index related performance problem on Stack Overflow recently. The problem was to check an input string against a table that holds about 2000 prefix patterns (e.g.,
LIKE 'xyz%'). A fast select is needed that returns one row if any pattern matches the input string, or no row otherwise.
I believe my solution is worth a few extra words to explain it in more detail. Even though it’s a perfect fit for Use The Index, Luke it’s a little early to put it as an exercise there. It is, however, a very good complement to my previous article Analytic Top-N queries—so I put it here.
Although the problem was raised for a MySQL database, my solution applies to all databases that can properly optimize Top-N queries.
Today, I’d like to introduce my new Web-Book Use The Index, Luke. Use The Index, Luke is a guide to Oracle database performance for developers.
I started the book because I noticed that almost every existing book or online document on that topic is stuffed with plenty of information that is not relevant to developers.
One of the more advanced tricks I like to exploit are analytic Top-N queries. Although I am using them for quite a while, I recently discovered a “limitation” that I was not aware of. Actually—to be honest—it’s not a limitation; it is a missing optimization in a rarely used feature that can easily worked around. I must admit that I ask for quite a lot in that case.
The article starts with a general introduction into Top-N queries, applies that technique to analytic queries and explains the case where I miss an optimization. But is is really worth all that efforts? The article concludes with my answer to that question.
As Jonathan Lewis commented on my article Clustering Factor: Row Migrtion’s Victim, there is even more to say about the difference between the good, old, and deprecated
ANALYZE statement and the
DBMS_STATS package. Jonathan mentioned that the CBO is using the
CHAIN_CNT value in the statistics, if present, and suggested to try my “trapQL” after analyzing the base table in the old fashion.
This article describes the effects of a high row migration rate on the clustering factor and the optimizer’s ability to select the best execution plan.
In my previous article—Row Migration and Row Movement—I have demonstrated that the “insert empty, update everything” anti-pattern can lead to 100% row migration. This article continues the research on row migration and unveils surprising effects on the clustering factor. To be precise, the clustering factor can become completely bogus in presence of a very high row migration rate. Once the clustering factor is “wrong”, it’s just a finger exercise to construct an optimizer trap and proof that row migration can affect the query plan.
The Oracle database knows three distinct processes that are easily mixed up: Row Chaining, Row Migration and Row Movement.
For the impatient, I provide some very short definitions:
- Row Chaining
- Distribution of a single table row across multiple data blocks.
- Row Migration
- Relocation of an entire table row to a new place, without updating the indexes.
- Row Movement
- Relocation of an entire table row to a new place and updating the indexes.
This article was inspired by the question if Oracle 11r2 performs Row Movement instead of Row Migration for ordinary
UPDATE statements—that is, in absence of partitions. The short answer is: no, it doesn’t. The long answer is the rest of this article.
I have witnessed a very short talk between a network engineer and a top-level manager at a client’s Christmas party. The network guy explained that the firewall adds about 0.2 milliseconds latency to each round trip between the application server and the database, which adds up to some hours in one particular case. So the manager asked what could be done and the network guy provided two solutions:
- Change the application to make less round trips
- Accept the security risk and don’t put a firewall in between those two tiers
Funny enough the network guy explained that the second option needs the managers signature because it bypasses the corporate security guidelines and somebody must take the responsibility for that.
Consider you are the manager, would you sign that a paper?