<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Software Quality is Quality of Life</title>
	<atom:link href="http://blog.fatalmind.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.fatalmind.com</link>
	<description>Markus Winand&#039;s blog about Performance, Reliability, Maintainability, Scalability and more</description>
	<lastBuildDate>Tue, 10 Jan 2012 12:49:17 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='blog.fatalmind.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Software Quality is Quality of Life</title>
		<link>http://blog.fatalmind.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://blog.fatalmind.com/osd.xml" title="Software Quality is Quality of Life" />
	<atom:link rel='hub' href='http://blog.fatalmind.com/?pushpress=hub'/>
		<item>
		<title>Choosing NoSQL For The Right Reason</title>
		<link>http://blog.fatalmind.com/2011/05/13/choosing-nosql-for-the-right-reason/</link>
		<comments>http://blog.fatalmind.com/2011/05/13/choosing-nosql-for-the-right-reason/#comments</comments>
		<pubDate>Fri, 13 May 2011 07:42:50 +0000</pubDate>
		<dc:creator>Markus Winand</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[Reliability]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://blog.fatalmind.com/?p=991</guid>
		<description><![CDATA[Observing the NoSQL hype through the eyes of an SQL performance consultant is an interesting experience. It is, however, very hard to write about NoSQL because there are so many forms of it. After all, NoSQL is nothing more than a marketing term. A marketing term that works pretty well because it goes to the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=991&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Observing the <a href="http://en.wikipedia.org/wiki/NoSQL_%28concept%29">NoSQL</a> hype through the eyes of an SQL performance consultant is an interesting experience. It is, however, very hard to write about NoSQL because there are so many forms of it. After all, NoSQL is nothing more than a marketing term. A marketing term that works pretty well because it goes to the heart of many developers that struggle with SQL every day.</p>
<p><span id="more-991"></span>
<p>My unrepresentative observation is that NoSQL is often taken for performance reasons. Probably because SQL performance problems are an everyday experience. NoSQL, on the other hand, is known to &#8220;scale well&#8221;. However, performance is often a bad reason to choose NoSQL—especially if the side effects, like <a href="http://en.wikipedia.org/wiki/Eventual_consistency">eventual consistency</a>, are poorly understood.</p>
<p>Most SQL performance problems result out of improper indexing. Again, my unrepresentative observation. But I believe it so strongly that I am writing a <a href="http://sql-performance-explained.com">book about SQL indexing</a>. But indexing is not only a SQL topic, it applies to NoSQL as well.<a href="http://www.mongodb.org/"> MongoDB</a>, for example, claims to support &#8220;<cite>Index[es] on any attribute, just like you&#8217;re used to</cite>&#8220;. Seems like there is no way around proper indexing—no matter if you use SQL or NoSQL. The latest release of my book, &#8220;<a href="http://use-the-index-luke.com/sql/testing-scalability/response-time-throughput-scaling-horizontal">Response Time, Throughput and Horizontal Scalability</a>&#8220;, describes that in more detail.</p>
<p>Performance is—almost always—the wrong reason for NoSQL. Still there are cases where NoSQL is a better fit than SQL. As an example, I&#8217;ll describe a NoSQL system that I use almost every day. It is the <a href="http://git-scm.com/">distributed revision control system Git</a>. Wait! Git is not NoSQL? Well, let&#8217;s have a closer look.</p>
<dl>
<dt>Git doesn&#8217;t have an SQL front end</dt>
<dd>
<p>Git has specialized interfaces to interact with the repository. Either on the command line or integrated into an IDE. There isn&#8217;t anything that remotely compares to SQL or a relational model. I never missed it.</p>
</dd>
<dt>Git doesn&#8217;t use an SQL back-end</dt>
<dd>
<p>Honestly, if I would have to develop a revision control system, I wouldn&#8217;t take an SQL database as back-end. There is no benefit in putting BLOBs into a relational model and handling BLOBs all the time is just too awkward.</p>
</dd>
<dt>Git is distributed</dt>
<dd>
<p>That&#8217;s my favourite Git feature. Working offline is exactly what is meant by &#8216;partition tolerance&#8217; in <a href="http://en.wikipedia.org/wiki/CAP_theorem">Brewer&#8217;s CAP Theorem</a>. I can use all Git features without Internet connection. Others can, of course, still use the server if they can connect to it. Full functionality on either end. It is partition tolerant.</p>
</dd>
<dt>Conflicts happen anyway</dt>
<dd>
<p>If there is one thing we learned in the 25 years since <a href="http://en.wikipedia.org/wiki/Patch_%28Unix%29">Larry Wall introduced patch</a>, it is that conflicts happen. No matter what. Software development has a very long &#8220;transaction time&#8221; and we are mostly using optimistic locking—conflicts are inevitable. But here comes the famous CAP Theorem again. If we cannot have consistency anyway, let&#8217;s focus on the other two CAP properties: availability and partition tolerance.</p>
<p>Acknowledging inconsistencies means to take care of methods and tools to find and resolve them. That involves the software (e.g., Git) as well as the user. But here comes one last unrepresentative observation from my side: most NoSQL users just ignore that. They assume that the system magically resolves contradicting writes automatically. It&#8217;s like using a CVS work flow with Git—it works for a while, but you&#8217;ll end up in trouble soon.</p>
</dd>
</dl>
<p>I&#8217;m not aware of a minimum feature set for NoSQL datastores—it&#8217;s therefore hard to tell if Git fulfils them or not. However, Git feels to me like using NoSQL for the right reason.</p>
<p>It&#8217;s about choosing the right tool for the job. But I can&#8217;t get rid of the feeling that NoSQL is too often taken for the wrong reasons—query response time, in particular. No doubt, NoSQL is a better fit for some applications. However, an <a href="http://use-the-index-luke.com/">index review</a> would often solve the performance problems within a few days. SQL is no better than NoSQL, nor vice-versa. Because the question is not what&#8217;s better. The question is what is a better fit for a particular problem.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/myfatalmind.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/myfatalmind.wordpress.com/991/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/myfatalmind.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/myfatalmind.wordpress.com/991/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/myfatalmind.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/myfatalmind.wordpress.com/991/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/myfatalmind.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/myfatalmind.wordpress.com/991/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/myfatalmind.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/myfatalmind.wordpress.com/991/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/myfatalmind.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/myfatalmind.wordpress.com/991/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/myfatalmind.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/myfatalmind.wordpress.com/991/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=991&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.fatalmind.com/2011/05/13/choosing-nosql-for-the-right-reason/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/6855feeb83ac8a3e397bc8260bad8294?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fatalmind</media:title>
		</media:content>
	</item>
		<item>
		<title>Finding the Best Match With a Top-N Query</title>
		<link>http://blog.fatalmind.com/2010/09/29/finding-the-best-match-with-a-top-n-query/</link>
		<comments>http://blog.fatalmind.com/2010/09/29/finding-the-best-match-with-a-top-n-query/#comments</comments>
		<pubDate>Wed, 29 Sep 2010 09:16:21 +0000</pubDate>
		<dc:creator>Markus Winand</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[postgresql]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://blog.fatalmind.com/?p=945</guid>
		<description><![CDATA[There was an interesting index related performance problem on Stack Overflow recently. The problem was to check an input string against a table that holds about 2000 prefix patterns (e.g., LIKE 'xyz%'). A fast select is needed that returns one row if any pattern matches the input string, or no row otherwise. I believe my [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=945&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>There was an interesting index related performance problem on <a href="http://stackoverflow.com/questions/3778319/how-to-use-index-efficienty-in-mysql-query">Stack Overflow</a> recently. The problem was to check an input string against a table that holds about 2000 prefix patterns (e.g., <code>LIKE 'xyz%'</code>). A fast select is needed that returns one row if any pattern matches the input string, or no row otherwise. </p>
<p>I believe my solution is worth a few extra words to explain it in more detail. Even though it&#8217;s a perfect fit for <a href="http://Use-The-Index-Luke.com/">Use The Index, Luke</a> it&#8217;s a little early to put it as an exercise there. It is, however, a very good complement to my previous article <a href="http://blog.fatalmind.com/2010/07/30/analytic-top-n-queries/">Analytic Top-N queries</a>—so I put it here.</p>
<p>Although the problem was raised for a MySQL database, my solution applies to all databases that can properly optimize Top-N queries.</p>
<p><span id="more-945"></span>
<p>The original SQL statement in the question was like that:</p>
<blockquote><pre><b>select 1
  from T1
 where 'fxg87698x84' like concat (C1, '%')</b></pre>
</blockquote>
<p><code>T1.C1</code> is the column that holds the prefix patterns—one per row. Although a prefixed <code>LIKE</code> filter can use an index range scan, the problem is that it is the wrong way around: it&#8217;s not searching for a string that matches the pattern, it&#8217;s searching for a pattern that matches the string.</p>
<p><!--Although the patterns themselves are the primary key of the table, multiple patterns could potentially match because their length varies. However, my solution doesn't account for that and I believe that this was not the intention of the query—counting the number of matching patterns is not possible with my approach.-->
<p>The query, as written, must check all the patterns against the string. E.g., by a full table scan or (fast) full index scan. However, it&#8217;s always a <em>full</em> scan. Can that be improved?</p>
<p>Let&#8217;s start step-by-step. The simplest case is that the exact input string is a pattern in the table. A SQL statement to check for the exact pattern is very simple:</p>
<blockquote><pre><b>select C1
  from T1
 where C1 = 'fxg87698x84'</b></pre>
</blockquote>
<p>The next case that the exact pattern doesn&#8217;t exist in the table, but a prefix pattern, that matches the input string, exists. That pattern must be shorter than the input string—otherwise it cannot match. Because we aim to solve the problem with an index, let&#8217;s imagine the patterns, as they would be stored in an index:</p>
<blockquote><pre>axt3
fxg
      <b>&lt;- place where 'fxg87698x84' would be</b>
tru56</pre>
</blockquote>
<p>If the exact pattern doesn&#8217;t exist, the <em>preceding</em> index entry is the best possible match (precondition: <a href="#nooverlap">no overlapping patterns</a> exist). That&#8217;s because shorter strings are considered &#8220;smaller&#8221; when sorted. So, let&#8217;s extend the select to find the preceding record if the exact pattern is not in the table:</p>
<blockquote><pre>select C1
  from T1
 where C1 <b>&lt;=</b> 'fxg87698x84'
 <b>order by c1 desc
 limit 1</b></pre>
</blockquote>
<p>The less than or equals condition will match the exact pattern, if it exists, and all that precede it. The reverse <code>ORDER BY</code> clause makes sure that the index is traversed upwards. In conjunction with the where clause, it means that the <a href="http://use-the-index-luke.com/anatomy/the-tree">tree traversal</a> is done to find the input string, and the <a href="http://use-the-index-luke.com/anatomy/the-leaf-nodes">leaf node scan</a> continues upwards from there. The <code>LIMIT 1</code> clause is the MySQL way to make a Top-N query so that the leaf node scan aborts after the first record. Voilà, this statement will return the best candidate  pattern (or none at all) by performing a very small index range scan.</p>
<p>The final case we need to take care of is that no pattern matches the input string. There are two sub-variants that can happen: (a) a potentially matching pattern would be the very first entry in the index. In that case the Top-N query will not return any row and we are done; (b) the Top-N query returns a pattern that is not a prefix for the input string. That can be handled by wrapping the Top-N query to filter the result through the original <code>LIKE</code> expression:</p>
<blockquote><pre><b>select 1
  from (</b>
        select C1
          from T1
         where C1 &lt;= 'fxg87698x84'
         order by C1 desc
         limit 1
       <b>) tmp
 where 'fxg87698x84' like concat (C1, '%')</b></pre>
</blockquote>
<p>Done.</p>
<p>Simple? With a good understanding of index fundamentals, it is simple! That&#8217;s why I am writing a Web-Book about indexing basics: <a href="http://Use-The-Index-Luke.com/">Use The Index, Luke!</a>. Funny enough, the basics are the same for all databases—we all put our pants on one leg at a time.</p>
<h3 id="nooverlap">Closing Note</h3>
<p>The precondition for all that is that there are no overlapping patterns in the table. E.g., the statement doesn&#8217;t work with the following patterns:</p>
<blockquote><pre>axt3
fxg
<b>fxg1</b>
      <b>&lt;- place where 'fxg87698x84' would be</b>
tru56</pre>
</blockquote>
<p>In that case, the closest entry doesn&#8217;t match although there is a matching entry. However, the <code>FXG</code> entry matches everything that <code>FXG1</code> can possibly match—the two patterns are overlapping.</p>
<h3>Second Closing Note</h3>
<p>The original problem posted on <a href="http://stackoverflow.com/questions/3778319/how-to-use-index-efficienty-in-mysql-query">Stack Overflow</a> mentioned that this lookup must be performed 1 million times—within half an hour. The author did not mention if that target was reached, nor if the process is single-threaded. </p>
<p>However, considering the overall problem, the <em>most computing resource efficient</em> solution would probably be to sort both sets—the patterns and the input strings—and implement a manual merge. But that&#8217;s probably much more effort to implement. The index solution is very <em>efficient on human resources</em>. Whatever is the <em>best</em> solution for the business is up to the company to decide.</p>
<div id="right_top_adspace">
<div id="util_cover">
<a href="http://Use-The-Index-Luke.com"><img src="http://Use-The-Index-Luke.com/img/util_cover_free.png" height="209" width="130" alt="Use The Index, Luke! A Guide to SQL Performance for Developers" border="0"></a>
</div>
<div class="ad">
<h3><a href="http://winand.at/de/consulting/instant/">Instant-Consulting</a></h3>
<p>Häppchenweise Online Consulting für Entwickler. Jetzt gratis testen!<br />
<a href="http://winand.at/de/consulting/instant/">winand.at/consulting/</a>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/myfatalmind.wordpress.com/945/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/myfatalmind.wordpress.com/945/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/myfatalmind.wordpress.com/945/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/myfatalmind.wordpress.com/945/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/myfatalmind.wordpress.com/945/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/myfatalmind.wordpress.com/945/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/myfatalmind.wordpress.com/945/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/myfatalmind.wordpress.com/945/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/myfatalmind.wordpress.com/945/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/myfatalmind.wordpress.com/945/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/myfatalmind.wordpress.com/945/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/myfatalmind.wordpress.com/945/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/myfatalmind.wordpress.com/945/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/myfatalmind.wordpress.com/945/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=945&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.fatalmind.com/2010/09/29/finding-the-best-match-with-a-top-n-query/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/6855feeb83ac8a3e397bc8260bad8294?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fatalmind</media:title>
		</media:content>

		<media:content url="http://Use-The-Index-Luke.com/img/util_cover_free.png" medium="image">
			<media:title type="html">Use The Index, Luke! A Guide to SQL Performance for Developers</media:title>
		</media:content>
	</item>
		<item>
		<title>Use The Index, Luke!</title>
		<link>http://blog.fatalmind.com/2010/08/15/use-the-index-luke/</link>
		<comments>http://blog.fatalmind.com/2010/08/15/use-the-index-luke/#comments</comments>
		<pubDate>Sun, 15 Aug 2010 17:17:08 +0000</pubDate>
		<dc:creator>Markus Winand</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[oracle book]]></category>

		<guid isPermaLink="false">http://blog.fatalmind.com/?p=937</guid>
		<description><![CDATA[Today, I&#8217;d like to introduce my new Web-Book Use The Index, Luke. Use The Index, Luke is a guide to Oracle database performance for developers. &#160; I started the book because I noticed that almost every existing book or online document on that topic is stuffed with plenty of information that is not relevant to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=937&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Today, I&#8217;d like to introduce my new Web-Book <a href="http://Use-The-Index-Luke.com">Use The Index, Luke</a>. Use The Index, Luke is a guide to Oracle database performance for <em>developers</em>.</p>
<div id="util_cover" align="center">
<a href="http://Use-The-Index-Luke.com"><img src="http://Use-The-Index-Luke.com/img/util_cover_free.png" height="209" width="130" alt="Use The Index, Luke! A Guide to SQL Performance for Developers" border="0"></a>
<p>&nbsp;</p>
</div>
<p>I started the book because I noticed that almost every existing book or online document on that topic is stuffed with plenty of information that is not relevant to developers.</p>
<p><span id="more-937"></span>
<p>Although a certain know-how of the database&#8217;s internals is required to get the best performance, <em>Use The Index, Luke</em> keeps that information at a minimum and presents it from a developers perspective.</p>
<p>The book is supplemented by the <a href="http://Ask.Use-The-Index-Luke.com">Ask Use The Index, Luke</a> community where further questions of that matter are asked, discussed and answered. </p>
<p>The first chapters of Use The Index, Luke are already online. The remaining parts will be published on a bi- or tri-weekly basis. Follow the <a href="http://use-the-index-luke.com/blog/feed">RSS</a> feed not to miss the new chapters. </p>
<p>I hope you enjoy reading the <a href="http://Use-The-Index-Luke.com/preface">existing chapters</a> in the meanwhile.</p>
<div id="right_top_adspace">
<div class="ad">
<h3><a href="http://winand.at/de/consulting/instant/">Instant-Consulting</a></h3>
<p>Häppchenweise Online Consulting für Entwickler. Jetzt gratis testen!<br />
<a href="http://winand.at/de/consulting/instant/">winand.at/consulting/</a>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/myfatalmind.wordpress.com/937/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/myfatalmind.wordpress.com/937/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/myfatalmind.wordpress.com/937/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/myfatalmind.wordpress.com/937/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/myfatalmind.wordpress.com/937/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/myfatalmind.wordpress.com/937/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/myfatalmind.wordpress.com/937/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/myfatalmind.wordpress.com/937/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/myfatalmind.wordpress.com/937/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/myfatalmind.wordpress.com/937/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/myfatalmind.wordpress.com/937/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/myfatalmind.wordpress.com/937/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/myfatalmind.wordpress.com/937/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/myfatalmind.wordpress.com/937/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=937&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.fatalmind.com/2010/08/15/use-the-index-luke/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/6855feeb83ac8a3e397bc8260bad8294?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fatalmind</media:title>
		</media:content>

		<media:content url="http://Use-The-Index-Luke.com/img/util_cover_free.png" medium="image">
			<media:title type="html">Use The Index, Luke! A Guide to SQL Performance for Developers</media:title>
		</media:content>
	</item>
		<item>
		<title>Analytic Top-N Queries</title>
		<link>http://blog.fatalmind.com/2010/07/30/analytic-top-n-queries/</link>
		<comments>http://blog.fatalmind.com/2010/07/30/analytic-top-n-queries/#comments</comments>
		<pubDate>Fri, 30 Jul 2010 08:55:05 +0000</pubDate>
		<dc:creator>Markus Winand</dc:creator>
				<category><![CDATA[Maintainability]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://blog.fatalmind.com/?p=896</guid>
		<description><![CDATA[One of the more advanced tricks I like to exploit are analytic Top-N queries. Although I am using them for quite a while, I recently discovered a “limitation” that I was not aware of. Actually—to be honest—it&#8217;s not a limitation; it is a missing optimization in a rarely used feature that can easily worked around. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=896&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>One of the more advanced tricks I like to exploit are analytic Top-N queries. Although I am using them for quite a while, I recently discovered a “limitation” that I was not aware of. Actually—to be honest—it&#8217;s not a limitation; it is a missing optimization in a rarely used feature that can easily worked around. I must admit that I ask for quite a lot in that case.</p>
<p>The article starts with a general introduction into Top-N queries, applies that technique to analytic queries and explains the case where I miss an optimization. But is is really worth all that efforts? The article concludes with my answer to that question.</p>
<p><span id="more-896"></span>
<p>Please find the <code>CREATE</code> and <code>INSERT</code> statements at the <a href="#create">end of the article</a>.</p>
<h3>Top-N Queries</h3>
<p>Top-N queries are queries for the first N rows according to a specific sort order—e.g., the first three rows like that:</p>
<blockquote><pre><b>select * from (
  select start_key, group_key, junk
    from demo
   where start_key = 'St'
   order by group_key
)
 where rownum &lt;= 3;</b></pre>
</blockquote>
<p>That&#8217;s well known and very straight. However, the interesting part is performance—as usual. A naïve implementation executes the inner SQL first—that is, fetch and sort all the matching records—before limiting the result set to the first three rows. In absence of a useful index, that is really happening:</p>
<blockquote><pre>START_KEY  GROUP_KEY JUNK
---------- --------- ----------
St                 1 junk
St                 3 junk
St                10 junk

3 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 142682949

--------------------------------------------------------------
| Id | Operation               | Name | Rows  | Bytes | Cost |
--------------------------------------------------------------
|  0 | SELECT STATEMENT        |      |     <b>3</b> |  1032 | 8240 |
|* 1 |  COUNT STOPKEY          |      |       |       |      |
|  2 |   VIEW                  |      |   370 |   124K| 8240 |
|* 3 |    <b>SORT ORDER BY STOPKEY</b>|      |   370 | 76960 | 8240 |
|* 4 |     <b>TABLE ACCESS FULL</b>   | DEMO |   <b>370</b> | 76960 | 8239 |
--------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM&lt;=3)
   3 - filter(ROWNUM&lt;=3)
   4 - filter("START_KEY"='St')

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      30370  consistent gets
      30365  physical reads
          0  redo size
        998  bytes sent via SQL*Net to client
        419  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          <b>1  sorts (memory)</b>
          0  sorts (disk)
          3  rows processed</pre>
</blockquote>
<p>A full table scan is performed—more on that in a few seconds—to retrieve all the rows that match the where clause; about 370 according to the optimizers estimate. The next sorts the entire result set. Finally the limit is applied—the <code>COUNT STOPKEY</code> step—and the number of rows is reduced to three.</p>
<p>The performance problem of this query is obviously the full table scan. Let&#8217;s create an index to make it go away:</p>
<blockquote><pre><b>create index demo_idx on demo (start_key);
exec dbms_stats.gather_index_stats(null, 'DEMO_IDX');</b></pre>
</blockquote>
<p>That&#8217;s much better:</p>
<blockquote><pre>START_KEY  GROUP_KEY JUNK
---------- --------- ----------
St                 1 junk
St                 3 junk
St                10 junk

3 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1129354520

------------------------------------------------------------------------
| Id | Operation                      | Name     | Rows | Bytes | Cost |
------------------------------------------------------------------------
|  0 | SELECT STATEMENT               |          |    3 |  1032 |  372 |
|* 1 |  COUNT STOPKEY                 |          |      |       |      |
|  2 |   VIEW                         |          |  370 |   124K|  372 |
|* 3 |    SORT ORDER BY STOPKEY       |          |  370 | 76960 |  372 |
|  4 |     <b>TABLE ACCESS BY INDEX ROWID</b>| DEMO     |  370 | 76960 |  371 |
|* 5 |      <b>INDEX RANGE SCAN</b>          | DEMO_IDX |  370 |       |    3 |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM&lt;=3)
   3 - filter(ROWNUM&lt;=3)
   5 - access("START_KEY"='St')

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        360  consistent gets
        201  physical reads
          0  redo size
        998  bytes sent via SQL*Net to client
        419  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          3  rows processed</pre>
</blockquote>
<p>You can see that the full table scan was replaced by an index lookup and the corresponding table access. The other steps remain unchanged.</p>
<p>However, this is still a bad execution plan because all matching records are fetched and sorted just to throw most of them away. The following index allows a much better execution plan:</p>
<blockquote><pre><b>drop index demo_idx;
create index demo_idx on demo (start_key, group_key);
exec dbms_stats.gather_index_stats(null, 'DEMO_IDX');</b></pre>
</blockquote>
<p>The new execution plan looks like this:</p>
<blockquote><pre>        ID START_KEY  GROUP_KEY
---------- ---------- ---------
    936196 St                 1
    232303 St                 3
    759212 St                10

3 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1891928015

-----------------------------------------------------------------------
| Id | Operation                     | Name     | Rows | Bytes | Cost |
-----------------------------------------------------------------------
|  0 | SELECT STATEMENT              |          |    3 |   465 |    7 |
|* 1 |  <b>COUNT STOPKEY</b>                |          |      |       |      |
|  2 |   VIEW                        |          |    3 |   465 |    7 |
|  3 |    <b>TABLE ACCESS BY INDEX ROWID</b>| DEMO     |    <b>3</b> |    36 |    7 |
|* 4 |     <b>INDEX RANGE SCAN</b>          | DEMO_IDX |  370 |       |    3 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM&lt;=3)
   4 - access("START_KEY"='St')

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        609  bytes sent via SQL*Net to client
        419  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          <b>0  sorts (memory)</b>
          <b>0  sorts (disk)</b>
          3  rows processed</pre>
</blockquote>
<p>Well, that is efficient. The sort operation has vanished at all because the index definition supports the <code>ORDER BY</code> clause. But even more powerful, the <code>STOPKEY</code> takes effect down to the index range scan. You can see the reduced number of table accesses in the plan. Although not visible in the execution plan, the index range scan is also aborted after fetching the first three records.</p>
<p>Well, that optimization is in the Oracle database for quite a while—at least since 8i I guess. After that preparation, I can demonstrate what 10R2 has to offer on top of that.</p>
<h3>Analytic Top-N Queries</h3>
<p>It is actually the very same story with a small extension: I don&#8217;t want to retrieve the first N rows, but all the rows where the <code>group_key</code> value is at it&#8217;s minimum for the respective <code>start_key</code>. A very straight solution is that:</p>
<blockquote><pre><b>select id, start_key, group_key
  from demo
 where start_key = 'St'
   and group_key = (select min(group_key)
                      from demo
                     where start_key = 'St'
                   );</b></pre>
</blockquote>
<p>That statement is perfectly legal—even performance wise:</p>
<blockquote><pre>        ID START_KEY  GROUP_KEY
---------- ---------- ---------
    936196 St                 1

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1142136980

------------------------------------------------------------------------
| Id | Operation                      | Name     | Rows | Bytes | Cost |
------------------------------------------------------------------------
|  0 | SELECT STATEMENT               |          |    1 |    12 |    8 |
|  1 |  <b>TABLE ACCESS BY INDEX ROWID</b>   | DEMO     |    1 |    12 |    5 |
|* 2 |   <b>INDEX RANGE SCAN</b>             | DEMO_IDX |    1 |       |    3 |
|  3 |    SORT AGGREGATE              |          |    1 |     7 |      |
|  4 |     FIRST ROW                  |          |    1 |     7 |    3 |
|* 5 |      <b>INDEX RANGE SCAN (MIN/MAX)</b>| DEMO_IDX |    1 |     7 |    3 |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("START_KEY"='St'
               AND "GROUP_KEY"= (SELECT MIN("GROUP_KEY") FROM
              "DEMO" "DEMO" WHERE "START_KEY"='St'))
   5 - access("START_KEY"='St')

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
        550  bytes sent via SQL*Net to client
        419  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          <b>0  sorts (memory)
          0  sorts (disk)</b>
          1  rows processed</pre>
</blockquote>
<div class="sidenote" id="AnalyticFunctions">
<h6>Analytic functions</h6>
<p>Analytic functions can perform calculations on the basis of multiple rows. However, not to be confused with aggregate functions, analytical functions work without <code>GROUP BY</code>. A very typical use for analytical functions is a running balance; that is, the sum of all the rows preceding the current row.</p>
<p>The function used in the example (<code>dense_rank</code>) returns the rank of the current row according to the supplied <code>OVER(ORDER BY)</code> clause—that is, in turn, not to be confused with a regular <code>ORDER BY</code>.</p>
<p>orafaq.com has a nice intro <a href="http://www.orafaq.com/node/55">to Oracle analytic functions</a>.</p>
</div>
<p>The fist step is to fetch the smallest <code>group_key</code>. Because of the min/max optimization in combination with a well supporting index, the database doesn&#8217;t need to sort the data—it just picks the first record from the index which must be the smallest anyway. The second step is to perform a regular index lookup for the <code>start_key</code> and the <code>group_key</code> that was just retrieved from the sub-query.</p>
<p>Another possible implementation for that is to use an <a href="#AnalyticFunction">analytic function</a>:</p>
<blockquote><pre>select * from (
  select id, start_key, group_key,
         <b>dense_rank() OVER (order by group_key) rnk</b>
    from demo
   where start_key = 'St'
)
 where <b>rnk &lt;= 1</b>;</pre>
</blockquote>
<p>Do you recognize the pattern? It is very similar to the traditional Top-N query that was described at the beginning of this article. Instead of limiting on the <code>rownum</code> pseudocolumn we use an analytic function. The execution plan reveals the performance characteristic of that statement:</p>
<blockquote><pre>        ID START_KEY  GROUP_KEY        RNK
---------- ---------- --------- ----------
    936196 St                 1          1

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3221234897

-----------------------------------------------------------------------
| Id | Operation                     | Name     | Rows | Bytes | Cost |
-----------------------------------------------------------------------
|  0 | SELECT STATEMENT              |          |  370 | 62160 |  374 |
|* 1 |  VIEW                         |          |  370 | 62160 |  374 |
|* 2 |   <b>WINDOW NOSORT STOPKEY</b>       |          |  <b>370</b> |  4440 |  374 |
|  3 |    TABLE ACCESS BY INDEX ROWID| DEMO     |  370 |  4440 |  373 |
|* 4 |     <b>INDEX RANGE SCAN</b>          | DEMO_IDX |  <b>370</b> |       |    3 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RNK"&lt;=1)
   2 - filter(DENSE_RANK() OVER ( ORDER BY "GROUP_KEY")&lt;=1)
   4 - access("START_KEY"='St')

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        610  bytes sent via SQL*Net to client
        419  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
<b>          0  sorts (memory)
          0  sorts (disk)</b>
          1  rows processed</pre>
</blockquote>
<p>What was the <code>COUNT STOPKEY</code> operation for the traditional Top-N query has become the <code>WINDOW NOSORT STOPKEY</code> operation for the analytical function. However, the expected number of rows is not known to the optimizer—any number of rows could have the lowest <code>group_key</code> value. Still the index range scan is aborted once the required rows have been fetched. On the one hand, the consistent gets are even better as with the sub-query statement. On the other hand, the cost value is higher. Whenever you use analytic functions, go for a benchmark to know the actual performance.</p>
<p>Let&#8217;s have some thoughts about this optimization. The database knows that the index order corresponds to the <code>OVER (ORDER BY)</code> clause and avoids the the sort operation. But even more impressive is that that it can abort the range scan when the first value that doesn&#8217;t match the <code>rnk &lt;= 1</code> expression is fetched. That is only possible because the <code>dense_rank()</code> function can not decrease if the rows are fetched in order of the <code>OVER(ORDER BY)</code> clause. That&#8217;s impressive, isn&#8217;t it?</p>
<h3>Mass Top-N Queries</h3>
<p>The next step towards the issue that made me writing this article is to make a mass Top-N query. With the previous statement as basis, it is actually quite simple; just remove the inner where clause to get the result for all <code>start_key</code> values and add a partition clause to make sure the rank is built individually for each <code>start_key</code>:</p>
<blockquote><pre>select * from (
  select start_key, group_key, junk,
         dense_rank() OVER (<b>partition by start_key</b>
                                order by group_key) rnk
    from demo
)
 where rnk &lt;= 1;</pre>
</blockquote>
<p>Declaring the partition is required to make sure those <code>start_keys</code> that don&#8217;t have a <code>group_key</code> of one will still show up, with their lowest <code>group_key</code> value.</p>
<p>With that query, we have reached the end of the optimizers smartness—as of release 11r2. On the first sight, the plan is not surprising:</p>
<blockquote><pre><b>3260 rows selected.</b>

Execution Plan
----------------------------------------------------------
Plan hash value: 1766530486

--------------------------------------------------------------
| Id | Operation                | Name | Rows | Bytes | Cost |
--------------------------------------------------------------
|  0 | SELECT STATEMENT         |      | 1000K|   340M| 8239 |
|* 1 |  VIEW                    |      | 1000K|   340M| 8239 |
|* 2 |   WINDOW SORT PUSHED RANK|      | 1000K|   198M| 8239 |
|  3 |    <b>TABLE ACCESS FULL</b>     | DEMO | 1000K|   198M| 8239 |
--------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RNK"&lt;=1)
   2 - filter(DENSE_RANK() OVER ( PARTITION BY "START_KEY" ORDER BY
              "GROUP_KEY")&lt;=1)

Statistics
----------------------------------------------------------
         22  recursive calls
         20  db block gets
      30370  consistent gets
      33163  physical reads
          0  redo size
      59215  bytes sent via SQL*Net to client
       2806  bytes received via SQL*Net from client
        219  SQL*Net roundtrips to/from client
          0  sorts (memory)
          <b>1  sorts (disk)</b>
       3260  rows processed</pre>
</blockquote>
<p>It&#8217;s a full table scan. However, a “mass” query performs a full table scan on good purpose—that did not call my attention. What did call my attention is the following:</p>
<blockquote><pre><b>select * from (</b>
select * from (
   select start_key, group_key, junk,
          dense_rank() OVER (partition by start_key
                                 order by group_key) rnk
     from demo
) where rnk &lt;= 1
<b>) where start_key = 'St';</b></pre>
</blockquote>
<p>It is actually the individual Top-N query again. This time it is built on the basis of the mass Top-N query—that was set up as view. That way, a single database view can be used for any mass query as well as for individual Top-N queries—that&#8217;s a <a href="/2010/06/12/about-software-quality/">maintainability</a> benefit. If the advanced magic to abort the index range scan is still working it would be extremely efficient as well. The execution plan proves the opposite:</p>
<blockquote><pre>START_KEY  GROUP_KEY JUNK              RNK
---------- --------- ---------- ----------
St                 1 junk                1

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1309566133

-----------------------------------------------------------------------
| Id | Operation                     | Name     | Rows | Bytes | Cost |
-----------------------------------------------------------------------
|  0 | SELECT STATEMENT              |          |  370 |   128K|  373 |
|* 1 |  VIEW                         |          |  370 |   128K|  373 |
|<b>* 2</b> |   <b>WINDOW NOSORT</b>               |          |  370 | 76960 |  373 |
|  3 |    <b>TABLE ACCESS BY INDEX ROWID</b>| DEMO     |  370 | 76960 |  373 |
|* 4 |     INDEX RANGE SCAN          | DEMO_IDX |  370 |       |    3 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RNK"&lt;=1)
<b>   2 - filter(DENSE_RANK() OVER ( PARTITION BY "START_KEY" ORDER BY
              "GROUP_KEY")&lt;=1)</b>
   4 - access("START_KEY"='St')

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        <b>363  consistent gets</b>
          0  physical reads
          0  redo size
        808  bytes sent via SQL*Net to client
        419  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
<b>          0  sorts (memory)
          0  sorts (disk)</b>
          1  rows processed</pre>
</blockquote>
<p>Although no sort is required, the <code>STOPKEY</code> has disappeared from the <code>WINDOW NOSORT</code> operation. That means that the full index range scan will be performed; for all 359 rows <code>where start_key='St'</code>. On top of that, the number of consistent gets is rather high. A closer look into the execution plan reveals that the entire row is fetched from the table <em>before</em> the filter on the analytic expression is applied. The junk column that is fetched from the table is not required for the evaluation of this predicate; it would be possible to fetch that column only for those rows that pass the filter.</p>
<p>The “premature table access” is the reason why the full table scan is more efficient for the mass query than a index full scan. Have a look into the (hinted) full index scan execution plan for the mass query:</p>
<blockquote><pre><b>3260 rows selected.</b>

Execution Plan
----------------------------------------------------------
Plan hash value: 1402975529

------------------------------------------------------------------------
| Id | Operation                     | Name     | Rows | Bytes |  Cost |
------------------------------------------------------------------------
|  0 | SELECT STATEMENT              |          | 1000K|   340M| 1002K |
|* 1 |  VIEW                         |          | 1000K|   340M| 1002K |
|<b>* 2</b> |   <b>WINDOW NOSORT</b>               |          | 1000K|   198M| 1002K |
|  3 |    <b>TABLE ACCESS BY INDEX ROWID</b>| DEMO     | 1000K|   198M| <b>1002K</b> |
|  4 |     <b>INDEX FULL SCAN</b>           | DEMO_IDX | 1000K|       | 2504  |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RNK"&lt;=1)
   2 - filter(DENSE_RANK() OVER ( PARTITION BY "START_KEY" ORDER BY
              "GROUP_KEY")&lt;=1)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
    1002692  consistent gets
     817172  physical reads
          0  redo size
      59215  bytes sent via SQL*Net to client
       2806  bytes received via SQL*Net from client
        219  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       3260  rows processed</pre>
</blockquote>
<p>The expensive step in this execution plan is the table access. If the table access would be moved up to take place after the window filter, the cost for this step would be 3260 (one for each fetched row). The total cost for the plan would probably stay below 7000; that is, lower then the cost for the full table scan plan.</p>
<h3>Conclusion </h3>
<p>Just to re-emphasize the motivation behind the view that can serve both needs; it is about multidimensional optimization—and that has nothing to do with OLAP! </p>
<p>Typically, performance optimization takes only one dimension into account; that is, performance. So far so good, but what about long term maintenance? Very often, performance optimization <em>reduces the maintainability</em> of the software. That&#8217;s not a coincidence, it&#8217;s because maintainability is the only degree of freedom during optimization. Unfortunately, reduced <a href="http://blog.fatalmind.com/2010/06/12/about-software-quality/#maintainability">maintainability is very hard to notice</a>. If it is noticed at all, it is probably years later.</p>
<p>I have been in both worlds for some years—operations and development—and try to optimize for all dimensions whenever possible because all of them are important for the business.</p>
<h3><a id="create"></a>Create and Insert Statements</h3>
<p>To try it yourself:</p>
<blockquote><pre><b>create table demo (
       id          number        not null,
       start_key   varchar2(255) not null,
       group_key   number        not null,
       junk        char(200),
       primary key (id)
);

insert into demo (
       select level,
              dbms_random.string('A', 2) start_key,
              trunc(dbms_random.value(0,1000)),
              'junk'
         from dual
      connect by level &lt;= 1000000
);

commit;

exec DBMS_STATS.GATHER_TABLE_STATS(null, 'DEMO');</b></pre>
</blockquote>
<p>My tests were conducted on 11R2.</p>
<div id="right_top_adspace">
<div id="util_cover">
<a href="http://Use-The-Index-Luke.com"><img src="http://Use-The-Index-Luke.com/img/util_cover_free.png" height="209" width="130" alt="Use The Index, Luke! A Guide to SQL Performance for Developers" border="0"></a>
</div>
<div class="ad">
<h3><a href="http://winand.at/de/consulting/instant/">Instant-Consulting</a></h3>
<p>Häppchenweise Online Consulting für Entwickler. Jetzt gratis testen!<br />
<a href="http://winand.at/de/consulting/instant/">winand.at/consulting/</a>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/myfatalmind.wordpress.com/896/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/myfatalmind.wordpress.com/896/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/myfatalmind.wordpress.com/896/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/myfatalmind.wordpress.com/896/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/myfatalmind.wordpress.com/896/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/myfatalmind.wordpress.com/896/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/myfatalmind.wordpress.com/896/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/myfatalmind.wordpress.com/896/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/myfatalmind.wordpress.com/896/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/myfatalmind.wordpress.com/896/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/myfatalmind.wordpress.com/896/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/myfatalmind.wordpress.com/896/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/myfatalmind.wordpress.com/896/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/myfatalmind.wordpress.com/896/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=896&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.fatalmind.com/2010/07/30/analytic-top-n-queries/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/6855feeb83ac8a3e397bc8260bad8294?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fatalmind</media:title>
		</media:content>

		<media:content url="http://Use-The-Index-Luke.com/img/util_cover_free.png" medium="image">
			<media:title type="html">Use The Index, Luke! A Guide to SQL Performance for Developers</media:title>
		</media:content>
	</item>
		<item>
		<title>About Software Quality</title>
		<link>http://blog.fatalmind.com/2010/06/12/about-software-quality/</link>
		<comments>http://blog.fatalmind.com/2010/06/12/about-software-quality/#comments</comments>
		<pubDate>Sat, 12 Jun 2010 11:00:54 +0000</pubDate>
		<dc:creator>Markus Winand</dc:creator>
				<category><![CDATA[Maintainability]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Reliability]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Meta]]></category>

		<guid isPermaLink="false">http://blog.fatalmind.com/?p=407</guid>
		<description><![CDATA[You might have noticed that this blog never had a kickoff post that explains what this blog is about. Time has come to spend some words on the topic of software quality—as I see it—and how this blog covers some aspects of software quality. First of all, I&#8217;d like to introduce the two most frequently [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=407&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>You might have noticed that this blog never had a kickoff post that explains what this blog is about. Time has come to spend some words on the topic of software quality—as I see it—and how this blog covers some aspects of software quality.</p>
<p><span id="more-407"></span>
<p>First of all, I&#8217;d like to introduce the two most frequently considered software quality aspects:</p>
<dl>
<dt>Correctness</dt>
<dd>
<p>The software must produce correct results. There is no point in using software, which produces results that you need to verify before you can trust them.</p>
</dd>
<dt>Usability</dt>
<dd>
<p>If you know the task you want to perform, the software should make it easy to do so. Frequently used functions must be easily accessible and allow to complete the task in an efficient manner.</p>
</dd>
</dl>
<p>These two aspects belong to the <em>functional</em> cluster of software quality. Today&#8217;s software industry has a rather high awareness for these issues—I guess because users complain immediately otherwise.</p>
<p>The industrialization of those aspects becomes evident when looking at the supporting industries. As of writing, <a href="http://www.google.com/search?q=requirements+engineering+tools">Google shows 8 sponsored links for “requirements engineering tools”</a> and <a href="http://www.google.com/search?q=software+testing+tool">countless software testing tools</a>. There are even education programs to <a href="http://en.wikipedia.org/wiki/Certified_Software_Tester">certify software testers</a>.</p>
<p>On the other hand there are also aspects that belong to the <em>technical</em> cluster of software quality.</p>
<dl>
<dt><a id="performance"></a>Performance</dt>
<dd>
<p>On my business card, I use the following definitions:</p>
<blockquote><p>Short response time for a given piece of work.</p>
<p>Low utilization of computing resources.</p>
</blockquote>
<p>Please note that testers can observe the effects of the first definition. Sometime performance issues become that bad that they become a usability issue. There is a certain amount of <a href="http://www.google.com/search?q=performance+testing+tools">industrial support</a> for performance available.</p>
</dd>
<dt><a id="reliability"></a>Reliability</dt>
<dd>
<p>On my business card, I use the following definition:</p>
<blockquote><p>The probability of failure free operation of a computer program in a specified environment for a specified time.</p>
</blockquote>
<p>The majority of reliability issues are system crashes and system hangs. They are typically caused by <a href="http://en.wikipedia.org/wiki/Memory_leak">memory leaks</a>, <a href="http://en.wikipedia.org/wiki/Deadlock">deadlocks</a> and <a href="http://en.wikipedia.org/wiki/Race_condition#Computing">race conditions</a>.</p>
<p>At this point, industrial support becomes sparse. Reliability issues are usually very hard to reproduce and they often remain unsolved for a long time. The first signs are often regarded as one-time events that are not taken seriously. However, these are the opportunities to handle the problem before it becomes unmanageable. Unfortunately reliability issues are easier to debug once they occur more often.</p>
</dd>
<dt><a id="maintainability"></a>Maintainability</dt>
<dd>
<p>On my business card, I use the following definition:</p>
<blockquote><p>The ease with which a software product can be modified in order to correct defects and meet new requirements.</p>
</blockquote>
<p>The users of a software product can not directly observe maintainability issues. There is, however, one sign that can be observed from the outside and <em>might</em> indicate maintainability issues: the time to market. If a software vendor can quickly adopt the product for new requirements, it is usually a good sign. The opposite conclusion doesn&#8217;t hold true because the vendor&#8217;s priorities might favor other issues at that time.</p>
<p>Although there are tools to automatically calculate <a href="http://www.virtualmachinery.com/sidebar4.htm">maintainability metrics</a>, I believe the most efficient way to improve maintainability is a review by knowing eyes.</p>
</dd>
<dt>Scalability</dt>
<dd>
<p>On my business card, I use the following definition:</p>
<blockquote><p>The property of a system which indicates its ability to handle growing amounts of work in a graceful manner.</p>
</blockquote>
<p>Scalability is related to performance. Many performance testing tools can be used to analyze the application&#8217;s scalability. The key difference—from my perspective—is that performance questions tend to focus on current load scenarios while scalability questions focus on future business development.</dd>
</dl>
<p>There are some more software quality topics. You probably noticed that there was not a single word about <em>security</em>. The reason I didn&#8217;t list security is that this area is way to wide to be properly covered by a one-man business like mine.</p>
<p>So, the four technical software quality aspects <em>performance</em>, <em>reliability</em>, <em>maintainability</em> and <em>scalability</em> are my primary area of interest and work—and therefore also the topic of this blog.</p>
<div id="right_top_adspace">
<div id="util_cover">
<a href="http://Use-The-Index-Luke.com"><img src="http://Use-The-Index-Luke.com/img/util_cover_free.png" height="209" width="130" alt="Use The Index, Luke! A Guide to SQL Performance for Developers" border="0"></a>
</div>
<div class="ad">
<h3><a href="http://winand.at/de/consulting/instant/">Instant-Consulting</a></h3>
<p>Häppchenweise Online Consulting für Entwickler. Jetzt gratis testen!<br />
<a href="http://winand.at/de/consulting/instant/">winand.at/consulting/</a>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/myfatalmind.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/myfatalmind.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/myfatalmind.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/myfatalmind.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/myfatalmind.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/myfatalmind.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/myfatalmind.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/myfatalmind.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/myfatalmind.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/myfatalmind.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/myfatalmind.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/myfatalmind.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/myfatalmind.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/myfatalmind.wordpress.com/407/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=407&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.fatalmind.com/2010/06/12/about-software-quality/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/6855feeb83ac8a3e397bc8260bad8294?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fatalmind</media:title>
		</media:content>

		<media:content url="http://Use-The-Index-Luke.com/img/util_cover_free.png" medium="image">
			<media:title type="html">Use The Index, Luke! A Guide to SQL Performance for Developers</media:title>
		</media:content>
	</item>
		<item>
		<title>Analyze That</title>
		<link>http://blog.fatalmind.com/2010/04/30/analyze-that/</link>
		<comments>http://blog.fatalmind.com/2010/04/30/analyze-that/#comments</comments>
		<pubDate>Fri, 30 Apr 2010 15:03:02 +0000</pubDate>
		<dc:creator>Markus Winand</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[oracle]]></category>

		<guid isPermaLink="false">http://blog.fatalmind.com/?p=415</guid>
		<description><![CDATA[As Jonathan Lewis commented on my article Clustering Factor: Row Migrtion&#8217;s Victim, there is even more to say about the difference between the good, old, and deprecated ANALYZE statement and the DBMS_STATS package. Jonathan mentioned that the CBO is using the CHAIN_CNT value in the statistics, if present, and suggested to try my &#8220;trapQL&#8221; after [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=415&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As <a href="http://jonathanlewis.wordpress.com/">Jonathan Lewis</a> commented on my article <a href="http://blog.fatalmind.com/2010/03/09/clustering-factor-row-migrations-victim/">Clustering Factor: Row Migrtion&#8217;s Victim</a>, there is even more to say about the difference between the good, old, and deprecated <code>ANALYZE</code> statement and the <code>DBMS_STATS</code> package. Jonathan mentioned that the CBO is using the <code>CHAIN_CNT</code> value in the statistics, if present, and suggested to try my &#8220;trapQL&#8221; after analyzing the base table in the old fashion.</p>
<p><span id="more-415"></span>For easy C&amp;P I the complete new script to try:</p>
<blockquote><pre><b>CREATE TABLE row_mig1 (
  a CHAR(2000),
  b CHAR(2000),
  c CHAR(2000),
  x NUMBER      NOT NULL,
  filter NUMBER NOT NULL,
  CONSTRAINT row_mig1_pk PRIMARY KEY (x)
) ENABLE ROW MOVEMENT;

BEGIN
   FOR i IN 1..100000 LOOP
      INSERT INTO row_mig1
                    (x, filter)
             VALUES (i, trunc(dbms_random.value(0, 10)));

      UPDATE row_mig1
         SET a ='a', b = 'b', c = 'c'
       WHERE x=i;
   END LOOP;
END;
/
COMMIT;

CREATE INDEX row_mig1_idx ON row_mig1(filter);

ANALYZE TABLE row_mig1 COMPUTE STATISTICS;
ANALYZE INDEX row_mig1_pk COMPUTE STATISTICS;
ANALYZE INDEX row_mig1_idx COMPUTE STATISTICS;

SELECT i.index_name         idx_name
     , t.blocks             table_blocks
     , i.clustering_factor  idx_clust_factor
     , i.num_rows           idx_rows
     , t.chain_cnt          table_chain_cnt
  FROM user_indexes i
  JOIN user_tables t USING (table_name)
 WHERE table_name = 'ROW_MIG1';</b></pre>
</blockquote>
<p>This is the very same table as in my previous article. I just replaced the <code>DBMS_STATS</code> with <code>ANALYZE</code>. As expected, the <code>CHAIN_CNT</code> is set correctly:</p>
<blockquote><pre>IDX_NAME       TABLE_BLOCKS IDX_CLUST_FACTOR IDX_ROWS TABLE_CHAIN_CNT
-------------- ------------ ---------------- -------- ---------------
ROW_MIG1_PK          100103              919   100000           <b>99999</b>
ROW_MIG1_IDX         100103             9187   100000           <b>99999</b></pre>
</blockquote>
<p>So far, so good. Now let&#8217;s create the second table—without row migration:</p>
<blockquote><pre><b>CREATE TABLE row_mig2 (
  a CHAR(2000),
  b CHAR(2000),
  c CHAR(2000),
  x NUMBER      NOT NULL,
  filter NUMBER NOT NULL,
  CONSTRAINT row_mig2_pk PRIMARY KEY (x)
) ENABLE ROW MOVEMENT;

INSERT INTO row_mig2 (x, filter, a, b, c)
     SELECT level, trunc(dbms_random.value(0,17)), 'a', 'b', 'c'
       FROM dual CONNECT BY level &lt;= 100000;
COMMIT;

CREATE INDEX row_mig2_idx ON row_mig2(filter);

ANALYZE TABLE row_mig2 COMPUTE STATISTICS;
ANALYZE INDEX row_mig2_pk COMPUTE STATISTICS;
ANALYZE INDEX row_mig2_idx COMPUTE STATISTICS;

SELECT i.index_name         idx_name
     , t.blocks             table_blocks
     , i.clustering_factor  idx_clust_factor
     , i.num_rows           idx_rows
     , t.chain_cnt          table_chain_cnt
  FROM user_indexes i
  JOIN user_tables t USING (table_name)
 WHERE table_name = 'ROW_MIG2';</b></pre>
</blockquote>
<p>Just to verify that there is no chaining when a plain insert approach is followed:</p>
<blockquote><pre>IDX_NAME       TABLE_BLOCKS IDX_CLUST_FACTOR IDX_ROWS TABLE_CHAIN_CNT
-------------- ------------ ---------------- -------- ---------------
ROW_MIG2_IDX         100877           100000   100000               <b>0</b>
ROW_MIG2_PK          100877           100000   100000               <b>0</b></pre>
</blockquote>
<p>So, let&#8217;s see how my “trapQL” works:</p>
<blockquote><pre><b>SET AUTOTRACE TRACEONLY;
SET TIMING ON;

SELECT /* analyze that! */ *
  FROM row_mig1 d1
  JOIN row_mig2 d2 ON (d1.x = d2.x)
 WHERE d1.filter = 0
   AND d2.filter = 0;

SET TIMING OFF
SET AUTOTRACE OFF</b></pre>
</blockquote>
<p>And the result is:</p>
<blockquote><pre>602 rows selected.

Elapsed: 00:01:00.73

Execution Plan
----------------------------------------------------------
Plan hash value: 3163423506

----------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  |  Cost |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |  5882 | 17663 |
|   1 |  NESTED LOOPS                 |              |       |       |
|   2 |   NESTED LOOPS                |              |  5882 | 17663 |
|   3 |    TABLE ACCESS BY INDEX ROWID| ROW_MIG2     |  5882 |  <b>5896</b> |
|*  4 |     <b>INDEX RANGE SCAN          | ROW_MIG2_IDX |  5882</b> |    12 |
|*  5 |    INDEX UNIQUE SCAN          | ROW_MIG1_PK  |     1 |     0 |
|*  6 |   TABLE ACCESS BY INDEX ROWID | ROW_MIG1     |     1 |     <b>2</b> |
----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("D2"."FILTER"=0)
   5 - access("D1"."X"="D2"."X")
   6 - filter("D1"."FILTER"=0)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      20929  consistent gets
      12994  physical reads
          0  redo size
      39658  bytes sent via SQL*Net to client
        859  bytes received via SQL*Net from client
         42  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        602  rows processed</pre>
</blockquote>
<p>Surprising, at least to me. The optimizer considers the <code>CHAIN_CNT</code> and produces the best possible execution plan. The cost for the <code>TABLE ACCESS BY INDEX ROW ID</code> has grown to two. That&#8217;s a very concise reflection of the additional block that needs to be fetched.</p>
<p>To verify that the difference is the <code>CHAIN_CNT</code> let&#8217;s verify the plan after <code>DBMS_STATS</code> collection:</p>
<blockquote><pre>SQL&gt; <b>BEGIN
       DBMS_STATS.GATHER_TABLE_STATS(null, 'ROW_MIG1',
                                     CASCADE =&gt; TRUE);
       DBMS_STATS.GATHER_TABLE_STATS(null, 'ROW_MIG12',
                                     CASCADE =&gt; TRUE);
     END;
     /</b>

PL/SQL procedure successfully completed.

SQL&gt; <b>SELECT i.index_name         idx_name
          , t.blocks             table_blocks
          , i.clustering_factor  idx_clust_factor
          , i.num_rows           idx_rows
          , t.chain_cnt          table_chain_cnt
       FROM user_indexes i
       JOIN user_tables t USING (table_name)
      WHERE table_name IN ('ROW_MIG1', 'ROW_MIG2')
      ORDER BY i.index_name;</b>

IDX_NAME       TABLE_BLOCKS IDX_CLUST_FACTOR IDX_ROWS TABLE_CHAIN_CNT
-------------- ------------ ---------------- -------- ---------------
ROW_MIG1_IDX         100622             9180   100000           99999
ROW_MIG1_PK          100622              918   100000           99999
ROW_MIG2_IDX         100239           100000   100000               0
ROW_MIG2_PK          100239           100000   100000               0</pre>
</blockquote>
<p>Well, as a matter of fact, <code>DBMS_STATS</code> just doesn&#8217;t care about the <code>CHAIN_CNT</code>. So, the old values remain there. On the other hand, there are some minor differences in the other stats like the table blocks and the index clustering factor of <code>ROW_MIG1_IDX</code>. However, the execution plan doesn&#8217;t change:</p>
<blockquote><pre>602 rows selected.

Elapsed: 00:01:03.71

Execution Plan
----------------------------------------------------------
Plan hash value: 3163423506

----------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  |  Cost |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |  6084 | 18270 |
|   1 |  NESTED LOOPS                 |              |       |       |
|   2 |   NESTED LOOPS                |              |  6084 | 18270 |
|   3 |    TABLE ACCESS BY INDEX ROWID| ROW_MIG2     |  6084 |  <b>6098</b> |
|*  4 |     <b>INDEX RANGE SCAN          | ROW_MIG2_IDX |  6084</b> |    12 |
|*  5 |    INDEX UNIQUE SCAN          | ROW_MIG1_PK  |     1 |     0 |
|*  6 |   TABLE ACCESS BY INDEX ROWID | ROW_MIG1     |     1 |     <b>2</b> |
----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("D2"."FILTER"=0)
   5 - access("D1"."X"="D2"."X")
   6 - filter("D1"."FILTER"=0)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      21131  consistent gets
      12911  physical reads
          0  redo size
      39658  bytes sent via SQL*Net to client
        859  bytes received via SQL*Net from client
         42  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        602  rows processed</pre>
</blockquote>
<p>Well, it seems that the <code>CHAIN_CNT</code> is really making all the difference since it is the only value that was not calculated by <code>DMBS_STATS</code>…or is it? The ultimate proof would be to update the <code>CHAIN_CNT</code> manually to zero in the same way as I manually update the <code>CLUSTERING_FACTOR</code> using <code>DBMS_STATS</code> in my previous post. However, I already mentioned <code>DBMS_STATS</code> doesn&#8217;t care about the <code>CHAIN_CNT</code>, consequently there is no parameter in <a href="http://download.oracle.com/docs/cd/E11882_01/appdev.112/e10577/d_stats.htm#i997763">SET_TABLE_STATS</a> to manually update it.</p>
<p>For the sake of science, I have done what nobody should ever do. And I will definitely not post what I&#8217;ve done. However, I did it, and I can tell you; the <code>CHAIN_CNT</code> is making all the difference.</p>
<h3>Summary</h3>
<p>The story from the previous posts is a little bit depressing. Although <code>ANALYZE</code> is deprecated, it provides more information to the CBO than <code>DBMS_STATS</code>. Using the recommended <code>DBMS_STATS</code> alone opens the problem that I have demonstrated in my previous post. The depressing factor is that this is actually a regression. A useful functionality of the CBO was “removed” by the recommendation to use <code>DBMS_STATS</code>.</p>
<p>Once more I must emphasize that all of that trouble was caused by the excessive use of the “insert empty, update everything” anti-pattern. Obviously everybody should try very hard not to follow that pattern. That solves all the problems.</p>
<h3>Thanks</h3>
<p>Thanks to <a href="http://jonathanlewis.wordpress.com/">Jonathan Lewis</a> for pointing me onto the question how my “trapQL” works with <code>ANALYZE</code>.</p>
<p>Special thanks to Gerhard Kircher.</p>
<div id="right_top_adspace">
<div id="util_cover">
<a href="http://Use-The-Index-Luke.com"><img src="http://Use-The-Index-Luke.com/img/util_cover_free.png" height="209" width="130" alt="Use The Index, Luke! A Guide to SQL Performance for Developers" border="0"></a>
</div>
<div class="ad">
<h3><a href="http://winand.at/de/consulting/instant/">Instant-Consulting</a></h3>
<p>Häppchenweise Online Consulting für Entwickler. Jetzt gratis testen!<br />
<a href="http://winand.at/de/consulting/instant/">winand.at/consulting/</a>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/myfatalmind.wordpress.com/415/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/myfatalmind.wordpress.com/415/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/myfatalmind.wordpress.com/415/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/myfatalmind.wordpress.com/415/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/myfatalmind.wordpress.com/415/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/myfatalmind.wordpress.com/415/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/myfatalmind.wordpress.com/415/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/myfatalmind.wordpress.com/415/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/myfatalmind.wordpress.com/415/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/myfatalmind.wordpress.com/415/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/myfatalmind.wordpress.com/415/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/myfatalmind.wordpress.com/415/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/myfatalmind.wordpress.com/415/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/myfatalmind.wordpress.com/415/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=415&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.fatalmind.com/2010/04/30/analyze-that/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/6855feeb83ac8a3e397bc8260bad8294?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fatalmind</media:title>
		</media:content>

		<media:content url="http://Use-The-Index-Luke.com/img/util_cover_free.png" medium="image">
			<media:title type="html">Use The Index, Luke! A Guide to SQL Performance for Developers</media:title>
		</media:content>
	</item>
		<item>
		<title>Working for the Community: hatools 2.14</title>
		<link>http://blog.fatalmind.com/2010/03/16/working-for-the-community-hatools-2-14/</link>
		<comments>http://blog.fatalmind.com/2010/03/16/working-for-the-community-hatools-2-14/#comments</comments>
		<pubDate>Tue, 16 Mar 2010 08:36:23 +0000</pubDate>
		<dc:creator>Markus Winand</dc:creator>
				<category><![CDATA[Portability]]></category>
		<category><![CDATA[Reliability]]></category>
		<category><![CDATA[hatools]]></category>
		<category><![CDATA[script]]></category>
		<category><![CDATA[unix]]></category>

		<guid isPermaLink="false">http://blog.fatalmind.com/?p=392</guid>
		<description><![CDATA[This post is about my current work on release 2.14 of my most beloved OSS project: hatools. Just some observations, objectives, rants and an advertisement. Scope My scope for this release was rather limited: implement -v switches to hatimerun and halockrun to make them more communicative. The reason for that is quite simple; hatools will [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=392&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This post is about my current work on release 2.14 of my most beloved <a href="http://en.wikipedia.org/wiki/Open-source_software">OSS</a> project: <a href="http://www.fatalmind.com/software/hatools/">hatools</a>. Just some observations, objectives, rants and an advertisement.</p>
<h3>Scope</h3>
<p>My scope for this release was rather limited: implement <code>-v</code> switches to <code>hatimerun</code> and <code>halockrun</code> to make them more communicative.</p>
<p><span id="more-392"></span>
<p>The reason for that is quite simple; hatools will be easier to learn and use if they talk more. I also noticed that there are two “new” tools in the ever growing locking tool zoo—not to mention <a href="http://linux.die.net/man/1/flock">flock(1)</a>—that talk more than hatools:</p>
<ul>
<li><a href="http://unixwiz.net/tools/lockrun.html">Steve Friedl&#8217;s lockrun.c</a></li>
<li><a href="http://www.scylla-charybdis.com/tool.php/lockrun">Scylla and Charybdis lockrun</a></li>
</ul>
<p>My original design goal of hatools was to allow easy integration into shell scripts, thus every error is reported via the exitcode. I still believe in that goal because error handling should not parse error messages. UNIX commands should communicate errors to the caller in a way that allows easy handling in scripts.</p>
<p>However, I must admit that my implementation is a little bit harsh. Except of fatal errors, hatools don&#8217;t write anything to STDOUT or STDERR. Especially “designed” errors—such as “lock busy”—don&#8217;t cause a message to the user. Today—almost 9 years later—I wonder about the missing verbosity of hatools for two reasons:</p>
<ul>
<li>Just because the exitcode contains all the information doesn&#8217;t mean that a message isn&#8217;t handy.</li>
<li>Steve Friedl&#8217;s <code>lockrun.c</code> has an option that causes a warning if the program takes longer than a specified timeout. He mentions that this is very handy in <code>cron</code> jobs, because cron e-mails that message.</li>
</ul>
<p>I also believe that hatools have become a very powerful and a little complex in the last few years—most notably: multiple occurrences of <code>-t</code>, <code>-k</code> and <code>-e</code> in <code>hatimerun</code>. A verbose mode will make debugging much easier.</p>
<p>So, here comes the story why it took more than 1 hour do to it.</p>
<h3>More Community</h3>
<p>First of all, I moved the source code repository to <a href="http://github.com/fatalmind/hatools/">GitHub</a>—after listening to <a href="http://chaosradio.ccc.de/cre130.html">Tim Pritlove&#8217;s (german) podcast on “Verteilte Versionskontrollsysteme&#8221;</a>.</p>
<h3>The Implementation</h3>
<p>Although <code>halockrun</code> was quickly done, <code>hatimerun</code> challenged me a little bit.</p>
<p>Finally <code>hatimerun</code> got two verbose modes:</p>
<dl>
<dt><code>-v</code></dt>
<dd>
<p>Will write a message if a timeout has passed by:</p>
<blockquote><pre>$ ./hatimerun <b>-v</b> -t 1 sleep 2
./hatimerun: process 9494 terminated on signal SIGKILL after 1s (sleep 2)</pre>
</blockquote>
</dd>
<dt><code>-vv</code></dt>
<dd>
<p>Writes a message on every timeout:</p>
<blockquote><pre>$ ./hatimerun <b>-vv</b> -t 1 -k hup -t 1 nohup sleep 3
nohup: appending output to `nohup.out'
./hatimerun: Timout #1 after 1s: sending signal SIGHUP to process group -9711 (nohup sleep 3)
./hatimerun: Timout #2 after 2s: sending signal SIGKILL to process group -9711 (nohup sleep 3)
./hatimerun: process 9711 terminated on signal SIGKILL after 2s (nohup sleep 3)</pre>
</blockquote>
</dd>
</dl>
<p>After years of silence, quite a lot of verbosity.</p>
<p>The “hard” part was to map the signal number to the signal name. I have already put a lot of effort in previous releases to make <code>halockrun -k</code> accept symbolic signal names—in a portable manner. That&#8217;s there since many years and seems to work quite well. So, it would be rather inappropriate to write numbers in the messages. The mapping took me quite a while and caused a lot of testing because I touched the “portability layer“ that has three different variants.</p>
<p>Special thanks go to Vallo Kallaste and the guys at <a href="http://25th-floor.com">25th-floor</a> for testing. After all, the release was tested on the following platforms:</p>
<ul>
<li>Linux 2.6.22-15-server #1 SMP Wed Aug 20 19:08:24 UTC 2008 i686 GNU/Linux—with gcc and icc</li>
<li>FreeBSD 4.11-STABLE FreeBSD 4.11-STABLE #0: Thu Feb 12 08:04:00 GMT 2009</li>
<li>HP-UX B.11.11 U 9000/800 9000/800 1 HP-UX</li>
<li>SunOS 5.10 Generic_127111-02 sun4u sparc SUNW,UltraSPARC-IIi-cEngine</li>
<li>Darwin 8.11.1 Darwin Kernel Version 8.11.1: Wed Oct 10 18:23:28 PDT 2007; root:xnu-792.25.20~1/RELEASE_I386 i386 i386</li>
<li>aix 5300-09— with xlc (C for AIX version 5.0.2.0) and gcc.</li>
</ul>
<h3>Compatibility</h3>
<p>Because the verbose mode was inspired by Steve Friedl&#8217;s <code>lockrun</code>, I checked again if hatools can do what <code>lockrun.c</code> does. Although <code>halockrun</code> provides a very flexible timeout mechanism, it doesn&#8217;t support the same feature as <code>--max-time</code> in Steve&#8217;s <code>lockrun</code>. The focus of <code>hatimerun</code> is to kill the process after a while, the <code>--max-time</code> switch in <code>lockrun.c</code> is just a error reporting feature. Well, I believe it is perfectly reasonable to have a warning if the program takes too long, but not kill it automatically.</p>
<p><code>halockrun</code> can not be used for that purpose because it doesn&#8217;t <code>fork()</code> and can therefore not do anything after the child program has been started. <code>hatimerun</code> is the tool for timeouts in hatools. As it turned out, <code>hatimerun</code> could “not send a signal“ ever since the first release:</p>
<blockquote><pre>$ ./hatimerun -v <b>-k 0</b> -t 1 sleep 2
./hatimerun: process 11957 terminated with status 0 after 2s (sleep 2)</pre>
</blockquote>
<p>The trick is to use “signal” zero; that is, <a href="http://www.opengroup.org/onlinepubs/009695399/functions/kill.html">not a real signal</a>! However, <code>-k 0</code> is rather awkward and most people are not aware of it&#8217;s meaning. So I introduced the symbolic name <code>NONE</code> for that purpose. This allows you to implement a warning level:</p>
<blockquote><pre>$ ./hatimerun -v -t 1:00 <b>-k NONE</b> -t 1:00 -k KILL sleep 130</pre>
</blockquote>
<p>This will wait for a minute (first <code>-t 1:00</code>), then do nothing (<code>-k NONE</code>) but write a warning in the end (<code>-v</code>). After another minute (second <code>-t 1:00</code>) kill the process (<code>-k KILL</code>).</p>
<h3>Portability</h3>
<p>Because I have already downloaded and tried Steve&#8217;s <code>lockrun.c</code>, I tried it together with <code>halockrun</code>. Bad enough, they don&#8217;t work together at all. That means, if a lock is occupied by <code>lockrun</code>, that doesn&#8217;t affect <code>halockrun</code>. The reason is that both tools use different advisory locking mechanisms. While <code>halockrun</code> uses POSIX <code><a href="http://www.opengroup.org/onlinepubs/009695399/functions/fcntl.html">fcntl(2)</a></code>, <code>lockrun</code> takes BSD <a href="http://www.kernel.org/doc/man-pages/online/pages/man2/flock.2.html">flock(2)</a> or POSIX <a href="http://www.opengroup.org/onlinepubs/009695399/functions/lockf.html">lockf(3)</a>, depending on the platform. No surprise, the BSD <code>flock()</code> doesn&#8217;t care about POSIX locks. The <a href="http://www.kernel.org/doc/man-pages/online/pages/man2/flock.2.html">Linux manpage</a> is quite clear about that:</p>
<blockquote><p>Since kernel 2.0, <code>flock()</code> is implemented as a system call in its own right rather than being emulated in the GNU C library as a call to <code>fcntl(2)</code>. This yields true BSD semantics: there is no interaction between the types of lock placed by <code>flock()</code> and <code>fcntl(2)</code>, and <code>flock()</code> does not detect deadlock.</p>
</blockquote>
<p>However, <a href="http://www.opengroup.org/onlinepubs/009695399/functions/lockf.html">POSIX</a> isn&#8217;t much better, as it doesn&#8217;t define the interaction of <code>fcntl()</code> and <code>lockf()</code>:</p>
<blockquote><p>The interaction between <code>fcntl()</code> and <code>lockf()</code> locks is unspecified.</p>
</blockquote>
<p>AFAIK, most systems implement <code>lockf()</code> in terms of <code>fcntl()</code>. Still there is no guarantee for that and the worst case is that a particular operating system has three different locking mechanisms. Special thanks to the “<a href="http://en.wikipedia.org/wiki/POSIX">Portable Operating System Interface [for Unix]</a>” that explicitly pushes two incompatible variants. I suppose there was a good reason for that decision, but I am not aware of it.</p>
<p>However, <code>halockrun</code> will continue to use <code>fcntl()</code> because it can be queried about the PID that currently holds the lock. <code>halockrun -t</code> hands this feature on to you.</p>
<p>Poor man&#8217;s fix is that I added a note about the incompatibility into the man-page.</p>
<h3>An Advertisement: It&#8217;s All About Details</h3>
<p>You might wonder why I write all of that? The point is that I aim to make <code>hatools</code> a piece of quality software. That takes quite a lot of time because quality is about details.</p>
<p>The advertisement is that I am an independent Software Quality Consultant for non-functional issues like performance, reliability, maintainability, scalability and so on. Let <a href="http://blog.fatalmind.com/consulting/">me know</a> if I can help you.</p>
<div id="right_top_adspace">
<div id="util_cover">
<a href="http://Use-The-Index-Luke.com"><img src="http://Use-The-Index-Luke.com/img/util_cover_free.png" height="209" width="130" alt="Use The Index, Luke! A Guide to SQL Performance for Developers" border="0"></a>
</div>
<div class="ad">
<h3><a href="http://winand.at/de/consulting/instant/">Instant-Consulting</a></h3>
<p>Häppchenweise Online Consulting für Entwickler. Jetzt gratis testen!<br />
<a href="http://winand.at/de/consulting/instant/">winand.at/consulting/</a>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/myfatalmind.wordpress.com/392/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/myfatalmind.wordpress.com/392/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/myfatalmind.wordpress.com/392/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/myfatalmind.wordpress.com/392/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/myfatalmind.wordpress.com/392/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/myfatalmind.wordpress.com/392/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/myfatalmind.wordpress.com/392/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/myfatalmind.wordpress.com/392/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/myfatalmind.wordpress.com/392/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/myfatalmind.wordpress.com/392/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/myfatalmind.wordpress.com/392/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/myfatalmind.wordpress.com/392/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/myfatalmind.wordpress.com/392/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/myfatalmind.wordpress.com/392/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=392&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.fatalmind.com/2010/03/16/working-for-the-community-hatools-2-14/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/6855feeb83ac8a3e397bc8260bad8294?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fatalmind</media:title>
		</media:content>

		<media:content url="http://Use-The-Index-Luke.com/img/util_cover_free.png" medium="image">
			<media:title type="html">Use The Index, Luke! A Guide to SQL Performance for Developers</media:title>
		</media:content>
	</item>
		<item>
		<title>Clustering Factor: Row Migration&#8217;s Victim</title>
		<link>http://blog.fatalmind.com/2010/03/09/clustering-factor-row-migrations-victim/</link>
		<comments>http://blog.fatalmind.com/2010/03/09/clustering-factor-row-migrations-victim/#comments</comments>
		<pubDate>Tue, 09 Mar 2010 10:15:46 +0000</pubDate>
		<dc:creator>Markus Winand</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://blog.fatalmind.com/?p=285</guid>
		<description><![CDATA[This article describes the effects of a high row migration rate on the clustering factor and the optimizer&#8217;s ability to select the best execution plan. In my previous article—Row Migration and Row Movement—I have demonstrated that the “insert empty, update everything” anti-pattern can lead to 100% row migration. This article continues the research on row [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=285&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This article describes the effects of a high row migration rate on the clustering factor and the optimizer&#8217;s ability to select the best execution plan.</p>
<p>In my previous article—<a href="http://blog.fatalmind.com/2010/02/23/row-migration-and-row-movement/">Row Migration and Row Movement</a>—I have demonstrated that the “insert empty, update everything” anti-pattern can lead to 100% row migration. This article continues the research on row migration and unveils surprising effects on the clustering factor. To be precise, the clustering factor can become <em>completely bogus</em> in presence of a very high row migration rate. Once the clustering factor is “wrong”, it&#8217;s just a finger exercise to construct an optimizer trap and proof that row migration can affect the query plan.</p>
<p><span id="more-285"></span>
<p>To start off, I create a table similar to the one in the previous article. However, I add one more column and populate it with random values from 0 to 9. I will need this column to build my optimizer trap later on.</p>
<blockquote><pre><b>CREATE TABLE row_mig1 (
  a CHAR(2000),
  b CHAR(2000),
  c CHAR(2000),
  x NUMBER      NOT NULL,
  filter NUMBER NOT NULL,
  CONSTRAINT row_mig1_pk PRIMARY KEY (x)
) ENABLE ROW MOVEMENT;

BEGIN
   FOR i IN 1..100000 LOOP
      INSERT INTO row_mig1
                    (x, filter)
             VALUES (i, trunc(dbms_random.value(0, 10)));

      UPDATE row_mig1
         SET a ='a', b = 'b', c = 'c'
       WHERE x=i;
   END LOOP;
END;
/
COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(null, 'ROW_MIG1', CASCADE=&gt;TRUE);</b></pre>
</blockquote>
<p>Up till now everything is very similar to the last article, except that I used <code>DBMS_STATS</code> instead of <code>ANALYZE TABLE</code>. Although we know exactly that almost every row was migrated, <code>DBMS_STATS</code> doesn&#8217;t care about that:</p>
<blockquote><pre>SQL&gt; <b>SELECT num_rows, chain_cnt
       FROM user_tables
      WHERE table_name='ROW_MIG1';</b>

  NUM_ROWS  CHAIN_CNT
---------- ----------
    100000          0

SQL&gt; </pre>
</blockquote>
<p>However, let&#8217;s have a look at the index:</p>
<blockquote><pre>SQL&gt; <b>SELECT num_rows, leaf_blocks, clustering_factor
       FROM user_indexes
      WHERE index_name = 'ROW_MIG1_PK';</b>

  NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------- ----------- -----------------
    100000         187               918

SQL&gt; </pre>
</blockquote>
<p>Well, that&#8217;s an excellent clustering factor. A low clustering factor indicates that the index is in the same sequence as the table. That&#8217;s true in that case because the index column corresponds to the order in which the rows were inserted to the table. However, 918, how can this be? The clustering factor is supposed to be between the number of table blocks—which indicates a good clustering factor—and the number of index rows—which is the worst case. So, let&#8217;s look at the table size:</p>
<blockquote><pre>SQL&gt; <b>SELECT t.blocks             table_blocks
          , i.clustering_factor  index_clustering_factor
          , i.num_rows           index_rows
       FROM user_indexes i
       JOIN user_tables t USING (table_name)
      WHERE index_name = 'ROW_MIG1_PK';</b>

TABLE_BLOCKS INDEX_CLUSTERING_FACTOR INDEX_ROWS
------------ ----------------------- ----------
      100877                     918     100000

SQL&gt; </pre>
</blockquote>
<p>According to that, the lower bound for the clustering factor is higher than the upper bound. Hmm, let&#8217;s investigate the clustering factor manually and verify the distribution of the rows across the table blocks:</p>
<blockquote><pre>SQL&gt; <b>SELECT * FROM (
       SELECT DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) block_number
            , COUNT(*) rows_in_block
         FROM row_mig1
        GROUP BY DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)
     ) WHERE ROWNUM &lt;=10;</b>

BLOCK_NUMBER ROWS_IN_BLOCK
------------ -------------
         523           109
         524           113
         525           109
         526           109
         527           110
         536           109
         537           110
         538           109
         539           109
         540           110

10 rows selected.

SQL&gt; </pre>
</blockquote>
<p>There are 109 different <code>ROWIDs</code> that refer to the block number 523. But we know that each record has about 6k. It is impossible to fit 109 rows into a single block of 8k. However, the “insert empty, update everything” anti-pattern makes it possible. The game goes like this:</p>
<ol>
<li>
<p>The very first row is inserted into a new block.</p>
</li>
<li>
<p>The first row is updated, and fits into the same block. The free space in that particular block is still more than a kilobyte.</p>
</li>
<li>
<p>The second row is inserted into the very same block, because there is enough free space available in that block.</p>
</li>
<li>
<p>The update of the second row triggers the migration of that row to a different block.</p>
<p>The row migration changes neither the <code>ROWID</code> nor the index entry. That means that the forwarding address—that is, somehow, the new <code>ROWID</code>—is stored in the original block so that the index can find the row.</p>
</li>
<li>
<p>The third row is inserted into the very first table block, again.</p>
<p>Because the second row was moved into a different block, the very first block has still free space. There is only the first row—as a whole—and one forwarding address stored in that block. So, the insert of the third row can take place there.</p>
</li>
<li>And so on. Until the forwarding addresses fill the block—up to <code>PCTFREE</code>.</li>
</ol>
<p>That&#8217;s a good one, hmm?</p>
<p>We can even verify that:</p>
<blockquote><pre>SQL&gt; <b>SELECT * FROM (
       SELECT MIN(X)                               min_x
            , MAX(x)                               max_x
            , MAX(x) - MIN(x) + 1                  diff_x
            , DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) block_number
            , COUNT(*)            rows_in_block
         FROM row_mig1
        GROUP BY DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)
        ORDER BY min_x
     ) WHERE ROWNUM &lt;=10;</b>

     MIN_X      MAX_X     DIFF_X BLOCK_NUMBER ROWS_IN_BLOCK
---------- ---------- ---------- ------------ -------------
         1        113        113          524           113
       114        222        109          523           109
       223        332        110          537           110
       333        441        109          538           109
       442        550        109          539           109
       551        660        110          540           110
       661        769        109          541           109
       770        878        109          542           109
       879        987        109          543           109
       988       1096        109          525           109

10 rows selected.

SQL&gt;</pre>
</blockquote>
<p>You see that the first row was inserted into block number 524. All subsequent rows up to the 113<sup>th</sup> were put into the same block. When that block was finally filled up—with one row and 112 forwarding addresses—the game starts over in the next block. All the <code>INSERT</code> statements took place in just 918 distinct blocks. Because the <code>ROWID</code> is assigned during the <code>INSERT</code>, the subsequent migration of the row due to the <code>UPDATE</code> is not reflected in the <code>ROWID</code>.</p>
<p>Neither <code>DBMS_STATS</code> nor <code>ANALYSE TABLE</code> look into the table to check if the row is really there or if the particular block it is just an accumulation of forwarding addresses. From their perspective, all the rows are in the same block—this is how the clustering factor is calculated. Although correctly calculated—technically—and up to date, the clustering factor of this index does not reflect the real situation.</p>
<p>The “correct” value—in that sense that it reflects the data distribution correctly—for the clustering factor would be 100.000; that is, the number of rows. If the clustering factor equals the number of rows in the index—which is the worst possible case—it means that there are no two adjacent index entries that refer to the same table block. This is actually the case because no 8k block can contain two complete 6k rows.</p>
<h3>The Clustering Factor is Wrong, So What?</h3>
<p>So, what&#8217;s the problem if the clustering factor is wrong?</p>
<p>The problem is that the Cost Based Optimizer (CBO) uses the clustering factor in its cost calculation for an <code>INDEX RANGE SCAN</code> (see <a href="http://www.centrexcc.com/Fallacies%20of%20the%20Cost%20Based%20Optimizer.pdf">Fallacies of the Cost Based Optimizer [pdf]</a>). That means, the cost of the <code>INDEX RANGE SCAN</code> will be too low, because the clustering factor is way too low. Luckily there is an effect that makes all that less problematic; all indexes on that table are affected.</p>
<p>Let&#8217;s make a second index to verify that:</p>
<blockquote><pre>SQL&gt; <b>CREATE INDEX row_mig1_idx ON row_mig1(filter);</b>

Index created.

SQL&gt; <b>BEGIN
       DBMS_STATS.GATHER_TABLE_STATS(null, 'ROW_MIG1',
                                     CASCADE =&gt; TRUE);
     END;
     /</b>

PL/SQL procedure successfully completed.

SQL&gt; <b>SELECT i.index_name         index_name
          , t.blocks             table_blocks
          , i.clustering_factor  index_clustering_factor
          , i.num_rows           index_rows
       FROM user_indexes i
       JOIN user_tables t USING (table_name)
      WHERE table_name = 'ROW_MIG1';</b>

INDEX_NAME   TABLE_BLOCKS INDEX_CLUSTERING_FACTOR INDEX_ROWS
------------ ------------ ----------------------- ----------
ROW_MIG1_PK        100877                     918     100000
ROW_MIG1_IDX       100877                    9180     100000

SQL&gt; </pre>
</blockquote>
<p>The clustering factor of the new index is bigger than the one of the original index, but still way off its real value. The “correct” clustering factor for the new index would be 100.000 because there are no two adjacent index entries that refer to the table block. Well, they actually do, but there is nothing more than the forwarding address in those blocks.</p>
<div class="sidenote" id="TenTimesAsHigh">
<h6>Ten Times as High</h6>
<p>The clustering factor is ten times as high because the filter column has ten distinct values. That means that the adjacent index entries will refer to ten blocks at least, because those entries were not inserted in the same order as they are stored in the index. On the other hand, the index on the <code>filter</code> column is not only sorted by the <code>filter</code> value, but also by the <code>ROWID</code>—as every nonunique index in Oracle—so that the clustering factor is kept at a minimum. Finally, the clustering factor is ten times as high because it refers to ten times as many table blocks, but does not grow above that because the index order keeps the clustering factor at a minimum.</p>
</div>
<p>Although the clustering factor does not correctly reflect the efforts to fetch the table rows, it correctly reflects the relation between the two indexes. The primary key index has a lower clustering factor because it has the same sequence as the table itself. On the other side, the index on the <code>filter</code> column has a different order, thus the value is higher. The side note explains why it is <a href="#TenTimesAsHigh">Ten Times as High</a>.</p>
<p>If all indexes are affected, there is hardly any real problem I can see for a single table access. Even queries that can be executed with two different indexes, the CBO will most likely not do wrong because both clustering factors are misleading.</p>
<h3>The Join Trap</h3>
<p>If it&#8217;s not possible to confuse the optimizer with a single table, let&#8217;s use more of them. So, I try to build a trap where the optimizer&#8217;s decision of the join order is influenced by the phony clustering factor so that the optimizer takes the less efficient execution plan.</p>
<p>For that purpose, I build a second table very similar to the first one. There are only two differences:</p>
<ol>
<li>I don&#8217;t follow the “insert empty, update everything” anti-pattern—there will be no row migration.</li>
<li>The selectivity of the <code>filter</code> columns is slightly increased (about 6% instead of 10%).</li>
</ol>
<p>Here is the overall script:</p>
<blockquote><pre><b>CREATE TABLE row_mig2 (
  a CHAR(2000),
  b CHAR(2000),
  c CHAR(2000),
  x NUMBER      NOT NULL,
  filter NUMBER NOT NULL,
  CONSTRAINT row_mig2_pk PRIMARY KEY (x)
) ENABLE ROW MOVEMENT;

INSERT INTO row_mig2 (x, filter, a, b, c)
     SELECT level, trunc(dbms_random.value(0,17)), 'a', 'b', 'c'
       FROM dual CONNECT BY level &lt;= 100000;
COMMIT;

CREATE INDEX row_mig2_idx ON row_mig2(filter);

EXEC DBMS_STATS.GATHER_TABLE_STATS(null, 'ROW_MIG2', CASCADE=&gt;TRUE);</b></pre>
</blockquote>
<p>Now let&#8217;s check the statistics:</p>
<blockquote><pre>SQL&gt; <b>SELECT i.index_name         index_name
          , t.blocks             table_blocks
          , i.clustering_factor  index_clustering_factor
          , i.num_rows           index_rows
       FROM user_indexes i
       JOIN user_tables t USING (table_name)
      WHERE table_name = 'ROW_MIG2';</b>

INDEX_NAME   TABLE_BLOCKS INDEX_CLUSTERING_FACTOR INDEX_ROWS
------------ ------------ ----------------------- ----------
ROW_MIG2_PK        100749                  100000     100000
ROW_MIG2_IDX       100749                  100000     100000

SQL&gt; </pre>
</blockquote>
<p>Please note that the clustering factor is at the upper bound of the expected range; that means, it indicates that there are no two adjacent index entries referring to the same table block. That&#8217;s somehow logical, if we consider that every row is in its own table block.</p>
<p>After that preparation, I can present my “trapQL”:</p>
<blockquote><pre><b>SELECT *
  FROM row_mig1 d1
  JOIN row_mig2 d2 ON (d1.x = d2.x)
 WHERE d1.filter = 0
   AND d2.filter = 0;</b></pre>
</blockquote>
<p>It is a rather trivial join on the primary keys. Additionally each table is filtered on the <code>filter</code> column for one particular value. The trap works because a <code>NESTED LOOPS</code> join is possible in both ways. Either by filtering the first table by an <code>INDEX RANGE SCAN</code> on <code>filter</code> and then fetch the corresponding entry from the second by a primary key lookup, or vice versa. However, because we—as well as the optimizer—know that the second table is more selective than the first one, the more efficient way to execute that query is to first perform the <code>INDEX RANGE SCAN</code> on the second table and then join in the first one. In that way, the number of primary key lookups is reduced and the overall performance will be better. Considering that the first table suffers from heavy row migration, that effect becomes even more relevant. However, it is of course the purpose of the discussion to proof that the optimizer is doing “wrong”:</p>
<blockquote><pre><b>SET AUTOTRACE TRACEONLY;
SET TIMING ON;

SELECT /* original clustering factor */ *
  FROM row_mig1 d1
  JOIN row_mig2 d2 ON (d1.x = d2.x)
 WHERE d1.filter = 0
   AND d2.filter = 0;

SET TIMING OFF
SET AUTOTRACE OFF</b></pre>
</blockquote>
<p>And the result is:</p>
<blockquote><pre>577 rows selected.

Elapsed: <b>00:01:08.35</b>

Execution Plan
----------------------------------------------------------
Plan hash value: 4162018446

---------------------------------------------------------------------
| Id | Operation                     | Name         | Rows  | Cost  |
---------------------------------------------------------------------
|  0 | SELECT STATEMENT              |              |  5882 | 10941 |
|  1 |  NESTED LOOPS                 |              |       |       |
|  2 |   NESTED LOOPS                |              |  5882 | 10941 |
|  3 |    TABLE ACCESS BY INDEX ROWID| ROW_MIG1     | 10000 |   <b>938</b> |
|* 4 |     <b>INDEX RANGE SCAN          | ROW_MIG1_IDX | 10000</b> |    20 |
|* 5 |    INDEX UNIQUE SCAN          | ROW_MIG2_PK  |     1 |     0 |
|* 6 |   TABLE ACCESS BY INDEX ROWID | ROW_MIG2     |     1 |     1 |
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("D1"."FILTER"=0)
   5 - access("D1"."X"="D2"."X")
   6 - filter("D2"."FILTER"=0)

Statistics
----------------------------------------------------------
        913  recursive calls
          0  db block gets
      <b>35584</b>  consistent gets
      21027  physical reads
          0  redo size
      39012  bytes sent via SQL*Net to client
        837  bytes received via SQL*Net from client
         40  SQL*Net roundtrips to/from client
         12  sorts (memory)
          0  sorts (disk)
        577  rows processed</pre>
</blockquote>
<p>The optimizer has chosen to perform the <code>INDEX RANGE SCAN</code> on <code>ROW_MIG1_IDX</code> first. The optimizer is well aware of the fact that the <code>INDEX RANGE SCAN</code> will return about 10000 rows; still it was preferred over the alternative execution plan.</p>
<p>So let&#8217;s check what happens if we tell the optimizer the truth about that table&#8217;s indexes?</p>
<blockquote><pre><b>BEGIN
  DBMS_STATS.SET_INDEX_STATS(null, 'ROW_MIG1_PK', clstfct=&gt;100000);
  DBMS_STATS.SET_INDEX_STATS(null, 'ROW_MIG1_IDX',clstfct=&gt;100000);
END;
/

SET AUTOTRACE TRACEONLY;
SET TIMING ON;

SELECT /* updated clustering factor */ *
  FROM row_mig1 d1
  JOIN row_mig2 d2 ON (d1.x = d2.x)
 WHERE d1.filter = 0
   AND d2.filter = 0;

SET TIMING OFF
SET AUTOTRACE OFF</b></pre>
</blockquote>
<p>The only change is that the statistics have been manually updated to the “more correct” clustering factor of 100.000. Unfortunately neither <code>DBMS_STATS</code> nor <code>ANALYZE TABLE</code> can be used for that purpose, so I did it manually. Please note that the table itself was not changed; most of the rows are still migrated.</p>
<p>And the result is:</p>
<blockquote><pre>577 rows selected.

Elapsed: <b>00:00:59.91</b>

Execution Plan
----------------------------------------------------------
Plan hash value: 3004301745

---------------------------------------------------------------------
| Id | Operation                     | Name         | Rows  | Cost  |
---------------------------------------------------------------------
|  0 | SELECT STATEMENT              |              |  5882 | 11780 |
|  1 |  NESTED LOOPS                 |              |       |       |
|  2 |   NESTED LOOPS                |              |  5882 | 11780 |
|  3 |    TABLE ACCESS BY INDEX ROWID| ROW_MIG2     |  5882 |  <b>5896</b> |
|* 4 |     <b>INDEX RANGE SCAN          | ROW_MIG2_IDX |  5882</b> |    12 |
|* 5 |    INDEX UNIQUE SCAN          | ROW_MIG1_PK  |     1 |     0 |
|* 6 |   TABLE ACCESS BY INDEX ROWID | ROW_MIG1     |     1 |     1 |
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("D2"."FILTER"=0)
   5 - access("D1"."X"="D2"."X")
   6 - filter("D1"."FILTER"=0)

Statistics
----------------------------------------------------------
        913  recursive calls
          0  db block gets
      <b>20695</b>  consistent gets
      12817  physical reads
          0  redo size
      39012  bytes sent via SQL*Net to client
        837  bytes received via SQL*Net from client
         40  SQL*Net roundtrips to/from client
         12  sorts (memory)
          0  sorts (disk)
        577  rows processed</pre>
</blockquote>
<p>The more representative clustering factor makes the optimizer take the expected plan. The filtering takes place on the more selective table first, which matches about 5900 rows. The other table is joined in later. The execution is about 13% faster, the logical and physical gets dropped by about 40%. <em>That</em> makes quite a difference.</p>
<p>The cost of the <code>TABLE ACCESS BY INDEX ROWID</code> that follows the <code>INDEX RANGE SCAN</code> reflects the clustering factor&#8217;s impact. The second query plan has a cost of about 5900, that actually means that each fetched row will need a block read. The original execution plan had a cost value of 938 for that step, so that the overall cost value was lower.</p>
<p>After that I must remind the reader that the rows in table <code>ROW_MIG1</code> are still migrated. The performance difference is not cause by the row migration per se, but by the misleading statistics that result out of the row migration.</p>
<h3>Correcting the Row Migration</h3>
<p>To complete the exercise, I will correct the row migration, run the statement again, and compare the performance improvement:</p>
<blockquote><pre>SQL&gt; <b>ALTER TABLE row_mig1 MOVE;</b>

Table altered.

SQL&gt; <b>ALTER INDEX row_mig1_pk REBUILD;</b>

Index altered.

SQL&gt; <b>ALTER INDEX row_mig1_idx REBUILD;</b>

Index altered.

SQL&gt; <b>BEGIN
       DBMS_STATS.GATHER_TABLE_STATS(null, 'ROW_MIG1',CASCADE=&gt;TRUE);
     END;
     /</b>

PL/SQL procedure successfully completed.

SQL&gt; <b>SELECT i.index_name         index_name
          , t.blocks             table_blocks
          , i.clustering_factor  index_clustering_factor
          , i.num_rows           index_rows
       FROM user_indexes i
       JOIN user_tables t USING (table_name)
      WHERE table_name = 'ROW_MIG1';</b>

INDEX_NAME     TABLE_BLOCKS INDEX_CLUSTERING_FACTOR INDEX_ROWS
-------------- ------------ ----------------------- ----------
ROW_MIG1_PK          100506                  100000     100000
ROW_MIG1_IDX         100506                  100000     100000

SQL&gt; </pre>
</blockquote>
<p>Then execute the statement again:</p>
<blockquote><pre><b>SET AUTOTRACE TRACEONLY;
SET TIMING ON;

SELECT /* no row migration */ *
  FROM row_mig1 d1
  JOIN row_mig2 d2 ON (d1.x = d2.x)
 WHERE d1.filter = 0
   AND d2.filter = 0;

SET TIMING OFF
SET AUTOTRACE OFF</b></pre>
</blockquote>
<p>Because the statistics are almost identical, the plan doesn&#8217;t change, nor does the cost. What <em>does</em> change is the execution time as well as the number of logical gets:</p>
<blockquote><pre>577 rows selected.

Elapsed: <b>00:00:30.90</b>

Execution Plan
----------------------------------------------------------
Plan hash value: 3004301745

---------------------------------------------------------------------
| Id | Operation                     | Name         | Rows  | Cost  |
---------------------------------------------------------------------
|  0 | SELECT STATEMENT              |              |  5882 | 11780 |
|  1 |  NESTED LOOPS                 |              |       |       |
|  2 |   NESTED LOOPS                |              |  5882 | 11780 |
|  3 |    TABLE ACCESS BY INDEX ROWID| ROW_MIG2     |  5882 |  <b>5896</b> |
|* 4 |     <b>INDEX RANGE SCAN          | ROW_MIG2_IDX |  5882</b> |    12 |
|* 5 |    INDEX UNIQUE SCAN          | ROW_MIG1_PK  |     1 |     0 |
|* 6 |   TABLE ACCESS BY INDEX ROWID | ROW_MIG1     |     1 |     1 |
---------------------------------------------------------------------            

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("D2"."FILTER"=0)
   5 - access("D1"."X"="D2"."X")
   6 - filter("D1"."FILTER"=0)

Statistics
----------------------------------------------------------
        925  recursive calls
          0  db block gets
      <b>14271</b>  consistent gets
      11952  physical reads
          0  redo size
      39012  bytes sent via SQL*Net to client
        837  bytes received via SQL*Net from client
         40  SQL*Net roundtrips to/from client
         13  sorts (memory)
          0  sorts (disk)
        577  rows processed</pre>
</blockquote>
<h3>Summary</h3>
<p>The article investigates the effects of row migration on the clustering factor and the optimizer. A “proof of concept” SQL demonstrates that row migration can affect the optimizer. The lessons from this article are:</p>
<ul>
<li>The “insert empty, update everything” pattern can lead to a very high row migration rate.</li>
<li><code>DBMS_STATS</code> doesn&#8217;t populate the <code>CHAIN_CNT.</code></li>
<li>A high row migration rate can cut the clustering factor, even below its theoretic minimum.</li>
<li>A wrong clustering factor can affect the optimizer and result in a suboptimal plan.</li>
</ul>
<p><b>Update 2010-04-30</b>: As <a href="http://blog.fatalmind.com/2010/03/09/clustering-factor-row-migrations-victim/#comments">commented by Jonathan Lewis</a>, I also checked the “trapQL” against tables that were analyzed with <code>ANALYZE</code>—<a href="http://blog.fatalmind.com/2010/04/30/analyze-that/">surprising results</a>.</p>
<h3>Disclaimer</h3>
<p>Please note that this trap was intentionally built, just to prove that row migration can potentially influence the optimizer. There are better ways to tune that particular SQL, foremost a better indexing approach.</p>
<p>I have put many adjectives in quotation marks because they are not in line with the technical definition of the respective noun.</p>
<p>The “insert empty, update everything“ anti-pattern was used to create a very high row migration rate. Although that anti-pattern can lead to 100% migration rate, it does not always.</p>
<p>I have successfully verified my results on Oracle 10gR1, 11gR1 and 11gR2.</p>
<h3>Thanks</h3>
<p>Thanks to the guys at <a href="http://www.25th-floor.com">25th-floor</a> who verified my results on Oracle 11gR1.</p>
<div id="right_top_adspace">
<div id="util_cover">
<a href="http://Use-The-Index-Luke.com"><img src="http://Use-The-Index-Luke.com/img/util_cover_free.png" height="209" width="130" alt="Use The Index, Luke! A Guide to SQL Performance for Developers" border="0"></a>
</div>
<div class="ad">
<h3><a href="http://winand.at/de/consulting/instant/">Instant-Consulting</a></h3>
<p>Häppchenweise Online Consulting für Entwickler. Jetzt gratis testen!<br />
<a href="http://winand.at/de/consulting/instant/">winand.at/consulting/</a>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/myfatalmind.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/myfatalmind.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/myfatalmind.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/myfatalmind.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/myfatalmind.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/myfatalmind.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/myfatalmind.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/myfatalmind.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/myfatalmind.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/myfatalmind.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/myfatalmind.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/myfatalmind.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/myfatalmind.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/myfatalmind.wordpress.com/285/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=285&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.fatalmind.com/2010/03/09/clustering-factor-row-migrations-victim/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/6855feeb83ac8a3e397bc8260bad8294?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fatalmind</media:title>
		</media:content>

		<media:content url="http://Use-The-Index-Luke.com/img/util_cover_free.png" medium="image">
			<media:title type="html">Use The Index, Luke! A Guide to SQL Performance for Developers</media:title>
		</media:content>
	</item>
		<item>
		<title>Row Migration and Row Movement</title>
		<link>http://blog.fatalmind.com/2010/02/23/row-migration-and-row-movement/</link>
		<comments>http://blog.fatalmind.com/2010/02/23/row-migration-and-row-movement/#comments</comments>
		<pubDate>Tue, 23 Feb 2010 10:52:34 +0000</pubDate>
		<dc:creator>Markus Winand</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[anit-pattern]]></category>
		<category><![CDATA[oracle]]></category>

		<guid isPermaLink="false">http://myfatalmind.wordpress.com/?p=193</guid>
		<description><![CDATA[The Oracle database knows three distinct processes that are easily mixed up: Row Chaining, Row Migration and Row Movement. Luckily all three are well described in excellent articles: The Secrets of Oracle Row Chaining and Migration and Row Movement in Oracle. For the impatient, I provide some very short definitions: Row Chaining Distribution of a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=193&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The Oracle database knows three distinct processes that are easily mixed up: Row Chaining, Row Migration and Row Movement.</p>
<p>Luckily all three are well described in excellent articles: <a href="http://www.akadia.com/services/ora_chained_rows.html">The Secrets of Oracle Row Chaining and Migration</a> and <a href="http://www.databasejournal.com/features/oracle/article.php/3676401/Row-Movement-in-Oracle.htm">Row Movement in Oracle</a>.</p>
<p>For the impatient, I provide some very short definitions:</p>
<dl>
<dt>Row Chaining</dt>
<dd>Distribution of a single table row across multiple data blocks.</dd>
<dt>Row Migration</dt>
<dd>Relocation of an entire table row to a new place, without updating the indexes.</dd>
<dt>Row Movement</dt>
<dd>Relocation of an entire table row to a new place and updating the indexes.</dd>
</dl>
<p>This article was inspired by the question if Oracle 11r2 performs <i>Row Movement</i> instead of <i>Row Migration</i> for ordinary <code>UPDATE</code> statements—that is, in absence of partitions. The short answer is: no, it doesn&#8217;t. The long answer is the rest of this article.</p>
<p><span id="more-193"></span>
<p>The difference between <i>Row Chaining</i> and <i>Row Migration</i> is somehow understandable: If the row doesn&#8217;t fit into a single data block, it must be chained; otherwise it can be migrated as a whole. The limitation of the Row Migration is that it does not update the indexes on the table. That means, the <code>ROWID</code> that is stored in the index still refers to the old location of the row. An additional block, the new location of the row, must be read to fetch the required data.</p>
<p>The more modern <i>Row Movement</i> is different as it updates the corresponding indexes—the <code>ROWID</code> actually changes. This has benefits on the long run, because the additional block read can be avoided in the <i>TABLE ACCESS BY INDEX ROWID</i> operation. On the short run, the actual <code>UPDATE</code> operation is much more complex. I suppose this is the reason why a regular update does not (yet) trigger a Row Movement.</p>
<p>The Row Movement was introduced to support an <code>UPDATE</code> on a partition key—so that a table row is moved from one table partition to another one. In the meanwhile it is also used for flashback and space management—as described in the <a href="http://www.databasejournal.com/features/oracle/article.php/3676401/Row-Movement-in-Oracle.htm">above-mentioned article</a>.</p>
<p>As of Oracle Release 11g2, Row Movement is optional and disabled per default. It can be enabled per table with a very trivial <code>ALTER TABLE</code> statement. The usual reason to enable it is that one of the features which require Row Movement is used; partition key update, table shrink and flashback. There is hardly any reason not to enable row movement—the only side effect is that <code>ROWID</code>&#8216;s might change; that should not affect well designed applications.</p>
<p>Row Migration has some considerable problems, as pointed out in the later in this article. On the other hand, Row Movement has also a drawback; that is, the update of all the affected indexes can be very expensive. It&#8217;s a trade off between read and write performance. While Row Movement is more expensive for <code>UPDATES</code>, it maintains best index performance. In contrast, the Row Migration favors <code>UPDATE</code> performance over index speed. Although not true for all cases, I believe that most data is read more often than it is written—especially in our modern society where nobody every deletes data—so that Row Movement is generally the better choice.</p>
<p>The article examines how to get 100% migrated rows with the “insert empty, update everything” anti-pattern, why <code>DBMS_STATS</code> isn&#8217;t a perfect substitute for <code>ANALYZE TABLE</code>, the immunity of migrated rows to <code>alter table ... shrink space</code> and why <code>PCTFREE</code> is still the only rescue from DBAs perspective.</p>
<p>Just to show what I mean, I repeat a modified version of the test originally created by Tom Kyte and reused by Martin Zahn:</p>
<blockquote><pre><b>CREATE TABLE row_migration_demo (
  a CHAR(2000),
  b CHAR(2000),
  c CHAR(2000),
  d CHAR(2000),
  e CHAR(2000),
  x int,
  constraint row_migration_pk primary key (x)
) enable row movement
/</b></pre>
</blockquote>
<p>I have changed the size of the CHAR field to 2k because my block size is 8k whereas it was 4k in the original example. I have also put the x column to the end for a better verification of Row Migration versus Row Chaining. Finally I enable Row Movement, which does actually not change anything but allows me to perform an <code>alter table ... shrink space</code>.</p>
<p>The next step is a classical insert empty, update everything sequence:</p>
<blockquote><pre><b>INSERT INTO row_migration_demo (x) values (1);
UPDATE row_migration_demo set a = 'a', b='b', c='c' where x=1;
COMMIT; </b></pre>
</blockquote>
<p>Up till now everything is perfectly fine and the entire row is in a single block:</p>
<blockquote><pre>SQL&gt; <b>select * from row_migration_demo where x=1;</b>

         X
----------
         1

SQL&gt; <b>SELECT a.name, b.value
       FROM v$statname a, v$mystat b
      WHERE a.statistic# = b.statistic#
        AND lower(a.name) = 'table fetch continued row';</b>

NAME                             VALUE
--------------------------- ----------
table fetch continued row            0</pre>
</blockquote>
<p>So, lets do it again and insert a second row:</p>
<blockquote><pre><b>INSERT INTO row_migration_demo (x) values (2);
UPDATE row_migration_demo set a = 'a', b='b', c='c' where x=2;
COMMIT;</b></pre>
</blockquote>
<p>And select it again:</p>
<blockquote><pre>SQL&gt; <b>select * from row_migration_demo where x=2;</b>

         X
----------
         1

SQL&gt; <b>SELECT a.name, b.value
       FROM v$statname a, v$mystat b
      WHERE a.statistic# = b.statistic#
        AND lower(a.name) = 'table fetch continued row';</b>

NAME                             VALUE
--------------------------- ----------
table fetch continued row            1</pre>
</blockquote>
<p>Hooray, the row is migrated:</p>
<blockquote><pre>SQL&gt; <b>ANALYZE TABLE row_migration_demo COMPUTE STATISTICS;</b>

PL/SQL procedure successfully completed.

SQL&gt; <b>select num_rows, chain_cnt
       from user_tables
      where table_name='ROW_MIGRATION_DEMO';</b>

  NUM_ROWS  CHAIN_CNT
---------- ----------
         2          1</pre>
</blockquote>
<p><b>Note:</b> <code>ANALYZE</code> is deprecated. Yup, I know, but <code>DMBS_STATS.GATHER_TABLE_STATS</code> does not propagate the <code>CHAIN_CNT</code> column. <a href="http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1830023856761">Ask Tom!</a></p>
<p>It&#8217;s hard to figure out if the row is chained or migrated—I am actually not sure is there any difference in the data structure. Because the additional <code>table fetch continued row</code> occurs for the very first column, I believe the row is migrated:</p>
<blockquote><pre>SQL&gt;  select length(a) from row_migration_demo where x=2;

 LENGTH(A)
----------
      2000

SQL&gt; <b>SELECT a.name, b.value
       FROM v$statname a, v$mystat b
      WHERE a.statistic# = b.statistic#
        AND lower(a.name) = 'table fetch continued row';</b>

NAME                             VALUE
--------------------------- ----------
table fetch continued row            2</pre>
</blockquote>
<p>To be on the safe side, I will double verify:</p>
<blockquote><pre><b>INSERT INTO row_migration_demo (x, a) values (3, 'a');
UPDATE row_migration_demo
   SET b = 'b', c = 'c', d = 'd', e = 'e'
 WHERE x=3; </b></pre>
</blockquote>
<p>Indeed, the <code>a</code> column is at the place where the row was inserted while the <code>e</code> column is somewhere else:</p>
<blockquote><pre>SQL&gt;  select length(<b>a</b>) from row_migration_demo where x=3;

 LENGTH(A)
----------
      2000

SQL&gt; <b>SELECT a.name, b.value
       FROM v$statname a, v$mystat b
      WHERE a.statistic# = b.statistic#
        AND lower(a.name) = 'table fetch continued row';</b>

NAME                             VALUE
--------------------------- ----------
table fetch continued row            2

SQL&gt;  select length(<b>e</b>) from row_migration_demo where x=3;

 LENGTH(A)
----------
      2000

SQL&gt; <b>SELECT a.name, b.value
       FROM v$statname a, v$mystat b
      WHERE a.statistic# = b.statistic#
        AND lower(a.name) = 'table fetch continued row';</b>

NAME                             VALUE
--------------------------- ----------
table fetch continued row            3</pre>
</blockquote>
<p>To make things even worse, single rows can be migrated <em>and</em> chained. If we insert two more records, it becomes visible:</p>
<blockquote><pre><b>INSERT INTO row_migration_demo (x) values (4);
UPDATE row_migration_demo
   SET a='a', b='b', c='c', d='d', e='e'
 WHERE x=4;

INSERT INTO row_migration_demo (x) values (5);
UPDATE row_migration_demo
   SET a='a', b='b', c='c', d='d', e='e'
 WHERE x=5;

COMMIT;</b></pre>
</blockquote>
<p>Each of those rows must be chained, because they need more space than a single block has. However, due to migration, the row is actually distributed to three blocks:</p>
<blockquote><pre>SQL&gt;  select length(<b>a</b>) from row_migration_demo where x=5;

 LENGTH(A)
----------
      2000

SQL&gt; <b>SELECT a.name, b.value
       FROM v$statname a, v$mystat b
      WHERE a.statistic# = b.statistic#
        AND lower(a.name) = 'table fetch continued row';</b>

NAME                             VALUE
--------------------------- ----------
table fetch continued row            <b>4</b>

SQL&gt;  select length(<b>b</b>) from row_migration_demo where x=5;

 LENGTH(B)
----------
      2000

SQL&gt; <b>SELECT a.name, b.value
       FROM v$statname a, v$mystat b
      WHERE a.statistic# = b.statistic#
        AND lower(a.name) = 'table fetch continued row';</b>

NAME                             VALUE
--------------------------- ----------
table fetch continued row            <b>5</b>

SQL&gt;  select length(<b>c</b>) from row_migration_demo where x=5;

 LENGTH(C)
----------
      2000

SQL&gt; <b>SELECT a.name, b.value
       FROM v$statname a, v$mystat b
      WHERE a.statistic# = b.statistic#
        AND lower(a.name) = 'table fetch continued row';</b>

NAME                             VALUE
--------------------------- ----------
table fetch continued row            <b>7</b></pre>
</blockquote>
<p>Please note that the “continued fetch” increased by two for the select on column <code>c</code> . The statistics show that all but one row is chained (the very first row):</p>
<blockquote><pre>SQL&gt; <b>ANALYZE TABLE row_migration_demo COMPUTE STATISTICS;</b>

PL/SQL procedure successfully completed.

SQL&gt; <b>select num_rows, chain_cnt
       from user_tables
      where table_name='ROW_MIGRATION_DEMO';</b>

  NUM_ROWS  CHAIN_CNT
---------- ----------
         5          4</pre>
</blockquote>
<p>If we continue the game and continue to insert data in this way, we get an astonishing chain count:</p>
<blockquote><pre>begin
   for i in 10..1000 loop
      INSERT INTO row_migration_demo (x) values (i);
      UPDATE row_migration_demo set a ='a', b = 'b', c = 'c' where x=i;
   end loop;
end;
/

SQL&gt; <b>ANALYZE TABLE row_migration_demo COMPUTE STATISTICS;</b>

PL/SQL procedure successfully completed.

SQL&gt; <b>select num_rows, chain_cnt
       from user_tables
      where table_name='ROW_MIGRATION_DEMO';</b>

  NUM_ROWS  CHAIN_CNT
---------- ----------
       996        995</pre>
</blockquote>
<p>All the inserted rows are chained.</p>
<p>The only way to clean up the mess is to move the table. Shrinking doesn&#8217;t help to much, especially not if there was no <code>DELETE</code> or <code>UPDATE</code> that has freed some space:</p>
<blockquote><pre>SQL&gt; <b>alter table row_migration_demo shrink space;</b>

Table altered.

SQL&gt; <b>analyze table row_migration_demo compute statistics;</b>

Table analyzed.

SQL&gt; <b>select num_rows, chain_cnt
       from user_tables
      where table_name='ROW_MIGRATION_DEMO';</b>

  NUM_ROWS  CHAIN_CNT
---------- ----------
       996        995
</pre>
</blockquote>
<p>Even if some space is freed and rows are moved, the chaining is not notably reduced:</p>
<blockquote><pre>SQL&gt; <b>delete from row_migration_demo where x &lt; 200;</b>

195 rows deleted.

SQL&gt; <b>analyze table row_migration_demo compute statistics;</b>

Table analyzed.

SQL&gt; <b>select num_rows, chain_cnt
       from user_tables
      where table_name='ROW_MIGRATION_DEMO';</b>

  NUM_ROWS  CHAIN_CNT
---------- ----------
       801        801

SQL&gt; <b>alter table row_migration_demo shrink space;</b>

Table altered.

SQL&gt; <b>analyze table row_migration_demo compute statistics;</b>

Table analyzed.

SQL&gt; <b>select num_rows, chain_cnt
       from user_tables
      where table_name='ROW_MIGRATION_DEMO';</b>

  NUM_ROWS  CHAIN_CNT
---------- ----------
       801        800</pre>
</blockquote>
<p>Please note that the shrink has corrected one chained row. That proofs that a Row Movement assigns a new <code>ROWID</code> and updates the index. However, one &#8220;unchained&#8221; row isn&#8217;t really the correction I would like to see. The only way to correct all chains, is to mote the table:</p>
<blockquote><pre>SQL&gt; <b>alter table row_migration_demo move;</b>

Table altered.

SQL&gt; <b>alter index row_migration_pk rebuild;</b>

Index altered.

SQL&gt; <b>analyze table row_migration_demo compute statistics;</b>

Table analyzed.

SQL&gt; <b>select num_rows, chain_cnt
       from user_tables
      where table_name='ROW_MIGRATION_DEMO';</b>

  NUM_ROWS  CHAIN_CNT
---------- ----------
       801          0</pre>
</blockquote>
<p>No chaining anymore.</p>
<p>There is actually another way to correct row migration. The <a href="http://download.oracle.com/docs/cd/B12037_01/server.101/b10739/general.htm#sthref2787">Oracle documentation</a> suggest to delete and re-insert all affected rows. Watch out for your triggers.</p>
<p>From administrators‛ perspective, the only way to <em>prevent</em> row migration in the first place is to increase <code>PCTFREE</code>. The complete re-execution of this test with <code>PCTFREE</code> increased to 50% “solves” the problem and no chained rows occur anymore:</p>
<blockquote><pre><b>drop table row_migration_demo;
CREATE TABLE row_migration_demo (
  a CHAR(2000),
  b CHAR(2000),
  c CHAR(2000),
  d CHAR(2000),
  e CHAR(2000),
  x int,
  constraint row_migration_pk primary key (x)
) enable row movement pctfree 50
/
begin
   for i in 1..1000 loop
      INSERT INTO row_migration_demo (x) values (i);
      UPDATE row_migration_demo set a ='a', b = 'b', c = 'c' where x=i;
   end loop;
end;
/
commit;
analyze table row_migration_demo compute statistics;
select num_rows, chain_cnt
       from user_tables
      where table_name='ROW_MIGRATION_DEMO';</b>

  NUM_ROWS  CHAIN_CNT
---------- ----------
      1000          0</pre>
</blockquote>
<p>If this is not acceptable, there is one very last option: change the application.</p>
<p>However, it would be very nice if an operation—similar to shrink—would move migrated rows. It would be even more compelling if a table option would allow to disable Row Migration in favor of Row Movement.</p>
<p><b>UPDATE 2010-03-09</b>: A <a href="http://blog.fatalmind.com/2010/03/09/clustering-factor-row-migrations-victim/">follow up article describes the impact of Row Migration on the Clustering Factor</a>.</p>
<h3>Appendix</h3>
<p>After some years of professional experience, I always wonder why a particular idea of mine should be unique. So I suspected that this feature might be there, somewhere hidden in Oracle. So I checked for hidden parameters (thanks to a <a href="http://www.adp-gmbh.ch/ora/misc/x.html#ksppi">little documentation on that topic</a>):</p>
<blockquote><pre>SQL&gt; <b>select a.ksppinm parameter, b.kspftctxvl
       from x$ksppi a join x$ksppcv2 b on (a.INDX+1= b.kspftctxpn)
      where a.KSPPDESC like '%movement%';</b>

PARAMETER                                KSPFTCTXVL
---------------------------------------- ----------
_disable_implicit_row_movement           FALSE

SQL&gt; <b>alter session set "_disable_implicit_row_movement" = true;</b>

Session altered.

SQL&gt; <b>select a.ksppinm parameter, b.kspftctxvl
       from x$ksppi a join x$ksppcv2 b on (a.INDX+1= b.kspftctxpn)
      where a.KSPPDESC like '%movement%';</b>

PARAMETER                                KSPFTCTXVL
---------------------------------------- ----------
_disable_implicit_row_movement           TRUE

SQL&gt;</pre>
</blockquote>
<p>As suggested by the name of the parameter, this doesn&#8217;t change anything in the direction I would like.</p>
<div id="right_top_adspace">
<div id="util_cover">
<a href="http://Use-The-Index-Luke.com"><img src="http://Use-The-Index-Luke.com/img/util_cover_free.png" height="209" width="130" alt="Use The Index, Luke! A Guide to SQL Performance for Developers" border="0"></a>
</div>
<div class="ad">
<h3><a href="http://winand.at/de/consulting/instant/">Instant-Consulting</a></h3>
<p>Häppchenweise Online Consulting für Entwickler. Jetzt gratis testen!<br />
<a href="http://winand.at/de/consulting/instant/">winand.at/consulting/</a>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/myfatalmind.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/myfatalmind.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/myfatalmind.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/myfatalmind.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/myfatalmind.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/myfatalmind.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/myfatalmind.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/myfatalmind.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/myfatalmind.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/myfatalmind.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/myfatalmind.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/myfatalmind.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/myfatalmind.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/myfatalmind.wordpress.com/193/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=193&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.fatalmind.com/2010/02/23/row-migration-and-row-movement/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/6855feeb83ac8a3e397bc8260bad8294?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fatalmind</media:title>
		</media:content>

		<media:content url="http://Use-The-Index-Luke.com/img/util_cover_free.png" medium="image">
			<media:title type="html">Use The Index, Luke! A Guide to SQL Performance for Developers</media:title>
		</media:content>
	</item>
		<item>
		<title>Oracle Trace File Rotation</title>
		<link>http://blog.fatalmind.com/2010/02/01/oracle-trace-file-rotation/</link>
		<comments>http://blog.fatalmind.com/2010/02/01/oracle-trace-file-rotation/#comments</comments>
		<pubDate>Mon, 01 Feb 2010 17:30:11 +0000</pubDate>
		<dc:creator>Markus Winand</dc:creator>
				<category><![CDATA[Maintainability]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[trace]]></category>

		<guid isPermaLink="false">http://myfatalmind.wordpress.com/?p=174</guid>
		<description><![CDATA[Under very rare circumstances, I need Oracle SQL trace files from a long period of time. Because trace files usually grow large—especially over several days—there is the need to rotate the trace file during that time so that they can be compressed and put away. The problem is that there is no “rotate tracefile“ button [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=174&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Under very rare circumstances, I need Oracle SQL trace files from a long period of time. Because trace files usually grow large—especially over several days—there is the need to rotate the trace file during that time so that they can be compressed and put away. The problem is that there is no “rotate tracefile“ button in Oracle. However, I have found a “undocumented feature“ that does exactly that—without disabling tracing.</p>
<p>My procedure uses the <code>close_trace</code> call of <code>oradebug</code>. This call closes the currently written trace file for a session. <a href="http://www.oracloid.com/2006/05/closing-trace-file-with-oradebug/">Alex Gorbachev</a> has used this to delete big trace files that are still open. My procedure goes a little bit further and exposes one more side effect of <code>oradebug close_trace</code>.</p>
<p><span id="more-174"></span>
<p><b>Warning: The technique described here uses the undocumented and unsupported <code><a href="http://www.psoug.org/reference/oradebug.html">oradebug</a></code> facility. Although I am not aware of any negative side effects of this procedure, the use of this method takes place at your own risk.</b></p>
<p>The <code>oradebug</code> utility can be used from the SQL*Plus prompt. If you have an account with sufficient rights (sigh) you can attach to any session using the <code>oradebug setorapid</code> command. To obtain the required Oracle PID, just query the <a href="http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/dynviews_2022.htm"><code>v$process</code></a> view:</p>
<blockquote><pre>SQL&gt; <b>SELECT s.sid, s.serial#, p.pid </b>
       <b>FROM v$session s, v$process p</b>
      <b>WHERE s.paddr=p.addr;</b>

       SID    SERIAL#        PID
---------- ---------- ----------
         1          1          2
[... skipped ...]
        20       3126         21

26 rows selected.

SQL&gt; <b>oradebug setorapid 21</b>
Oracle pid: 21, Unix process pid: 145, image: oracle@test.fatalmind.com
SQL&gt;</pre>
</blockquote>
<p>Once attached to a specific process, every <code>oradebug</code> operation is applied to that particular process. The trick to rotate the trace file is very simple and exploits a &#8220;feature&#8221; of UNIX file systems; that is, that the name of the file is used only at the moment the file is opened. Once the file is open, all access is managed with the so called <code>inode</code> and the name is not relevant for the read and write operations anymore. The <code>inode</code> system has some more <a href="http://en.wikipedia.org/wiki/Inode#Implications">implications</a>, one of them is frequently used to rotate log files without data loss. Another implications makes it hard to delete open files, as <a href="http://www.oracloid.com/2006/05/closing-trace-file-with-oradebug/">Alex Gorbachev</a> explained.</p>
<p>The log rotation trick is to rename the open file and then cause the writing process to re-open that file. All write operations to the renamed file will be written properly to new file name—more correctly expressed: to the file which can be accessed under the new name. If the writing process re-opens the file under its original name, it will create a new file because there is no file with that name anymore. Every subsequent write operation goes to the new file. The original file will not grow anymore and can be taken away safely.</p>
<p>Although the <code>oradebug</code> command name <code>close_trace</code> doesn&#8217;t suggest that the trace file is reopened, it is—if SQL tracing is enabled. So we have everything to rotate a trace file. Consider the following example:</p>
<blockquote><pre>$ <b>ls -lrt TEST_ora_14457.trc*</b>
-rw-r----- 1 oracle oinstall 14758723 2010-02-01 TEST_ora_14457.trc
$ <b>mv  TEST_ora_14457.trc  TEST_ora_14457.trc.old</b></pre>
</blockquote>
<p>After renaming the file, the file still grows as the SQL trace is written. You can now use <code>oradebug close_trace</code> to actually rotate the file:</p>
<blockquote><pre>SQL&gt; <b>oradebug close_trace</b>
Statement processed.
SQL&gt; </pre>
</blockquote>
<p>The verification in the file system shows that the &#8216;old&#8217; file became bigger, and the new file is also filling up:</p>
<blockquote><pre>$ <b>ls -lrt TEST_ora_14457.trc*</b>
-rw-r----- 1 oracle oinstall 18878704 2010-02-01 TEST_ora_14457.trc.old
-rw-r----- 1 oracle oinstall  1555972 2010-02-01 TEST_ora_14457.trc
$ </pre>
</blockquote>
<p>Because I often need to trace many sessions, I wrote a very tiny script (<code>switch_traces.sql</code>) that creates another script to re-open all trace files:</p>
<blockquote><pre>set echo off
set feedback off
set pages 0
set heading off
set define off

spool oradebug_close_traces.sql

select cmd from (
select pid, 1 o, 'oradebug setorapid ' || pid cmd from v$process
UNION ALL
select pid, 2 o, 'oradebug flush' from v$process
UNION ALL
select pid, 3 o, 'oradebug close_trace' from v$process) order by pid,o;

spool off

prompt
prompt c&amp;p &gt;&gt;&gt; @oradebug_close_traces.sql &lt;&lt;&lt; to switch traces

set feedback on
set pages 1000
set heading on
set echo on</pre>
</blockquote>
<p>This script will close all traces in all sessions and continue tracing to new files if SQL tracing is enabled. You might want to add additional <code>where</code> clauses to limit your scope—e.g., to skip background processes. The script will also issue a <code>flush</code> before the <code>close_trace</code>. I don&#8217;t know if this is required or not, it doesn&#8217;t seem to make any harm. The use of the script is very simple:</p>
<blockquote><pre>SQL&gt; <b>@switch_traces.sql</b>
SQL&gt; set echo off
oradebug setorapid 1
oradebug flush
oradebug close_trace

[... skipped ...]

oradebug setorapid 29
oradebug flush
oradebug close_trace

c&amp;p &gt;&gt;&gt; @oradebug_close_traces.sql &lt;&lt;&lt; to switch traces
SQL&gt; <b>@oradebug_close_traces.sql</b>
SQL&gt; oradebug setorapid 1
ORA-00072: process "Unix process pid: 0, image: PSEUDO" is not active
SQL&gt; oradebug flush
Statement processed.
SQL&gt; oradebug close_trace
Statement processed.
SQL&gt; oradebug setorapid 2
Oracle pid: 2, Unix process pid: 137, image: oracle@test.fatalmind.com
SQL&gt; oradebug flush
Statement processed.
SQL&gt; oradebug close_trace
Statement processed.

[... skipped ...]

Oracle pid: 29, Unix process pid: 138, image: oracle@test.fatalmind.com
SQL&gt; oradebug flush
Statement processed.
SQL&gt; oradebug close_trace
Statement processed.
SQL&gt;</pre>
</blockquote>
<p>In case you have not forgotten to rename your trace files you will see the new files growing now.</p>
<p>Read also my previous article <a href="http://blog.fatalmind.com/2009/11/24/to-trace-or-not-to-trace/">To Trace or Not to Trace</a> for information about fine grained SQL tracing.</p>
<div id="right_top_adspace">
<div id="util_cover">
<a href="http://Use-The-Index-Luke.com"><img src="http://Use-The-Index-Luke.com/img/util_cover_free.png" height="209" width="130" alt="Use The Index, Luke! A Guide to SQL Performance for Developers" border="0"></a>
</div>
<div class="ad">
<h3><a href="http://winand.at/de/consulting/instant/">Instant-Consulting</a></h3>
<p>Häppchenweise Online Consulting für Entwickler. Jetzt gratis testen!<br />
<a href="http://winand.at/de/consulting/instant/">winand.at/consulting/</a>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/myfatalmind.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/myfatalmind.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/myfatalmind.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/myfatalmind.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/myfatalmind.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/myfatalmind.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/myfatalmind.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/myfatalmind.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/myfatalmind.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/myfatalmind.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/myfatalmind.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/myfatalmind.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/myfatalmind.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/myfatalmind.wordpress.com/174/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.fatalmind.com&amp;blog=10300405&amp;post=174&amp;subd=myfatalmind&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.fatalmind.com/2010/02/01/oracle-trace-file-rotation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/6855feeb83ac8a3e397bc8260bad8294?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">fatalmind</media:title>
		</media:content>

		<media:content url="http://Use-The-Index-Luke.com/img/util_cover_free.png" medium="image">
			<media:title type="html">Use The Index, Luke! A Guide to SQL Performance for Developers</media:title>
		</media:content>
	</item>
	</channel>
</rss>
