Why Isn't My Query Using an Index? Databases for Developers: Performance #5

Рет қаралды 12,994

The Magic of SQL

Күн бұрын

Пікірлер: 30

@tanujgupta143 2 жыл бұрын

I am happy that I found this channel

@VEERSINGH-er9so 4 жыл бұрын

If I understood correctly then , on a very basic level, choosing between index and table scan depends on how many blocks both the access methods are likely to traverse for a particular query

@TheMagicofSQL 4 жыл бұрын

Yep, you got it!

@mdo2480 4 жыл бұрын

Very very useful. Big thanks to you Chris. I love the way you explain things.

@TheMagicofSQL 4 жыл бұрын

Thanks, glad you found this useful :)

@nidhikhaitan8114 Жыл бұрын

Thanks for such a wonderful explanation with example. Is it possible to calculate the number of block accesses with and without index. Does last_cr_buffer_gets represent the number of blocks read? I have a table with B-Tree index with height 2, total blocks - 305, leaf blocks - 154. The clustering_factor is equal to the number of blocks. I need to compare the number of blocks reading for fetching 90% of the records. Eg. select * from table_name where pk_column > 100, in table containing 80000 records.

@TheMagicofSQL Жыл бұрын

Thanks. In theory you can calculate the accesses needed with an index. In practice it's difficult because you need to know exactly how many index entries you'll read (and thus index blocks) and how many table block reads this requires. The easiest way is to run the query & get the plan with the buffer information. Does last_cr_buffer_gets represent the number of blocks read? Yes I need to compare the number of blocks reading for fetching 90% of the records. As you have perfect clustering this will be roughly 2 (the height) + 80% of the index leaf blocks + 80% of the table blocks. That said, a full table scan will almost always be more efficient than an index when fetching that large a fraction of the table. I wouldn't even bother trying to make the calculation.

@ppaolucc 11 ай бұрын

Hi Chris, very useful. Thank you. Just one consideration, if it is worth do add it: another factor that could affect the optimizer choice on whether to use FTs or the Index (I'm excluding PK access here) could also be the DB_BLOCK_SIZE in use? I mean, suppose you have a DB Block Size greater than 8K: say 16K or even 32K (even though this last one is more used in DWH), the Optimizer could opt, as a matter of cost, in using Full Table Scan as in few IOs you read few blocks each containing potentially more rows (since the DB Block Size is enough large ans supposing rows to search are as much as contiguous in the blocks potentially red). Could it be? I'm also excluding the handling of DB_FILE_MULTIBLOCK_READ_COUNT (in earlier releases was used to modify it). Does it make sense?

@TheMagicofSQL 11 ай бұрын

Not really. The maximum I/O size depends on the platform - by default Oracle Database determines this. This maximum is the same whatever block size you use. If you have bigger blocks, the database will fetch fewer each time. Think of it like this: say you've got buckets holding coins. Each bucket only stores coins of one value. So there's 1p coins in one bucket and £1 coins in another. Each time you get money from a bucket, the maximum value is £10. So when taking from the penny bucket, you could get up to 1,000 coins each time. From the £1 bucket you could get a maximum of 10 coins each time.

@mexicanmomo Жыл бұрын

Very helpful. One question, when you say avoid using hints in production query, do you specifically say about index hints or other hints like direct path insert etc.

@TheMagicofSQL Жыл бұрын

While you should aim to avoids hints in general, as you say there are hints that affect SQL behaviour such as append (direct path). These types of hints can affect transaction processing (i.e. the code you write), so you should explicitly code them. You should still use these rarely though. You should avoid hints that force a particular access method (e.g. index vs full table scan), join method/order or query transformations. If you're trying to get a specific plan, it's better to use SQL plan management (baselines).

@manjushatonape2099 2 жыл бұрын

Hello thanks for these videos, How Bitmap index works ? If column has ‘yes or no’ values (50%) each.

@TheMagicofSQL 2 жыл бұрын

I discuss the differences between BTree and bitmap indexes at blogs.oracle.com/sql/post/how-to-create-and-use-indexes-in-oracle-database#choose

@pankajjha1550 3 жыл бұрын

Clustering Factor was a riddle to me, not any more. Thank you.

@TheMagicofSQL 3 жыл бұрын

Great :)

@MeerAthil 4 жыл бұрын

Good presentation, to reduce high clustering factor#, do we need to reorg the table and rebuild index?

@TheMagicofSQL 4 жыл бұрын

Rebuilding indexes has NO effect on the clustering factor; to change it you need to move rows in the table

@MeerAthil 4 жыл бұрын

@@TheMagicofSQL thanks for your reply,👍

@Hellscream1830 3 жыл бұрын

Hi Chris! Excellent video on this topic! How about the importance of single block I/O vs multiblock I/O? Is it ever a good idea to play with DB_FILE_MULTIBLOCK_READ_COUNT parameter?

@TheMagicofSQL 3 жыл бұрын

Relatively low - I wouldn't start fiddling with that parameter

@Hellscream1830 3 жыл бұрын

@@TheMagicofSQL Thank you for the reply!

@KenBellows 3 жыл бұрын

So to make sure I understand, it sounds like you're saying that if three matching rows reside in the same block, an index-based query wil get the same block three times, thus performing triple the amount of work as a table scan (within the context of that one block). Is that right? If so, I'm kind of surprised that no hybrid solution is used, something like "when you get a block based on an index hit, also scan it for other matches avoid repeated gets". I'm sure I'm not the first genius to think of this, so there are probably reasons why this is a bad idea... any insight?

@TheMagicofSQL 3 жыл бұрын

Only if the database has to fetch other rows in between these three. Say you have 9 rows with values 1-9, spread across 3 data blocks with three rows in each. Say the rows are stored in the data blocks "in order" - block 1 = 1,2,3, block 2 = 4,5,6, and block 3 = 7,8,9. Consecutive index entries match are also consecutive in the table. So to get the rows 1-3, you only need to access block one. And it only does this once, not three times. Now imagine the rows are in the blocks in this sequence: block 1 = 1,4,7 block 2 = 2,5,8, and block 3 = 3,6,9. To get the values 1-3, you need to visit all three data blocks. To get row 4, the database goes back to block 1 and fetches it again. Does this clarify it for you?

@KenBellows 3 жыл бұрын

@@TheMagicofSQL Ahh I see. So the point is that the db performs gets in the order that it retrieves pointers from the index, so if several successive matches are stored in the same block, it should be able to gather them all at once, correct? How about the case where block 1=1,2,3 but your query is for, e.g., odd numbers, such that you match 1 and 3, but not the 2. Does it matter that 1 and 3 are not directly adjacent within the block? Or as long as the index finds 1 and 3 in succession, will it still be able to use block 1 all at once without needing a repeated get?

@TheMagicofSQL 3 жыл бұрын

Yes, it's the order of retrieving index pointers that affects table block fetches. As long as consecutive pointers from the index reference the same data block, the database only fetches the table block once. So if you have the condition col in ( 1, 3 ), skipping over 2 as your example then the database only fetches the table data block once.

@KenBellows 3 жыл бұрын

@@TheMagicofSQL That makes a ton of sense. Thanks!