Handling large tables

Greg.NET · Jun 18, 2007

I have a few tables that have 2-3 million records each and thousands more records are inserted every day. Queries are starting to slow down a lot and Im wondering what would be the best way to handle this.

Usually a record won't be used again after a few months from its insertion date. But once in a great while one from a year ago will be updated, so its not an option to simply delete old records.

If you have any advice for me I would greatly appreciate it!

I am relatively new to sql or any databasing, but I've been reading about sql 2005's partitioning feature and it sounds like my database could benefit from that.

cjard · Jul 9, 2007

ss7thirty said:
Is there anyway you can better explain what each of the joins do because I am still a bit unclear about all of the variants. My understanding is that there are 4 different types:

RIGHT OUTER
LEFT OUTER
RIGHT INNER
LEFT INNER

I think that this is where the confusion is coming in.

There is no right or left inner. Right and Left determine solidity:

suppose the tables are:
LEFT: 1, 2, 3
RIGHT: 1, 2, 4

INNER (no nulls on either side):
left1, right1
left2, right2

LEFT (left side solid, nulls on right):
left1, right1
left2, right2
left3, null

RIGHT (right side solid, nulls on left)
left1, right1
left2, right2
null, right4

FULL OUTER (both sides can have nulls, effectively a LEFT UNION RIGHT, no dupes):
left1, right1
left2, right2
left3, null
null, right4

Other joins are extensions of the JOIN keyword, not the left or rightness. [LEFT/RIGHT/INNER/FULL OUTER] [CROSS/NATURAL] JOIN

For the full list see your DB manual. All combinations might be accepted, even ones that dont make sense (and the sensible one might be performed):
LEFT INNER JOIN --> inner
OUTER JOIN --> full outer
JOIN --> inner

CROSS and NATURAL talk about the way the join is done.
CROSS join has no condition. All rows are the cartesian product of the elements
NATURAL joins on all columns having the same name. Usually a dumb idea.

JOIN in some DB can use the USING clause instead of the ON clause, if joined tables have columns of the same name. SELECT a JOIN b USING (id)

When talkiing about left and rightness, it is the table order that matters, not the condition order:

SELECT a LEFT JOIN b ON
a.id1 = b.id1 AND
b.id2 = a.id2

a is left joined to b

Greg.NET · Jul 9, 2007

This is good stuff!

SELECT
*
FROM
color
LEFT OUTER JOIN
fruit
ON
fruit.colorid = color.id AND
fruit.price > 100

I think I get it now. the where clause must be scanning the joined result set and not the fruit table, which is why if you put the "fruit.price > 100" in the where clause it will return nothing. but the "ON" clause acually scans the fruit table so thats why you can put the check in there?

cjard · Jul 9, 2007

Greg.NET said:
if you put the "fruit.price > 100" in the where clause it will return nothing.

Well, it doesnt specifically return nothing.. But if any fruit rows are blank because they are null, but the color rows are filled with data (i.e. there is no fruit that is blue in color) then attempting to compare the blank row with a value fails (returns false)

The simplest logic I can think of to explain what you see though, is that of your "two-stage test" notion you posted..

First, assemble all rows that match the criteria in the ON clauses
Rows that have no match, insert a blank row on the "weak" side of the join
Then do the where clause

It doesnt actually happen like that, but it looks like it's that way and thats the easiest explanation

Handling large tables

Greg.NET

Active member

cjard

Well-known member

Greg.NET

Active member

cjard

Well-known member

Similar threads

Share this page

Latest posts