Long threaded SQL Query including Counts...

Arg81

Well-known member
Joined
Mar 11, 2005
Messages
949
Location
Midlands, UK
Programming Experience
1-3
This may not be possible in one hit....

I want to have a grid on a form that splits my customer into it's sites, and then displays counts from another table.

Now...it sounds easy. :D

So far I can do the following;

VB.NET:
SELECT     CustomerSite.CustomerSiteName, COUNT(DWR.DWRNumber) AS TotalDWR
FROM         CustomerSite INNER JOIN
                      DWR ON CustomerSite.CustomerSiteID = DWR.CustomerSiteID
GROUP BY CustomerSite.CustomerSiteName

That works well so I get a total count of all DWR's in the DWR table that match the CustomerSiteID.

...now what I want to do...
I have a StatusID field (where 1 = Active and 2 = Complete). I want to add these counts to the query.

For a count of Active DWRs
VB.NET:
SELECT     CustomerSite.CustomerSiteName, COUNT(DWR.DWRNumber) AS ActiveDWR
FROM         CustomerSite INNER JOIN
                      DWR ON CustomerSite.CustomerSiteID = DWR.CustomerSiteID
WHERE DWR.StatusID = 1
GROUP BY CustomerSite.CustomerSiteName

For a count of Complete DWRs
VB.NET:
SELECT     CustomerSite.CustomerSiteName, COUNT(DWR.DWRNumber) AS CompleteDWR
FROM         CustomerSite INNER JOIN
                      DWR ON CustomerSite.CustomerSiteID = DWR.CustomerSiteID
WHERE DWR.StatusID = 2
GROUP BY CustomerSite.CustomerSiteName

What I don't know how to do is put it all together.

What I want to achieve is something like;

CustomerSiteName ---- TotalDWR ---- ActiveDWR ---- CompleteDWR

CustomerSiteA ---------------10------------ 8----------------- 2
CustomerSiteB ---------------29------------ 9----------------- 20

etc etc.

Any advice greatly appriciated as always!

Regards,
 
Last edited:
OK...I managed to do this!!

VB.NET:
SELECT     CustomerSiteName,
                          (SELECT     COUNT(DWRNumber) AS Expr1
                            FROM          DWR AS T1
                            WHERE      (CustomerSiteID = CustomerSite.CustomerSiteID)) AS TotalDWR,
                          (SELECT     COUNT(DWRNumber) AS Expr1
                            FROM          DWR AS T2
                            WHERE      (CustomerSiteID = CustomerSite.CustomerSiteID) AND (StatusID = 1)) AS ActiveDWR,
                          (SELECT     COUNT(DWRNumber) AS Expr1
                            FROM          DWR AS T3
                            WHERE      (CustomerSiteID = CustomerSite.CustomerSiteID) AND (StatusID = 2)) AS CompleteDWR
FROM         CustomerSite
WHERE CustomerID = @CustomerID
GROUP BY CustomerSiteID, CustomerSiteName
ORDER BY CustomerSiteName

All I forgot to mention above was that I would be passing the CustomerID value, so using @CustomerID works perfectly!

Thanks for the help, urm, me! (And Google Search :D)
 
Erm, just so long as youre sure it wont run the query three times!

You should actually do this:

VB.NET:
SELECT
  id,
  SUM(CASE WHEN status = 'Complete' THEN 1 ELSE 0 END) AS count_complete,
  SUM(CASE WHEN status = 'InProg' THEN 1 ELSE 0 END) AS count_inprog
FROM
...
GROUP BY id

If you need help on why this works, let me know
 
thing with using correlated queries is that some serious artificial intelligence to work out whether the queries overlap. Rather than do that, the server is likely to do one of 2 things:

1) just run the queries, keep them in memory and link them in as tables...

For example, your SQL actually contains 3 queries:
First the server counts the TotalDWR per site by running aquery a bit like this:
SELECT id, count(*) FROM customersite GROUP BY id

Then it runs another query to count the activedwr:
SELECT id, count(*) FROM customersite WHERE statusid = 1 GROUP BY id

Then it runs another query to count the completedwr:
SELECT id, count(*) FROM customersite WHERE statusid = 2 GROUP BY id

Then it links all the IDs together and delivers the result.


-
2) In a worse case scenario, it might run each query once per ID..
i.e. it selects all the rows out of CustomerSite, and then goes through and for each row, runs the following 3 queries:

SELECT count(*) FROM customersite WHERE id = <id of current row being processed>
SELECT count(*) FROM customersite WHERE id = <id of current row being processed> and Status = 1
SELECT count(*) FROM customersite WHERE id = <id of current row being processed> and Status = 2

If your table contains 1000 rows, then potentially, 3001 queries can be run


-

When optimising SQLs we try to do as much as possible in a single pass of the table, and nested correlated subqueries rarely involve a single pass.

The single pass solution looks like what I wrote above..

The server traverses the table just once, doing grouped sums

CASE WHEN test THEN do_if_true ELSE do_if_false END is pretty easy to understand

SUM( CASE WHEN status = 1 THEN 1 ELSE 0 ) as Count_of_Status_1

is easy too:

"If the status = 1, then put a 1, else put a 0. All the status 1 hence count for 1 and anything else counts as 0. When you SUM this column of 0 and 1, you get a number equal to the number of rows where status was 1"

Here is the result set:
status, "CASE WHEN status = 1 ..."
1, 1
2, 0
3, 0
1, 1
2, 0
1, 1
3, 0
2, 0

the sum of 1 + 0 + 0 + 1 + 0 + 1 + 0 + 0 = 3
the number of status 1s we had, was 3
 
Can I do a SUM(CASE...) FROM ... of two tables?

If you look at my query, the CustomerSiteName comes from the CustomerSite table, but the Sum(DWRNumber) is actually coming from the DWR table, with CustomerSiteID linking the 2 together.

I understand what you are saying, but I'm just missing that link. I "think" you are saying that the Sum(Case...) is done on the DWR table, how do I then relate the results back to the CustomerSite table?
 
ya, just do the join..

remember, databases operate on blocks of data, be they tables, views, subqueries or a joined combination of all these. Grouping, Counting ans Summing is done at the end..
 
its why I wrote ... in the FROM block of the original query.. by the time the db gets round to grouping and aggreagte functiuons, there is no such concept of "tables" any more..
 
one thing i have to ask.. is ID the primary key of both these tables? if so, why are they two separate tables?
 
urm, didn't see another reply to this post...whoops.

No the primary key isn't the same, CustomerID is primary for Customer, whilst foreign for both Contacts and Sites.

The reason I've done it that way is because a customer (top level) can have many sites, and a customer (top level) can have many contacts. I did have the contacts related to sites, but a contact can move about which caused dramas in the old system (lots of duplicate data in the contacts table...)
 
PS, I've gone through and made the necessary changes, but in your example
SUM( CASE WHEN status = 1 THEN 1 ELSE 0 ) as Count_of_Status_1

You missed off END :D

Kept getting a function error and saw what it was when I searched Google for SQL CASE(SUM) :)

thanks for the help, that seems to have done the same trick...Does this look better?????

VB.NET:
SELECT     CustomerSite.CustomerID, 
               COUNT(DWR.DWRNumber) AS TotalDWRs, 
               SUM(CASE WHEN StatusID = 1 THEN 1 ELSE 0 END) AS ActiveDWRs, 
               SUM(CASE WHEN StatusID = 0 THEN 1 ELSE 0 END) AS CompleteDWRs
 
FROM         CustomerSite INNER JOIN
                      DWR ON CustomerSite.CustomerSiteID = DWR.CustomerSiteID
 
WHERE     (CustomerSite.CustomerID = @CustomerID)
GROUP BY CustomerSite.CustomerID
 
ha, I now need to take this one stage further :(

Basically I need;

CustomerSiteName (From CustomerSite Table) - DONE
TotalDWRs for that site (From DWR Table) - DONE
ActiveDWRs for that site (From DWR Table WHERE StatusID = 1) - DONE
CompleteDWRs for that site (From DWR Table WHERE StatusID = 0) - DONE

...now need to add (from another table)
CountOfRequestsMade for that site (From SampleRequest Table)

The SampleRequest table is another section of my app, but uses the same customer information from the same customer tables.

So I need the grid to display all of the above.

Never used a table merge, would that be the best option to do here?
 
PS, I've gone through and made the necessary changes, but in your example


You missed off END :D
I usually do. Oops.

Its one of those things thats very easy to detect when running it.. Oracle says something like "Missing keyword" and highlights where END should occur..


VB.NET:
SELECT     CustomerSite.CustomerID, 
               COUNT(DWR.DWRNumber) AS TotalDWRs, 
               SUM(CASE WHEN StatusID = 1 THEN 1 ELSE 0 END) AS ActiveDWRs, 
               SUM(CASE WHEN StatusID = 0 THEN 1 ELSE 0 END) AS CompleteDWRs
 
FROM         CustomerSite INNER JOIN
                      DWR ON CustomerSite.CustomerSiteID = DWR.CustomerSiteID
 
WHERE     (CustomerSite.CustomerID = @CustomerID)
GROUP BY CustomerSite.CustomerID
Yup!
 
youll need to separately aggregate it and join it in...

VB.NET:
SELECT     CustomerSite.CustomerID, 
               COUNT(DWR.DWRNumber) AS TotalDWRs, 
               SUM(CASE WHEN StatusID = 1 THEN 1 ELSE 0 END) AS ActiveDWRs, 
               SUM(CASE WHEN StatusID = 0 THEN 1 ELSE 0 END) AS CompleteDWRs
[B]              SUM(reqs_agg.CountOfReqs) as countofreqs[/B]
 
FROM         CustomerSite INNER JOIN
                      DWR ON CustomerSite.CustomerSiteID = DWR.CustomerSiteID
 
[B]INNER JOIN[/B]
[B](SELECT siteID, count(*) as COuntOfReqs FROM samplerequests GROUP BY siteID) reqs_agg ON customersite.siteid = reqs_agg.siteID[/B]
 
WHERE     (CustomerSite.CustomerID = @CustomerID)
 
GROUP BY CustomerSite.CustomerID
 
because reqs are unrelated (by definition; they are not in the other tables so they are not 1:1 related..) they must be aggregated first to ensure siteid is unique. If any join causes duplication of rows then aggregates will be affected
 
Back
Top