Jump to content

Why can't I GROUP BY emp_name only in this query? But why am I able to GROUP BY orderid alone?

shivajikobardan

I'm on Windows 10, MySQL Workbench. I'm on MySQL 8.0.

Here's the dataset. northwind.sql.

https://pastebin.com/bMgjXvfT

Objective:

Write a query to get the order ID, customer's name, grand total of each order, and the name of the employee who handled each order. See below the expected first four rows.

Output should look like this.

image.thumb.png.e0d3f59fd08dbd9ea6bbc8eca022b427.png

This is the database schema diagram.

https://brucebauer.info/assets/ITEC3610/Northwind/Northwind-Sample-Database-Diagram.pdf

It's northwind.sql database.

This is my query.

SELECT 

    o.orderid,

    c.contactname,

    SUM(od.unitprice * od.quantity),

    CONCAT(e.lastname, ' ', e.firstname) AS emp_name

FROM

    orders o

        INNER JOIN

    customers c ON o.customerid = c.customerid

        INNER JOIN

    orderdetails od ON o.orderid = od.OrderID

        INNER JOIN

    employees e ON o.EmployeeID = e.EmployeeID

GROUP BY emp_name

ORDER BY orderid

LIMIT 4;

But it was producing an error.


Error Code: 1055. Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'northwind.o.OrderID' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

But if I order by orderid alone, it'll work, Why does it work? Should not it be producing the same error as above as not all non-aggregated query aren't present in group by clause ?

What's the computer science(Database concept) behind all this? Can anyone explain it? 

 

Link to comment
Share on other sites

Link to post
Share on other sites

If you want to GROUP BY, then every column in your select list has to either be part of the GROUP BY, or use an aggregate function, such as SUM, AVG, MIN, MAX, …

 

If you select a column that isn't grouped, then which value should be displayed?

LastName | FirstName
--------------------
Smith    | Jack
Smith    | Jane

 

Let's say you select from this table by using the following query

SELECT LastName, FirstName
    FROM Person
    GROUP BY LastName

 

Both entries have the same last name, so both records should obviously be grouped together. But we now have two first names, Jack and Jane. Which one should be displayed? To fix this you'll need to either add FirstName into the group by (making both entries distinct again), or use something like MIN(FirstName), though that generally doesn't make much sense.

 

Alternatively you need to remove the GROUP BY. In that case you'll get both entries returned, as there's no more ambiguity in what should be selected/returned.

 

Note that (older versions?) of MariaDB/MySQL actually let you write such broken select statements. They'll simply pick whatever value they first encounter for the column that isn't grouped. I would call it bad practice to use it, since the query won't work on any other database and the value is effectively random and therefore useless.

 

So in your particular case, you'll need to rewrite your query as

SELECT 
    o.orderid,
    c.contactname,
    SUM(od.unitprice * od.quantity),
    CONCAT(e.lastname, ' ', e.firstname) AS emp_name
FROM orders o
  INNER JOIN customers c ON o.customerid = c.customerid
  INNER JOIN orderdetails od ON o.orderid = od.OrderID
  INNER JOIN employees e ON o.EmployeeID = e.EmployeeID
    GROUP BY o.orderid, c.contactname, emp_name
    ORDER BY o.orderid
    LIMIT 4;

—or— remove the GROUP BY, so that you get all records as they are

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

That's not my question. The query works even if I don't use c.contactname, emp_name to group by. It works if I only use o.orderid. That's my question

Link to comment
Share on other sites

Link to post
Share on other sites

12 hours ago, shivajikobardan said:

That's not my question. The query works even if I don't use c.contactname, emp_name to group by. It works if I only use o.orderid. That's my question

See my note:

Note that (older versions?) of MariaDB/MySQL actually let you write such broken select statements. They'll simply pick whatever value they first encounter for the column that isn't grouped. I would call it bad practice to use it, since the query won't work on any other database and the value is effectively random and therefore useless.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

13 hours ago, shivajikobardan said:

That's not my question. The query works even if I don't use c.contactname, emp_name to group by. It works if I only use o.orderid. That's my question

Well I mean he pretty much did answer your question by explaining how it works, ultimately he was explaining how group by works because it was a necessity to express what was going wrong.

 

Similar to asking why a matrix A * B = I, and B * A = I, but C * D = E but not D * C = E...essentially you just got lucky that it worked correctly in one case but not the other.

 

If you must know why your exact one appeared to work, and I really couldn't bother looking at the database to check, my guess is it that each order only has a singular employee, so effectively the group by means nothing in that case so there aren't any ambiguities that it needs to resolve since it is a 1 to 1.  Compared to grouping by employee name where they could have multiple orders and you now have a many to 1 situation (so it can't pick, so it just complains)

3735928559 - Beware of the dead beef

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×