SQL (Teradata) Help

leegbr · August 27, 2024

Hi,

I'm not a coder, I'll get that out the way quickly, I''m stuck on the below, any help appreciated.

Addresses are assigned a type. i.e. 1 - residential, 5 - correspondence, the system allows you to have multiple type 5 addresses, they can be the same actual address or different.

example

address type. address

5 1, smith street, se1 1se

5 24, John street, sw1 1sw

I'm trying to find records where ALL type 5 addresses are the same, I want to exclude any where one of them is different.

The main table has the type 1 addresses by default, table 2 allows you to select the type of address you want.

select *
 from customer
 inner join
 ( select addresses.*
 from addresses
 where addr_type = '5'
 ) as addr
 on (customer.id = addr.id
 and customer.pcode = addr.pcode)

The code above will pull all people who have a type 5 address where the post code matches the customer table post code, but it will also pull them through if they have another type 5 address that is completely different. So it's half working.

But how can I get it working so that the only results are where ALL of their type 5 addresses are the same?

I appreciate that this looks like it was written by an ape who was given a computer and you'd be near enough correct.

Eigenvektor · August 27, 2024

It sounds like you want GROUP BY to ensure identical entries are grouped. Then you want to add an additional filter that says HAVING COUNT(*) > 1, meaning you only get entries where the group contains more than one entry.

SELECT
    a.addr_type,
    c.address
    FROM customer c
    JOIN addresses a ON a.id = c.id AND a.pcode = c.pcode
    WHERE a.addr_type = '5'
    GROUP BY a.addr_type, c.address
    HAVING COUNT(*) > 1

Note: If a customer only has one type 5 address they are "all the same", but this query would exclude it. If you exclude the HAVING clause, you will get multiple entries per customer though, as long as they are unique and a single if they are all identical.

~edit: Can you provide more examples of the data? E.g. do the addresses that belong to the same customer all have the same Id, or a different one? Ideally provide examples of customers and addresses and what the expected output for them would be.

leegbr · August 27, 2024

1 hour ago, Eigenvektor said:
It sounds like you want GROUP BY to ensure identical entries are grouped. Then you want to add an additional filter that says HAVING COUNT(*) > 1, meaning you only get entries where the group contains more than one entry.
SELECT
    a.addr_type,
    c.address
    FROM customer c
    JOIN addresses a ON a.id = c.id AND a.pcode = c.pcode
    WHERE a.addr_type = '5'
    GROUP BY a.addr_type, c.address
    HAVING COUNT(*) > 1
Note: If a customer only has one type 5 address they are "all the same", but this query would exclude it. If you exclude the HAVING clause, you will get multiple entries per customer though, as long as they are unique and a single if they are all identical.

~edit: Can you provide more examples of the data? E.g. do the addresses that belong to the same customer all have the same Id, or a different one? Ideally provide examples of customers and addresses and what the expected output for them would be.

Cheers.

iirc every address should have its own ID, however I’ll have to check if the exact same address holds the same ID. The other problem is some addresses are the same , but formatted slightly different, which will create a new address ID.

Eigenvektor · August 27, 2024

3 minutes ago, leegbr said:

iirc every address should have its own ID, however I’ll have to check if the exact same address holds the same ID. The other problem is some addresses are the same , but formatted slightly different, which will create a new address ID.

If that's the case, then this join you had shouldn't work:

customer.id = addr.id

This join can only work if all addresses of this customer match its id, assuming a single customer can only exist once and you don't have duplicate customers as well as duplicate addresses.

Any chance you can provide realistic examples of the data you're working with (using anonymized addresses, of course)?

leegbr · August 27, 2024

2 hours ago, Eigenvektor said:
If that's the case, then this join you had shouldn't work:
customer.id = addr.id
This join can only work if all addresses of this customer match its id, assuming a single customer can only exist once and you don't have duplicate customers as well as duplicate addresses.

Any chance you can provide realistic examples of the data you're working with (using anonymized addresses, of course)?

The .id is the unique customer id. I do believe that each address entered on the system is given its own address id too, in the address table. The id is the primary and foreign key between the two tables. .

you’ve given me an idea to look at the actual address ID, I’ll have another go tomorrow.

cheers.

Eigenvektor · August 28, 2024

14 hours ago, leegbr said:

The id is the primary and foreign key between the two tables.

That wouldn't work. A primary key must be unique. If the primary key of the address table is also its foreign key (i.e. it matches customer id), then you can only ever have a 1:1 relationship.

The customer table should have an id, the address table should have both an id (its primary key) and a customer_id (foreign key that references a customer) if you want a 1:n relationship.

So either you have a 1:1 relationship, which means you can't have more than one type 5 address per customer and you really have both duplicate customers and addresses, or you have a 1:n relationship and you need to join on some other column. Or you added an id and didn't actually set it as a primary key, meaning it has no unique constraint.