Select max value from column for every value in other two columns
Select max value from column for every value in other two columns
I'm working on a webapp that tracks tvshows, and I need to get all episodes id's that are season finales, which means, the highest episode number from all seasons, for all tvshows.
This is a simplified version of my "episodes" table.
id tvshow_id season epnum
---|-----------|--------|-------
1 | 1 | 1 | 1
2 | 1 | 1 | 2
3 | 1 | 1 | 3
4 | 1 | 2 | 1
5 | 1 | 2 | 2
6 | 2 | 1 | 1
7 | 2 | 1 | 2
8 | 2 | 1 | 3
9 | 2 | 1 | 4
10 | 2 | 2 | 1
11 | 2 | 2 | 2
The expect output:
id
---|
3 |
5 |
9 |
11 |
I've managed to get this working for the latest season but I can't make it work for all seasons.
I've also tried to take some ideas from this but I can't seem to find a way to add the tvshow_id in there.
I'm using Postgres v10
It's in the end of the question: "I'm using Postgres v10"
– nip
Jun 29 at 21:33
If you say you already got parts of it working, it'd be good to see your source code. It's easier to fix a small error than writing it all anew.
– Matthias Bö
Jun 29 at 21:34
I don't think it will be useful because its limited to the latest season and the tvshow_id is hardcoded, so the correct answer is unlikely to use any of this.. but here you go pastebin.com/DdVQMTx2
– nip
Jun 29 at 21:38
3 Answers
3
You can use the below SQL to get your result, using GROUP BY with sub-subquery as:
select id from tab_x
where (tvshow_id,season,epnum) in (
select tvshow_id,season,max(epnum)
from tab_x
group by tvshow_id,season)
SELECT Id from
(Select *, Row_number() over (partition by tvshow_id,season order by epnum desc) as ranking from tbl)c
Where ranking=1
How would this relate to @Vivek 's answer in terms of performance ? I got the same result (both output and operation time) and the test table that im using is quite small (~150 rows), so I dont have an idea.
– nip
Jun 29 at 21:49
@nip I'm not sure about PostgreS, but window functions are usually very performant in most databases I've used. Plus it keeps things set-based. Also,
WHERE x IN (1,2,3)
is pretty much WHERE x = 1 OR x=2 OR x=3
. For large lists, that can get pretty slow.– Shawn
Jun 29 at 22:25
WHERE x IN (1,2,3)
WHERE x = 1 OR x=2 OR x=3
@Shawn Well, now I'm concerced because this table is going to get somewhat big and I've added another WHERE IN to filter the tvshow_id's, since I only want the season finales from a certain number of tvshows. How would this work (the filtering) with this answer without using WHERE IN ? Is this even possible ?
– nip
Jun 29 at 23:21
@nip I'd use the
ROW_NUMBER()
method. Use Ajay Gupta's answer. postgresqltutorial.com/postgresql-row_number– Shawn
13 hours ago
ROW_NUMBER()
Below is the simple query to get desired result. Below query is also good in performance with help of using distinct on() clause
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
in which database?
– Vivek
Jun 29 at 21:32