Optimizing mysqli query for calculating Median using GROUP BY & ORDER BY
Optimizing mysqli query for calculating Median using GROUP BY & ORDER BY
Please see below for a sample portion of my MySQL table:
table name: eb_tickets
I want to output the response like so, calculating the median value with distinct values ASC:
Currently I am achieving this with the following query:
Unfortunately this query is taking around 1 second per record, and I am querying thousands of records with the possibility of 20 different values for the ticket_type column.
I am asking for help in seeing if it is possible to optimize my current query in attempt to make the query perform faster.
Thank you for your help!
1 Answer
1
Could you try like this ?
SELECT counter.ticket_type,AVG(time_first_response) AS median FROM
(
SELECT
IF(@type = type, @ctr := @ctr + 1, @ctr := 1) AS rownum,
@type := ticket_type AS ticket_type,
time_first_response
FROM eb_tickets
ORDER BY ticket_type,time_first_response
) AS counter,
(
SELECT ticket_type, COUNT(*) AS rows
FROM eb_tickets
GROUP BY ticket_type
) AS types
WHERE types.ticket_type = counter.ticket_type AND
CASE rows % 2
WHEN 1 THEN rownum IN (ROUND(rows / 2),ROUND(rows / 2))
ELSE rownum IN (ROUND(rows / 2),ROUND(rows / 2) + 1)
END
GROUP BY counter.ticket_type
Initially while I was experimenting on the SQL I used shorter names for the columns - type
instead of ticket_type
and the subquery counter
was left with the wrong column name type
instead of ticket_type
type
ticket_type
counter
type
ticket_type
I also edited
GROUP BY ticket_type
to be GROUP BY ticket_type,time_first_response
and now I am getting the full result set with close accuracy to my original query. Now I am not sure which query is accurate :/ However this saved a massive amount of time in the query. 118 seconds for 10 records VS 1 second for 10 records.– TaylorSmolik
Jun 28 at 17:02
GROUP BY ticket_type
GROUP BY ticket_type,time_first_response
Doing some testing to see which one is more accurate :)
– TaylorSmolik
Jun 28 at 17:09
Yours is more accurate and faster (with the above adjustments). Thank you sir!
– TaylorSmolik
Jun 28 at 17:18
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
I used counter.type instead instead of counter.ticket_type because I was getting an error about counter.ticket_type not existing, and it is only returning partial results (although the results are correct). So it is returning the ticket types and the correct medians, however it is not including all of the ticket types in the results. Was changing counter.ticket_type to counter.type the right move?
– TaylorSmolik
Jun 28 at 16:57