Optimizing mysqli query for calculating Median using GROUP BY & ORDER BY


Optimizing mysqli query for calculating Median using GROUP BY & ORDER BY



Please see below for a sample portion of my MySQL table:



table name: eb_tickets



I want to output the response like so, calculating the median value with distinct values ASC:



Currently I am achieving this with the following query:



Unfortunately this query is taking around 1 second per record, and I am querying thousands of records with the possibility of 20 different values for the ticket_type column.



I am asking for help in seeing if it is possible to optimize my current query in attempt to make the query perform faster.



Thank you for your help!




1 Answer
1



Could you try like this ?


SELECT counter.ticket_type,AVG(time_first_response) AS median FROM
(
SELECT
IF(@type = type, @ctr := @ctr + 1, @ctr := 1) AS rownum,
@type := ticket_type AS ticket_type,
time_first_response
FROM eb_tickets
ORDER BY ticket_type,time_first_response
) AS counter,
(
SELECT ticket_type, COUNT(*) AS rows
FROM eb_tickets
GROUP BY ticket_type
) AS types
WHERE types.ticket_type = counter.ticket_type AND
CASE rows % 2
WHEN 1 THEN rownum IN (ROUND(rows / 2),ROUND(rows / 2))
ELSE rownum IN (ROUND(rows / 2),ROUND(rows / 2) + 1)
END
GROUP BY counter.ticket_type



Initially while I was experimenting on the SQL I used shorter names for the columns - type instead of ticket_type and the subquery counter was left with the wrong column name type instead of ticket_type


type


ticket_type


counter


type


ticket_type





I used counter.type instead instead of counter.ticket_type because I was getting an error about counter.ticket_type not existing, and it is only returning partial results (although the results are correct). So it is returning the ticket types and the correct medians, however it is not including all of the ticket types in the results. Was changing counter.ticket_type to counter.type the right move?
– TaylorSmolik
Jun 28 at 16:57





I also edited GROUP BY ticket_type to be GROUP BY ticket_type,time_first_response and now I am getting the full result set with close accuracy to my original query. Now I am not sure which query is accurate :/ However this saved a massive amount of time in the query. 118 seconds for 10 records VS 1 second for 10 records.
– TaylorSmolik
Jun 28 at 17:02


GROUP BY ticket_type


GROUP BY ticket_type,time_first_response





Doing some testing to see which one is more accurate :)
– TaylorSmolik
Jun 28 at 17:09





Yours is more accurate and faster (with the above adjustments). Thank you sir!
– TaylorSmolik
Jun 28 at 17:18






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

how to run turtle graphics in Colaboratory

Export result set on Dbeaver to CSV