Retrieving Records with Max Values using TOP

I know I know, it’s a horrible title. If anyone has a better suggestion please let me know!

Here’s what I’m talking about:

I have two tables; the first table contains unique records describing games and the second contains game ids, scores, and user ids. Game id is joining the first table to the second in a one to many, something like this:

Games
======
gameID
gameName

GameScores
=========
gameID
userID
score

MY MISSION: Retrieve the user with the high score for each game in one query.

After much trial, error, and gnashing of teeth, the solution, as usual, proved to be ridiculously simple:


SELECT (select top 1 userID from GameScores where gameID = Games.gameID order by score desc) as topScoreUser,
                Games.gameName,Games.gameID, s.score
FROM Games
JOIN   GameScores s on s.gameID = Games.gameID

What this does is return one record for each game. Since I’m only returning the top record in the query and I’m ordering by score in descending order, I’m getting the top score and player for each game.

I’ve run into quite a few instances now where TOP has proved invaluable. Unfortunately it always seems to take me an hour of futzing around before I remember to use it!

-rG

2 thoughts on “Retrieving Records with Max Values using TOP

  1. Understand though that using correlated subqueries like that can be quite inefficient, depending on the database platform, schema, and the amount of data. Correlated subqueries are evaluated once for every matching record. Which means that the subquery can potentially run for every cartesian product created in the join between GameScores and Games. Different databases have different approaches to trying to optimize this, but it is definitely something to be aware of.

    Consider this alternative:

    select gs.scoreID, v.maxScore, g.gameName
    from
    GameScores gs
    inner join
    (
    select
    max(score) maxScore,
    gameID
    from
    GameScores
    group by
    gameID
    ) v
    on v.maxScore = gs.score and v.gameID = gs.gameID
    inner join
    Games g
    on gs.gameID = g.gameID

    This should give the same results, but it uses an inline view that has no correlation to the rest of the query, which means that it will be more efficient than a correlated subquery.

    Note that the original query you posted also suffers from another flaw: what happens if there are two records that tie for the high score for a game? Using Top 1 will essentially drop one of the two records. Probably not a big deal for high scores for a game, but if you were calculating which sales rep had the highest sales for the quarter this might be a much bigger issue. 😉

Comments are closed.