Nuxtstop

For all things nuxt.js

Digging into Postgresql and DEV

42 7

Earlier I was reviewing the draft Community Wellness badge pull request; and with my head deep in SQL these days, I thought I’d give a go at crafting a query to create this logic.

The following query finds the user IDs and weeks since today in which a user has written at least two comments that don’t have a negative moderator reaction.

  • user_id: The user’s database ID.
  • weeks_ago: The number of weeks (since today) in which we’re grouping comments
  • number_of_comments_with_positive_reaction: How many positive reaction comments did they have for the weeks_ago.
SELECT user_id,
    COUNT(user_id) as number_of_comments_with_positive_reaction,
    /* Get the number of weeks, since today for posts */
    (trunc((extract(epoch FROM (current_timestamp- created_at))) / 604800)) AS weeks_ago
FROM comments
/* Only select comments from the last 32 weeks that
   don't have a negative moderator reaction */
INNER JOIN
    /* Find all comments in the last 32 weeks */
    (SELECT DISTINCT reactable_id
    FROM reactions
    WHERE reactable_type = 'Comment'
    AND created_at > (now() - interval '224' day)
    /* Omit any comments that got a negative moderator reaction */
    EXCEPT
      SELECT DISTINCT reactable_id
      FROM reactions
      WHERE reactable_type = 'Comment'
      AND created_at > (now() - interval '224' day)
      AND category IN ('thumbsdown', 'vomit')) AS positve_reactions
    ON comments.id = positve_reactions.reactable_id
INNER JOIN
    /* Find the users who have at least two comments in the last week */
    (SELECT count(id) AS number_of_comments,
        user_id AS comment_counts_user_id
      FROM comments
      WHERE created_at >= (now() - interval '7' day)
      GROUP BY user_id) AS comment_counts
      ON comments.user_id = comment_counts_user_id
      AND comment_counts.number_of_comments > 1
/* Don’t select anything older than 224 days ago, or 32 weeks ago */
WHERE created_at > (now() - interval '224' day)
GROUP BY user_id, weeks_ago
Enter fullscreen mode Exit fullscreen mode

The above query creates multiple rows per user id. Which is fine, but if you want to loop through things, you’ll need to bust out some temporary variable magic.

I was wondering if I'd be able to get this down to one query. With the help of some folks at Forem, I wrote the following query aggregates that information for you; but you need to do some assembly work.

The columns are:

  • user_id: The user’s database ID.
  • serialized_weeks_ago: A comma separated list of the weeks in which we had comments.
  • weeks_ago_array: An array of integers that is the non-string representation of serialized_weeks_ago; we want to see how ActiveRecord handles this array of integers. It's a the simpler version of the serialized_weeks_ago.
  • serialized_comment_counts: A comma separated list of the number of comments.

The first number of the serialized_weeks_ago maps to the first number of the serialized_comment_counts. And you get one row per user.

SELECT user_id,
       /* A comma separated string of "weeks_ago" */
       array_to_string(array_agg(weeks_ago), ',') AS serialized_weeks_ago,
       /* Will active record convert this to an array of integers? */
       array_agg(weeks_ago) AS weeks_ago_array,
       /* A comma separated string of comment counts.  The first value in this string
       happens on the week that is the first value in serialized_weeks_ago */
       array_to_string(array_agg(number_of_comments_with_positive_reaction), ',') AS serialized_comment_counts
FROM
/* This is the same query as the first example query */
(SELECT user_id,
    COUNT(user_id) as number_of_comments_with_positive_reaction,
    /* Get the number of weeks, since today for posts */
    (trunc((extract(epoch FROM (current_timestamp- created_at))) / 604800)) AS weeks_ago
FROM comments
/* Only select comments from the last 32 weeks that
   don't have a negative moderator reaction */
INNER JOIN
    /* Find all comments in the last 32 weeks */
    (SELECT DISTINCT reactable_id
    FROM reactions
    WHERE reactable_type = 'Comment'
    AND created_at > (now() - interval '224' day)
    /* Omit any comments that got a negative moderator reaction */
    EXCEPT
      SELECT DISTINCT reactable_id
      FROM reactions
      WHERE reactable_type = 'Comment'
      AND created_at > (now() - interval '224' day)
      AND category IN ('thumbsdown', 'vomit')) AS positve_reactions
    ON comments.id = positve_reactions.reactable_id
INNER JOIN
    /* Find the users who have at least two comments in the last week */
    (SELECT count(id) AS number_of_comments,
        user_id AS comment_counts_user_id
      FROM comments
      WHERE created_at >= (now() - interval '7' day)
      GROUP BY user_id) AS comment_counts
      ON comments.user_id = comment_counts_user_id
      AND comment_counts.number_of_comments > 1
/* Don’t select anything older than 224 days ago, or 32 weeks ago */
WHERE created_at > (now() - interval '224' day)
GROUP BY user_id, weeks_ago
) AS user_comment_counts_by_week GROUP BY user_id
Enter fullscreen mode Exit fullscreen mode

I am eager to share these Postgresql approaches as they can help circumvent running lots of smaller queries. I also had the chance to pair up with two folks to make sure we wrote the correct logic and it was performant enough.