Fl4m3Ph03n1x

How to get the top X results of a given category using Ecto?

Background

I have to queries that return a colossal amount of data on their own. I cannot use Repo.all as doing so would materialize these into memory, which would then quickly run out.

So I am trying to push as much as I can to the pSQL DB, and force the DB to do as much work as possible.

My issue starts with 2 queries.

This ones counts fruits and veggies and aggregates everything into a neat map.

all_counts =
      table_A
      |> join(:left, [item_A], item_B in table_B,
        on:
          item_A.home_id == item_B.home_id and
            item_A.path == item_B.path
      )
      |> select([unfiltered_item, filtered_item], %{
        path: item_A.path,
        item_fruits_count: coalesce(item_A.fruits, 0),
        item_veggies_count: coalesce(item_B.veggies, 0),
        dataset_id: item_A.home_id
      })
      |> subquery()

The second one, joins 2 tables as well (items and photos), with nothing fancy:

file_info =
      table_C
      |> join(:inner, [item], file in table_D,
        on:
          item.id == file.item_id and not file.deleted
      )
      |> select([item, file], %{
        item_id: item.id,
        home_id: item.home_id,
        path: item.path,
        photo_key: file.photo_key
      })
      |> subquery()

Problem

Now the problem is that I need to merge these 2 together.
At first, one would think to do something like this:

result =
      all_counts
      |> join(:inner, [c], f in ^file_info, on: c.home_id == f.home_id and c.path == f.path)
      |> select([c, f], %{
        item_id: f.item_id,
        home_id: f.home_id,
        path: f.path,
        photo_key: f.photo_key,
        # ... you get the idea
      })
      |> Repo.all()

But this creates an issue, namely, the it will return so much data, the machines will run out of memory.

Approach

The approach I am using to solve this problem is to group items by home_id and path (since that is unique for each destination) and then return only a portion of the data I need, lets say, the top 3 items ordered by id.

Source:

Here is where my difficulties begin.
I cannot use pSQL directly, I must use Ecto (for reasons beyond this post).

Normally I would use CTEs or row_number():

With ctes:

 WITH cte AS
  ( SELECT name, value,
           ROW_NUMBER() OVER (PARTITION BY name
                              ORDER BY value DESC
                             )
             AS rn
    FROM t
  )
SELECT name, value, rn
FROM cte
WHERE rn <= 3
ORDER BY name, rn ;

With row_number:

SELECT name, value, rn
FROM 
  ( SELECT name, value,
           ROW_NUMBER() OVER (PARTITION BY name
                              ORDER BY value DESC
                             )
             AS rn
    FROM t
  ) tmp 
WHERE rn <= 3
ORDER BY name, rn ;

However, I am not familiar enough with Ecto to know how to use them.

With CTEs, I understand I should avoid them, as they serve no purpose in Ecto:
https://hexdocs.pm/ecto/Ecto.Query.html#with_cte/3

With row_number() I would need to partition by both home_id and path (2 fields) instead of one:
https://hexdocs.pm/ecto/Ecto.Query.WindowAPI.html#row_number/0

Question

How do I get the result, to return the top 3 results, grouped by home_id and path and ordered by item_id using Ecto?

3 comments

/elixir #ecto

4 816 3

2023-02-13 11:04:13 UTC

Marked As Solved

Fl4m3Ph03n1x

Solution(s)

From what I gathered, there are two decent possible solutions to this conundrum.

`row_number` + `over`

One of them is using row_number() with over() from Ecto:
https://hexdocs.pm/ecto/Ecto.Query.WindowAPI.html#row_number/0

Assuming I join both file_info with all_counts in a single table, I can then perform the query as mentioned in the previous SO post I mentioned in my question:

file_info_with_counts
|> select([fi], %{
        rn: over(row_number(), partition_by: [fi.home_id, fi.path], order_by: [asc: fi.item_id]),
        item_id: fi.item_id,
       # you get the idea ...
      })
|> subquery()

IO.inspect(file_info_with_counts |> where([c], c.rn <= 3) |> Repo.all()

Which prints what I wanted.

Source:

Lateral inner joins

However, as mentioned by some people in the community, this solution is rather old, and these days lateral joins seem to also cover this use case.

tops = 
  from top in "file_info_with_counts", 
    where: top.home_id == parent_as(:parent).home_id,
    where: top.path == parent_as(:parent).path,
    order_by: [asc: top.item_id],
    limit: 3,
    select: %{id: top.id}

from parent in "file_info_with_counts",
  as: :parent,
  group_by: [parent.home_id, parent.path],
  lateral_join: top in subquery(tops), 
  on: true,
  select: %{home_id: parent.home_id, path: parent.path, item_id: top.id}

Source:

This solution is not without merit, however, given my familiarity with row_number I opted for that solution instead.

Unless there is a considerable performance difference between the two in favor of lateral joins, I will keep the previous solution.

For more info, here is the original source where I got these answers:

Post #3

Also Liked

andrea

Will Ecto be able to support NoSQL databases in the future?

Post #4

Fl4m3Ph03n1x

While I am not an expert, I believe there is a package called mongodb_ecto:

So if you use MongoDB, this might be your thing.

Post #5

Where Next?

View thread on forum

elixir

ecto

Home Backend>Questions

/elixir #ecto

4 816 3

Last post

Other popular topics

General Dev>Code Editors

Onivim 2 Code Editor

Thanks to @foxtrottwist’s and @Tomas’s posts in this thread: Poll: Which code editor do you use? I bought Onivim! :nerd_face: https://on...

#code-editors /onivim /revery

88 5364 32

2023-05-15 07:32:26 UTC

New

General Dev>Dev Chat

What are the 'coolest' languages and tech right now?

Inspired by this post from @Carter, which languages, frameworks or other tech or tools do you think is killing it right now? :upside_down...

#community /elixir /erlang /rust /typescript /phoenix #otp /svelte /wasm #adoption #cool-tech /agda

160 3807 49

2021-07-29 23:47:45 UTC

New

General Dev>Dev Chat

How fast do you type? Check your WPM here!

Do the test and post your score :nerd_face: :keyboard: If possible, please add info such as the keyboard you’re using, the layout (Qw...

typing-speed-test.aoeu.eu

/keyboards

82 6920 31

2021-07-10 05:52:20 UTC

New

Backend>Learning Resources

Concurrent Data Processing in Elixir

Learn different ways of writing concurrent code in Elixir and increase your application's performance, without sacrificing scalability or...

pragprog.com

#pragprog /elixir #published-book /book-concurrent-data-processing-in-elixir

78 4119 24

2021-09-04 12:35:42 UTC

New

Science/Tech>Health & Diet

Did you manage to avoid covid19?

Seems like a lot of people caught it - just wondered whether any of you did? As far as I know I didn’t, but it wouldn’t surprise me if I...

#covid19

190 3839 79

2022-10-27 05:12:52 UTC

New

Community>Journals

Programming Crystal Book Club

Crystal recently reached version 1. I had been following it for awhile but never got to really learn it. Most languages I picked up out o...

/crystal /book-programming-crystal #book-club

155 4360 65

2021-07-09 11:44:56 UTC

New

General Dev>Dev Chat

Warp—The blazingly fast, Rust-based terminal

A few weeks ago I started using Warp a terminal written in rust. Though in it’s current state of development there are a few caveats (tab...

/rust #terminal

52 4894 22

2025-02-26 17:47:24 UTC

New

macOS>Chat

How to block any website on Mac using Little Snitch

If you want a quick and easy way to block any website on your Mac using Little Snitch simply… File > New Rule: And select Deny, O...

#macos #how-to #littlesnitch

5 8045 3

2022-07-05 00:59:40 UTC

New

Backend>Learning Resources

Engineering Elixir Applications

Develop, deploy, and debug BEAM applications using BEAMOps: a new paradigm that focuses on scalability, fault tolerance, and owning each ...

pragprog.com

#pragprog /elixir #published-book /book-engineering-elixir-applications

40 2292 21

2024-11-08 15:13:02 UTC

New

Backend>Questions

Psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such file or directory

If you’re getting errors like this: psql: error: connection to server on socket “/tmp/.s.PGSQL.5432” failed: No such file or directory ...

#macos /rails /postgresql

1 2188 1

2024-10-17 02:03:48 UTC

New

Latest in Elixir

What are we afraid of?

Backend>Blogs/Talks

LiveView Colocated Hooks - ElixirCasts

Backend>Learning Resources

Thinking Elixir 265 - LiveView 1.1 Goes Live and Stack Overflow Results

Backend>Blogs/Talks

Q1: How will we know it works?

Backend>Blogs/Talks

JEG2's Questions

Backend>Blogs/Talks

Take part in the Global Elixir Meetups week

Backend>Official News

Thinking Elixir 264 - Hot Reload In Dev and QA Bottlenecks

Backend>Blogs/Talks

Tutorial Deploy Phoenix 1.8 with Coolify on Hetzner

Backend>Blogs/Talks

500 virtual Linux devices on ARM64 (a Nerves story)

Backend>Blogs/Talks

Thinking Elixir 263 - BEAM Scales from Nano to BBC Big

Backend>Blogs/Talks

Elixir Portal ❯

Backend>Questions

What do you think is a good direction to go for someone with a Rails background?

Backend>Questions

Anyone know how to get into Go from an Elixir background?

Backend>Questions

Are there any text-to-speech ai tools available using elixir?

Backend>Questions

How to run Ollama deepseek-coder:6.7b-instruct-q4_K_M in Docker for CrewAI Agents?

Backend>Questions

Psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such file or directory

Backend>Questions

Connection backend to frontend for orders

Backend>Questions

Clarifications with the terms regarding augmenting AI in your code

Backend>Questions

What is the difference between using `:references` and `:belongs_to` in a generate command in Rails?

Backend>Questions

Dialyzer cannot recognize types from dependencies

Backend>Questions

Learning Elixir Phoenix and Ash

Backend>Questions

Backend Questions ❯

Latest on Devtalk

This Month in Julia World (June and July 2025)

Backend>Official News

Julia: Announcing Google Summer of Code 2025 selected projects

Backend>Official News

I Want Everything Local — Building My Offline AI Workspace

AI>In The News

How to become your own ISP

General Dev>In The News

Vibe Check: Claude Sonnet 4 Now Has a 1-million Token Context Window

AI>In The News

Biff-Bang: Tariffs before Trump

General Dev>In The News

The Missing Protocol: Let Me Know

General Dev>In The News

PostgreSQL: Barman 3.15 Released

Backend>Official News

This Month in Julia World (May 2025)

Backend>Official News

Go 1.25 is released

Backend>Official News

React Native v0.81.0 released!

Hybrid>Official News

What are we afraid of?

Backend>Blogs/Talks

LiveView Colocated Hooks - ElixirCasts

Backend>Learning Resources

Thinking Elixir 265 - LiveView 1.1 Goes Live and Stack Overflow Results

Backend>Blogs/Talks

Math for Frontend Web Dev (Manning)

Frontend>Learning Resources

Devtalk ❯

We ❤️ helpful members!

We reward our most helpful members via our MOTM scheme - by giving away a whopping 25 books per year!

Sub Categories:

We're in Beta

About us Mission Statement See our Roadmap

How to get the top X results of a given category using Ecto?

Fl4m3Ph03n1x

How to get the top X results of a given category using Ecto?

Background

Problem

Approach

Question

Marked As Solved

Fl4m3Ph03n1x

Solution(s)

row_number + over

Lateral inner joins

Also Liked

andrea

Fl4m3Ph03n1x

Where Next?

Popular Backend topics

Any good learning resources for Rust for experienced programmers?

Ruby on Rails book recommendations please

Raxx routing doesn't seem to work

Can Phoenix LiveView be used in multi-page applications or does it have to be a SPA?

Any Elixir tutorials?

Troubleshooting Code: (KeyError) key :changeset not found

isReachable in Java throws an IOException

Gradient does not recognize type of TypedStruct structures

How to extract a tarball?

Clarifications with the terms regarding augmenting AI in your code

Other popular topics

Onivim 2 Code Editor

What are the 'coolest' languages and tech right now?

How fast do you type? Check your WPM here!

Concurrent Data Processing in Elixir

Did you manage to avoid covid19?

Programming Crystal Book Club

Warp—The blazingly fast, Rust-based terminal

How to block any website on Mac using Little Snitch

Engineering Elixir Applications

Psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such file or directory

Sponsor Spotlight

Latest in Elixir

Backend>Questions

Latest on Devtalk

We ❤️ helpful members!

Devtalk Sponsors

Categories:

Sub Categories:

Popular Portals

Devtalk Sponsors

We're in Beta

`row_number` + `over`