Fl4m3Ph03n1x

Fl4m3Ph03n1x

Ecto multiple streams in 1 transaction

Background

PS: the following situation describes an hypothetical scenario, where I own a company that sells things to customers.

I have an Ecto query that is so big, that my machine cannot handle it. With billions of results returned, there is probably not enough RAM in the world that can handle it.

The solution here (or so my research indicates) is to use streams. Streams were made for potentially infinite sets of results, which would fit my use case.

https://hexdocs.pm/ecto/Ecto.Repo.html#c:stream/2

Problem

So lets imagine that I want to delete All users that bought a given item. Maybe that item was not really legal in their country, and now me, the poor guy in IT, has to fix things so the world doesn’t come down crashing.

Naive way:

item_id = "123asdasd123"

purchase_ids =
      Purchases
      |> where([p], p.item_id == ^item_id)
      |> select([p], p.id)
      |> Repo.all()

Users
    |> where([u], u.purchase_id in ^purchase_ids)
    |> Repo.delete_all()

This is the naive way. I call it naive, because of 2 issues:

  • We have so many purchases, that the machine’s memory will overflow (looking at purchase_ids query)
  • purchase_ids will likely have more than 100K ids, so the second query (where we delete things) will fail as it hits Postgres parameters limit of 32K: https://stackoverflow.com/a/42251312/1337392

What can I say, our product is highly addictive and very well priced!
Our customers simply cant get enough of it. Don’t know why. Nope. No reason comes to mind. None at all.

With these problems in mind, I cannot help my customers and grow my empire, I mean, little home owned business.

I did find this possible solution:

Stream way:

item_id = "123asdasd123"

purchase_ids =
      Purchases
      |> where([p], p.item_id == ^item_id)
      |> select([p], p.id)

stream = Repo.stream(purchase_ids)

Repo.transacion(fn -> 
  ids = Enum.to_list(stream)

  Users
    |> where([u], u.purchase_id in ^ids)
    |> Repo.delete_all()
end)

Questions

However, I am not convinced this will work:

  • I am using Enum.to_list and saving everything into a variable, placing everything into memory again. So I am not gaining any advantage by using Repo.stream.
  • I still have too many ids for my Repo.delete_all to work without blowing up

I guess the one advantage here is that this now a transaction, so either everything goes or nothing goes.

So, the following questions arise:

  • How do I properly make use of streams in this scenario?
  • Can I delete items by streaming parameters (ids) or do I have to manually batch them?
  • Can I stream ids to Repo.delete_all ?

Marked As Solved

Fl4m3Ph03n1x

Fl4m3Ph03n1x

Every question post created here creates an entry in a dedicated thread in the official forum iirc. Nonetheless, I still post my questions in both places. And when I find an answer, I add it to both places as well.

I do this mainly for visibility, both for the community, and for the question itself, although the later one is less impactful due to the mentioned DevChat thread the official forum has.

Solutions

In regards to the question, there are two possible solutions.

One suggested by benwilson:

query = from u in Users,
  join: p in assoc(u, :purchase),
  where: p.item_id == ^item_id

Repo.delete_all(query)

And the other by Aleksei Matiushkin:

Repo.transacion(fn ->
  max_rows = 500

  purchase_ids
  |> Repo.stream(max_rows: max_rows)
  |> Stream.chunk_every(max_rows)
  |> Stream.each(fn ids ->
     Users
     |> where([u], u.purchase_id in ^ids)
     |> Repo.delete_all()
  end)
  |> Stream.run()
end, timeout: :infinity)

My pick

The first solution is great, but it requires the User Schema to have a belongs_to :purchase, Purchase definition in its schema. Unfortunately for me, this was a deal breaker, since changing any schemas in the project where I am working in is either not allowed or would result in a lengthy approval process.

So I went with the second solution that is self contained. It requires no changes to any schemas and it can work with the data as is.

Also Liked

jaeyson

jaeyson

hi @Fl4m3Ph03n1x, this might be spammy but, have you tried to ask this via ElixirForums or slack? so other people can see this.

Where Next?

Popular Backend topics Top

dimitarvp
Hey everyone, I resumed work on my Elixir <=> SQLite library (which uses a Rust NIF underneath) and I am in a need of small and we...
New
Jsdr3398
Are there any databases that require no setup (can be shipped in a small zip together with the project)?
New
Fl4m3Ph03n1x
Background I am trying to up my Functional Programming (FP) skills and one of the things that newcomers first learn in FP is the Option T...
New
Fl4m3Ph03n1x
Background I am trying out polymorphic typing with dialyzer. As an example I am using the famous Option type (aka, Maybe Monad) that is n...
New
Fl4m3Ph03n1x
Background I am trying to find a cheap and easy way to create New Types in Elixir, and Records seem to be just what I would need. Probl...
New
Fl4m3Ph03n1x
Background I have recently been delving into more functional code. My objective right now is to get something similar to the IO Monad (in...
New
Fl4m3Ph03n1x
Background I am moving towards defined data structures in my application, and I find that TypedStruct is quite useful. Questions Howeve...
New
Fl4m3Ph03n1x
Background When trying to execute mix release on a Windows 11 machine for a Phoenix project I get the following error: * assembling mark...
New
pillaiindu
What is the difference between using :references and :belongs_to in the following command? bin/rails generate scaffold LineItem product:...
New
apoorv-2204
Anyone know how to get in golang? I am from elixir background?.
New

Other popular topics Top

wolf4earth
@AstonJ prompted me to open this topic after I mentioned in the lockdown thread how I started to do a lot more for my fitness. https://f...
New
AstonJ
We have a thread about the keyboards we have, but what about nice keyboards we come across that we want? If you have seen any that look n...
New
AstonJ
I’ve been hearing quite a lot of comments relating to the sound of a keyboard, with one of the most desirable of these called ‘thock’, he...
New
AstonJ
Thanks to @foxtrottwist’s and @Tomas’s posts in this thread: Poll: Which code editor do you use? I bought Onivim! :nerd_face: https://on...
New
AstonJ
In case anyone else is wondering why Ruby 3 doesn’t show when you do asdf list-all ruby :man_facepalming: do this first: asdf plugin-upd...
New
DevotionGeo
The V Programming Language Simple language for building maintainable programs V is already mentioned couple of times in the forum, but I...
New
AstonJ
Continuing the discussion from Thinking about learning Crystal, let’s discuss - I was wondering which languages don’t GC - maybe we can c...
New
PragmaticBookshelf
Author Spotlight Jamis Buck @jamis This month, we have the pleasure of spotlighting author Jamis Buck, who has written Mazes for Prog...
New
Help
I am trying to crate a game for the Nintendo switch, I wanted to use Java as I am comfortable with that programming language. Can you use...
New
New