Using Common Table Expressions for Temporary Tables in Elixir (Ecto)

Last updated on December 6, 2020

Ocassionally, when you've got lots of business logic defined, you may need to perform some heavy calculations outside of your main SQL query and then join back the calculated result set into your final query to perform some statistical final calculation. Usually, the calculated result set would be in the form of a list of maps or list of tuples.

Thankfully, we can use Ecto.Query.with_cte/3 to help with this. With the help of the PostgreSQL function unnest, we can interpolate arrays into the query while also defining the data type for that temporary column.

There are 3 main steps with this technique:

  1. Prepare the data into separate lists
  2. Create the CTE query
  3. Join on the CTE child query as a subquery in the main query

Step 1: Prepare the Data

We need to get the data into a format which we can then interpolate easily as lists. We also need to convert them to a data type that PostgreSql can understand. For example:

iex> data = [test: 1, id: 2] |> Enum.unzip()
iex> data
{[:test, :testing], [1, 2]}
iex> {string_col, int_col} = data
iex> string_col = Enum.map(string_col, &Atom.to_string/1)

iex> string_col
["test", "testing"]

iex> int_col
[1, 2]

Creating the Common Table Expression Ecto Query

Creating the query requires the use of fragments, as well as specifying the data type for each interpolated column. We will also need to provide the CTE with a name. Note that the name must be a compile-time string literal, as noted in the docs. This means that dynamic table names are not possible.

scores_query = with_cte("names", 
  as: fragment("""
    select name, val from unnest(?::text[], ?::integer[]) t (name, val)
    """,
    ^string_col,
    ^int_col
  )
)
|> select([n], %{name: n.name, val: n.val})

The fragment calls the unnest sql array function, and creates two columns that we can then name name and val. Within the fragment, we also select the name of the columns that we want.

Thereafter, we use an Ecto.Query.select/2 function to help make this query understandable to Ecto. This helps when we utilize this query in dynamic query compositions.

Joining on the CTE

Since we have created our CTE query in something that Ecto can understand, we can finally use it in our main query.

from(s in Student
  join: sc in subquery(scores_query),
  on: sc.name == s.name,
  select: {s.name, sc.val}
)
|> Repo.all()

In this scenario, the name column in the students table is the primary key, and we join on that to allow us to select the respective scores of each student.

Hope this helps!