4 posts tagged with "ecto"

Running Ecto Migrations on Startup with Elixir Releases

May 26, 2024 · 3 min read

All-round Awesome Dude

It has been a long time since my initial post on using Distillery to manage Ecto migrations on startup, and I'm super happy that the Elixir core team has worked on making deployments a breeze now.

The absolute simplest way to achieve migrations on startup is now as follows:

Write the migrator function using this example here
Add a startup script in the release overlays folder
Add the startup script to your dockerfile CMD

Writing the Migrator Function

This is going to be largely lifted from the Phoenix documentation, but the core aspects that you need are all here:

defmodule MyApp.Release do
  @app :my_app

  def migrate do
    load_app()

    for repo <- repos() do
      {:ok, _, _} = Ecto.Migrator.with_repo(repo, &Ecto.Migrator.run(&1, :up, all: true))
    end
  end

  defp repos do
    Application.fetch_env!(@app, :ecto_repos)
  end

  defp load_app do
    Application.load(@app)
  end
end

I have left out the rollback/2 function, but you can include it in if you really think that you will need it. Its more likely that in reality you'll just add in a new migration to fix a bad migration, so it is up to personal preference.

Adding the Startup Script

With Elixir releases, we now have a nice convenient way to copy our files into our releases automatically, without having to do multiple COPY commands in our dockerfiles. Neat! Anything to save a few lines of code!

Create your startup file here:

# rel/overlays/startup.sh
./my_app eval "MyApp.Release.migrate"
# optionally, start the app
./my_app start

This assumes that we will be setting our working directory to our release root, which we will do in our docker file. If you wish to couple the migrations together with the app startup, you can add the optional ./my_app start portion. However, you can also decouple it so that you don't end up in a bootloop in the event of a bad migration. As always, it really depends on your sitation.

And then in your release configuration:

# mix.exs
releases: [
    my_app: [
        include_executables_for: [:unix],
        # the important part!
        overlays: ["rel/overlays"],
        applications: [my_app: :permanent]
    ]
]

This will then copy your files under the re/overlays directory over to the built release.

Add the Startup Script to Dockerfile

Let's run the startup script now from our dockerfile as so:

WORKDIR ".../my_app/bin"
CMD ["./startup.sh"]

The three dots are for illustration purposes, adjust the WORKDIR to the actual directory that you copied your release binaries to. If you coupled your app startup together with the startup script, the above CMD will run the migrations and then start up the app.

If you wish to decouple the migrations from the startup, you can do the following:

WORKDIR "..."
CMD ["./startup.sh", "&&", "./my_app", "start" ]

Wrap Up

An important mention is that Phoenix now comes with mix phx.gen.release which also comes with a dockerfile option for bootstrapping docker-based release workflows. The migration files are also automatically generated for you too. However, you wouldn't want to use the helper if you aren't doing any Phoenix stuff, and the above example walkthrough will work for any generic Elixir release.

Thanks for reading!

Using Embedded Schemas for Easy Peasy Ecto Preloading

December 28, 2020 · 2 min read

Structs in Elixir are great, they let you define some data structure and lets you do all sorts of nifty stuff like default values. But what if you want to use this particular struct inside of an Ecto query and then preload associations based on a given field?

An Example Problem

We have a struct called Deal, which is build dynamically from an SQL query. This means that there is no table associated with our Deal structs.

We initially define it as so:

defmodule MyApp.Deal do
    defstruct total_price: nil,
              product_id: nil,
              product: nil
end

The SQL query then populates the product_id field with the id of a product that is currently on sale, as so:

from(p in Products,
  ...
  select: %Deal{total_price: sum(p.price), product_id: p.id}
)
|> Repo.all()

If we were to query this, we would get back a Deal struct as so (for example):

# for product with id=5
[%Deal{ total_price: 123.20 product_id: 5}]

All smooth sailing so far... or is it?

The Preload

What if we wanted to load our product information onto the struct? Could we perhaps use Repo.preload/3 to help?

from(p in Products,
  ...
  select: %Deal{total_price: sum(p.price), product_id: p.id}
)
|> Repo.all()
|> Repo.preload([:product])

Trying out this updated query function out will give us this error:

 function MyApp.Deal.__schema__/2 is undefined or private

D'oh! Seems like our Deal struct does not have the schema metadata that is used by Repo.preload/3. It seems like we'll have to ditch the struct and implement a full schema backed by a table...

The Solution: Embedded Schemas To The Rescue

The post's title kind of gave it away, but we're going to use Ecto's embedded schemas to declare our schema without actually having a table backing our schema. This allows us to declare associations with other tables, and we can then use Repo.preload/3 to load these associations automatically for us! 🤯 I know right?

Refactoring our code for our Deal struct into an embedded schema gives us this:

defmodule MyApp.Deal do
  alias MyApp.Product
  use Ecto.Schema

  embedded_schema do
    field(:total_price :float)
    belongs_to(:product, Product)
  end
end

Note that we don't have to specify both product_id and product fields, as they are automatically created with the Ecto.Schema.belongs_to/2 macro.

Now, preloading our product information works perfectly fine!

Using Common Table Expressions for Temporary Tables in Elixir (Ecto)

December 6, 2020 · 3 min read

Ocassionally, when you've got lots of business logic defined, you may need to perform some heavy calculations outside of your main SQL query and then join back the calculated result set into your final query to perform some statistical final calculation. Usually, the calculated result set would be in the form of a list of maps or list of tuples.

Thankfully, we can use Ecto.Query.with_cte/3 to help with this. With the help of the PostgreSQL function unnest, we can interpolate arrays into the query while also defining the data type for that temporary column.

There are 3 main steps with this technique:

Prepare the data into separate lists
Create the CTE query
Join on the CTE child query as a subquery in the main query

Step 1: Prepare the Data

We need to get the data into a format which we can then interpolate easily as lists. We also need to convert them to a data type that PostgreSql can understand. For example:

iex> data = [test: 1, id: 2] |> Enum.unzip()
iex> data
{[:test, :testing], [1, 2]}
iex> {string_col, int_col} = data
iex> string_col = Enum.map(string_col, &Atom.to_string/1)

iex> string_col
["test", "testing"]

iex> int_col
[1, 2]

Creating the Common Table Expression Ecto Query

Creating the query requires the use of fragments, as well as specifying the data type for each interpolated column. We will also need to provide the CTE with a name. Note that the name must be a compile-time string literal, as noted in the docs. This means that dynamic table names are not possible.

scores_query = with_cte("names", 
  as: fragment("""
    select name, val from unnest(?::text[], ?::integer[]) t (name, val)
    """,
    ^string_col,
    ^int_col
  )
)
|> select([n], %{name: n.name, val: n.val})

The fragment calls the unnest sql array function, and creates two columns that we can then name name and val. Within the fragment, we also select the name of the columns that we want.

Thereafter, we use an Ecto.Query.select/2 function to help make this query understandable to Ecto. This helps when we utilize this query in dynamic query compositions.

Joining on the CTE

Since we have created our CTE query in something that Ecto can understand, we can finally use it in our main query.

from(s in Student
  join: sc in subquery(scores_query),
  on: sc.name == s.name,
  select: {s.name, sc.val}
)
|> Repo.all()

In this scenario, the name column in the students table is the primary key, and we join on that to allow us to select the respective scores of each student.

Hope this helps!

Using Recursive Functions for Ecto Query Building

September 16, 2020 · 6 min read

Composability is the name of the game when it comes to writing Ecto queries. With such a beautifully designed DSL given to us, we should make full use of its design to build our queries.

A Brief on Recursive Functions

Recursive functions are functions that call themselves. There is the risk of functions becoming infinite loops if they are not given exit conditions.

Here's an example:

def eat(food) when food == "biscuits" do
   eat(food, "water") 
end

def eat(food) do
   eat(food, nil) 
end

# return the final digestable result
def eat(food, drink) do
   {food, drink}
end

In the example, we keep calling the eat function until we end up with a digestable form of input, a food input and drink.

Building Queries Based on Flags

When we want to build a query, the most simple way (as an end user) would be to use flags. This allows the user to describe the type of data that they want returned.

For example, if I wanted to specify a filter, all I would need to do is add an option where_id: 5 to filter the query to all records where the id is 5.

Hence, ideally, we should interface with our function like so:

...
food = list_food(is_dry: true, origin: "us")

Using `Enum.reduce/3` for a Naive Implementation

Utilizing Enum.reduce/3 is an extremely simple way to build up your query, as it will iterate over all your options and call a function for each option.

def list_food(opts \\ [])
   # we give some default options, in this case we limit the output to 5 by default
   opts = Enum.into(opts, %{limit: 5})
   base_query = from(f in Food)
   
   # the base query is the accumulator, and we constantly call the function for each option pair 
   Enum.reduce(opts, base_query, &build_query/2)
   |> Repo.all()
end

# the accumulator is always the second argument
# the map key-value pairs are passed as tuples
def build_query({:limit, value}, query), do: limit(query, value)

# to filter by origin, a string column
def build_query({:origin, loc}, query) when is_binary(loc), do: where(query,[f], f.origin == ^loc)

# to filter by dryness
def build_query({:is_dry, true}, query), do: where(query,[f], f.type == "dry")

# this is for unrecognized options
def build_query(_, query), do: query

This implementation will work for simple situations, but what if we have an option that depends on another option? Or what if we need to access multiple options at once?

Ah, these issues are not so simple to solve when using Enum.reduce/3 as the backbone for our recursive query building.

What other methods can we use, then, for managing the recursive nature of our function? Why, the head-tail recursion technique, of course!

Head-Tail Recursion Implementation

Let's re-implement the function, but this time, addressing our new list of concerns.

This function needs to:

access all options at the same time
control the option execution order

def list_food(opts \\ [])
   # our defaults
   opts = Enum.into(opts, %{limit: 5}) 
  
   from(f in Food)
   |> build_query(opts)
   |> Repo.all()
end

# we expect the 2 arity function to always receive the option as a map.
# We then convert the map to a list of keys, and use it as the 3rd parameter
def build_query(query, opts), do: build_query(query, opts, Map.keys(opts))

# match for the :limit option
def build_query(q, %{limit: value}, [:limit | t]) do
  limit(query, value)
  |> build_query(opts, t)
end

# match for the :origin option
def build_query(q, %{origin: loc}, [:origin | t]) when is_binary(loc) do
  where(query,[f], f.origin == ^loc)
  |> build_query(opts, t)
end

# match for the :is_dry option
def build_query(q, %{is_dry: true}, [:is_dry | t]) do
  where(query,[f], food.type == "dry")
  |> build_query(opts, t)
end

# control the option execution stack as needed
# to process this option, we need to have a join with brands first 
def build_query(q, %{country: iso}, [:is_dry | t]) when is_binary(iso) do
  if has_named_binding?(q, :brands) do
    # we utilize the named binding to filter by the country's ISO abbreviation
    where(query,[f, brands: b], b.iso == ^iso)
    |> build_query(opts, t)
  else
    # add the :brands key to the front of the stack, then add country again, then the remaining tail end. 
    build_query(q, opts, [:brands, :country] ++ t)
  end
end

# we add this join on demand, as not every query needs it
def build_query(q, _opts, [:brands |t]) do
   join(:left, [f], b in Brands, on: food.brand_id == b.id, as: :brands)
   |> build_query(opts, t)
end

# this is for unrecognized options, we skip over it
def build_query(q, opts, [_ | t]), do: build_query(q, opts, t)

# no more options to process, let's exit the function now
def build_query(q, _, []), do: q

Let's break this down:

We call the function with a map of our options
We convert these options into a list of keys, and pass it as the 3rd parameter of our recursive function.
For each option handler, we match on our required option keys in the map (on the 2nd parameter), and on the head of the list. Essentially, we are popping off an option from the stack of option keys we need to process.
If we have pre-requisite conditions before processing the option, we can call add in or sort the option stack to ensure oure requisite option is processed first before the current option is procesed (see the :country option handler and how it checks for the :brands join before adding a where clause to the query).
Call the function again with the tail-end of the option stack, ensuring that each option in the stack is processed.
Exit the recursive function when all options are processed and an empty list is remaining.

Although more verbose, this gives you the ultimate control over the flow of query building, as well as allowing you full access to all options at the same time.

Other Benefits

Besides composability at outer layers of your application, you can also utilize this to build your subqueries easily.

For example, you can use this technique to build queries for other tables, then use the resultant query in a subquery.

A super simplified example:

...
food_query = from(f in Food)
|> Foods.build_query(is_dry: true)

# to get all dry restraunt food
from(r in RestrauntFoods,
  join: f in subquery(food_query),
  on: r.food_id == f.id
)
|> Repo.all()
...

I usually use the above in situations where the subqueries are complex aggregations and would benefit from the re-usability.

The head-tail recursion method is very common in Elixir, and having it in your Ecto query building toolbox would save you from re-writing many queries from scratch.

Writing the Migrator Function​

Adding the Startup Script​

Add the Startup Script to Dockerfile​

Wrap Up​

An Example Problem​

The Preload​

The Solution: Embedded Schemas To The Rescue​

Step 1: Prepare the Data​

Creating the Common Table Expression Ecto Query​

Joining on the CTE​

A Brief on Recursive Functions​

Building Queries Based on Flags​

Using Enum.reduce/3 for a Naive Implementation​

Head-Tail Recursion Implementation​

Other Benefits​