Translating Data Types Between Elixir and Python (when using Erlport/Export)

Last updated on October 31, 2020

As explained in my original post on connecting Python with Elixir, there are some data translation issues that we will have to deal with when interfacing between the two languages.

Thankfully, the Erlport documentation kindly provides a full table of the data type mappings, which we can now use to make our lives easier.

Translating On The Python Side

When calling Python from Elixir, we will want pass our arguments as so:

    # the py variable refers to the python process reference (pid)
    result =  Python.call(py, "my_module", "my_func", [arg1, arg2])

arg1 and arg2 are positional arguments, and will on the python side, will by passed to the my_func function defined in the my_module python module. Take a few minutes to read the setup post if this is unfamiliar to you.

However, the data types get mapped custom erlport classes Atom, Map, and List, which are rather cumbersome to deal with when we want to work with our arguments with native python functions.

As such, we'll have to write a translation function that converts the data types accordingly:


from erlport.erlterms import Atom, Map, List
import codecs

def translate(target):
    if isinstance(target, List):
        res = list(target)
        if len(res) > 0:
            return [translate(i) for i in res]
        return res
    elif isinstance(target, Map):
        res = dict(target)
        new = {}
        for k, v in res.items():
            new[translate(k)] = translate(v)
        return new
    elif isinstance(target, Atom):
        return codecs.decode(target)
    elif isinstance(target, str) or isinstance(target, bytes):
        return codecs.decode(target)
    else:
        return target

Let's break this function down:

  1. We import the main classes that we want to convert from the erlport.erlterms module.
  2. We use a big if statement to help us convert each class conditionally. If there is nested data, we recursively call the translate function until all data types are converted.
    • The List class is translated with a list comprehension
    • The Map class is translated to a dict first, but the resultant keys are atoms, hence we'll need to translate those too.
    • The Atom class is converted to a string directly. However, as we receive binary strings from erlport, we need to use codecs.decode() as a safe measure to ensure the string is utf-8 encoded.

When we want to translate our arguments, all we need to do is to pass it to the translate python function and we're done!

def my_func(arg1, arg2):
    translated1 = translate(arg1)
    translated2 = translate(arg2)
    pass

Note that this translation function does not handle improper lists or tuples.

Translating On The Elixir Side

When we receive our result from our python call, there are a few issues:

  1. Map keys are charlists
  2. String values are charlists
  3. Instead of translating Python's None to Elixir's nil, we get :undefined.

Here's my take on the data type translation for the above issues:


  def translate_from_python(:undefined), do: nil

  def translate_from_python(%{} = target) do
    for {k, v} <- target, into: %{} do
      {List.to_string(k), translate_from_python(v)}
    end
  end

  def translate_from_python(target) when is_list(target) do
    if Enum.all?(target, &is_integer/1) do
      List.to_string(target)
    else
      for v <- target, into: [], do: translate_from_python(v)
    end
  end

  def translate_from_python(v), do: v

Let's break it down:

  1. The first function definition catches any values that are :undefined and returns nil.
  2. The second function definition converts all maps into string-keyed maps, and recursively calls the translation function for the map values.
  3. The next function catches all list data types. Lists can either be charlists or list with other datatypes. Charlists are simply a list of integer code-points representing characters. Hence, before converting the charlist into a string, we'll check they the list is made up of integers first.
  4. We return all unmatched data types which are of no concern to us.

A limitation of this method is that when you expect the result to contain a list of integers (for example, a list of ids) such as [97, 97], we might accidentally convert it into "aa" instead. However, this currently cannot be avoided as there is simply no canonical way to determine if a list is a charlist. is_charlist?/1 would definitely be a good addition to the list of guards in the Kernel module.