Noah Watkins

github
twitter
linkedin

Lua bindings for RADOS object class handlers

The Ceph distributed file system is built on top of a scalable object store called RADOS, which is also used as a basis for several products including RADOS Gateway and RBD. One feature of RADOS is the Object Class system, providing the ability to allow developers to define new object behavior by writing C++ plugins that execute within the context of the storage system nodes, and operate on object data using arbitrary functions.

One downside to using object classes is the injection of new functionality into the system. A compiled C++ plugin must be delivered and dynamically loaded into each OSD process. This becomes more complicated if a cluster is composed of multiple architecture targets, and makes it difficult to update functionality on the fly.

One approach to addressing these problems is to embed a language runtime within the OSD, and use an interpreted language to write handlers. In this way new functionality can be easily injected into the system. The Lua language and its runtime (either LuaVM and LuaJIT) is considered to be among the fastest, most light-weight extension languages available. In the remainder of this post we show the design of Lua Object Class Handlers that let programmers write new object class methods in Lua rather than C++.

Default Error Handling

When performing an I/O operation (e.g. reading an extent from an object) it is common to simply return the error because there is little that can be done to recover from most errors. In the following object class handler code snippet the goal is to create the object being referenced by the operation, and throw an error if it already exists (i.e. exclusive creation). Note that the true parameter to cls_cxx_create indicates exclusive creation semantics.

int handle(cls_method_context_t hctx, bufferlist *in, bufferlist *out)
{
  int ret = cls_cxx_create(hctx, true);
  if (ret < 0)
    return ret;
  ...
  return 0;
}

A common pattern. When a negative value is returned from an object class handler the current transaction will be aborted, and the return value passed back to the client. When the handler has completed successfully a return value of zero will commit the transaction. In Lua this common pattern for handling errors is fully managed. In the previous example, if the object already exists, then the handler aborts and -EEXIST is returned automagically. The following handler named handle shows an example of this pattern in Lua. Note that the return statement won’t be reached if an exception is thrown in objclass.create.

function handle(input, output)
  objclass.create(true);
  return 0;
end

function handle2(input, output)
  objclass.create(true);
end

Explicitly returning zero is optional. A handler that returns without providing an explicit return value will default to the same behavior as if zero had been returned explicitly. The two handlers shown above have identical semantics.

Explicit Error Handling

Some operations return error codes that we may want to handle directly. For example, when retrieving a value from the object map, ENOENT is used to indicate that the given key was not found. If the handler code can deal with this case (e.g. creating and initializing a new key), then it is simple enough to just return all other error codes.

int handle(cls_method_context_t hctx, bufferlist *in, bufferlist *out)
{
  string key;
  ::decode(key, *in);
  int ret = cls_cxx_map_get_val(hctx, key, &bl);
  if (ret < 0 && ret != -ENOENT)
    return ret;
  if (ret == -ENOENT) {
    /* initialize new key */
  }
  ...
  return 0;
}

In order to capture the return value in Lua, we have to call the RADOS interface in protected mode (similar to catching an exception). This is done using the standard pcall interface provided in Lua.

function handle(input, output)
  key = input:str()
  ok, ret_or_val = pcall(cls.map_get_val, key)
  if not ok then
    if ret_or_val ~= -objclass.ENOENT then
      return ret_or_val
    else
      -- initialize new key
    end
  end
  val = ret_or_val
  ...
  return 0
end

These two error handling patterns cover many cases, but there are still ways to improve the interface. Currently an error is logged in the OSD using a generic message from strerror. It would be useful to allow the handler to define context dependent error messages that provide more information.

30bd762cd913e5b33d66499bed483624ef44ed89