Creating a Varnish module
Besides being a rock-solid HTTP cache, Varnish is an invaluable source of architecture best practices. Here, I’m of course talking about Varnish 3, since it’s the first major version providing a modular architecture. So, let’s learn how to create a VMOD.
A word on Varnish’s architecture
The famous design principle of Varnish is the ”work with the Kernel, not against it”. There are several aspects of the architecture that allow Varnish to achieve such a goal. The child process (which does the caching) is isolated from the management process which for instance compiles the VCLs. System calls are kept to a minimum with a workspace-oriented memory model for the threads processing requests. Those are mere examples, there is a lot more to Varnish, but this is what we need to understand how to create a module.
Varnish is written in C, so are the VMODs. I actually am a Java developer, so I know what our usual readers might think about this. It isn’t that hard to switch from Java to C, and Varnish’s memory model actually have similarities with the Java Memory Model. It isn’t actually required to know C if you want to attend the VMOD training course.
The Varnish Configuration Language
If you know Varnish, you already know about the Varnish Configuration Language. As I mentioned earlier, VCLs are compiled by the management process. Varnish is configured with actual compiled source code, which allows greater performance than interpreted configuration. VCL is very limited as a programming language, you can’t create loops for instance. This makes configuration limited, but also simpler and safer. It is possible to inline C code in the VCL to overcome the limitations, you might also need to create a VMOD depending on your needs.
Where to start ?
Varnish Software provides a helloworld module with autotools configuration, an implementation, and a test case out of the box. The easiest way to bootstrap a VMOD is simply by forking libvmod-example.
git clone https://github.com/varnish/libvmod-example.git
It’s then easy to find which files you need to edit to rename the VMOD or the hello function.
cd libvmod-example git grep -e example -e hello
Building the module
In order to build the module, you actually need to build Varnish from the source. You can download a source distribution of varnish or clone the git repository and follow the build instructions:
git clone https://github.com/varnish/Varnish-Cache.git cd Varnish-Cache git checkout varnish-3.0.2 ./autogen.sh ./configure make
Once Varnish is built, you can include headers, link to the libraries and invoke varnishtest
(yes, I’m talking about TDD). It is required by the libvmod-example skeleton, but it isn’t mandatory[1].
You module can be built with:
# go back to the libvmod-example repo ./autogen.sh ./configure VARNISHSRC=/path/to/varnish/sources make
Developing the module
You only need two things to develop your module:
- declare the functions
- write the functions
You also need to know the mapping between C types and VCL types. This is documented here : https://www.varnish-cache.org/docs/3.0/reference/vmod.html#vcl-and-c-data-types.
Declaring the functions
This part is quite easy, you declare the module name, an initialization function and your module’s functions. In this case, we have a single function hello
that takes a string and returns a new string. If you’re creating stateless functions, you don’t need to care about the initialization functions.
Module example Init init_function Function STRING hello(STRING)
About the STRING
type, please note that it is immutable, you are not supposed to edit a string, but rather create a new one (that’s one thing I love about the java.lang.String
class). This is mapped to const char *
.
Implementing the functions
The hello function is very simple, hello("world")
returns "Hello, world"
. If you know your standard C libraries, you can do that very easily :
const char * vmod_hello(struct sess *sp, const char *name) { unsigned length; char* result; /* strlen("Hello, ") + strlen(name) + trailing '\0' */ length = 7 + strlen(name) + 1; result = malloc(length); if (result == NULL) { return NULL; } strcpy(result, "Hello, "); strcat(result, name); return result; }
The problem here is that Varnish can’t know how or when to free the string. It is actually possible to return a manually-allocated value and matching a free
function, but you don’t wan’t to do that, unless you’re dealing with third-party API that cannot integrate with Varnish’s memory model.
The workspace memory model
In order to understand the underlying API of Varnish, I had to read the actual source code. Do I miss my usual javadoc ? Sometimes yes, Varnish’s code base is not easy, but the workspace API is not that hard to understand.
Headers to include
First of all, let’s take a look at the includes :
#include "vrt.h" #include "bin/varnishd/cache.h" #include "vcc_if.h"
The vrt.h
header provides various VCL data structures and functions, such as regexp
functions. In cache.h
you will find many data structures and functions, including the workspace functions.
/* cache_ws.c */ void WS_Init(struct ws *ws, const char *id, void *space, unsigned len); unsigned WS_Reserve(struct ws *ws, unsigned bytes); void WS_Release(struct ws *ws, unsigned bytes); void WS_ReleaseP(struct ws *ws, char *ptr); void WS_Assert(const struct ws *ws); void WS_Reset(struct ws *ws, char *p); char *WS_Alloc(struct ws *ws, unsigned bytes); char *WS_Dup(struct ws *ws, const char *); char *WS_Snapshot(struct ws *ws); unsigned WS_Free(const struct ws *ws);
The vcc_if.h
is generated by the ./configure
script and contains the declaration of your VMOD’s functions. This how you know your functions signatures.
How it works
Every worker thread has its own workspace where it can allocate at will in virtual memory (of course it is bounded to a maximum size). Worker threads are those which receive and answer requests. The workspace is a “large” contiguous char array
defined as such :
struct ws { unsigned magic; #define WS_MAGIC 0x35fac554 unsigned overflow; /* workspace overflowed */ const char *id; /* identity */ char *s; /* (S)tart of buffer */ char *f; /* (F)ree pointer */ char *r; /* (R)eserved length */ char *e; /* (E)nd of buffer */ };
The magic
field must contain the WS_MAGIC
value to assert we are pointing to an actual workspace (this is one of the sanity checks used by the workspace functions). The overflow
field is a counter of buffer overflows (but it won’t make you write past the buffer e
nd, sanity checks again). The important part as a workspace user lies in the SFRE fields.
The s
tart and e
nd fields point to the actual start and theoretical end (remember this is virtual memory) of the char array
. The f
ree field points to currently available memory. It can be viewed as a head that moves forward every time memory is allocated. If you try to allocate too much memory, the f
ree field will exceed the e
nd boundary. As for the r
eserved field, it all
ows incremental allocation within the workspace. It also locks all workspace allocation functions until you actually release it (I think it is worth mentioning that you must ensure it will be released)!
A workspace allocation simply moves the f
ree pointer…
…unless there isn’t enough f
ree space in the workspace.
Many thanks to Guillaume Gaulard who made my original illutrations look a lot better
Using the Varnish API
It looks like we have to break the “hiding internals” principle in order to use the Varnish API. I actually read all the workspace implementation (just a few lines of code surprisingly) and learned quite a few tricks with just that. So, now, we can get rid of the obvious memory leak we coined earlier and simply replace malloc
by WS_Alloc
:
const char * vmod_hello(struct sess *sp, const char *name) { unsigned length; char* result; /* strlen("Hello, ") + strlen(name) + trailing '\0' */ length = 7 + strlen(name) + 1; result = WS_Alloc(sp->wrk->ws, length); if (result == NULL) { return NULL; } strcpy(result, "Hello, "); strcat(result, name); return result; }
Why am I still not satisfied with that ? It does a safe allocation in the workspace and produces the expected output, right ? Let’s look at the vmod real implementation :
const char * vmod_hello(struct sess *sp, const char *name) { char *p; unsigned u, v; u = WS_Reserve(sp->wrk->ws, 0); /* Reserve some work space */ p = sp->wrk->ws->f; /* Front of workspace area */ v = snprintf(p, u, "Hello, %s", name); v++; if (v > u) { /* No space, reset and leave */ WS_Release(sp->wrk->ws, 0); return (NULL); } /* Update work space with what we've used */ WS_Release(sp->wrk->ws, v); return (p); }
It uses the WS_Reserve
function (the 0
length means reserve any available space) instead of WS_Alloc
. It is interesting when you can’t predict the final size you need to allocate. If pre-computing required space is costly, you might want to change your algorithm. Remember that workspace allocation or reservation is of constant time and space complexity (almost free). If you indeed exceed the e
nd of the workspace, you’ll have wasted CPU instructions writing in your workspace. But those few CPU cycles are not your main problem if you encounter a workspace buffer overflow.
Conclusion
Writing a module for Varnish 3 could have been a real pain, but the libvmod-example module makes it possible to create a working project rapidly. The C language can be a barrier but if you have a good knowledge of programming and know a little about how a CPU works, it is only a matter of syntax and API learning. The tooling is also a problem when you are not familiar with C programming in Unix environments (I am still using a plain text editor with syntax highlighting for source editing and bash for make
usage). Consider the VMOD training course to learn more about Varnish hacking, we even offer training in french. !
Last but not least, Varnish comes with a nice test framework. On top of the VCL, varnishtest
uses another DSL called VTC (varnish test case). You can do TDD with Varnish and next time I’ll explain how to use it!
Notes
[1] For instance, openSUSE has a varnish-devel
package which provides what’s needed to build a module, but then you have to rewrite your autotools configuration.