Tuesday, July 8, 2014

ORDER OF EXECUTION OF NGINX FILTER MODULES



HOW NGINX DETERMINES THE ORDER OF INVOCATION OF NGINX FILTER MODULES

As the usage of nginx grows, people are enhancing its capabilities by adding a variety of modules. As the Emiller's guide to writing nginx modules states, nginx modules can take on 3 roles:


  1. handler, that handles the HTTP request
  2. filter, that handles the HTTP response and
  3. load balancer that choose which back-end server to send the HTTP request to.



In this article, we will discuss the order of execution of nginx filters.

Sometimes, when writing a new filter , you may wish to  look at the order of execution of module. For example, you may wish to always process decompressed response chunks.  For that, you need to be sure that the nginx gunzip body filter will decompress the response chunk before your module processes it.

Order of Execution of Nginx Modules

The order of filters is derived from the order of execution of nginx modules. The order of execution of nginx modules is implemented within the file auto/modules in the nginx source code.

 If you look inside the file, you will see that nginx divides the modules into various classes. For example, the statements

1     modules="$CORE_MODULES $EVENT_MODULES"
2      if [ $HTTP = YES ]; then
    modules="$modules $HTTP_MODULES $HTTP_FILTER_MODULES \
             $HTTP_HEADERS_FILTER_MODULE \
             $HTTP_AUX_FILTER_MODULES \
             $HTTP_COPY_FILTER_MODULE \
             $HTTP_RANGE_BODY_FILTER_MODULE \
             $HTTP_NOT_MODIFIED_FILTER_MODULE"
show  that
nginx has modules like CORE_MODULES, EVENT_MODULES, HTTP_MODULES, HTTP_FILTER_MODULES, HTTP_AUX_FILTER_MODULE etc
The modules which have the term FILTER in their names are the filter modules and this is what we discuss from now on.

For example, the gunzip filter module is classified as HTTP_FILTER while the Lua module is classified as HTTP_AUXILIARY_FILTERS.
Each of these class of filters is then invoked in a particular order as listed in the auto/modules code above.
Within each class, the filters are invoked in the order of their appearance in the file.

Now we need to figure out how the modules are invoked. The file auto/modules puts the list of modules in the $modules variable. The code at the bottom of this file (reproduced below) , then generates ngx_modules.c file. Nginx, then invokes these modules starting from the bottom in the ngx_modules.c file.

cat << END                                    > $NGX_MODULES_C
#include <ngx_config.h>
#include <ngx_core.h>

$NGX_PRAGMA
END

for mod in $modules
do
    echo "extern ngx_module_t  $mod;"         >> $NGX_MODULES_C
done
echo                                          >> $NGX_MODULES_C
echo 'ngx_module_t *ngx_modules[] = {'        >> $NGX_MODULES_C
for mod in $modules
do
    echo "    &$mod,"                         >> $NGX_MODULES_C
done
cat << END                                    >> $NGX_MODULES_C
    NULL
};
END

Building call hierarchy for filter modules

Nginx builds  the call hierarchy for the filter modules at configuration time.

For this, it performs the following steps

  1. Initialise a global pointer to the top of the calling stack. Let’s call it ngx_ top_ filter
  2. Each filter module maintains a local variable that holds the the address of the next filter to be invoked. Let’s call this variable as ngx_next_filter 
  3.   At configuration time, nginx will invoke the init function for say module 1 (how the modules are going to be invoked will be explained shortly).


      1.  Module 1  will initialize ngx_top_filter = module1 and ngx_next_filter = NULL
TThe call hierarchy looks like this:


      2. then module 2 is invoked, it will initialize ngx_next_filter = ngx_top_filter (i.e. ngx_next_filter  will be initialized to module1 )  and ngx_top_filter = module2 i.e module 2 to place itself on top of the calling stack and module 1 after it
TThe call hierarchy looks like this:


   3. when module 3 is invoked, it will place itself on top of the calling stack with module 2 in the middle and module 1 at the bottom
TThe call hierarchy looks like this:


Thus  the module that is invoked first will end up at the bottom of the calling stack while the module that is invoked last will end up at the top.


Run time invocation

Once the calling stack has been built during the configuration time, the filters are expected to do the processing and invoked the next filter using the variable ngx_next_filter.

Additional implementation note

Nginx actually maintains two calling hierarchies. One, for the header filters and the other for the body filters.

Typically, a header filter will process the HTTP response headers while the body filter will process the HTTP body. Accordingly, the variables that you will see in the nginx source will be ngx_top_header_filter, ngx_next_header_filter for headers and ngx_top_body_filter, ngx_next_body_filter for body.

Though most of the filters that are available today stick to the aforementioned steps of invoking the next header filter , however, there is nothing binding a filter writer to follow this rule.

For example, a filter can choose not to invoke the next header filter in line under certain situations. This is, for example, done by modsecurity. Breaking the convention is usually not a good idea, but in case if you  are using a number of third-party filters and getting unexpected results, you may wish to check the calling hierarchy using a debugger or nginx logs.