Nautilus —— An Exploration of Balancing Performance and Maintainability in JIT Query Compilation

Paper name: Query Compilation Without Regrets

Address：https://dl.acm.org/doi/10.1145/3654968

Github project address: https://github.com/nebulastream/nautilus

This paper, published in SIGMOD 2024, proposes and implements Nautilus, a new database query execution framework that attempts to address the engineering complexity of query compilation without sacrificing runtime performance.

Technical Background

Query Compilation using JIT is essentially on trade-off:

interpretation versus compilation: in the case of a small amount of data, the speed of direct interpretation is faster than compilation before execution.
development complexity: the need for developers to master both database and compiler knowledge, and JIT code generated by the difficulty of debugging
software maintainability: too much pursuit of the underlying design, resulting in high engineering costs and poor maintainability
adaptive compilation of complexity: the compilation strategy of different compilation speed and execution speed will lead to different effects, and take a different compilation backend will make the problem more complex

JIT-based Query Compilation is only sustainable if the relationship between the four is properly resolved.

Trace-based JIT comes from the compiler of the programming language: in Pypy, it is possible to dynamically optimise hotspots (frequently run functions) to achieve performance improvements.

Implementation details

The article proposes a Trace-based JIT for querying and compiling, where the input logic plan is pipelined, then interpreted and executed directly (Interpreter), or symbolically executed to generate a trace, then converted to Nautilus IR, and the code is generated to adapt to different latency/throughput requirements for different compilation backends (LLVM, MLIR, Flounder, or even C++).

And the corresponding queries will be pipelined

For UDF (User Define Function), it can also be compiled into a pipeline

Experimental results

It can be seen that no single backend solves the problem perfectly, and each backend has trade-offs between performance, latency, and implementation complexity for different types of workloads

In short query or debugging scenarios, MIR acts as a lightweight JIT backend that balances throughput and latency
In short query or debugging scenarios, MIR serves as a lightweight JIT backend that balances throughput and latency for long queries or batch processing scenarios.

In the case of Tracing, the compilation speed does not increase dramatically as the number of selection queries increases

Among the TPC-H Queries, MLIR’s Backend is better than DuckDB, but few samples can beat Umbra

Conclusion

The Trace-based JIT query compilation scheme proposed in the article provides new ideas for balancing database performance and development complexity by decoupling operator implementation and compilation logic to achieve dynamic optimisation of execution hotspots.

However, the problems raised in the beginning of the article are the problems that all developers who want to implement JIT database execution engine need to face, unfortunately, the article does not provide very good experience or suggestions in this area, which still need to be further explored by researchers.