Apache Arrow Gandiva项目解析
最初知道Apache Arrow Gandiva是无意间看Arrow项目的时候看到的,冲着项目主页上的LLVM,JIT的字样,我还实际尝试在Ubuntu安装和运行了下,但最后因为实在想不清楚,在什么场景下能用上,就弃坑了😂
直到前几天,我读完NoisePage的论文和部分源码,总感觉Arrow和LLVM的结合在哪里见到过——就是Apache Arrow Gandiva,那干脆这回一并把源码看了,搞清楚这东西到底是什么
相关资料
如果现在在Bing上搜“Apache Arrow Gandiva”,那么第二篇就会是一位知乎老哥写的Apache Arrow Gandiva:远大理想与尴尬现实,这也是当时我弃坑的主要原因。但今天我想说的是:为什么要用Java去处理Arrow数据?😂如果我是用Rust/C++,那Gandiva就一点都不尴尬,相反还很有意思——Apche Arrow Gandiva做了很多打通LLVM和Arrow生态的工作,给研究学者留下了很多探索空间
中文文档
Dremio提供的资料
Introducing the Gandiva Initiative for Apache Arrow
Adding a User Defined Function to Gandiva
Gandiva Initiative: Improving SQL Performance by 70x
大家写的Blog
项目历史&现状简述
该项目由Dremio在2018年捐给Apache Arrow,现作为Apache Arrow的子项目之一(信息来源:Gandiva: A LLVM-based Analytical Expression Compiler for Apache Arrow)如果你再进一步深究的话,会发现Arrow当中有不少人现在就在Dremio中工作,而Dremio项目也使用Apache Arrow,而Gandiva则宣称为Dremio执行引擎的一部分
Gandiva最大的亮点是使用LLVM的自动向量化完成Arrow的向量化处理,而在LLVM部分当中,还实现了Project和Filter——这里如果加上Join和Aggregation操作,很多SQL操作就齐活了,如果你再把NoisePage算上的话,甚至能完成整套纯LLVM的Arrow CURD处理机制
虽然网传这个项目烂尾(根本就没这回事好吧😅),但事实是Gandiva一直都有commit进行维护,今年LLVM20出来以后也很快做了跟进
目前Gandiva有C和C++的相关库,但对于Rust版本的Arrow似乎就不提供相关支持了:Interfaces for gandiva bindings.
源码解析
代码下载于2025.6.24,所有代码均平铺在单层目录上
Gandiva源码的地址:https://github.com/apache/arrow/tree/main/cpp/src/gandiva
|-- CMakeLists.txt|-- GandivaConfig.cmake.in|-- annotator.cc|-- annotator.h|-- annotator_test.cc|-- arrow.h|-- basic_decimal_scalar.h|-- bitmap_accumulator.cc|-- bitmap_accumulator.h|-- bitmap_accumulator_test.cc|-- cache.cc|-- cache.h|-- cache_test.cc|-- cast_time.cc|-- compiled_expr.h|-- condition.h|-- configuration.cc|-- configuration.h|-- context_helper.cc|-- date_utils.cc|-- date_utils.h|-- decimal_ir.cc|-- decimal_ir.h|-- decimal_scalar.h|-- decimal_type_util.cc|-- decimal_type_util.h|-- decimal_type_util_test.cc|-- decimal_xlarge.cc|-- decimal_xlarge.h|-- dex.h|-- dex_visitor.h|-- encrypt_utils.cc|-- encrypt_utils.h|-- encrypt_utils_test.cc|-- engine.cc|-- engine.h|-- engine_llvm_test.cc|-- eval_batch.h|-- execution_context.h|-- exported_funcs.cc|-- exported_funcs.h|-- exported_funcs_registry.cc|-- exported_funcs_registry.h|-- exported_funcs_registry_test.cc|-- expr_decomposer.cc|-- expr_decomposer.h|-- expr_decomposer_test.cc|-- expr_validator.cc|-- expr_validator.h|-- expression.cc|-- expression.h|-- expression_cache_key.h|-- expression_registry.cc|-- expression_registry.h|-- expression_registry_test.cc|-- external_c_functions.cc|-- field_descriptor.h|-- filter.cc|-- filter.h|-- formatting_utils.h|-- func_descriptor.h|-- function_holder.h|-- function_holder_maker_registry.cc|-- function_holder_maker_registry.h|-- function_ir_builder.cc|-- function_ir_builder.h|-- function_registry.cc|-- function_registry.h|-- function_registry_arithmetic.cc|-- function_registry_arithmetic.h|-- function_registry_common.h|-- function_registry_datetime.cc|-- function_registry_datetime.h|-- function_registry_hash.cc|-- function_registry_hash.h|-- function_registry_math_ops.cc|-- function_registry_math_ops.h|-- function_registry_string.cc|-- function_registry_string.h|-- function_registry_test.cc|-- function_registry_timestamp_arithmetic.cc|-- function_registry_timestamp_arithmetic.h|-- function_signature.cc|-- function_signature.h|-- function_signature_test.cc|-- gandiva.pc.in|-- gandiva_aliases.h|-- gandiva_object_cache.cc|-- gandiva_object_cache.h|-- gdv_function_stubs.cc|-- gdv_function_stubs.h|-- gdv_function_stubs_test.cc|-- gdv_hash_function_stubs.cc|-- gdv_string_function_stubs.cc|-- hash_utils.cc|-- hash_utils.h|-- hash_utils_test.cc|-- in_holder.h|-- interval_holder.cc|-- interval_holder.h|-- interval_holder_test.cc|-- literal_holder.cc|-- literal_holder.h|-- llvm_generator.cc|-- llvm_generator.h|-- llvm_generator_test.cc|-- llvm_includes.h|-- llvm_types.cc|-- llvm_types.h|-- llvm_types_test.cc|-- local_bitmaps_holder.h|-- lru_cache.h|-- lru_cache_test.cc|-- lvalue.h|-- make_precompiled_bitcode.py|-- native_function.h|-- node.h|-- node_visitor.h|-- precompiled| |-- CMakeLists.txt| |-- arithmetic_ops.cc| |-- arithmetic_ops_test.cc| |-- bitmap.cc| |-- bitmap_test.cc| |-- decimal_ops.cc| |-- decimal_ops.h| |-- decimal_ops_test.cc| |-- decimal_wrapper.cc| |-- epoch_time_point.h| |-- epoch_time_point_test.cc| |-- extended_math_ops.cc| |-- extended_math_ops_test.cc| |-- hash.cc| |-- hash_test.cc| |-- print.cc| |-- string_ops.cc| |-- string_ops_test.cc| |-- testing.h| |-- time.cc| |-- time_constants.h| |-- time_fields.h| |-- time_test.cc| |-- timestamp_arithmetic.cc| `-- types.h|-- precompiled_bitcode.cc.in|-- projector.cc|-- projector.h|-- random_generator_holder.cc|-- random_generator_holder.h|-- random_generator_holder_test.cc|-- regex_functions_holder.cc|-- regex_functions_holder.h|-- regex_functions_holder_test.cc|-- regex_util.cc|-- regex_util.h|-- selection_vector.cc|-- selection_vector.h|-- selection_vector_impl.h|-- selection_vector_test.cc|-- simple_arena.h|-- simple_arena_test.cc|-- symbols.map|-- tests| |-- CMakeLists.txt| |-- binary_test.cc| |-- boolean_expr_test.cc| |-- date_time_test.cc| |-- decimal_single_test.cc| |-- decimal_test.cc| |-- external_functions| | |-- CMakeLists.txt| | |-- multiply_by_two.cc| | `-- multiply_by_two.h| |-- filter_project_test.cc| |-- filter_test.cc| |-- generate_data.h| |-- hash_test.cc| |-- huge_table_test.cc| |-- if_expr_test.cc| |-- in_expr_test.cc| |-- literal_test.cc| |-- micro_benchmarks.cc| |-- null_validity_test.cc| |-- projector_build_validation_test.cc| |-- projector_test.cc| |-- test_util.cc| |-- test_util.h| |-- timed_evaluate.h| |-- to_string_test.cc| `-- utf8_test.cc|-- to_date_holder.cc|-- to_date_holder.h|-- to_date_holder_test.cc|-- tree_expr_builder.cc|-- tree_expr_builder.h|-- tree_expr_test.cc|-- value_validity_pair.h`-- visibility.h
由于代码量极大,只选取部分进行分析
node
关于Tree的Node的定义
namespace gandiva {
class FieldNode;class FunctionNode;class IfNode;class LiteralNode;class BooleanNode;template <typename Type>class InExpressionNode;
/// \brief Visitor for nodes in the expression tree.class GANDIVA_EXPORT NodeVisitor { public: virtual ~NodeVisitor() = default;
virtual Status Visit(const FieldNode& node) = 0; virtual Status Visit(const FunctionNode& node) = 0; virtual Status Visit(const IfNode& node) = 0; virtual Status Visit(const LiteralNode& node) = 0; virtual Status Visit(const BooleanNode& node) = 0; virtual Status Visit(const InExpressionNode<int32_t>& node) = 0; virtual Status Visit(const InExpressionNode<int64_t>& node) = 0; virtual Status Visit(const InExpressionNode<float>& node) = 0; virtual Status Visit(const InExpressionNode<double>& node) = 0; virtual Status Visit(const InExpressionNode<gandiva::DecimalScalar128>& node) = 0; virtual Status Visit(const InExpressionNode<std::string>& node) = 0;};
} // namespace gandiva
tree_expr
tree_expr_test.cctree_expr_builder.cctree_expr_builder.h
用于解析计算树,比如4*5+3
这种,通过TreeExprBuilder
完成树的构建
TEST_F(TestExprTree, TestField) { Annotator annotator;
auto n0 = TreeExprBuilder::MakeField(i0_); EXPECT_EQ(n0->return_type(), int32());
auto n1 = TreeExprBuilder::MakeField(b0_); EXPECT_EQ(n1->return_type(), boolean());
ExprDecomposer decomposer(*registry_, annotator); ValueValidityPairPtr pair; auto status = decomposer.Decompose(*n1, &pair); DCHECK_EQ(status.ok(), true) << status.message();
auto value = pair->value_expr(); auto value_dex = std::dynamic_pointer_cast<VectorReadFixedLenValueDex>(value); EXPECT_EQ(value_dex->FieldType(), boolean());
EXPECT_EQ(pair->validity_exprs().size(), 1); auto validity = pair->validity_exprs().at(0); auto validity_dex = std::dynamic_pointer_cast<VectorReadValidityDex>(validity); EXPECT_NE(validity_dex->ValidityIdx(), value_dex->DataIdx());}
借助函数重载,使用访问者模式,实现树的遍历与转换
class GANDIVA_EXPORT TreeExprBuilder { public: /// \brief create a node on a literal. static NodePtr MakeLiteral(bool value); static NodePtr MakeLiteral(uint8_t value); static NodePtr MakeLiteral(uint16_t value); static NodePtr MakeLiteral(uint32_t value); static NodePtr MakeLiteral(uint64_t value); static NodePtr MakeLiteral(int8_t value); static NodePtr MakeLiteral(int16_t value); static NodePtr MakeLiteral(int32_t value); static NodePtr MakeLiteral(int64_t value); static NodePtr MakeLiteral(float value); static NodePtr MakeLiteral(double value); static NodePtr MakeStringLiteral(const std::string& value); static NodePtr MakeBinaryLiteral(const std::string& value); static NodePtr MakeDecimalLiteral(const DecimalScalar128& value);
to_date_holder
完成字符串往时间的转化
EST_F(TestToDateHolder, TestSimpleDateTime) { EXPECT_OK_AND_ASSIGN(auto to_date_holder, ToDateHolder::Make("YYYY-MM-DD HH:MI:SS", 1));
auto& to_date = *to_date_holder; bool out_valid; std::string s("1986-12-01 01:01:01"); int64_t millis_since_epoch = to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid); EXPECT_EQ(millis_since_epoch, 533779200000);
s = std::string("1986-12-01 01:01:01.11"); millis_since_epoch = to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid); EXPECT_EQ(millis_since_epoch, 533779200000);
s = std::string("1986-12-01 01:01:01 +0800"); millis_since_epoch = to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid); EXPECT_EQ(millis_since_epoch, 533779200000);
#if 0 // TODO : this fails parsing with date::parse and strptime on linux s = std::string("1886-12-01 00:00:00"); millis_since_epoch = to_date(&execution_context_, s.data(), (int) s.length(), true, &out_valid); EXPECT_EQ(out_valid, true); EXPECT_EQ(millis_since_epoch, -2621894400000);#endif
s = std::string("1886-12-01 01:01:01"); millis_since_epoch = to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid); EXPECT_EQ(millis_since_epoch, -2621894400000);
s = std::string("1986-12-11 01:30:00"); millis_since_epoch = to_date(&execution_context_, s.data(), (int)s.length(), true, &out_valid); EXPECT_EQ(millis_since_epoch, 534643200000);}
simple_arena
没太理解内容,似乎是关于内存分配处理的内容,实现以Trunk为单位的内存分配
TEST_F(TestSimpleArena, TestAlloc) { int64_t chunk_size = 4096; SimpleArena arena(arrow::default_memory_pool(), chunk_size);
// Small allocations should come from the same chunk. int64_t small_size = 100; for (int64_t i = 0; i < 20; ++i) { auto p = arena.Allocate(small_size); EXPECT_NE(p, nullptr);
EXPECT_EQ(arena.total_bytes(), chunk_size); EXPECT_EQ(arena.avail_bytes(), chunk_size - (i + 1) * small_size); }
// large allocations require separate chunks int64_t large_size = 100 * chunk_size; auto p = arena.Allocate(large_size); EXPECT_NE(p, nullptr); EXPECT_EQ(arena.total_bytes(), chunk_size + large_size); EXPECT_EQ(arena.avail_bytes(), 0);}
selection_vector
实现对于Arrow格式存储的选择向量(Selection Vector)
这里需要补充下关于选择向量的相关知识
Selection Vector 是一种在数据处理系统中使用的技术,用来表示一批数据中哪些行被选中(有效),从而避免对不相关的数据行进行操作。它常见于列式数据库、矢量化执行引擎(如 Apache Arrow、Dremio、Gandiva)中,用于提升性能。
Selection Vector(选择向量)本质上是一个索引数组,存储的是被选中行在原始数据批中的下标。
避免复制数据:只需操作向量而不移动原始数据。
高效过滤:可以快速跳过不符合条件的行。
矢量化执行支持:配合批处理(batch processing),提升 SIMD 性能。
落到具体选择上,可能就是bitmap或是个Set
TEST_F(TestSelectionVector, TestInt16Set) { int max_slots = 10;
std::shared_ptr<SelectionVector> selection; auto status = SelectionVector::MakeInt16(max_slots, pool_, &selection); EXPECT_EQ(status.ok(), true) << status.message();
selection->SetIndex(0, 100); EXPECT_EQ(selection->GetIndex(0), 100);
selection->SetIndex(1, 200); EXPECT_EQ(selection->GetIndex(1), 200);
selection->SetNumSlots(2); EXPECT_EQ(selection->GetNumSlots(), 2);
// TopArray() should return an array with 100,200 auto array_raw = selection->ToArray(); const auto& array = dynamic_cast<const arrow::UInt16Array&>(*array_raw); EXPECT_EQ(array.length(), 2) << array_raw->ToString(); EXPECT_EQ(array.Value(0), 100) << array_raw->ToString(); EXPECT_EQ(array.Value(1), 200) << array_raw->ToString();}
也可以通过Bitmap实现向量选择
TEST_F(TestSelectionVector, TestInt64PopulateFromBitMap) { int max_slots = 200;
std::shared_ptr<SelectionVector> selection; auto status = SelectionVector::MakeInt64(max_slots, pool_, &selection); EXPECT_EQ(status.ok(), true) << status.message();
int bitmap_size = RoundUpNumi64(max_slots) * 8; std::vector<uint8_t> bitmap(bitmap_size);
arrow::bit_util::SetBit(&bitmap[0], 0); arrow::bit_util::SetBit(&bitmap[0], 5); arrow::bit_util::SetBit(&bitmap[0], 121); arrow::bit_util::SetBit(&bitmap[0], 220);
status = selection->PopulateFromBitMap(&bitmap[0], bitmap_size, max_slots - 1); EXPECT_EQ(status.ok(), true) << status.message();
EXPECT_EQ(selection->GetNumSlots(), 3); EXPECT_EQ(selection->GetIndex(0), 0); EXPECT_EQ(selection->GetIndex(1), 5); EXPECT_EQ(selection->GetIndex(2), 121);}
regex_functions/util
正则表达式相关,似乎能检测SQL相关的符号,这部分使用了Google的re2库,参考PCRE(Perl Compatible Regular Expressions)实现标准
const std::set<char> RegexUtil::pcre_regex_specials_ = { '[', ']', '(', ')', '|', '^', '-', '+', '*', '?', '{', '}', '$', '\\', '.'};
而测试也基本围绕些简易字符串展开
你甚至能看到关于中文字符的检测,这可太稀罕了,C++的UTF-8识别这块我一直摸不着头脑😂
input_string = "路%c$大"; extract_index = 2; // Retrieve all matched string
ret = extract_numbers(&execution_context_, input_string.c_str(), static_cast<int32_t>(input_string.length()), extract_index, &out_length); ret_as_str = std::string(ret, out_length); EXPECT_EQ(out_length, 1); EXPECT_EQ(ret_as_str, "c");
random_generator
随机数生成器,里面包含了随机种子信息
namespace gandiva {
/// Function Holder for 'random'class GANDIVA_EXPORT RandomGeneratorHolder : public FunctionHolder { public: ~RandomGeneratorHolder() override = default;
static Result<std::shared_ptr<RandomGeneratorHolder>> Make(const FunctionNode& node);
double operator()() { return distribution_(generator_); }
private: explicit RandomGeneratorHolder(int seed) : distribution_(0, 1) { int64_t seed64 = static_cast<int64_t>(seed); seed64 = (seed64 ^ 0x00000005DEECE66D) & 0x0000ffffffffffff; generator_.seed(static_cast<uint64_t>(seed64)); }
RandomGeneratorHolder() : distribution_(0, 1) { generator_.seed(::arrow::internal::GetRandomSeed()); }
std::mt19937_64 generator_; std::uniform_real_distribution<> distribution_;};
} // namespace gandiva
project
关于Gandiva如何处理Apache Arrow的Project的代码了,
/// \brief projection using expressions.
///
/// A projector is built for a specific schema and vector of expressions.
/// Once the projector is built, it can be used to evaluate many row batches.
看以看到实现中LLVM Generator,output_fields,是否使用已有的缓存,以及代码生成设置相关属性
std::unique_ptr<LLVMGenerator> llvm_generator_; SchemaPtr schema_; FieldVector output_fields_; std::shared_ptr<Configuration> configuration_; bool built_from_cache_;};
这里面还涉及了关于数据缓冲区的代码
Status Projector::AllocArrayData(const DataTypePtr& type, int64_t num_records, arrow::MemoryPool* pool, ArrayDataPtr* array_data) const { arrow::Status astatus; std::vector<std::shared_ptr<arrow::Buffer>> buffers;
// The output vector always has a null bitmap. int64_t size = arrow::bit_util::BytesForBits(num_records); ARROW_ASSIGN_OR_RAISE(auto bitmap_buffer, arrow::AllocateBuffer(size, pool)); buffers.push_back(std::move(bitmap_buffer));
// String/Binary vectors have an offsets array. auto type_id = type->id(); if (arrow::is_binary_like(type_id)) { auto offsets_len = arrow::bit_util::BytesForBits((num_records + 1) * 32);
ARROW_ASSIGN_OR_RAISE(auto offsets_buffer, arrow::AllocateBuffer(offsets_len, pool)); buffers.push_back(std::move(offsets_buffer)); }
// The output vector always has a data array. int64_t data_len; if (arrow::is_primitive(type_id) || type_id == arrow::Type::DECIMAL) { const auto& fw_type = static_cast<const arrow::FixedWidthType&>(*type); data_len = arrow::bit_util::BytesForBits(num_records * fw_type.bit_width()); } else if (arrow::is_binary_like(type_id)) { // we don't know the expected size for varlen output vectors. data_len = 0; } else { return Status::Invalid("Unsupported output data type " + type->ToString()); } ARROW_ASSIGN_OR_RAISE(auto data_buffer, arrow::AllocateResizableBuffer(data_len, pool));
// This is not strictly required but valgrind gets confused and detects this // as uninitialized memory access. See arrow::util::SetBitTo(). if (type->id() == arrow::Type::BOOL) { memset(data_buffer->mutable_data(), 0, data_len); } buffers.push_back(std::move(data_buffer));
*array_data = arrow::ArrayData::Make(type, num_records, std::move(buffers)); return Status::OK();}
有点奇怪的是这部分内容没有没有配备test
lru_cache
从Boost库修改的LRU Cache,因为代码使用了模板,所以这里看不出来是存了什么
// modified from boost LRU cache -> the boost cache supported only an// ordered map.namespace gandiva {// a cache which evicts the least recently used item when it is fulltemplate <class Key, class Value>class LruCache { public: using key_type = Key; using value_type = Value; using list_type = std::list<key_type>;
测试代码是直接使用string
TEST_F(TestLruCache, TestLruBehavior) { cache_.insert(TestCacheKey(1), "hello"); cache_.insert(TestCacheKey(2), "hello"); cache_.get(TestCacheKey(1)); cache_.insert(TestCacheKey(3), "hello"); // should have evicted key 2. ASSERT_EQ(*cache_.get(TestCacheKey(1)), "hello");}
llvm_types
有一个llvm_types
用于全局的types生成管理,用于映射Arrow的类型,这样的代码也能在NoisePage里面找到
class GANDIVA_EXPORT LLVMTypes { public: explicit LLVMTypes(llvm::LLVMContext& context);
llvm::Type* void_type() { return llvm::Type::getVoidTy(context_); }
llvm::Type* i1_type() { return llvm::Type::getInt1Ty(context_); }
llvm::Type* i8_type() { return llvm::Type::getInt8Ty(context_); }
llvm::Type* i16_type() { return llvm::Type::getInt16Ty(context_); }
llvm::Type* i32_type() { return llvm::Type::getInt32Ty(context_); }
llvm::Type* i64_type() { return llvm::Type::getInt64Ty(context_); }
llvm::Type* i128_type() { return llvm::Type::getInt128Ty(context_); }
llvm::StructType* i128_split_type() { // struct with high/low bits (see decimal_ops.cc:DecimalSplit) return llvm::StructType::get(context_, {i64_type(), i64_type()}, false); }
以及一些简单的内容初始化
llvm::Constant* i128_zero() { return i128_constant(0); } llvm::Constant* i128_one() { return i128_constant(1); }
相关测试代码
TEST_F(TestLLVMTypes, TestFound) { EXPECT_EQ(types_->IRType(arrow::Type::BOOL), types_->i1_type()); EXPECT_EQ(types_->IRType(arrow::Type::INT32), types_->i32_type()); EXPECT_EQ(types_->IRType(arrow::Type::INT64), types_->i64_type()); EXPECT_EQ(types_->IRType(arrow::Type::FLOAT), types_->float_type()); EXPECT_EQ(types_->IRType(arrow::Type::DOUBLE), types_->double_type()); EXPECT_EQ(types_->IRType(arrow::Type::DATE64), types_->i64_type()); EXPECT_EQ(types_->IRType(arrow::Type::TIME64), types_->i64_type()); EXPECT_EQ(types_->IRType(arrow::Type::TIMESTAMP), types_->i64_type());
EXPECT_EQ(types_->DataVecType(arrow::boolean()), types_->i1_type()); EXPECT_EQ(types_->DataVecType(arrow::int32()), types_->i32_type()); EXPECT_EQ(types_->DataVecType(arrow::int64()), types_->i64_type()); EXPECT_EQ(types_->DataVecType(arrow::float32()), types_->float_type()); EXPECT_EQ(types_->DataVecType(arrow::float64()), types_->double_type()); EXPECT_EQ(types_->DataVecType(arrow::date64()), types_->i64_type()); EXPECT_EQ(types_->DataVecType(arrow::time64(arrow::TimeUnit::MICRO)), types_->i64_type()); EXPECT_EQ(types_->DataVecType(arrow::timestamp(arrow::TimeUnit::MILLI)), types_->i64_type());}
TEST_F(TestLLVMTypes, TestNotFound) { EXPECT_EQ(types_->IRType(arrow::Type::SPARSE_UNION), nullptr); EXPECT_EQ(types_->IRType(arrow::Type::DENSE_UNION), nullptr); EXPECT_EQ(types_->DataVecType(arrow::null()), nullptr);}
llvm_includes
开头的关闭MSVC的警告可以记录以下,这是我头一回遇到,看以看出Gandiva是能在Windows上面运行的
#if defined(_MSC_VER)# pragma warning(push)# pragma warning(disable : 4141)# pragma warning(disable : 4146)# pragma warning(disable : 4244)# pragma warning(disable : 4267)# pragma warning(disable : 4291)# pragma warning(disable : 4624)#endif
甚至还考虑到了不同LLVM版本的情况
#if LLVM_VERSION_MAJOR >= 10# define LLVM_ALIGN(alignment) (llvm::Align((alignment)))#else# define LLVM_ALIGN(alignment) (alignment)#endif
llvm_generator
最为核心的LLVM代码生成
生成器似乎可以对缓存有效利用
class GANDIVA_EXPORT LLVMGenerator { public: /// \brief Factory method to initialize the generator. static Result<std::unique_ptr<LLVMGenerator>> Make( const std::shared_ptr<Configuration>& config, bool cached, std::optional<std::reference_wrapper<GandivaObjectCache>> object_cache = std::nullopt);
/// \brief Get the cache to be used for LLVM ObjectCache. static std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>> GetCache();
存储关于SelectionVector::Mode
的信息
SelectionVector::Mode selection_vector_mode() { return selection_vector_mode_; }
build
将表达式输入生成代码
/// \brief Build the code for the expression trees for default mode with a LLVM /// ObjectCache. Each element in the vector represents an expression tree Status Build(const ExpressionVector& exprs, SelectionVector::Mode mode);
/// \brief Build the code for the expression trees for default mode. Each /// element in the vector represents an expression tree Status Build(const ExpressionVector& exprs);
execute
将Arrow量输入LLVM IR函数
/// \brief Execute the built expression against the provided arguments for /// default mode. Status Execute(const arrow::RecordBatch& record_batch, const ArrayDataVector& output_vector) const;
/// \brief Execute the built expression against the provided arguments for /// all modes. Only works on the records specified in the selection_vector. Status Execute(const arrow::RecordBatch& record_batch, const SelectionVector* selection_vector, const ArrayDataVector& output_vector) const;
基本LLVMContext
和IRbuilder
自然是少不了,但这里的创建Global String居然不用检查重复,不知道是疏忽,还是因为前边有检查😂
llvm::LLVMContext* context() { return engine_->context(); } llvm::IRBuilder<>* ir_builder() { return engine_->ir_builder(); } llvm::Constant* CreateGlobalStringPtr(const std::string& string) { return engine_->CreateGlobalStringPtr(string); }
然后Vistor模式重新过一遍解析树
class Visitor : public DexVisitor { public: Visitor(LLVMGenerator* generator, llvm::Function* function, llvm::BasicBlock* entry_block, llvm::Value* arg_addrs, llvm::Value* arg_local_bitmaps, llvm::Value* arg_holder_ptrs, std::vector<llvm::Value*> slice_offsets, llvm::Value* arg_context_ptr, llvm::Value* loop_var);
void Visit(const VectorReadValidityDex& dex) override; void Visit(const VectorReadFixedLenValueDex& dex) override; void Visit(const VectorReadVarLenValueDex& dex) override; void Visit(const LocalBitMapValidityDex& dex) override; void Visit(const TrueDex& dex) override; void Visit(const FalseDex& dex) override; void Visit(const LiteralDex& dex) override; void Visit(const NonNullableFuncDex& dex) override; void Visit(const NullableNeverFuncDex& dex) override; void Visit(const NullableInternalFuncDex& dex) override; void Visit(const IfDex& dex) override; void Visit(const BooleanAndDex& dex) override; void Visit(const BooleanOrDex& dex) override; void Visit(const InExprDexBase<int32_t>& dex) override; void Visit(const InExprDexBase<int64_t>& dex) override; void Visit(const InExprDexBase<float>& dex) override; void Visit(const InExprDexBase<double>& dex) override; void Visit(const InExprDexBase<gandiva::DecimalScalar128>& dex) override; void Visit(const InExprDexBase<std::string>& dex) override; template <typename Type> void VisitInExpression(const InExprDexBase<Type>& dex);
LValuePtr result() { return result_; }
bool has_arena_allocs() { return has_arena_allocs_; }
还有专门关于LLVM函数生成与函数调用的函数
std::vector<llvm::Value*> BuildParams(int holder_idx, const ValueValidityPairVector& args, bool with_validity, bool with_context);
// Generate code to invoke a function call. LValuePtr BuildFunctionCall(const NativeFunction* func, DataTypePtr arrow_return_type, std::vector<llvm::Value*>* params);
// Generate code for an if-else condition. LValuePtr BuildIfElse(llvm::Value* condition, std::function<LValuePtr()> then_func, std::function<LValuePtr()> else_func, DataTypePtr arrow_return_type);
通过接口添加预定义的LLVM IR函数
/// Generate code to make a function call (to a pre-compiled IR function) which takes /// 'args' and has a return type 'ret_type'. llvm::Value* AddFunctionCall(const std::string& full_name, llvm::Type* ret_type, const std::vector<llvm::Value*>& args);
关于Cache的详细实现
std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>LLVMGenerator::GetCache() { static std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>> shared_cache = std::make_shared< Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>();
return shared_cache;}
Status LLVMGenerator::SetLLVMObjectCache(GandivaObjectCache& object_cache) { return engine_->SetLLVMObjectCache(object_cache);}
build
的部分实现
Status LLVMGenerator::Build(const ExpressionVector& exprs, SelectionVector::Mode mode) { selection_vector_mode_ = mode;
for (auto& expr : exprs) { auto output = annotator_.AddOutputFieldDescriptor(expr->result()); ARROW_RETURN_NOT_OK(Add(expr, output)); }
// Compile and inject into the process' memory the generated function. ARROW_RETURN_NOT_OK(engine_->FinalizeModule());
// setup the jit functions for each expression. for (auto& compiled_expr : compiled_exprs_) { auto fn_name = compiled_expr->GetFunctionName(mode); ARROW_ASSIGN_OR_RAISE(auto fn_ptr, engine_->CompiledFunction(fn_name)); auto jit_fn = reinterpret_cast<EvalFunc>(fn_ptr); compiled_expr->SetJITFunction(selection_vector_mode_, jit_fn); }
return Status::OK();}
这部分的详细内容有空的话值得细看,而关于Test的话,这边给的示范样例是LLVM自动向量化向量加
TEST_F(TestLLVMGenerator, TestAdd) { // Setup LLVM generator to do an arithmetic add of two vectors ASSERT_OK_AND_ASSIGN(auto generator, LLVMGenerator::Make(TestConfigWithIrDumping(), false)); Annotator annotator;
auto field0 = std::make_shared<arrow::Field>("f0", arrow::int32()); auto desc0 = annotator.CheckAndAddInputFieldDescriptor(field0); auto validity_dex0 = std::make_shared<VectorReadValidityDex>(desc0); auto value_dex0 = std::make_shared<VectorReadFixedLenValueDex>(desc0); auto pair0 = std::make_shared<ValueValidityPair>(validity_dex0, value_dex0);
auto field1 = std::make_shared<arrow::Field>("f1", arrow::int32()); auto desc1 = annotator.CheckAndAddInputFieldDescriptor(field1); auto validity_dex1 = std::make_shared<VectorReadValidityDex>(desc1); auto value_dex1 = std::make_shared<VectorReadFixedLenValueDex>(desc1); auto pair1 = std::make_shared<ValueValidityPair>(validity_dex1, value_dex1);
DataTypeVector params{arrow::int32(), arrow::int32()}; auto func_desc = std::make_shared<FuncDescriptor>("add", params, arrow::int32()); FunctionSignature signature(func_desc->name(), func_desc->params(), func_desc->return_type()); const NativeFunction* native_func = generator->function_registry_->LookupSignature(signature);
std::vector<ValueValidityPairPtr> pairs{pair0, pair1}; auto func_dex = std::make_shared<NonNullableFuncDex>( func_desc, native_func, FunctionHolderPtr(nullptr), -1, pairs);
auto field_sum = std::make_shared<arrow::Field>("out", arrow::int32()); auto desc_sum = annotator.CheckAndAddInputFieldDescriptor(field_sum);
// LLVM 10 doesn't like the expr function name to be the same as the module name when // LLJIT is used std::string fn_name = "llvm_gen_test_add_expr";
ASSERT_OK(generator->engine_->LoadFunctionIRs()); ASSERT_OK(generator->CodeGenExprValue(func_dex, 4, desc_sum, 0, fn_name, SelectionVector::MODE_NONE));
ASSERT_OK(generator->engine_->FinalizeModule()); auto const& ir = generator->engine_->ir(); EXPECT_THAT(ir, testing::HasSubstr("vector.body"));
ASSERT_OK_AND_ASSIGN(auto fn_ptr, generator->engine_->CompiledFunction(fn_name)); ASSERT_TRUE(fn_ptr);
auto eval_func = reinterpret_cast<EvalFunc>(fn_ptr); constexpr size_t kNumRecords = 4; std::array<uint32_t, kNumRecords> a0{1, 2, 3, 4}; std::array<uint32_t, kNumRecords> a1{5, 6, 7, 8}; uint64_t in_bitmap = 0xffffffffffffffffull;
std::array<uint32_t, kNumRecords> out{0, 0, 0, 0}; uint64_t out_bitmap = 0;
std::array<uint8_t*, 6> addrs{ reinterpret_cast<uint8_t*>(a0.data()), reinterpret_cast<uint8_t*>(&in_bitmap), reinterpret_cast<uint8_t*>(a1.data()), reinterpret_cast<uint8_t*>(&in_bitmap), reinterpret_cast<uint8_t*>(out.data()), reinterpret_cast<uint8_t*>(&out_bitmap), }; std::array<int64_t, 6> addr_offsets{0, 0, 0, 0, 0, 0};
eval_func(addrs.data(), addr_offsets.data(), nullptr, nullptr, nullptr, 0 /* dummy context ptr */, kNumRecords);
EXPECT_THAT(out, testing::ElementsAre(6, 8, 10, 12)); EXPECT_EQ(out_bitmap, 0ULL);}
这一块可以看下GPT的解析
这个
TestAdd
测试用例是在测试 LLVM 代码生成器生成向量加法运算的完整流程。让我详细解释:测试目标
验证 Gandiva 能否正确生成并执行两个 int32 向量相加的 LLVM 代码。
测试流程分析
1. 初始化 LLVM 生成器
ASSERT_OK_AND_ASSIGN(auto generator, LLVMGenerator::Make(TestConfigWithIrDumping(), false));
- 创建支持 IR 转储的 LLVM 代码生成器
2. 设置输入向量描述符
// 第一个输入向量 f0 (int32)auto field0 = std::make_shared<arrow::Field>("f0", arrow::int32());auto desc0 = annotator.CheckAndAddInputFieldDescriptor(field0);auto validity_dex0 = std::make_shared<VectorReadValidityDex>(desc0); // 有效性位图auto value_dex0 = std::make_shared<VectorReadFixedLenValueDex>(desc0); // 数据值auto pair0 = std::make_shared<ValueValidityPair>(validity_dex0, value_dex0);// 第二个输入向量 f1 (int32) - 类似的设置3. 定义加法函数
DataTypeVector params{arrow::int32(), arrow::int32()};auto func_desc = std::make_shared<FuncDescriptor>("add", params, arrow::int32());
- 定义一个接受两个 int32 参数、返回 int32 的加法函数
4. 创建函数表达式
auto func_dex = std::make_shared<NonNullableFuncDex>(func_desc, native_func, FunctionHolderPtr(nullptr), -1, pairs);
- 将输入向量对与加法函数绑定
5. 生成 LLVM 代码
ASSERT_OK(generator->CodeGenExprValue(func_dex, 4, desc_sum, 0, fn_name,SelectionVector::MODE_NONE));
- 为表达式生成 LLVM IR 代码
- 处理 4 条记录
6. 编译并获取函数指针
ASSERT_OK_AND_ASSIGN(auto fn_ptr, generator->engine_->CompiledFunction(fn_name));auto eval_func = reinterpret_cast<EvalFunc>(fn_ptr);
- 将 LLVM IR 编译为机器码
- 获取可执行的函数指针
7. 准备测试数据
std::array<uint32_t, kNumRecords> a0{1, 2, 3, 4}; // 第一个向量std::array<uint32_t, kNumRecords> a1{5, 6, 7, 8}; // 第二个向量uint64_t in_bitmap = 0xffffffffffffffffull; // 全部有效std::array<uint32_t, kNumRecords> out{0, 0, 0, 0}; // 输出向量8. 执行生成的函数
eval_func(addrs.data(), addr_offsets.data(), nullptr, nullptr, nullptr,0 /* dummy context ptr */, kNumRecords);
- 调用编译后的函数执行向量加法
9. 验证结果
EXPECT_THAT(out, testing::ElementsAre(6, 8, 10, 12)); // 1+5, 2+6, 3+7, 4+8EXPECT_EQ(out_bitmap, 0ULL); // 输出位图应为0(全部有效)关键技术点
1. 向量化优化验证
EXPECT_THAT(ir, testing::HasSubstr("vector.body"));
- 检查生成的 LLVM IR 中包含向量化循环体
- 确保 LLVM 进行了向量化优化
2. 内存布局
std::array<uint8_t*, 6> addrs{reinterpret_cast<uint8_t*>(a0.data()), // 第一个向量数据reinterpret_cast<uint8_t*>(&in_bitmap), // 第一个向量有效性位图reinterpret_cast<uint8_t*>(a1.data()), // 第二个向量数据reinterpret_cast<uint8_t*>(&in_bitmap), // 第二个向量有效性位图reinterpret_cast<uint8_t*>(out.data()), // 输出向量数据reinterpret_cast<uint8_t*>(&out_bitmap) // 输出向量有效性位图};测试意义
这个测试验证了 Gandiva 的核心功能:
- 表达式解析:将高级表达式转换为内部表示
- 代码生成:生成高效的 LLVM IR 代码
- 编译执行:将 IR 编译为机器码并执行
- 向量化优化:确保生成的代码利用了 SIMD 指令
- 正确性:验证计算结果的准确性
这是一个端到端的集成测试,确保整个代码生成和执行流水线工作正常。
这里的testing::HasSubstr
是GMock的组件
这里可以看到,C语言函数是可以直接register上去的
TEST_F(TestLLVMGenerator, VerifyExtendedCFunctions) { VerifyFunctionMapping("multiply_by_three_int32", [](auto registry) { return TestConfigWithCFunction(std::move(registry)); });
//test_util.ccstd::shared_ptr<Configuration> TestConfigWithCFunction( std::shared_ptr<FunctionRegistry> registry) { return BuildConfigurationWithRegistry(std::move(registry), [](auto reg) { return reg->Register(GetTestExternalCFunction(), reinterpret_cast<void*>(multiply_by_three)); });}
static int64_t multiply_by_three(int32_t value) { return value * 3; }
literal_holder
Gandiva 中统一表示和处理各种类型的常量值
namespace gandiva {
using LiteralHolder = std::variant<bool, float, double, int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t, uint32_t, uint64_t, std::string, DecimalScalar128>;
GANDIVA_EXPORT std::string ToString(const LiteralHolder& holder);
} // namespace gandiva
std::variant
是 C++17 引入的一个类型安全的联合体(type-safe union),它可以在运行时保存一个多个预设类型中的一个值,但不会像传统的 union 那样不安全。Rust 的
enum
枚举类型是std::variant
的更强版本
Interval_holder
处理各类时间间隔
// Pass only years and days to cast data = "P12Y15D"; response = cast_interval_day(&execution_context_, data.data(), 7, true, &out_valid); qty_days_in_response = 15; qty_millis_in_response = 0; EXPECT_TRUE(out_valid); EXPECT_FALSE(execution_context_.has_error()); EXPECT_EQ(response, (qty_millis_in_response << 32) | qty_days_in_response);
hash_utils
hash组件用的是OpenSSL,主要是关于Sha类,Md5l类函数
GANDIVA_EXPORTconst char* gdv_sha512_hash(int64_t context, const void* message, size_t message_length, int32_t* out_length) { constexpr int sha512_result_length = 128; return gdv_hash_using_openssl(context, message, message_length, EVP_sha512(), sha512_result_length, out_length);}
/// Hashes a generic message using the SHA256 algorithmGANDIVA_EXPORTconst char* gdv_sha256_hash(int64_t context, const void* message, size_t message_length, int32_t* out_length) { constexpr int sha256_result_length = 64; return gdv_hash_using_openssl(context, message, message_length, EVP_sha256(), sha256_result_length, out_length);}
/// Hashes a generic message using the SHA1 algorithmGANDIVA_EXPORTconst char* gdv_sha1_hash(int64_t context, const void* message, size_t message_length, int32_t* out_length) { constexpr int sha1_result_length = 40; return gdv_hash_using_openssl(context, message, message_length, EVP_sha1(), sha1_result_length, out_length);}
GANDIVA_EXPORTconst char* gdv_md5_hash(int64_t context, const void* message, size_t message_length, int32_t* out_length) { constexpr int md5_result_length = 32; return gdv_hash_using_openssl(context, message, message_length, EVP_md5(), md5_result_length, out_length);}
gandiva_object_cache
直接对result1 = evaluate("column1 + column2 * 3");
这类操作的结果进行缓存,相关操作继承自llvm::ObjectCache
,使用llvm::memorybuffer
缓存相关代码
class GandivaObjectCache : public llvm::ObjectCache { public: explicit GandivaObjectCache( std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>>& cache, ExpressionCacheKey key);
~GandivaObjectCache() {}
void notifyObjectCompiled(const llvm::Module* M, llvm::MemoryBufferRef Obj);
std::unique_ptr<llvm::MemoryBuffer> getObject(const llvm::Module* M);
private: ExpressionCacheKey cache_key_; std::shared_ptr<Cache<ExpressionCacheKey, std::shared_ptr<llvm::MemoryBuffer>>> cache_;};
function_signature
给函数上Hash,我猜应该是缓存记录
EXPECT_EQ(FunctionSignature("extract_month", {arrow::date32()}, arrow::int64()), FunctionSignature("extract_month", {local_date32_type_}, local_i64_type_));
TEST_F(TestFunctionSignature, TestHash) { FunctionSignature f1("add", {arrow::int32(), arrow::int32()}, arrow::int64()); FunctionSignature f2("add", {local_i32_type_, local_i32_type_}, local_i64_type_); EXPECT_EQ(f1.Hash(), f2.Hash());
FunctionSignature f3("extractDay", {arrow::int64()}, arrow::int64()); FunctionSignature f4("extractday", {arrow::int64()}, arrow::int64()); EXPECT_EQ(f3.Hash(), f4.Hash());}
function_register
class GANDIVA_EXPORT FunctionRegistry { public: using iterator = const NativeFunction*; using FunctionHolderMaker = std::function<arrow::Result<std::shared_ptr<FunctionHolder>>( const FunctionNode& function_node)>;
FunctionRegistry(); FunctionRegistry(const FunctionRegistry&) = delete; FunctionRegistry& operator=(const FunctionRegistry&) = delete;
/// Lookup a pre-compiled function by its signature. const NativeFunction* LookupSignature(const FunctionSignature& signature) const;
/// \brief register a set of functions into the function registry from a given bitcode /// file arrow::Status Register(const std::vector<NativeFunction>& funcs, const std::string& bitcode_path);
/// \brief register a set of functions into the function registry from a given bitcode /// buffer arrow::Status Register(const std::vector<NativeFunction>& funcs, std::shared_ptr<arrow::Buffer> bitcode_buffer);
/// \brief register a C function into the function registry /// @param func the registered function's metadata /// @param c_function_ptr the function pointer to the /// registered function's implementation /// @param function_holder_maker this will be used as the function holder if the /// function requires a function holder arrow::Status Register( NativeFunction func, void* c_function_ptr, std::optional<FunctionHolderMaker> function_holder_maker = std::nullopt);
/// \brief get a list of bitcode memory buffers saved in the registry const std::vector<std::shared_ptr<arrow::Buffer>>& GetBitcodeBuffers() const;
/// \brief get a list of C functions saved in the registry const std::vector<std::pair<NativeFunction, void*>>& GetCFunctions() const;
const FunctionHolderMakerRegistry& GetFunctionHolderMakerRegistry() const;
iterator begin() const; iterator end() const; iterator back() const;
friend arrow::Result<std::shared_ptr<FunctionRegistry>> MakeDefaultFunctionRegistry();
private: std::vector<NativeFunction> pc_registry_; SignatureMap pc_registry_map_; std::vector<std::shared_ptr<arrow::Buffer>> bitcode_memory_buffers_; std::vector<std::pair<NativeFunction, void*>> c_functions_; FunctionHolderMakerRegistry holder_maker_registry_;
Status Add(NativeFunction func);};
/// \brief get the default function registryGANDIVA_EXPORT std::shared_ptr<FunctionRegistry> default_function_registry();
} // namespace gandiva
function_ir_builder
一个十分通用的IR生成器(这玩意我怎么之前没想到过呢.jpg),甚至能实现If-else的block块跳转
class FunctionIRBuilder { public: explicit FunctionIRBuilder(Engine* engine) : engine_(engine) {} virtual ~FunctionIRBuilder() = default;
protected: LLVMTypes* types() { return engine_->types(); } llvm::Module* module() { return engine_->module(); } llvm::LLVMContext* context() { return engine_->context(); } llvm::IRBuilder<>* ir_builder() { return engine_->ir_builder(); } llvm::Constant* CreateGlobalStringPtr(const std::string& string) { return engine_->CreateGlobalStringPtr(string); }
/// Build an if-else block. llvm::Value* BuildIfElse(llvm::Value* condition, llvm::Type* return_type, std::function<llvm::Value*()> then_func, std::function<llvm::Value*()> else_func);
struct NamedArg { std::string name; llvm::Type* type; };
/// Build llvm fn. llvm::Function* BuildFunction(const std::string& function_name, llvm::Type* return_type, std::vector<NamedArg> in_args);
private: Engine* engine_;};
filter
这部分也是在LLVM中实现,看起来和Project差不多
private: std::unique_ptr<LLVMGenerator> llvm_generator_; SchemaPtr schema_; std::shared_ptr<Configuration> configuration_; bool built_from_cache_;
如果想要添加缓存,直接SetLLVMObjectCache
即可
Status Engine::SetLLVMObjectCache(GandivaObjectCache& object_cache) { auto cached_buffer = object_cache.getObject(nullptr); if (cached_buffer) { auto error = lljit_->addObjectFile(std::move(cached_buffer)); if (error) { return Status::CodeGenError("Failed to add cached object file to LLJIT: ", llvm::toString(std::move(error))); } } return Status::OK();}
在PassManager里面可以挂上Optimize
static void OptimizeModuleWithNewPassManager(llvm::Module& module, llvm::TargetIRAnalysis target_analysis) { // Setup an optimiser pipeline llvm::PassBuilder pass_builder; llvm::LoopAnalysisManager loop_am; llvm::FunctionAnalysisManager function_am; llvm::CGSCCAnalysisManager cgscc_am; llvm::ModuleAnalysisManager module_am;
function_am.registerPass([&] { return target_analysis; });
// Register required analysis managers pass_builder.registerModuleAnalyses(module_am); pass_builder.registerCGSCCAnalyses(cgscc_am); pass_builder.registerFunctionAnalyses(function_am); pass_builder.registerLoopAnalyses(loop_am); pass_builder.crossRegisterProxies(loop_am, function_am, cgscc_am, module_am);
pass_builder.registerPipelineStartEPCallback([&](llvm::ModulePassManager& module_pm, llvm::OptimizationLevel Level) { module_pm.addPass(llvm::ModuleInlinerPass());
llvm::FunctionPassManager function_pm; function_pm.addPass(llvm::InstCombinePass()); function_pm.addPass(llvm::PromotePass()); function_pm.addPass(llvm::GVNPass()); function_pm.addPass(llvm::NewGVNPass()); function_pm.addPass(llvm::SimplifyCFGPass()); function_pm.addPass(llvm::LoopVectorizePass()); function_pm.addPass(llvm::SLPVectorizerPass()); module_pm.addPass(llvm::createModuleToFunctionPassAdaptor(std::move(function_pm)));
module_pm.addPass(llvm::GlobalOptPass()); });
engine
关于LLVM Engine的配置基本都在engine.h
,engine.cc
,engine_llvm_test.cc
里面,还可以加载预编译好LLVM IR
/// load pre-compiled IR modules from precompiled_bitcode.cc and merge them into /// the main module. Status LoadPreCompiledIR();
// load external pre-compiled bitcodes into module Status LoadExternalPreCompiledIR();
// Create and add mappings for cpp functions that can be accessed from LLVM. arrow::Status AddGlobalMappings();
// Remove unused functions to reduce compile time. Status RemoveUnusedFunctions();
std::unique_ptr<llvm::LLVMContext> context_; std::unique_ptr<llvm::orc::LLJIT> lljit_; std::unique_ptr<llvm::IRBuilder<>> ir_builder_; std::unique_ptr<llvm::Module> module_; LLVMTypes types_;
std::vector<std::string> functions_to_compile_;
bool optimize_ = true; bool module_finalized_ = false; bool cached_; bool functions_loaded_ = false; std::shared_ptr<FunctionRegistry> function_registry_; std::string module_ir_; std::unique_ptr<llvm::TargetMachine> target_machine_; const std::shared_ptr<Configuration> conf_;};
encrypt
Gandiva里面有加密套件的相关设置(但是却没看到文档关于如何使用的),其使用的AES加密也来自OpenSSL组件
GANDIVA_EXPORTint32_t aes_encrypt(const char* plaintext, int32_t plaintext_len, const char* key, int32_t key_len, unsigned char* cipher);
/** * Decrypt data using aes algorithm **/GANDIVA_EXPORTint32_t aes_decrypt(const char* ciphertext, int32_t ciphertext_len, const char* key, int32_t key_len, unsigned char* plaintext);
具体的Test
TEST(TestShaEncryptUtils, TestAesEncryptDecrypt) { // 16 bytes key auto* key = "12345678abcdefgh"; auto* to_encrypt = "some test string";
auto key_len = static_cast<int32_t>(strlen(reinterpret_cast<const char*>(key))); auto to_encrypt_len = static_cast<int32_t>(strlen(reinterpret_cast<const char*>(to_encrypt))); unsigned char cipher_1[64];
int32_t cipher_1_len = gandiva::aes_encrypt(to_encrypt, to_encrypt_len, key, key_len, cipher_1);
unsigned char decrypted_1[64]; int32_t decrypted_1_len = gandiva::aes_decrypt(reinterpret_cast<const char*>(cipher_1), cipher_1_len, key, key_len, decrypted_1);
EXPECT_EQ(std::string(reinterpret_cast<const char*>(to_encrypt), to_encrypt_len), std::string(reinterpret_cast<const char*>(decrypted_1), decrypted_1_len));
decimal_ir
对于浮点数代码的生成进行了特别的处理,看来这里面坑不小😂
class DecimalIR : public FunctionIRBuilder { public: explicit DecimalIR(Engine* engine) : FunctionIRBuilder(engine), enable_ir_traces_(false) {}
/// Build decimal IR functions and add them to the engine. static Status AddFunctions(Engine* engine);
void EnableTraces() { enable_ir_traces_ = true; }
llvm::Value* CallDecimalFunction(const std::string& function_name, llvm::Type* return_type, const std::vector<llvm::Value*>& args);
private: /// The intrinsic fn for divide with small divisors is about 10x slower, so not /// using these. static const bool kUseOverflowIntrinsics = false;
// Holder for an i128 value, along with its with scale and precision. class ValueFull { public: ValueFull(llvm::Value* value, llvm::Value* precision, llvm::Value* scale) : value_(value), precision_(precision), scale_(scale) {}
llvm::Value* value() const { return value_; } llvm::Value* precision() const { return precision_; } llvm::Value* scale() const { return scale_; }
private: llvm::Value* value_; llvm::Value* precision_; llvm::Value* scale_; };
// Holder for an i128 value, and a boolean indicating overflow. class ValueWithOverflow { public: ValueWithOverflow(llvm::Value* value, llvm::Value* overflow) : value_(value), overflow_(overflow) {}
// Make from IR struct static ValueWithOverflow MakeFromStruct(DecimalIR* decimal_ir, llvm::Value* dstruct);
// Build a corresponding IR struct llvm::Value* AsStruct(DecimalIR* decimal_ir) const;
llvm::Value* value() const { return value_; } llvm::Value* overflow() const { return overflow_; }
private: llvm::Value* value_; llvm::Value* overflow_; };
附录:Arrow类型与LLVM类型的映射
Gandiva 类型(arrow 数据类型) | C 函数类型 |
---|---|
int8 | int8_t |
int16 | int16_t |
int32 | int32_t |
int64 | int64_t |
uint8 | uint8_t |
uint16 | uint16_t |
uint32 | uint32_t |
uint64 | uint64_t |
float32 | float |
float64 | double |
boolean | bool |
date32 | int32_t |
date64 | int64_t |
timestamp | int64_t |
time32 | int32_t |
time64 | int64_t |
interval_month | int32_t |
interval_day_time | int64_t |
utf8(作为参数类型) | const char*、uint32_t |
utf8(作为返回类型) | int64_t context、const char*、uint32_t* |
binary(作为参数类型) | const char*、uint32_t |
utf8(作为返回类型) | int64_t context、const char*、uint32_t* |
总结
抛开不知道为什么项目文件不区分文件夹的问题,项目代码质量很高,关键点的注释和测试样例可以让人理解Gandiva做的事情,很有意思。
虽然有提供Gandiva外部函数的相关手册,但具体怎么用的话,还是要是要看测试样例