A Closer Look at Setting Up an MLIR Environment

Thinking about it, I have not updated the blog for a while. On one hand, I spent March and April tinkering with some new paper-related things, raising Openclaw / Hermes, preparing for my graduation defense, and then actually attending it. By the way, I am very happy that I successfully finished my defense at the end of April 😁. On the other hand, I have been deep into Vibe Coding. I handed most engineering work over to Claude Code, but because the engineering side moved so quickly, I was forced to keep learning and recharging, which left me with even less time to organize notes 😨

But I really have to talk about MLIR environment setup. As everyone knows, MLIR is now a subproject of LLVM, and LLVM itself is a massive monorepo. Every installation pulls in a pile of unrelated files, eats disk space, and builds slowly. So today I am writing this post to properly talk through the problem of setting up an MLIR environment.

Option 1: Build from Source

Honestly building everything from scratch is definitely the best approach, and MLIR’s getting started guide also recommends starting from zero. But if your machine is only average, such as a laptop or normal PC, or your network is not great, such as in China, cloning the repository and compiling it are already painful enough 😰. When I once deployed an MLIR environment inside a company’s internal network, I had to try no fewer than three or four sets of build parameters, and even then I could not guarantee that every setup would compile and run successfully. Since then, my rule for MLIR environments has been: use binary distributions whenever possible, and avoid building manually.

My rating for this option: NPC-tier.

Option 2: Binary Distribution from Debian Sid Repositories

Container-based builds were my shortcut idea when I was learning MLIR. Debian Sid’s repositories provide a binary distribution of MLIR by default. You only need to pull a Debian Sid container and install it directly, which also makes distribution very convenient.

The only problem is that running perf inside a container is inconvenient. Still, this option deserves a solid thumbs-up. Dev Containers are the right answer for team collaboration.

BTW, even by the time I left that project, I still had not found a good technical partner 😂. The barrier to entry for MLIR is still too high, so environment deployment should be simplified wherever it can be simplified.

Option 3 - Recommended: Binary Distribution Through a Package Manager (pixi)

Because of well-known historical reasons, C++ dependency management is a deep pit, and C++ package management comes in all shapes and sizes. The approaches I know already include at least three or four options: Conan, vcpkg, manual script management, apt repository management, and I have even seen people manage C++ dependencies with Python’s venv virtual environments.

Among these, I had only heard of Conan but had not used it. At the time, distributing through Debian containers already solved most problems, and I was working either in Windows WSL or on remote Linux machines, so I did not try Conan 🤔.

As for vcpkg, when I tried it in 2024, the dependency chain was long and unpleasant. It consumed a lot of disk space and built slowly. It was quite useful for some small projects, but it became very slow when it encountered a large project like MLIR, so I did not try it again afterwards 😅.

As for managing C++ dependencies with Python’s venv virtual environments, I unexpectedly think this is a pretty good approach 😂. Demonstration-style projects like llvm/lighthouse even use uv together with edusl package distribution. However, binaries cannot be shared with that approach, though maybe uv can achieve a shared effect? If you are only writing a pure C++ project, seeing a .venv directory still feels awkward. The corresponding alternative is Conda management, but using Conda inside a company can involve compliance risks, so I gave up on it.

I know Google uses Bazel to manage MLIR environments, but that thing is too complex and not suitable for personal use.

Then, while working on a project in April 2026, I discovered pixi.

This package management mechanism feels excellent to me. It uses conda-forge by default, so it does not have Anaconda’s compliance risk. For users in China, it can also be switched to Tsinghua’s Conda mirror to solve dependency download problems.

I tried converting the dependencies of my previous MLIR project to pixi (mlir-hello-world), and it smoothly handled dependency management on both macOS and Linux. This is probably the most suitable distribution approach I have seen so far 🤩. As a bonus, Vibe Coding also upgraded the project from LLVM 20 to LLVM 22.

For the PPoPP 2026 tutorial, I also put together a pixi-managed setup (mocusez/mlir-tutor). Otherwise, using edusl dependencies in China is way too troublesome. However, in actual testing, the macOS package of mlir-python-bindings distributed by conda-forge has issues and reports LLVM ERROR: Attempting to attach an interface to an unregistered operation, so for now it only supports Linux.

Conclusion

In the end, this is still a C++ dependency problem. I think pixi, within the Conda ecosystem, is the best current solution for cross-platform distribution.

I recommend pixi-based MLIR environment setup to any beginner, or anyone who simply wants to try MLIR without spending time configuring the environment 👍