Cross-Compiling Rust CLI Tools for macOS and Linux
How we build stout for 4 targets (macOS ARM64, macOS Intel, Linux x86_64, Linux ARM64) in CI — cross-compilation, static linking, and release automation.
stout ships as a single binary for four targets: macOS ARM64 (Apple Silicon), macOS Intel (x86_64), Linux x86_64, and Linux ARM64. Every release produces four binaries, each statically linked with no runtime dependencies. This article covers the cross-compilation setup, the static linking decisions, the CI pipeline, and the specific problems we solved along the way.
Target matrix
Rust’s target triple system identifies each platform:
| Target | OS | Architecture | Use case |
|---|---|---|---|
aarch64-apple-darwin | macOS | ARM64 | Apple Silicon Macs (M1+) |
x86_64-apple-darwin | macOS | Intel | Pre-2021 Macs, Rosetta |
x86_64-unknown-linux-gnu | Linux | x86_64 | Servers, CI, WSL |
aarch64-unknown-linux-gnu | Linux | ARM64 | AWS Graviton, Raspberry Pi |
Adding a target to a Rust project is straightforward:
rustup target add aarch64-apple-darwin
rustup target add x86_64-apple-darwin
rustup target add x86_64-unknown-linux-gnu
rustup target add aarch64-unknown-linux-gnu
Compiling for a target is one flag:
cargo build --release --target aarch64-apple-darwin
The reality is more complex. Cross-compilation requires a linker and system libraries for the target platform. This is where things get interesting.
macOS: universal binaries from CI
macOS cross-compilation is the easiest case because Apple provides toolchains for both ARM64 and Intel on the same machine. A GitHub Actions runner on macos-14 (Apple Silicon) can build for both targets natively:
jobs:
build-macos:
runs-on: macos-14
strategy:
matrix:
target:
- aarch64-apple-darwin
- x86_64-apple-darwin
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ matrix.target }}
- run: cargo build --release --target ${{ matrix.target }}
- uses: actions/upload-artifact@v4
with:
name: stout-${{ matrix.target }}
path: target/${{ matrix.target }}/release/stout
The Intel build runs under Rosetta 2 translation on the ARM64 runner, but since we are cross-compiling (not running the binary), Rosetta is only used for the compiler itself, and the output is a native x86_64 binary.
We also produce a universal binary (fat binary) that contains both architectures:
universal-macos:
needs: build-macos
runs-on: macos-14
steps:
- uses: actions/download-artifact@v4
- run: |
lipo -create \
stout-aarch64-apple-darwin/stout \
stout-x86_64-apple-darwin/stout \
-output stout-universal
The universal binary is 2x the size of a single-architecture binary, but it works on any Mac without the user knowing or caring about their CPU architecture. This is what gets installed by the default install script.
Linux: static linking with musl
Linux cross-compilation is harder because of glibc. A binary linked against glibc 2.38 (Ubuntu 24.04) will not run on a system with glibc 2.31 (Ubuntu 20.04). The glibc version on the build machine becomes the minimum supported version.
The solution is musl libc, a lightweight libc implementation that supports full static linking. A musl-linked binary has zero runtime dependencies — it runs on any Linux kernel 3.2+, regardless of the distribution or installed libraries.
rustup target add x86_64-unknown-linux-musl
cargo build --release --target x86_64-unknown-linux-musl
Verify static linking:
file target/x86_64-unknown-linux-musl/release/stout
# stout: ELF 64-bit LSB executable, x86-64, statically linked
ldd target/x86_64-unknown-linux-musl/release/stout
# not a dynamic executable
The SQLite complication
stout depends on rusqlite with the bundled feature, which compiles SQLite from C source. When targeting musl, the C compiler must also target musl. On Ubuntu, this requires the musl-tools package:
sudo apt-get install musl-tools
This provides musl-gcc, which cc (the Rust build script crate) will use when it detects a musl target. Without it, the build fails with linker errors about missing glibc symbols.
For ARM64 Linux cross-compilation from an x86_64 host, we need a full cross-compilation toolchain:
sudo apt-get install gcc-aarch64-linux-gnu musl-tools
And a Cargo configuration to tell the linker which tool to use:
# .cargo/config.toml
[target.aarch64-unknown-linux-musl]
linker = "aarch64-linux-gnu-gcc"
rustflags = ["-C", "target-feature=+crt-static"]
[target.x86_64-unknown-linux-musl]
rustflags = ["-C", "target-feature=+crt-static"]
The OpenSSL problem (and how to avoid it)
The most common cross-compilation pain point in Rust is OpenSSL. Many crates depend on openssl-sys, which links against the system OpenSSL. Cross-compiling OpenSSL for a different target is a multi-step ordeal involving downloading source, configuring for the target architecture, and setting environment variables.
stout avoids this entirely by using reqwest with the rustls-tls feature instead of native-tls:
[dependencies]
reqwest = { version = "0.12", default-features = false, features = [
"rustls-tls",
"stream",
"gzip",
] }
rustls is a pure-Rust TLS implementation. It compiles for any target without system dependencies. The performance difference from OpenSSL is negligible for a CLI tool, and the cross-compilation simplification is substantial.
Similarly, ed25519-dalek is pure Rust, and zstd compiles its C source via the cc crate, which handles cross-compilation automatically.
CI pipeline: GitHub Actions
The complete CI pipeline builds all four targets in parallel:
name: Release
on:
push:
tags: ['v*']
jobs:
build:
strategy:
matrix:
include:
- target: aarch64-apple-darwin
os: macos-14
- target: x86_64-apple-darwin
os: macos-14
- target: x86_64-unknown-linux-musl
os: ubuntu-24.04
- target: aarch64-unknown-linux-musl
os: ubuntu-24.04
cross: true
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ matrix.target }}
- name: Install cross-compilation tools
if: matrix.cross
run: |
sudo apt-get update
sudo apt-get install -y gcc-aarch64-linux-gnu musl-tools
- name: Install musl tools
if: contains(matrix.target, 'musl') && !matrix.cross
run: sudo apt-get install -y musl-tools
- name: Build
run: cargo build --release --target ${{ matrix.target }}
- name: Strip binary
run: |
if [[ "${{ matrix.target }}" == *"linux"* ]]; then
strip target/${{ matrix.target }}/release/stout || true
fi
- name: Package
run: |
cd target/${{ matrix.target }}/release
tar czf stout-${{ matrix.target }}.tar.gz stout
shasum -a 256 stout-${{ matrix.target }}.tar.gz > \
stout-${{ matrix.target }}.tar.gz.sha256
- uses: actions/upload-artifact@v4
with:
name: stout-${{ matrix.target }}
path: target/${{ matrix.target }}/release/stout-${{ matrix.target }}.*
Binary stripping
Release binaries include debug symbols by default. Stripping reduces size significantly:
| Target | Before strip | After strip |
|---|---|---|
| macOS ARM64 | 14.2 MB | 5.8 MB |
| macOS Intel | 15.1 MB | 6.2 MB |
| Linux x86_64 (musl) | 16.8 MB | 6.5 MB |
| Linux ARM64 (musl) | 15.4 MB | 6.1 MB |
We also set optimization flags in Cargo.toml:
[profile.release]
opt-level = 3
lto = "fat"
codegen-units = 1
strip = true
panic = "abort"
lto = "fat" enables link-time optimization across all crates, which typically reduces binary size by 10-20% and improves runtime performance. codegen-units = 1 improves optimization at the cost of compile time — acceptable for release builds. panic = "abort" removes unwinding tables, saving another ~200KB.
Testing cross-compiled binaries
Building is only half the problem. We need to verify that binaries work on their target platforms. For macOS, the CI runner can execute both ARM64 and Intel binaries (ARM64 natively, Intel via Rosetta). For Linux ARM64, we use QEMU user-mode emulation:
test-linux-arm64:
needs: build
runs-on: ubuntu-24.04
steps:
- uses: actions/download-artifact@v4
with:
name: stout-aarch64-unknown-linux-musl
- name: Install QEMU
run: sudo apt-get install -y qemu-user-static
- name: Test binary
run: |
chmod +x stout
qemu-aarch64-static ./stout --version
qemu-aarch64-static ./stout search ripgrep
QEMU user-mode translates ARM64 system calls to x86_64 at runtime. It is slow (5-10x overhead) but sufficient for smoke testing. Full integration tests run on native ARM64 hardware via self-hosted runners.
Release automation
When we push a git tag (v0.3.2), the CI pipeline builds all targets, runs tests, and creates a GitHub Release with all artifacts. A separate job updates the Homebrew tap formula so that users who install stout via brew install neullabs/tap/stout get the new version automatically.
The entire pipeline — from git tag to published release with four binaries — runs in approximately 8 minutes. Most of that time is cargo build --release with LTO enabled. Without LTO, builds complete in 3 minutes, but the binaries are 15-20% larger.
Cross-compilation in Rust is not zero-configuration. OpenSSL avoidance, musl toolchains, and linker configuration require upfront work. But once the pipeline is set, every release produces four statically-linked binaries from a single codebase with no manual intervention. That is a capability that Rust’s toolchain makes practical in a way that few other languages match.
Need Rust performance engineering or AI agent expertise?
Neul Labs — the team behind stout — consults on Rust development, performance optimization, CLI tool design, and AI agent infrastructure. We build fast, reliable systems that ship.