Cross-compiling for the Onion Omega

Tomas Carnecky
5 min readFeb 7, 2016

TLDR: I figured out how to build a C and Haskell cross-compiler for the Onion Omega, instructions are in my GitHub repository.

I ordered four Onion Omega devices late last year and they arrived mid January 2016. They are tiny, nicely built, and the HTML GUI is gorgeous. Sadly there are not that many apps available yet. I was hoping that there would be at least node available for it so I could run small JavaScript apps on it.

Anyway, first thing I did was give them names (mauve, scarlet, sienna, umber) and do the basic setup (Wi-Fi name, password etc). Then I stared looking how to build software for it.

First challenge: the device has only 16MB flash, half of which is used up by the OS (OpenWrt). There is about 7MB usable persistent storage, a bit more if you place the files into tmpfs, but then the files are not persisted across reboots. Any reasonably sized application will push that limit. The node binary on my Mac alone is 23MB!

# df -h
Filesystem Size Used Available Use% Mounted on
rootfs 7.5M 480.0K 7.0M 6% /
/dev/root 7.3M 7.3M 0 100% /rom
tmpfs 29.9M 15.9M 14.0M 53% /tmp
/dev/mtdblock3 7.5M 480.0K 7.0M 6% /overlay
overlayfs:/overlay 7.5M 480.0K 7.0M 6% /
tmpfs 512.0K 0 512.0K 0% /dev

Building the C toolchain

The next step was to build the C toolchain. For that I needed to know the exact target configuration: What kind of CPU it is, which ISA it supports, what libc the system uses, the ABI etc. If you compile an application for the wrong target it may behave unexpectedly at runtime (if you use the wrong ABI) or fail hard with an Illegal instruction exception (if the binary was compiled for a different ISA).

Unfortunately there isn’t much information yet on the internet about the exact capabilities of the device, and /proc/cpuinfo isn’t of much help.

system type:             Atheros AR9330 rev 1
machine: Onion Omega
processor: 0
cpu model: MIPS 24Kc V7.4
BogoMIPS: 265.42
wait instruction: yes
microsecond timers: yes
tlb_entries: 16
extra interrupt vector: yes
hardware watchpoint: yes, count: 4, address/irw mask: [0x0ffc, 0x0ffc, 0x0ffb, 0x0ffb]
isa: mips1 mips2 mips32r1 mips32r2
ASEs implemented: mips16
shadow register sets: 1
kscratch registers: 0
package: 0
core: 0
VCED exceptions: not available
VCEI exceptions: not available

It’s a MIPS 24Kc CPU and supports up to the mips32r2 ISA, so far good. But what you don’t see is that the chip doesn’t have a FPU. So while a primitive Hello World C application runs just fine, once you start using Floating Point (FP) instructions it’ll crash with the aforementioned exception.

The OS uses an archaic libc (uClibc 0.9.33.2) which isn’t maintained anymore, not standards conform and thus incompatible with more recent userland tools. In particular, GHC, the Haskell compiler, can’t be built against it. I settled on musl libc which is a tiny libc optimised for static linking.

I built the toolchain using crosstools-ng, and the following defconfig. Since I’m using the compiler only for that specific target I set both march and mtune to the exact CPU model.

CT_ARCH_ARCH="24kc"
CT_ARCH_TUNE="24kc"
CT_ARCH_FLOAT_SW=y
CT_ARCH_mips=y
CT_KERNEL_linux=y
CT_LIBC_musl=y

That gave me a working C compiler which I was able to use to generate static binaries, copy them over to the device and run.

Haskell / GHC

Next up was GHC. It is a huge, complicated beast, and it takes forever to build. Google Cloud provided most of the hardware during this time. With jut a few clicks I could spin up a 32-core machine and repeatedly compile GHC. It saved me a lot of time.

MIPS is not a Tier 1 platform, which means that the GHC maintainers don’t guarantee that it works. GHC does know that the MIPS architecture exists, but can’t actually emit working bytecode for it. It looked like I was in for a lot of work (actually, I’d have given up if I had to write a complete MIPS assembler codegen). In the end though it wasn’t much that was needed to get it working.

GHC has a mode where it generates LLVM IR code, the rest is taken care of by the LLVM compiler. And fortunately the LLVM complier knows how to build MIPS binaries. In all, I had to make four changes:

  • GHC needs to write a short prelude into each LLVM IR file. It describes the architecture (how many registers it has of each type, how wide pointers are etc) and is used by the LLVM optimiser to optimise the code for that particular target. It consists of two lines which I copied from the Clang repository. Easy.
  • There is a bug in GHC where it generates wrong LLVM IR when the host and target have different endianness. I adapted the code specifically for my situation, where the host is little endian (x86) and target big endian (MIPS).
  • Since the chip on the Onion Omega has no FPU, I needed to pass an option to the LLVM compiler to not generate FP instructions. The GHC source code doesn’t distinguish between soft-float and hard-float for architectures other than ARM. So I hardcoded it in the source code.
  • libffi fails to build for a soft-float target. Known issue, patches are available but have not been integrated into the official repository (libffi seems not maintained anymore, the repository on GitHub has tons of open issues and pull requests).

One other thing which took me a while to figure out is to disable stripping in mk/build.mk. There’s either a bug in binutils or the linker which prevents the final binary from being assembled, or strip needs to be given additional options (to not strip too much out of the object files).

After figuring out all these necessary changes, it took just a few hours to actually build GHC. And then, finally, I had a shiny new GHC cross-compiler: mips-unknown-linux-musl-ghc. With it, I was able to build a Haskell Hello World app, copy it over to the Omega and run. Yay.

I should note though that the Hello World binary, even stripped, is 8.6MB, and as such doesn’t fit into the free space available on the Onion Omega flash. So I had to run it from tmpfs.

Building the toolchains on Mac OS X

The instructions in my GitHub repository are for Linux (docker containers based on Ubuntu 16.04 to be exact). I tried to replicate the instructions on my Mac OS X laptop, but that worked only for the C toolchain.

You can install crosstools-ng from homebrew and use pretty much the same defconfig. That gives you a working C toolchain. But I had to give up at compiling GHC. There are too many subtle differences between Linux and Mac OS X. Part of the problem is also GHC’s build system which seems to mix up build/host/target flags (CPP/C/LD) in various makefile rules, leading to failures in the middle of the build process.

--

--