I'm a bit worried about the amount of pointer chasing here. Presumably on modern architectures that would kill cache locality? Or am I missing something?
IIRC the Treadmill has always had appalling constant factors; its appeal is that, unlike any simpler or earlier GC algorithm, it has a worst-case execution time.