Yeah, we're doing some work to overhaul the module code in op2ext. Once that's done, we'll probably write up the patch as an internal module within op2ext.
As for the likelihood of getting that 0 value, it's probably not all that unlikely. Based on experience, it was roughly 1/15 tries. I haven't yet analysed exact details, but the math involves division by the unit size, which is 120 bytes. I suspect there is some modulo effect, which is why we were typically seeing small values in a limited range. Couple that with memory alignment requirements for the unit array, which is likely at least 4 bytes, we are further limiting possible values. Actually, if the memory alignment requirements were 8 bytes, which would be quite reasonable for an array of large objects, and that worked in conjunction with a modulus of 120, we might expect a given outcome to take on any of 120 / 8 = 15 values. Highly suspicious.
As things like memory alignment requirements are architecture specific, yes, there would be some degree of hardware dependence, though that choice would be made at compile time, and I wouldn't expect it to vary based on end user's machines. Granted, memory allocator's sometime defer responsibility to the OS's memory allocator, so there could be some difference in alignment possibilities if a particular OS/version imposes a larger minimum alignment than the language requires.