The uzp1/uzp2 implementation has a number of problems I had to rewrite it. It doesn't get the shifting/masking right. it gets input1 and input2 wrong. It checks one bit instead of two for the size field. It doesn't fail for the non-full size==3 (1d) case which should be an unallocated instruction. The new testcase passes with the patch, and fails without. The GCC C testsuite failures go from 2269 to 2227 (-42). Jim