Hi Joern, Here are some benchmark results. Following your advice, I took the arith-rand test, increased its main loop count until it took around 10 seconds to run on my test machine, and tested it against the eight optimization combinations that are tested for in the gcc torture test [see methodology notes attached] I found that my change increased the runtime by 1.5 to 2 percent (even when I added the new instructions that I'm working on). That didn't seem too bad to me, but I took some advice from Alex Oliva and tried simply changing the sh_jmp_table from char to short. This much simpler change increased the runtime by only 0.5 to 1 percent, at the cost of 64k more data space. So I'll withdraw my previous patch and submit the following instead: