Thanks for pointing me in the correct direction. I had thought that I was using the output buffer values directly from the "get_....." and had forgotten that I was taking an average. A bit of a closer look and it was obvious that summing the output buffer values meant that the result was too big for the default I16 data type. A conversion to I32 before summation and it works fine.
Sorry to bother you with what turned out to be my bad programming