It's a benchmark run hopefully on the same version of the source code. Each system has different compilers with different optimizations. Also, since this is a graphic benchmark this maybe more of a comparison of GPUs instead of CPUs, and graphics libraries. Perhaps the libraries on Honeycomb are not optimized for the larger displays. I know Apple has updated their graphics libraries several times over the last year with some substantial increase performance.
The only thing we know is that iPad2 did better than the XOOM on these specific tests. If your do not do similar tasks in everyday usage the tests do not necessarily relate to your experience. It also does not mean in the future a firmware or software update could not result in the roles reversing.
I actually think the CNet's web site loading test may be more relevant test since this is a more likely task for a "typical" tablet owner.