Ideally to do proper testing of a set of markers you need to minimize all the external variables, if only to avoid any later debunking of results even though the effects of an individual variable might be negligible.
The tests would need to be indoors in an environmentally controlled room thus eliminating effects of wind and temperature. Instead of paint, you'd want to use reball, thus minimizing issues over paint consistency. Instead of a hopper you'd want to use a mechanism that gives a true constant and consistent feed rate, such as a single column of balls (although this has its own problems).
A common barrel design should give the best comparative results, although it would be open to claims that a certain marker works best with a certain barrel, so additional tests should take place with the common barrels used for each marker. Ideally two-piece barrels, such as Freaks, should be avoided due to the possibility of misalignment of the two halves of the barrel due to wear and tear, or user error.
Each round of tests should be done with the same air system used by all markers, however the tests should be repeated with different designs of air system to again avoid claims that certain markers work better with certain air systems. In addition, every time the air system is refilled for the next marker, it should be left to sit until its temperature has settled before being used to eliminate this as a variable.
Preferably all markers would be brought for testing by the manufacturer or a trusted agent after having been tuned to their optimum performance, thus avoiding later claims that setup issues were at fault for poor results.
Finally the marker being tested should be clamped into a test rig and fired mechanically rather than by hand so that consistent trigger pull rates can be tested.
Pushing the idea to the extreme, you'd also want to repeat each and every test with more than one example of the equipment being used, whether marker, barrel or air system. This avoids claims that a particular item skewed results because it was one of the supposedly rare bad samples.
It would be an interesting series of tests and would no doubt still be lambasted by some manufacturers if it was discovered that their designs didn't quite do what they say on the tin. For example, while all manufacturers boast rates of fire that far exceed what we humanly can achieve, I still laugh at those who claim rates that might be mechanically achievable when dry firing, but have no chance of doing the same with paint actually being fired.
Personally I suspect that for most designs they'd all come out effectively identical as far as performance goes, with any differences being more than masked by real world variables, especially the effects of the firer, when used for actual playing.
Reliability, efficiency and maintainability would probably provide far greater variance between the various designs, with the latter being somewhat subjective, eg. personally I've always found Angels easy to work on, but Automags, supposedly a far simpler design to maintain, baffle me to this day.