There might be something simple that I'm overlooking, but I'm curious about this...
We all know that speakers are measured using microphones, and we get our nice FR graphs, etc. However, I'm curious about how they go about measuring the microphones? I remember obtaining a cheap one once for karaoke purposes, and it had an awful response curve; but I am curious as to how it was obtained, given that speakers are an imperfect tool to reproduce sound.
But if you know the errors in the source, you can calculate the errors in the receiving mechanism. Many of the frequency response errors in loudspeakers are due to the nature of how a consumer speaker is built to work in a domestic location under conditions more difficult than a simple sinewave sweep. Eliminate the requirements for music playback in a home environment and you can build a far more accurate speaker to use as a reference in a laboratory. As an example, home loudspeakers will require both low and high SPL capability. A reference speaker used in a lab probably will not be required to perform at such extremes depending on the tests being conducted. Knowing that alone makes the task of the speaker designer that much more simple. Consider also the proximity of most microphones to their source vs. the same condition for a consumer speaker.
How many microphones do you have? How many speakers? Take the average response curve and filter out the errors. Once you've arrived at a reference, there is no need for more averaging. And the reference was established a long, long time ago.