2009 Volume 22 Issue 1 Pages 1_43-1_56
A concept of matchability of survey data is introduced based on decompositions of the joint probability density functions. This definition of matchability naturally leads to restrictions on the joint distributions in the form of various conditional independence relations. The concept of partial matchability is defined as the global matchability with respect to a subset of the underlying variables. The global matchability does not imply partial matchability and vice versa, which constitutes part of Simpson's paradox. A numerical experiment is carried out to show possible merits of algorithms based on partial matchability. We also show numerically that when the ideal assumption of matchability holds only approximately, estimation accuracy is still guaranteed to some extent. Extension to the problem of matching three files is also briefly discussed.