In Bayesian decision theory, there has been a great amount of research into theoretical frameworks and information– theoretic quantities that can be used to provide lower and upper bounds for the Bayes error. These include well-known bounds such as Chernoff, Battacharrya, and J-divergence. Part of the challenge of utilizing these various metrics in practice is (i) whether they are ”loose” or ”tight” bounds, (ii) how they might be estimated via either parametric or non-parametric methods, and (iii) how accurate the estimates are for limited amounts of data. In general what is desired is a methodology for generating relatively tight lower and upper bounds, and then an approach to estimate these bounds efficiently from data. In this paper, we explore the so-called triangle divergence which has been around for a while, but was recently made more prominent in some recent research on non-parametric estimation of information metrics. Part of this work is motivated by applications for quantifying fundamental information content in SAR/LIDAR data, and to help in this, we have developed a flexible multivariate modeling framework based on multivariate Gaussian copula models which can be combined with the triangle divergence framework to quantify this information, and provide approximate bounds on Bayes error. In this paper we present an overview of the bounds, including those based on triangle divergence and verify that under a number of multivariate models, the upper and lower bounds derived from triangle divergence are significantly tighter than the other common bounds, and often times, dramatically so. We also propose some simple but effective means for computing the triangle divergence using Monte Carlo methods, and then discuss estimation of the triangle divergence from empirical data based on Gaussian Copula models.
|