Search Tree

Next: Changing the Linear Algebra Up: Function Object Classes Previous: Indicator Cdf Estimator

Subsections

Where Defined
Template Parameters
Model of
Members

Search Tree

search_tree<T, allocator>

Consider a random variable Z( $\bf u$ ) which can take K different values Z₁,..., Z_K. The aim of a Search Tree is to infer the ccdf of Z( $\bf u$ ) from a known realization of Z( $\bf u$ ): z( $\bf u_{\alpha_1}^{}$ ),..., z( $\bf u_{\alpha_N}^{}$ ), called ``training image''.

Call T = { $\bf h_1$ ,..., $\bf h_t$ } the family of vectors defining a geometric template (or ``window'') of t locations. These vectors are also called nodes of the template. A data event implied by T, at location u, is the sequence of values:

D_T( $\displaystyle \bf u$ ) = {Z( $\displaystyle \bf u$ + $\displaystyle \bf h_1$ ),..., Z( $\displaystyle \bf u$ + $\displaystyle \bf h_t$ )}

u is called the central node of the template. Provided the training image is stationary, the same data event (i.e. the same sequence of values) can be observed at different locations.

Guardiano, and later, Strebelle proposed to model the probability

P $\displaystyle \Big($ Z( $\displaystyle \bf u$ ) = z_k {z( $\displaystyle \bf u$ + $\displaystyle \bf h_1$ ),..., z( $\displaystyle \bf u$ + $\displaystyle \bf h_t$ )} $\displaystyle \Big)$

by the frequency of occurrence in the training image of event

z( $\displaystyle \bf u_{\alpha}^{}$ ) = z_k {z( $\displaystyle \bf u_{\alpha}^{}$ + $\displaystyle \bf h_1$ ),..., z( $\displaystyle \bf u_{\alpha}^{}$ + $\displaystyle \bf h_t$ )}

( $\bf u_{\alpha}$ is a location of the training image): if for a given data event d_T, there are n locations $\bf u_i$ in the training image such that:

D_T( $\displaystyle \bf u_i$ ) = d_T i = 1,..., n

and among these n locations, n_k are such that the central pixel value z( $\bf u_j$ ) = z_k ( j = 1,..., n_k), then the probability P $\Big($ Z( $\bf u$ ) = z_k d_T $\Big)$ is modeled by

P $\displaystyle \Big($ Z( $\displaystyle \bf u$ ) = z_k d_T $\displaystyle \Big)$ = $\displaystyle {\frac{n_k}{n}}$

In some cases, data event d_T can not be found in the training image. Call T_-1 the subset of T obtained by dropping one of the vectors (nodes) of T, and similarly, T_-j the subset of T after dropping j vectors of T, for any j $\in$ {0,..., t - 1}, with T_-0 = T. If data event d_T can not be found in the training image, template T is recursively simplified into T_-1,...,T_-j, until d_{T_-j} can be found. Typically, the nodes are dropped according to the amount of information they bring in estimating the probability distribution of Z( $\bf u$ ) = z_k. The probability P $\Big($ Z( $\bf u$ ) = z_k d_T $\Big)$ is then approximated by

P $\displaystyle \Big($ Z( $\displaystyle \bf u$ ) = z_k d_T $\displaystyle \Big)$ $\displaystyle \simeq$ P $\displaystyle \Big($ Z( $\displaystyle \bf u$ ) = z_k d_{T_-j} $\displaystyle \Big)$

A search tree is a data structure that enables to store all the data events d_{T_-j} (j = 0,..., t - 1) present in the training image, along with the corresponding frequencies of occurrence of z_k (k = 1,..., K) at the central node.

Where Defined

In header file <cdf_estimators.h>

Template Parameters

`T`		is the type used to represent a category. It is a ``discrete type'', e.g. `int`, `bool`, ...
`allocator`		is an object that manages the memory allocation for the search tree. Two models of allocator are available: a pool allocator and a standard allocator. The standard allocator allocates memory only when it is needed. This allows to use as little memory as possible, but may not be efficient in term of speed. Memory allocation is a slow task, and significant speed improvement can be achieved by decreasing the number of times memory is allocated. This is what the pool allocator does: memory is allocated by large chunks or pools, and when new space is needed, it is taken from the pool without requiring the system to allocate some new memory. The standard allocator should be used when memory is an issue, while the pool allocator is useful to improve performance. pool alocator is the default allocator.

Model of

Single Variable Cdf Estimator

Members

template<class window_neighborhood, class forward_iterator> search_tree::search_tree(forward_iterator begin, forward_iterator end window_neighborhood& neighbors, int window_size, int nb_of_categories) Constructs a search_tree.
forward_iterator is a model of Forward Iterator. It iterates on a set of geo-values, the training image.
neighborhood is a model of Window Neighborhood. It is used to define the data event associated to each geo-value in range [begin,end).
window_size is the maximum number of geovalues neighbors can contain. nb_of_categories is the number of categories (e.g. "sand", "mud") that we want to work with.
template<class location, class non_param_cdf> int search_tree::operator()(const location& u, const window_neighborhood& neighbors, non_param_cdf& ccdf) Function call operator. It estimates the non-parametric cdf parameters and modifies ccdf accordingly. location is a model of Location, and window_neighborhood is a model of Window Neighborhood.
u is the location at which the Gaussian conditional cdf is estimated.
neighbors is neighborhood of location u. It must be the same object (or different object with the same characteristics) as the one used in the search tree constructor. The order in which the neighbors are stored inside the neighborhood is important. Denote $\bf u$ + $\bf h_i$ , i = 1,..., N the locations of the N neighbors of u. The algorithm assumes that the i^th geo-value in the neighborhood is Z( $\bf u$ + $\bf h_i$ ). The function returns 0 if no problem occurred.

Next: Changing the Linear Algebra Up: Function Object Classes Previous: Indicator Cdf Estimator

nicolas
2002-05-07