pairwisedist.pairwisedist.ys1_distance

pairwisedist.pairwisedist.ys1_distance(data: ndarray, omega1: float = 0.5, omega2: float = 0.25, omega3: float = 0.25, rowvar: bool = True, similarity: bool = False) → ndarray

Calculates the pairwise YS1 distance matrix for a given array of n samples by p features, as described in (Son YS, Baek J 2008, Pattern Recognition Letters). The YS1 dissimilarity ranges between 0 and 1. The YS1 dissimilarity is a metric that takes into account the Spearman rank correlation between the samples (S* i,j), the positon of the minimal and maximal values of each sample (M i,j), and the agreement of their slopes (A i,j). The final score (Ys1 i,j) is a weighted average of these three paremeters: YS1 i,j = omega1 * (S* i,j) + omega2 * (A i,j) + omega3 * (M i,j)

Parameters

data (np.ndarray) – an n-by-p numpy array of n samples by p features, to calculate pairwise distance on.
omega1 (float between 0 and 1) – Relative weight of the correlation (S* i,j) component of the YS1 distance. All three relative weights (omega1-3) must add up to exactly 1.0.
omega2 (float between 0 and 1) – Relative weight of the slope concordance (A i,j) component of the YS1 distance. All three relative weights (omega1-3) must add up to exactly 1.0.
omega3 (float between 0 and 1) – Relative weight of the minimum-maximum similarity (M i,j) component of the YS1 distance. All three relative weights (omega1-3) must add up to exactly 1.0.
rowvar (bool (default=True)) – If True, calculates the pairwise distance between the rows of ‘data’. If False, calculate the pairwise distance between the columns of ‘data’.
similarity (bool (default=False)) – If False, returns a pairwise distance matrix (0 means closest, 1 means furthest). If True, returns a pairwise similarity matrix (1 means most similar, 0 means most different).

Returns

an n-by-n numpy array of pairwise YS1 dissimilarity scores.

Return type

np.ndarray