Data, Analytics, Intelligence

Tsallis entropy

Gini is Tsallis_entropy with q=2 and Boltzmann–Gibbs is  the limit Tsallis_ continuous entropy with q->1 

Given a discrete set of probabilities 

{\displaystyle \{p_{i}\}}

 with the condition 

{\displaystyle \sum _{i}p_{i}=1}

, and 

{\displaystyle q}

 any real number, the Tsallis entropy is defined as

{\displaystyle S_{q}({p_{i}})={k \over q-1}\left(1-\sum _{i}p_{i}^{q}\right),}


{\displaystyle q}

 is a real parameter sometimes called entropic-index. In the limit as 

{\displaystyle q\to 1}

, the usual Boltzmann–Gibbs entropy is recovered, namely

{\displaystyle S_{BG}=S_{1}(p)=-k\sum _{i}p_{i}\ln p_{i}.}

For continuous probability distributions, we define the entropy as

{\displaystyle S_{q}[p]={1 \over q-1}\left(1-\int (p(x))^{q}\,dx\right),}


{\displaystyle p(x)}

 is a probability density function.

The Tsallis Entropy has been used along with the Principle of maximum entropy to derive the Tsallis distribution.

Various relationships

The discrete Tsallis entropy satisfies

{\displaystyle S_{q}=-\lim _{x\rightarrow 1}D_{q}\sum _{i}p_{i}^{x}}

where Dq is the q-derivative with respect to x. This may be compared to the standard entropy formula:

{\displaystyle S=-\lim _{x\rightarrow 1}{\frac {d}{dx}}\sum _{i}p_{i}^{x}}




Gini impurity and Entropy

Gini is Tsallis_entropy with q=2 and Boltzmann–Gibbs is Tsallis_entropy with q=1


= Is the the probability of obtaining two identical outputs, independently of their category

Gini is the chance of no Identicals in any category =  Gini impurity is a measure of how often a randomly chosen element from the set would be incorrectly labeled if it was randomly labeled according to the distribution of labels in the subset. 

Gini is intended for continuous attributes and

Entropy is for attributes that occur in classes

Gini is to minimize misclassification
Entropy is for exploratory analysis

Generally, your performance will not change whether you use Gini impurity or Entropy.

Laura Elena Raileanu and Kilian Stoffel compared both in “Theoretical comparison between the gini index and information gain criteria“. The most important remarks were:

  • It only matters in 2% of the cases whether you use gini impurity or entropy.

For the case of a variable with two values, appearing with fractions f and (1-f),
the gini and entropy are given by:
gini = 2*f(1-f)
entropy = f*ln(1/f) + (1-f)*ln(1/(1-f))
These measures are very similar if scaled to 1.0 (plotting 2*gini and entropy/ln(2) ):

Gini (y4,purple) and Entropy (y3,green) values scaled for comparison