The Transmuted Kumaraswamy Pareto Distribution

The generalization of probability distributions in view of improving their flexibility in capturing the shape and tail behavior of disparate data sets has become one of the most active aspects of statistical research. We developed a new Pareto distribution by using Kumaraswamy method which gave rise to a new distribution called Kumaraswamy Pareto distribution. This method is called transmutation. The mathematical properties of the new generalized distribution was presented using quartiles, moments, entropy, order statistics mean deviation and maximum likelihood method of parameter estimation. The new generalized distribution was applied to a real life data on the exceedances of flood peaks (in m (cid:2) s⁄) where it was observed to be superior to its sub models in terms of some goodness-of-fit test such as log likelihood criterion, Akaike Information Criterion (AIC) and Kolmogorov-Smirnov (K-S) measures.


Introduction
In model development, probability distributions have been used in real life applications such as in finance, medicine, engineering agriculture, environmental sciences to mention but a few. These distributions include Gamma, Pareto, Weibull, Exponential, Lindley, Kumaraswamy and their generalization.
In some cases, classical probability distributions do not provide adequate fit to some real life data in terms of goodness of fit measure, for instance the normal distribution cannot provide an adequate description of the patterns of asymmetric or skewed data (Urama et al. [25]). 326 Kumaraswamy [19] studied the Kumaraswamy distribution for double bounded random process and applied in hydrology.
Azzalini [7] studied some class of new distributions and also some normal ones. This work gave rise to new methods for generalizing univariate continuous probability distributions.
Oguntunde et al. [22] remarked that attention has been shifted to comparing the performance of compound distributions to that of standard theoretical distributions.

The quadratic rank transmutation map (QRTM)
Let and be the cumulative distribution functions of two distributions that have a common sample space and and are their inverse functions also called the quantile functions respectively. The general rank transmutation are given by Shaw and Buckley [24] can be defined as ) = ) and ) = ). The First expression is called Sample transmutation map while the latter is known as the Quadratic transmutation map. The functions and both map the unit interval 0,1 into it selves. ) = + 1 − ), λ ≤ 1, from which follows that the cumulative function (cdf) satisfies the relationship ) = 1 + ) ) − ) . Then, by differentiation, it yields ) = ) 1 + − 2 ) , where and are the corresponding probability density functions associated with the cumulative distribution functions and respectively. Therefore, a random variable X is said to have a transmuted distribution if its cumulative distribution function (cdf) and probability density function (pdf) satisfies the following relationships ) = 1 + ) ) − ) , > 0, − 1 < < 1 1.1) and $ ) = )% 1 + − 2 ) &, 1 ) and ) are the corresponding probability density functions associated with the cumulative distribution functions ) and ) respectively.
Generally, quadratic rank transmutation map have been used as a convenient way of constructing new distributions, in particular, survival time models. According to Shaw and Buckley [24], transmutation maps comprise the functional composition of the Differentiating equation (1.1) we will obtain the corresponding probability density function of the transmuted probability distribution $ ) = ) 1 + − 2 )), where, λ is the transmuted parameter and ) is the baseline function. Equations (1.1) and (1.2) will reduce to the distribution function and probability density function of the baseline distribution when the transmuted parameter takes the value zero.

The transmuted Kumaraswamy Pareto distribution
In this study we proposed a four-parameter probability distribution called the transmuted Kumaraswamy Pareto distribution. Although Kumaraswamy pareto distribution is an extension of the Pareto distribution taking Kumaraswamy distribution function as the generator. Kumaraswamy Pareto is not common in the literature and has not been fully exploited. Although, its performance in modeling datasets exhibiting extreme value properties was remarked by Bourguignon et al. [9]. Although, they observed that Kumaraswamy Pareto provided a good fit in modeling exceedances of flood peaks data; much is still needed to be done to improve on its goodness of fit by adding extra parameter to the existing parameters of the model. According to Bourguignon et al. [8], a random variable, X, is distributed as Kumaraswamy Pareto if its probability density function is defined as follows, with the corresponding distribution function is given as

1.5)
The derivative of equation (1.5) yields the density function of a transmuted distribution given as, where ), is the cdf of the baseline distribution, $ ) and ) are the corresponding probability density functions (pdf's) associated with ) and ) respectively. It is important to note that if = 0, equation (1.5) becomes ), the distribution function of the baseline random variable; otherwise we shall have the distribution function of the transmuted random variable. Similarly, the probability density function of the baseline distribution will be equal to the probability density function of the transmuted distribution associated with it when = 0. for > * otherwise, $ ) = 0.
Note: If = 0, equation (1.7) reduces to the probability density function of the Kumaraswamy Pareto distribution.
Akinsete et al. [1] studied the beta-Pareto distribution and came up with the cumulative distribution function of the beta-Pareto random variable as where ' and ( are the shape parameters.
Akinsete et al. [1] found that the beta-Pareto distribution is unimodal, exhibits either decreasing or a unimodal hazard rate. The authors derived the expressions for mean, mean deviation and variance, skewness, kurtosis and entropies. The method of maximum likelihood estimate was proposed to estimate the parameters of the model while the application of the model to flood dataset was carried out to show that it can model heavytailed distributions.
Aryal and Tsokos [2] studied the Transmuted Generalized Extreme value by using the transmutation map technique on the cumulative generalized extreme value Aryal and Tsokos [2] observed that a random variable, X is said to have generalized extreme value (GEV) distribution if its cdf is given by: where −∞ < < ∞ is the location parameter, R is the scale parameter and −∞ < P < ∞ is the shape parameter. The corresponding density function is given by where P determines the tail behaviour of the distribution.
A particular case of the Generalized Extreme Value (GEV) distribution for P = 0 and −∞ < < ∞ is the Gumbel distribution. The cases for P > 0 and P < 0 the GEV distribution tends to be Frechet and the negative Weibull distributins respectively.
The authors studied the transmuted Gumbel distribution with cdf Equation (1.10) yields, equation (1.11) on differentiation, the probability density function (pdf) of a transmuted Gumbel random variable

1.13)
Aryal and Tsokos [2] further used the transmuted Gumbel distribution to model snow fall data and it was observed to be a very good model for the data.

330
Aryal and Tsokos [2] developed the analytical framework of the transmuted extreme value probability distribution and derived the expression for basic statistical measures and also provided the maximum likelihood equations of the parameter inherent in the subject distribution. They also illustrated the usefulness and effectiveness of the said transmuted Gumbel probability distribution and applied it in the snow fall data. The goodness of it tests they carried out reviews that the data is well described by this distribution.
Aryal and Tsokos [3] defined and studied the transmuted Weibull distribution with the cumulative distribution function of the form > 0, e, R > 0, | | ≤ 1 , , is the transmuted Weibull random variable, σ η, and λ are parameters of the model. They provided the comprehensive description of the mathematical properties of the proposed distribution with its reliability behavior. The author's also used two real world datasets to illustrate the usefulness of this distribution for modeling reliability data and compared the result with Exponentiated Weilbull through maximum likelihood estimate and log likelihood procedures. It was also observed that the transmuted Weibull can be a good competitor of other generalized Weilbull distributions.
Merovci [20] proposed a generalization of the Lindley distribution using the quadratic rank transmutation map by Shaw and Buckley [24] to construct the transmuted Lindley distribution given by the cumulative distribution function (cdf) of the form: Where is the transmuted Lindley-distributed random variable associated with the Transmuted Lindley distribution.The author studied several Mathematical properties of the new distribution alongside with the reliability behaviour. The researcher also observed that the transmuted Lindley distribution is an extended probability model that can analyze more complex data and generalize some of the widely used distributions. An uncensored set involving the remission times (in months) of a random sample of 128 bladder cancer patients was modeled using the transmuted Lindley distribution, Exponential distribution and Lindley distribution. According to the result of the performance measure reported by the author, the transmuted Lindley is a better model than the Lindley distribution.
Elbatal and Elgarhy [15] introduced the Transmuted Quasi-Lindley distribution which is a generalized form of quasi-Lindley distribution. They used the quasi-Lindley distribution as the baseline distribution and define the cumulative distribution function of the Transmuted Quasi-Lindley probability model of the form: They derived the moments and moment generating function of the distribution and also the least squares, weighted least square and maximum likelihood estimate of the parameters of the distribution.
They derived various structural properties of the distribution such as the moments, quantiles, mean deviations, and also obtained the model parameters by maximum likelihood method. They suggested that this distribution can be used to model reliability data.
Elbatal [11] defined the transmuted modified inverse Weibull distribution by the cdf of the form: Structural properties of the distribution were examined by the author in view of real-life applications of the distribution.
In a similar fashion, Elbatal [12] also defined the Transmuted Generalized Inverted 332 Exponential distribution by the cumulative distribution function (cdf) given by: They discussed some properties of the distribution of derived moment and other statistics. They obtained the parameters of the model through the maximum likelihood estimate Method and also derived the information matrix.
Ashour and Eltehiwy [6] studied the Transmuted Exponentiated Modified Weibull distribution taking the Exponentiated Modified Weibull as the baseline distribution and obtained the cumulative distribution function (cdf) as: Where the Transmuted Exponential Modified Weibull random variable while , α β and λ are the model parameters. They derived various structural properties including explicit expression for the moments, quantiles and moment generating function of the distribution and obtained the estimate of the model parameter by the least square method. They concluded that the distribution can be used to model reliability data. The method of least squares was proposed for estimating the parameters of the distribution by the authors.
Elbatal et al. [14] investigated a four-parameter generalized distribution called the Transmuted Generalized Linear Exponential distribution with cumulative distribution function (cdf): They derived the following mathematical properties of the distribution: a closed form of expression for the density, relative distribution, quantiles, median, moments and moment generating function and also obtained the estimate of the unknown parameters through the maximum likelihood estimation method. They tested the goodness-of-fit of some selected distributions with the Transmuted Generalized Linear Exponential with the Kolgomorov-Smirnov (K-S) distance test statistics and their corresponding P-value on two real datasets where they found that the Transmuted Generalized Linear Exponential compete favourably with some well-known distribution in modeling lifetime data.
Elbatal and Aryal [13] proposed and studied the Transmuted Additive Weibull distribution that extends the Additive Weilbull distribution and other distributions. They defined the cumulative distribution function (cdf) as: They derived the explicit expression for moments, random number generation and order statistics of the subject distribution. They applied the maximum likelihood estimation procedure to obtain the unknown parameters of the model. They also used the analytical results to model real-life data. They compared the model with the Weibull and Additive Weilbull distributions to analyze the data and the results indicated that the Transmuted Additive Weilbull distribution has the lowest Alkaike Information Criteria (AIC). So, it fits the real-life data better than the other models.
Khan and King [17] defined the Transmuted Modified Weibull distribution with cdf: Several properties of the distribution were studied by the authors and both the method of least squares and maximum likelihood were proposed for the estimation of the parameters of the distribution. The Transmuted Modified Weibull distribution was applied to a real dataset and compared with the Transmuted Exponential, Transmuted Weibull, Modified Weibull and Weibull distributions and the Transmuted Modified Weibull distribution did better than the rest.

Methodology
Urama et al. [25] have developed a new family of probability distribution called the transmuted Kumaraswamy Pareto distribution (TKPD) whose cdf is of the form: and the corresponding pdf is The graphs of the hazard function are given by Figures 1 and 2. The graphs of the survival function are also given by Figures 3 and 4. An examination of the hazard in Figure 1 and 2 clearly shows that the function possesses an upside-down bathtub shape. This is very rare in many survival and reliability models. The situation reveals that the TKPD can be effectively used to model processes with an initial increasing failure rate before a decrease occurs as with many engineering devices subjected to work-hardening or some biological system which obtains immunity overtime.

Deductions from the TKPD Distribution
We shall deduce the following cases from the above distribution; 1. The TKPD becomes the Kumaraswamy Pareto distribution when the transmutation parameter = 0. Solving the quadratic equation, we have and taking the inverse of both sides gives hence the proof is established.
From property 1, we have the following Random variate vector W can be simulated from the transmuted family of distribution using the expression where Ž is a uniform random variable defined on the interval (0,1).
From the quantile property of the transmuted family of distribution of Shaw and Buckley, then we deduce the following; The quantile function of the TKPD distribution is given as

2.11)
And hence the proof is established.
The first three quartiles of the TKPD distribution are given by The second quartile ‹ 1/2) corresponds to the median of the distribution. Thus

2.12)
Hence, The Random variate vector W can be simulated from the TKPD using the expression where Ž is a uniform random variable defined on the interval (0,1).

Moments of the TKPD distribution
The ' "" non-central moments of the TKPD random variable W is given by The ' "" non-central moments can be expressed as

2.15)
Observe that the integral represents the ' "" non-central moments of the Pareto distribution with location parameter ( and shape parameter o¤. Thus the ' "" non-central moments of the TKPD distribution is an infinite linear combination of the ' "" non-central moments of the Pareto distribution. This is another useful and major result of this study.
Now if ¥ is a Pareto random variable, the ' "" non-central moments of ¥ was given by Bourguignon et al. [8] as

2.16)
It follows that the ' "" non-central moments of the TKPD is given by The first four moment of the TKPD are given respectively by

2.21)
The ' "" central moments Q ™ ) and cummulants © ™ ) of the TKPD can be obtained from (2.15) and expressed respectively as The mean Q), variance R ), skewness p) and kurtosis ¬) of the TKPD is obtained from (2.21)  The quantile function can also be used in evaluating skewness and kurtosis of a distribution particularly when the quantile function of the distribution exists in closed form. Galton [16] proposed a quantile measure based approach for evaluating skewness.

2.29)
Since the Quantile function of the TKPD exists in closed form as given in (2.11), the above expressions can also be used in evaluating the skewness and kurtosis.

Moment generating function of the TKPD distribution
The moment generating function (mgf) of the TKPD from definition can be expressed as -® H) = › e "® ) = G e "K $ )I .

Mean deviation of the TKPD
The dispersion and the spread in a population from the center are often measured by the deviation from the mean, and the deviation from the median. The absolute mean deviation about the mean, ¯ Q), and the absolute mean deviation about the median, ¯ -), for the TKPD are defined as

Order Statistics of the TKPD
Let W , W , … , W ª be a random sample of size « from the TKPD and suppose W :ª < W :ª < ⋯ < W ª:ª denote the corresponding order statistic. The pdf of the o "" order 348 statistic can be expressed as where, µ . , . ) is the beta function.
Using the binomial expansion, we have where the expectations are evaluated using standard procedure.

Maximum likelihood estimation
Suppose δ is a O × 1 vector containing all the parameters of the TKPD, for a random sample, , , … , ª of size « from the ¼¬½¯ the total log-likelihood function is given by the TKPD, for a complete random sample , , … , ª of size « from the ¼¬½¯ the total log-likelihood function is given by

2.55)
Observe that the estimate of the parameter ( corresponds to the first order statistic of the TKPD. That is ( = min W). We would only need to find the maximum likelihood estimates for the parameters m, n, o and .
Let Θ = m n o ) Â be the unknown parameter vector of TKPD. Then the associated score function is given by

2.60)
The maximum likelihood estimate of Θ, Θ AE , can be obtained by solving the non-linear systems of equations, Ç Θ) = 0. Since the resulting systems of equations are not in closed form, (An equation is in a closed form if it can be expressed in a simple algebraic form) the solutions can be found by numerical method using iterative scheme such as the Newton-Raphson type algorithms. The R Software was used to derive the maximum likelihood parameter estimates of the real-life dataset in Table 3.1 and displayed in Table  3.2.  I  I  I  I   I  I  I  I   I  I  I  I   I  I  I  I   E  I   k  b  a   k  kk  kb  ka   b  bk  bb  ba   a  ak  ab  aa   j  i, where the elements È ,¡ Θ) =`Ä k ℒ ÄÊ À ÄÊ Ë d. Thus, the elements of the FIM can be obtained by considering the second order partial derivatives of the log-likelihood function w.r.t. to the parameters. These elements can be numerically obtained by using the R software. The total FIM, Ì Í,Î Θ), can be approximated by For real data, Ì Í,Î Θ AE can be obtained after the maximum likelihood estimate of Θ is gotten, which implies the convergence of the iterative numerical procedure involved in finding such estimate.
Suppose Θ AE is the maximum likelihood estimate of Θ. Under the usual regularity conditions and that the parameters are in the interior of the parameter space, but not on the boundary, we have: √« Θ AE − Θ Ó → Ô¨2Õ, Ì Ö Θ)3, where Ì Ö Θ) is the inverse of the expected FIM, which also corresponds to the variance-covariance matrix of the parameters. The asymptotic behavior is still valid if Ì Ö Θ) is replaced by the inverse of the observed information matrix evaluated at Θ AE , that is Ì Ö Θ AE . The multivariate normal distribution with mean vector Õ = 0 0 0 0) Â and covariance matrix Ì Ö Θ) can be used to construct confidence intervals for the TKPD parameters. The approximate 100 1 − ) ) % two-sided confidence interval for the parameters m, n, o, and Ø are given by m Ù ± Ú 6 ⁄ ÛÌ tt Θ AE , n Ü ± Ú 6 ⁄ ÛÌ uu Θ AE , o Ü ± Ú 6 ⁄ ÛÌ ss Θ AE , ⁄ ) "" percentile of a standard normal distribution.

Application
For the application, we shall apply the proposed TKPD to a real life data. The data correspond to the exceedances of flood peaks (in m 3 /s) of the Wheaton River near Carcross in Yukon Territory, Canada. The data consist of 72 exceedances for the years 1958-1984, rounded to one decimal place. They were analyzed by Choulakian and Stephens [10] and are listed in Table 3.1. To aid the fitting of the data, the data was scaled by 100 and since scaling does not change the shape of a distribution, the result is as same as using the unscaled data. The purpose of scaling here is to improve the convergence rate of the numerical procedure used in obtaining the maximum likelihood estimates of the distribution. The density plots and the histogram of the fitted distributions in Figure 7 showed that the data is highly skewed to the right. We shall fit the TKPD to the data and compare the results with its sub-models namely: Pareto (P) and Kumaraswamy Pareto (KP) distributions. The computation of the estimates for the parameters of all the distributions was carried out using the R software. The Maximum likelihood estimates of the parameters of all the fitted distributions alongside the loglikelihood value (loglik), Akaike Information Criterion (AIC) and the Kolmogorov-Smirnov (K-S) statistics are reported in Table 3.3. Observe from Table 3.3 that the maximum likelihood estimate of the parameter ( corresponds to the minimum value of the data. This is so since Þ ≥ (. Plots of the densities of the fitted distribution distributions alongside the histogram of the data are given by Figure 7. The cdf plots of all the fitted distributions are given by Figure 6 while the P-P plots of all the fitted distributions are given by Figure 7.    Table 7 clearly revealed the superiority of the proposed TKPD over the other distributions which are its sub-models. This is evident from the fact that the proposed TKPD possessed the smallest K-S statistics value. The graphs of the fitted distributions in Figure 5 also showed that the TKPD density fitted the histogram of the data better than the rest of them. The cdf plot of Figure 6 and P-P plots of Figure 7 further displayed the superiority of the TKPD over its sub-models. This clearly supports the fact that the generalization of the Pareto distribution as contained in the newly proposed distribution is a very useful one and will always come useful when the Pareto distribution fails.

Conclusions
The major conclusion that can be drawn from the results is that the combination of two or more distributions to form a compounded distribution function is an effective tool to deal with more real life datasets, especially when the population characteristics are many and requires many parameters in order to describe the pattern and behaviour of some random phenomenon. In this study, transmutation mapping technique was used to generate a univariate continuous probability distribution known as Transmuted Kumaraswamy Pareto Distribution. Actually, the baseline distribution of the proposed distribution is the combination of two classical distributions, the Kumaraswamy distribution and the Pareto distribution. Kumaraswamy Pareto distribution is also seen as a generalized family of the Pareto distribution using the Kumaraswamy distribution as the generator.