Headrick and Kowalchuk (2007) outlined a general method for comparing a simulated distribution $\Large Y$ to a given theoretical distribution $\Large Y^*$. Note that these could easily be modified for comparison to an empirical vector of data:
Obtain the standardized cumulants (skewness,
kurtosis, fifth, and sixth) for $\Large
Y^*$. This can be done using calc_theory
along with
either the distribution name (plus up to 4 parameters) or the pdf fx
(plus support bounds). In the case of an empirical vector of data, use
calc_moments
or calc_fisherk
.
Obtain the constants for $\Large Y$. This can be done using
find_constants
or by simulating the distribution with
nonnormvar1
.
Determine whether these constants produce a valid power
method pdf. The results of find_constants
or
nonnormvar1
indicate whether the constants yield an invalid
or valid pdf. The constants may also be checked using
pdf_check
. If the constants generate an invalid pdf, the
user should check if the kurtosis falls above the lower bound (using
calc_lower_skurt
). If yes, a vector of sixth cumulant
correction values should be used in find_constants
or
nonnormvar1
to find the smallest correction that produces
valid pdf constants.
Select a critical value from $\Large Y^*$, i.e. $\Large y^*$ such that $\Large Pr(Y^* \ge y^*) = \alpha$. This can
be done using the appropriate quantile function and $\Large 1 - \alpha$ value
(i.e. qexp(1 - 0.05)
).
Solve $\Large m_{2}^{1/2} * p(z') + m_{1} - y^* = 0$ for $\Large z'$, where $\Large m_{1}$ and $\Large m_{2}$ are the 1st and 2nd moments of $\Large Y^*$.
Calculate $\Large 1 - \Phi(z')$, the corresponding probability for the approximation $\Large Y$ to $\Large Y^*$ (i.e. $\Large 1 - \Phi(z') = 0.05$) and compare to the target value $\Large \alpha$.
Plot a parametric graph of $\Large Y^*$ and $\Large Y$. This can be done with a set of
constants using plot_pdf_theory
(overlay
=
TRUE) or with a simulated vector of data using
plot_sim_pdf_theory
(overlay
= TRUE). If
comparing to an empirical vector of data, use plot_pdf_ext
or plot_sim_pdf_ext
.
Use these steps to compare a simulated exponential(mean = 2)
variable to the theoretical exponential(mean = 2) density.
(Note that the printr
package is invoked to display the
tables.)
In R, the exponential parameter is
rate <- 1/mean
.
Note that calc_theory
returns the standard deviation,
not the variance. The simulation functions require variance as the
input.
H_exp <- nonnormvar1("Polynomial", means = stcums[1], vars = stcums[2]^2,
skews = stcums[3], skurts = stcums[4],
fifths = stcums[5], sixths = stcums[6], n = 10000,
seed = 1234)
## Constants: Distribution 1
##
## Constants calculation time: 0 minutes
## Total Simulation time: 0 minutes
Look at the power method constants.
c0 | c1 | c2 | c3 | c4 | c5 |
---|---|---|---|---|---|
-0.3077396 | 0.8005605 | 0.318764 | 0.0335001 | -0.0036748 | 0.0001587 |
Look at a summary of the target distribution.
as.matrix(round(H_exp$summary_targetcont[, c("Distribution", "mean", "sd",
"skew", "skurtosis", "fifth",
"sixth")], 5), nrow = 1, ncol = 7,
byrow = TRUE)
Distribution | mean | sd | skew | skurtosis | fifth | sixth | |
---|---|---|---|---|---|---|---|
mean | 1 | 2 | 2 | 2 | 6 | 24 | 120 |
Compare to a summary of the simulated distribution.
as.matrix(round(H_exp$summary_continuous[, c("Distribution", "mean", "sd",
"skew", "skurtosis", "fifth",
"sixth")], 5), nrow = 1, ncol = 7,
byrow = TRUE)
Distribution | mean | sd | skew | skurtosis | fifth | sixth | |
---|---|---|---|---|---|---|---|
X1 | 1 | 1.99987 | 2.0024 | 2.03382 | 6.18067 | 23.74145 | 100.3358 |
Let $\Large \alpha = 0.05$.
## [1] 5.991465
Since the exponential(2) distribution has a mean and standard deviation equal to 2, solve $\Large 2 * p(z') + 2 - y_star = 0$ for $\Large z'$. Here, $\Large p(z') = c0 + c1 * z' + c2 * z'^2 + c3 * z'^3 + c4 * z'^4 + c5 * z'^5$.
f_exp <- function(z, c, y) {
return(2 * (c[1] + c[2] * z + c[3] * z^2 + c[4] * z^3 + c[5] * z^4 +
c[6] * z^5) + 2 - y)
}
z_prime <- uniroot(f_exp, interval = c(-1e06, 1e06),
c = as.numeric(H_exp$constants), y = y_star)$root
z_prime
## [1] 1.644926
## [1] 0.04999249
This is approximately equal to the $\Large \alpha$ value of 0.05, indicating the method provides a good approximation to the actual distribution.
We can also plot the empirical cdf and show the cumulative probability up to y_star.