For years I've been angry that Maple isn't capable of formally manipulating random vectors (aka multivariate random variables).
For the record Mathematica does.
The problem I'm concerned with is to create a vector W such that
type(W, RandomVariable)
will return true.
Of course defining W from its components w1, .., wN, where each wn is a random variable is easy, even if these components are correlated or, more generally dependent ( the two concepts being equivalent iif all the wn are gaussian random variables).
But one looses the property that W is no longer a (multivariate) random variable.
See a simple example here: NoRandomVectorsInMaple.mw
This is the reason why I've developped among years several pieces of code to build a few multivariate random variable (multinormal, Dirichlet, Logistic-Normal, Skew Multivariate Normal, ...).
In the framework of my activities, they are of great interest and the purpose of this post is to share what I have done on this subject by presenting the most classic example: the multivariate gaussian random variable.
My leading idea was (is) to build a package named MVStatistics on the image of the Statistics package but devoted to Multi Variate random variables.
I have already construct such a package aggregating about fifty different procedures. But this latter doesn't merit the appellation of "Maple package" because I'm not qualified to write something like this which would be at the same time perennial, robust, documented, open and conflict-free with the Statistics package.
In case any of you are interested in pursuing this work (because I'm about to change jobs), I can provide it all the different procedures I built to construct and manipulate multivariate random variables.
To help you understand the principles I used, here is the most iconic example of a multivariate gaussian random variable.
The attached file contains the following procedures
MVNormal
Constructs a gaussian random vector whose components can be mutually correlated
The statistics defined in Distribution are: (this list could be extended to other
statistics, provided they are "recognized" statitics, see at the end of this
post):
PDF
Mode
Mean
Variance
StandardDeviation = add(s[k]*x[k], k=1..K)
RandomSample
DispersionEllipse
Builds and draws the dispersion ellipses of a bivariate gaussia, random vector
DispersionEllipsoid
Builds and draws the dispersion ellipsoids of a trivariate gaussia, random vector
MVstat
Computes several statistics of a random vector (Mean, Variance, ...)
Iserlis
Computes the moments of any order of a gaussian random vector
MVCentralMoment
Computes the central moments of a gaussian random vector
Conditional
Builds the conditional random vector of a gaussian random vector wrt some of its components
the moments of any order of a gaussian random vector.
Note: the result has type RandomVariable.
MarginalizeAgainst
Builds the marginal random vector of a gaussian random vector wrt some of its components
the moments of any order of a gaussian random vector.
Note: the result has type RandomVariable.
MardiaNormalityTest
The multi-dimensional analogue of the Shapiro-Wilks normality test
HZNormalityTest
Henze-Zirkler test for Multivariate Normality
MVWaldWolfowitzTest
A multivariate version of the non-parametrix Wald-Folfowitz test
Do not hesitate to ask me any questions that might come to mind.
In particular, as Maple introduces limitations on the type of some attributes (for instance Mean must be of algebraic type), I've been forced to lure it by transforming vector or matrix quantities into algebraic ones.
An example is
Mean = add(m[k]*x[k], k=1..K)
where m[k] is the expectation of the kth component of this random vector.
This implies using the procedure MVstat to "decode", for instance, what Mean returns and write it as a vector.
MultivariateNormal.mw
About the statistics ths Statistics:-Distribution constructor recognizes:
To get them one can do this (the Normal distribution seems to be the continuous one with the most exhaustive list os statistics):
restart
with(Statistics):
X := RandomVariable(Normal(a, b)):
attributes(X);
protected, RandomVariable, _ProbabilityDistribution
map(e -> printf("%a\n", e), [exports(attributes(X)[3])]):
Conditions
ParentName
Parameters
CharacteristicFunction
CDF
CGF
HodgesLehmann
Mean
Median
MGF
Mode
PDF
RousseeuwCrouxSn
StandardDeviation
Support
Variance
CDFNumeric
QuantileNumeric
RandomSample
RandomSampleSetup
RandomVariate
MaximumLikelihoodEstimate
Unfortunately it happens that for some unknown reason a few statistics cannot be set by the user.
This is for instance the case of Parameters serious consequences in certain situations.
Among the other statistics that cannot be set by the user one finds:
- ParentName,
- QuantileNumeric whose role is not very clear, at least for me, but which I suspect is a procedure which "inverts" the CDF to give a numerical estimation of a quantile given its probability.
If it is so accessing QuantileNumeric would be of great interest for distributions whose the quantiles have no closed form expressions.
- CDFNumeric (same remark as above)
Finally, the statistics Conditions, which enables defining the conditions the elements of Parameters must verify are not at all suited for multivariate random variables.
It is for instance impossible to declare that the variance matrix (or the correlation matrix) is a square symmetric positive definite matrix).