The package offers multiple methods to discretize multivariate continuous data using a grid that captures the joint distribution via preserving clusters in original data (Wang, Kumar, and Song 2020). Joint grid discretization is applicable as a data transformation step before using other methods to infer association, function, or causality without assuming a parametric model.
Most available discretization methods process one variable at a time, such as ‘Ckmeans.1d.dp’. As discretizing each variable independently may mis-represent patterns arising from the joint distribution of multiple variables, one may benefit from joint discretization. The methods can handle both unlabeled and labeled data.
install.packages("GridOnClusters")
See the Examples vignette of the package.
Wang J, Kumar S, Song M (2020). “Joint Grid Discretization for Biological Pattern Discovery.” In Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. Article no. 57. doi: 10.1145/3388440.3412415 (URL: https://doi.org/10.1145/3388440.3412415).