These are helper functions included in the package.
Generating background noise
The gen_bkgnoise() function allows users to generate
multivariate Gaussian noise to serve as background data in
high-dimensional spaces.
# Example: Generate 4D background noise
bkg_data <- gen_bkgnoise(n = 500, p = 4,
m = c(0, 0, 0, 0), s = c(2, 2, 2, 2))
head(bkg_data)
#> # A tibble: 6 × 4
#> x1 x2 x3 x4
#> <dbl> <dbl> <dbl> <dbl>
#> 1 -2.80 -1.42 -0.673 -1.16
#> 2 0.511 1.32 -0.432 -0.338
#> 3 -4.87 0.582 1.24 -3.84
#> 4 -0.0111 0.396 -2.57 -3.07
#> 5 1.24 -2.41 -2.60 -2.23
#> 6 2.30 -0.0796 -0.754 3.20The generated data has independent dimensions with specified means
(m) and standard deviations (s).
Randomizing rows
randomize_rows() ensures the rows of the input data is
randomized.
randomized_data <- randomize_rows(bkg_data)
head(randomized_data)
#> # A tibble: 6 × 4
#> x1 x2 x3 x4
#> <dbl> <dbl> <dbl> <dbl>
#> 1 -1.84 -0.905 1.11 -1.04
#> 2 2.36 -3.49 -2.98 -0.247
#> 3 -0.492 2.86 0.381 -2.50
#> 4 -0.424 -3.13 -3.56 -0.177
#> 5 3.79 1.10 0.0810 -0.861
#> 6 1.02 -0.397 3.29 3.21Relocating clusters
relocate_clusters() allows users to translate clusters
in any dimension(s). This is achieved by centering each cluster
(subtracting its mean) and then adding a translation vector from a
provided matrix (vert_mat).
df <- tibble::tibble(
x1 = rnorm(12),
x2 = rnorm(12),
x3 = rnorm(12),
x4 = rnorm(12),
cluster = rep(1:3, each = 4)
)
vert_mat <- matrix(c(
5, 0, 0, 0,
0, 5, 0, 0,
0, 0, 5, 0
), nrow = 3, byrow = TRUE)
relocated_df <- relocate_clusters(df, vert_mat)
head(relocated_df)
#> # A tibble: 6 × 5
#> x1 x2 x3 x4 cluster
#> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 5.86 -0.435 -0.339 0.243 1
#> 2 -0.794 5.35 -0.0382 0.133 2
#> 3 1.01 -0.652 5.06 -0.592 3
#> 4 2.93 0.728 0.218 -1.56 1
#> 5 0.821 4.67 -0.266 -0.773 2
#> 6 0.0601 4.58 -0.269 1.40 2Generating Rotation Matrices
The gen_rotation() function creates a rotation matrix in
high-dimensional space for given planes and angles.
rotations_4d <- list(
list(plane = c(1, 2), angle = 60),
list(plane = c(3, 4), angle = 90)
)
rot_mat <- gen_rotation(p = 4, planes_angles = rotations_4d)
rot_mat
#> [,1] [,2] [,3] [,4]
#> [1,] 0.5000000 -0.8660254 0.000000e+00 0.000000e+00
#> [2,] 0.8660254 0.5000000 0.000000e+00 0.000000e+00
#> [3,] 0.0000000 0.0000000 6.123234e-17 -1.000000e+00
#> [4,] 0.0000000 0.0000000 1.000000e+00 6.123234e-17Normalize data
When combining clusters or transforming data geometrically,
magnitudes can differ drastically. The normalize_data()
function rescales the entire dataset to fit within ([-1, 1]) based on
its maximum absolute value.
norm_data <- normalize_data(bkg_data)
head(norm_data)
#> x1 x2 x3 x4
#> 1 -0.450293102 -0.22884604 -0.10829694 -0.18606977
#> 2 0.082117097 0.21297520 -0.06942023 -0.05439019
#> 3 -0.783892057 0.09363561 0.19977349 -0.61727879
#> 4 -0.001791881 0.06366882 -0.41297879 -0.49346293
#> 5 0.199908717 -0.38710048 -0.41814604 -0.35853835
#> 6 0.369361251 -0.01280627 -0.12117958 0.51390085Generating cluster locations
To place clusters in different positions, gen_clustloc()
generates points forming a simplex-like arrangement
ensuring each cluster center is equidistant from others as much as
possible.
centers <- gen_clustloc(p = 4, k = 5)
head(centers)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0.9586298 -0.7847655 -0.67325802 -0.5253323 1.0247260
#> [2,] 0.2409763 -0.2831468 -1.86798385 -0.5513311 2.4614855
#> [3,] -0.3976958 1.2621977 -0.43202643 -0.5647853 0.1323098
#> [4,] 0.1294284 -0.6167717 0.01451185 0.3221455 0.1506859Numeric generators
Two helper functions, gen_nproduct() and
gen_nsum(), generate numeric vectors of positive integers
that approximately satisfy a user-specified target product or sum,
respectively.
The function gen_nsum(n, k) divides a total sum
n into k positive integers. It first assigns
an equal base value to each element and then randomly distributes any
remainder, ensuring the elements sum exactly to n.
gen_nsum(n = 100, k = 3)
#> [1] 34 33 33The function gen_nproduct(n, p) aims to produce
p positive integers whose product is approximately
n. It starts with all elements equal to the rounded
root of n and iteratively adjusts elements up or down in a
randomized manner until the product is within a small tolerance of
n. This accommodates the fact that exact integer solutions
for a given product are often impossible.
gen_nproduct(n = 500, p = 4)
#> [1] 4 5 5 5