Skip to contents

This function fits a high-dimensional model using hexagonal bins and provides options to customize the modeling process, including the choice of bin centroids or bin means, removal of low-density hexagons, and averaging of high-dimensional data.


  bin1 = 4,
  q = 0.1,
  is_bin_centroid = TRUE,
  is_rm_lwd_hex = FALSE,
  benchmark_to_rm_lwd_hex = NULL,
  col_start_highd = "x"



A tibble that contains the training high-dimensional data.


A tibble that contains embedding with a unique identifier.


Number of bins along the x axis.


The ratio of the ranges of the original embedding components.


The buffer amount as proportion of data range.


Logical, indicating whether to use bin centroids (default is TRUE).


Logical, indicating whether to remove low-density hexagons (default is FALSE).


The benchmark value to remove low-density hexagons.


The text prefix for columns in the high-dimensional data.


A list containing the data frame with high-dimensional coordinates for 2D bin centroids (df_bin) and the data frame containing information about hexagonal bin centroids (df_bin_centroids) in 2D.


r2 <- diff(range(s_curve_noise_umap$UMAP2))/diff(range(s_curve_noise_umap$UMAP1))
fit_highd_model(training_data = s_curve_noise_training,
emb_df = s_curve_noise_umap_scaled, bin1 = 4, r2 = r2,
col_start_highd = "x")
#> $df_bin
#> # A tibble: 11 × 8
#>    hb_id      x1     x2       x3        x4         x5       x6         x7
#>    <int>   <dbl>  <dbl>    <dbl>     <dbl>      <dbl>    <dbl>      <dbl>
#>  1     5 -0.458  1.13   -1.78     0.00444  -0.0000532 -0.0265  -0.00143  
#>  2     6  0.491  1.51   -1.86     0.0114   -0.0106    -0.0291  -0.000367 
#>  3     9 -0.340  0.0682 -1.90    -0.000169  0.00304   -0.0130  -0.00288  
#>  4    10  0.798  0.537  -1.44     0.00424  -0.00131    0.0395   0.00177  
#>  5    14  0.734  1.25   -0.358   -0.00214   0.00479    0.0230  -0.0000563
#>  6    15  0.0705 1.65   -0.00255  0.00553  -0.00839   -0.0248   0.00926  
#>  7    19 -0.498  1.03    0.269    0.00358   0.00210    0.00889  0.000509 
#>  8    23 -0.747  0.805   1.58     0.00287  -0.00408    0.0159   0.000514 
#>  9    27  0.0106 1.54    1.90     0.00762   0.00272   -0.0161   0.00209  
#> 10    28  0.552  0.465   1.69    -0.00901   0.00561   -0.0155  -0.00369  
#> 11    31  0.901  1.55    1.38     0.000970  0.00252    0.00784 -0.00202  
#> $df_bin_centroids
#> # A tibble: 11 × 6
#>    hexID     c_x   c_y bin_counts std_counts drop_empty
#>    <int>   <dbl> <dbl>      <int>      <dbl> <lgl>     
#>  1     5  0.0915 0.130         14      1     FALSE     
#>  2     6  0.474  0.130          3      0.214 FALSE     
#>  3     9 -0.1    0.461          3      0.214 FALSE     
#>  4    10  0.283  0.461          6      0.429 FALSE     
#>  5    14  0.474  0.793          9      0.643 FALSE     
#>  6    15  0.857  0.793          2      0.143 FALSE     
#>  7    19  0.666  1.12          11      0.786 FALSE     
#>  8    23  0.857  1.46           4      0.286 FALSE     
#>  9    27  0.666  1.79           9      0.643 FALSE     
#> 10    28  1.05   1.79           9      0.643 FALSE     
#> 11    31  0.857  2.12           5      0.357 FALSE     