Full-text search for arXiv

Maitra, Ranjan

Normalized to: Maitra, R.

5 article(s) in total. 3 co-authors, from 1 to 3 common article(s). Median position in authors list is 2,0.

[1] oai:arXiv.org:2003.05777 [pdf] - 2064571

Characterising hot stellar systems with confidence

Chattopadhyay, Souradeep; Maitra, Ranjan

Comments: 9 pages; 9 figures

Submitted: 2020-03-12, last modified: 2020-03-16

Hot stellar systems (HSS) are a collection of stars bound together by gravitational attraction. These systems hold clues to many mysteries of outer space so understanding their origin, evolution and physical properties is important but remains a huge challenge. We used multivariate $t$-mixtures model-based clustering to analyze 13456 hot stellar systems from Misgeld & Hilker (2011) that included 12763 candidate globular clusters and found eight homogeneous groups using the Bayesian Information Criterion (BIC). A nonparametric bootstrap procedure was used to estimate the confidence of each of our clustering assignments. The eight obtained groups can be characterized in terms of the correlation, mass, effective radius and surface density. Using conventional correlation-mass-effective radius-surface density notation, the largest group, Group 1, can be described as having positive-low-low-moderate characteristics. The other groups, numbered in decreasing sizes are similarly characterised, with Group 2 having positive-low-low-high characteristics, Group 3 displaying positive-low-low-moderate characteristics, Group 4 having positive-low-low-high characteristic, Group 5 displaying positive-low-moderate-moderate characteristic and Group 6 showing positive-moderate-low-high characteristic. The smallest group (Group 8) shows negative-low-moderate-moderate characteristic. Group 7 has no candidate clusters and so cannot be similarly labeled but the mass, effective radius correlation for these non-candidates indicates that they zare larger than typical globular clusters. Assertions drawn for each group are ambiguous for a few HSS having low confidence in classification. Our analysis identifies distinct kinds of HSS with varying confidence and provides novel insight into their physical and evolutionary properties.

[2] oai:arXiv.org:1904.09609 [pdf] - 1885634

TiK-means: $K$-means clustering for skewed groups

Berry, Nicholas S.; Maitra, Ranjan

Comments: 15 pages, 6 figures, to appear in Statistical Analysis and Data Mining - The ASA Data Science Journal

Submitted: 2019-04-21

The $K$-means algorithm is extended to allow for partitioning of skewed groups. Our algorithm is called TiK-Means and contributes a $K$-means type algorithm that assigns observations to groups while estimating their skewness-transformation parameters. The resulting groups and transformation reveal general-structured clusters that can be explained by inverting the estimated transformation. Further, a modification of the jump statistic chooses the number of groups. Our algorithm is evaluated on simulated and real-life datasets and then applied to a long-standing astronomical dispute regarding the distinct kinds of gamma ray bursts.

[3] oai:arXiv.org:1802.08363 [pdf] - 1747013

An efficient $k$-means-type algorithm for clustering datasets with incomplete records

Lithio, Andrew; Maitra, Ranjan

Comments: 21 pages, 12 figures, 3 tables, in press, Statistical Analysis and Data Mining -- The ASA Data Science Journal, 2018

Submitted: 2018-02-22, last modified: 2018-09-08

The $k$-means algorithm is arguably the most popular nonparametric clustering method but cannot generally be applied to datasets with incomplete records. The usual practice then is to either impute missing values under an assumed missing-completely-at-random mechanism or to ignore the incomplete records, and apply the algorithm on the resulting dataset. We develop an efficient version of the $k$-means algorithm that allows for clustering in the presence of incomplete records. Our extension is called $k_m$-means and reduces to the $k$-means algorithm when all records are complete. We also provide initialization strategies for our algorithm and methods to estimate the number of groups in the dataset. Illustrations and simulations demonstrate the efficacy of our approach in a variety of settings and patterns of missing data. Our methods are also applied to the analysis of activation images obtained from a functional Magnetic Resonance Imaging experiment.

[4] oai:arXiv.org:1712.08123 [pdf] - 1757643

Multivariate $t$-Mixtures-Model-based Cluster Analysis of BATSE Catalog Establishes Importance of All Observed Parameters, Confirms Five Distinct Ellipsoidal Sub-populations of Gamma Ray Bursts

Chattopadhyay, Souradeep; Maitra, Ranjan

Comments: 16 pages, 11 figures, 7 tables, in press, Monthly Notices of the Royal Astronomical Society, 2018

Submitted: 2017-12-21, last modified: 2018-09-08

Determining the kinds of gamma-ray bursts (GRBs) has been of interest to astronomers for many years. We analyzed 1599 GRBs from the Burst and Transient Source Experiment (BATSE) 4Br catalogue using $t$-mixtures-model-based clustering on all nine observed parameters ($T_{50}$, $T_{90}$, $F_1$, $F_2$, $F_3$, $F_4$, $P_{64}$, $P_{256}$, $P_{1024}$) and found evidence of five types of GRBs. Our results further refine the findings of Chattopadhyay and Maitra (2017) by providing groups that are more distinct. Using the Mukherjee et al. (1998) classification scheme, also used by Chattopadhyay and Maitra (2017), of duration, total fluence ($F_t = F_1 + F_2 + F_3 + F_4$)) and spectrum (using Hardness Ratio $H_{321} = F_3/(F_1 + F_2)$) our five groups are classified as long-intermediate-intermediate, short-faint-intermediate, short-faint-soft, long-bright-hard, and long-intermediate-hard. We also classify 374 GRBs in the BATSE catalogue that have incomplete information in some of the observed variables (mainly the four time integrated fluences $F_1$, $F_2$, $F_3$ and $F_4$) to the five groups obtained, using the 1599 GRBs having complete information in all the observed variables. Our classification scheme puts 138 GRBs in the first group, 52 GRBs in the second group, 33 GRBs in the third group, 127 GRBs in the fourth group and 24 GRBs in the fifth group.

[5] oai:arXiv.org:1703.07338 [pdf] - 1582053

Gaussian-Mixture-Model-based Cluster Analysis Finds Five Kinds of Gamma Ray Bursts in the BATSE Catalog

Chattopadhyay, Souradeep; Maitra, Ranjan

Comments: 17 pages, 12 figures, 6 tables

Submitted: 2017-03-21, last modified: 2017-05-02

Clustering methods are an important tool to enumerate and describe the different coherent kinds of Gamma Ray Bursts (GRBs). But their performance can be affected by a number of factors such as the choice of clustering algorithm and inherent associated assumptions, the inclusion of variables in clustering, nature of initialization methods used or the iterative algorithm or the criterion used to judge the optimal number of groups supported by the data. We analyzed GRBs from the BATSE 4Br catalog using $k$-means and Gaussian Mixture Models-based clustering methods and found that after accounting for all the above factors, all six variables -- different subsets of which have been used in the literature -- and that are, namely, the flux duration variables ($T_{50}$, $T_{90}$), the peak flux ($P_{256}$) measured in 256-millisecond bins, the total fluence ($F_t$) and the spectral hardness ratios ($H_{32}$ and $H_{321}$) contain information on clustering. Further, our analysis found evidence of five different kinds of GRBs and that these groups have different kinds of dispersions in terms of shape, size and orientation. In terms of duration, fluence and spectrum, the five types of GRBs were characterized as intermediate/faint/intermediate, long/intermediate/soft, intermediate/intermediate/intermediate, short/faint/hard and long/bright/intermediate.