Statistics

Statistical research must strive towards determinism or reducing the error of ignorance. One must regard the model building exercise as an attempt to recognize the most accurate deterministic pattern, and one must not settle for a model that contains more error than what is present in the underlying data generation process. This approach calls for shifting statistical focus from hypothesis testing with a possibly miss-specified model to proper specification of the model to reduce the errors due to ignorance. This requires considerable amount of research on newer statistical distributions and their applications, and on the newer distribution free methods. Theoretical and applied statistics must go hand in hand with data collection and analysis. This will be another major thrust of the statistics group at AIMSCS.

One of the areas in which India has a comparative advantage in terms of research potential and cost is that of developing new drugs and laboratory testing and clinical trials of new drugs. Given the important role that India is expected to play in this field AIMSCS will promote basic research in biometrics and promote its applications in the country. In the last century statistical theory was developed on the assumption that the probability model generating observed data is specified. Often, normal distribution was assumed. Research in the past 25 years has revealed that the theory based on normal distribution is not robust in the sense that slight deviation from normality makes the methods based on normal distribution highly inefficient. Further, methods suggested for model choice based on observed data have not been found to be useful. The recent advances in technology and new discoveries created new problems, which require new statistical tools to be forged for solving them. AIMSCS will concentrate on doing research on some of these new problems.

Some specific areas of research are as follows:

  • Development of model-free methodology for discrimination and prediction
  • Nonparametric Bayesian methods. Although Bayes theorem using a priori information was published in 1761, its potential in solving problems was discovered only
    recently with the development of Markov Chain Monte Carlo technique. It is predicted that Bayesian techniques will dominate the use of statistical methods in the 21st century . Considerable research exists on the use of Bayes theorem under given specification of probability model for observations. The Institute will concentrate research on using Bayes theorem under model free assumptions.
  • New areas of research like Bio-informatics have raised new problems for statistical research especially in multivariate analysis involving availability of a large number of measurements and a few samples. Further there is difficulty in interpreting a large number of tests of hypotheses made using the same data as in the study of microanalysis data in genetics. Some of these problems will be investigated.
  • New statistical problems have come up in the analysis of what is called transactional data relating to grocery store sales, credit card bills, insurance claims, medical bills etc. Statistical methods, in the name of Data Mining, are being developed to extract information of interest to business and industry from such data. The Institute will undertake research in this new area.
  • Bagging and Boosting is a new area of research in model free techniques, where starting with any given statistical procedure, its performance is boosted in a series of computational steps. Further research needed in this area will be undertaken.
  • Probability Theory and applications : Uncertainty, randomness and creation of new knowledge.Principles and strategies of data analysis with emphasis an “ Cross Examination of data, weighted and clouded distributions etc.,’