Overcoming bias and incompleteness in astronomy: statistical methods for the big data era
Stat Methods in Big Data
An exhibition of innovative statistical techniques used to tackle incompleteness, bias, and data-combination problems in astronomy in light of upcoming large-scale surveys and simulations.
Astronomy, more so than any other science, is concerned with inference based on incomplete, missing, and biased datasets.
Unlike other sciences, which may rerun experiments, astronomy suffers from problems brought about as a result of the limited sample available and imperfect instrumentation.
1) How do we infer model parameters in the presence of incompleteness and bias?
2) How accurately can we extend our models into unobserved regions?
3) How can we properly combine data from surveys with varied selection functions?
4) With what conditions should we initiate expensive simulations?
How we tackle these issues becomes ever more important as the volume of available data grows, especially with the advent of instruments such as JWST, SKA, LST, and Euclid. In addition, there is also a large archive of legacy data (e.g. SDSS, H-ATLAS, HST) which may be mined for unique discoveries using novel methods.
High-performance computing has enabled a diverse range of Bayesian, deep-learning, and other inference techniques that would not otherwise have been possible - yet how do we avoid drowning in data?
This session will explore novel approaches to statistical inference in astronomy, with an emphasis on the consistent combination of large diverse datasets, detection and mitigation of bias in Bayesian models or neural networks, and high-performance statistical methods for big-data.
We will host a diverse range of talks representing the current state of cutting-edge statistical data-mining methods in astronomy with the goal of informing others that such tools are possible and exist.
Schedule:
09:00 Devina Mohan “Bayesian Deep Learning for Radio Galaxy Classification”
09:15 Tom J Wilson “Generalising the Astrometric Uncertainty Function in the Era of the Rubin Observatory's LSST”
09:30 Christopher Lovell “An orientation-bias in the selection of submillimetre galaxies”
09:45 Anne Buckner “INDICATE: The novel spatial analysis tool for which incompleteness is not a problem”
10:00 Ralph Schoenrich “Statistics in Galactic Dynamics”
10:15 Shaun Read “Non-Gaussian extreme deconvolution with neural-network enhanced Gaussian mixture models”
Shaun Read, Garreth Martin
Thursday morning
All attendees are expected to show respect and courtesy to other attendees and staff, and to adhere to the NAM Code of Conduct.