Other

Significantly reducing the processing times of high-speed photometry data sets using a distributed computing model

Paul Doyle, Technological University DublinFollow
Fredrick Mtenzi, Technological University DublinFollow
Niall Smith, Munster Technological University
Adrian Collins, Munster Technological University
Brendan O'Shea, Technological University DublinFollow

Author ORCID Identifier

https://orcid.org/0000-0003-3877-7432

Document Type

Conference Paper

Rights

Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence

Disciplines

Computer Sciences

Publication Details

Congerence: Software and Cyberinfrastructure for Astronomy II, International Society for Optics and Photonics

Abstract

The scientific community is in the midst of a data analysis crisis. The increasing capacity of scientific CCD instrumentation and their falling costs is contributing to an explosive generation of raw photometric data. This data must go through a process of cleaning and reduction before it can be used for high precision photometric analysis. Many existing data processing pipelines either assume a relatively small dataset or are batch processed by a High Performance Computing centre. A radical overhaul of these processing pipelines is required to allow reduction and cleaning rates to process terabyte sized datasets at near capture rates using an elastic processing architecture. The ability to access computing resources and to allow them to grow and shrink as demand fluctuates is essential, as is exploiting the parallel nature of the datasets. A distributed data processing pipeline is required. It should incorporate lossless data compression, allow for data segmentation and support processing of data segments in parallel. Academic institutes can collaborate and provide an elastic computing model without the requirement for large centralized high performance computing data centers. This paper demonstrates how a base 10 order of magnitude improvement in overall processing time has been achieved using the "ACN pipeline", a distributed pipeline spanning multiple academic institutes.

DOI

https://doi.org/10.1117/12.924863

Recommended Citation

Paul Doyle, Fred Mtenzi, Niall Smith, Adrian Collins, Brendan O'Shea, "Significantly reducing the processing times of high-speed photometry data sets using a distributed computing model," Proc. SPIE 8451, Software and Cyberinfrastructure for Astronomy II, 84510C (24 September 2012); DOI: 10.1117/12.924863

Download

Included in

Other Computer Sciences Commons, Software Engineering Commons

COinS

Other

Significantly reducing the processing times of high-speed photometry data sets using a distributed computing model

Author ORCID Identifier

Document Type

Rights

Disciplines

Publication Details

Abstract

DOI

Recommended Citation

Included in

Search

Browse

Author Corner

Other

Significantly reducing the processing times of high-speed photometry data sets using a distributed computing model

Authors

Author ORCID Identifier

Document Type

Rights

Disciplines

Publication Details

Abstract

DOI

Recommended Citation

Included in

Share

Search

Browse

Author Corner