Data

The simulated datacube, before noise and instrumental effects are added. Covering a sky area of 20 square degrees and featuring nearly a quarter of a million galaxies, the cube represents an SKA observation of neutral hydrogen — or "HI" — emission.

SDC2 dataset

Details of the simulated SKA HI data product are the following:

  • 20 square degrees area;

  • 7 arcsec beam size, sampled with 2.8 x 2.8 arcsec pixels;

  • 950–1150 MHz bandwidth, sampled with a 30 kHz resolution. This corresponds to a redshift interval z = 0.2350.495.

  • noise consistent with a 2000 hour total observation;

  • systematics include imperfect continuum subtraction, simulated RFI flagging and excess noise due to RFI flagging of some of the data.

The file size of the HI cube is around 1 TB and was made available during the challenge at designated processing facilities only. The data analysis pipelines developed by each team were deployed and executed at a processing facility of their choice amongst those made available. The final challenge score was computed on the full SDC2 dataset.

The dataset also includes maps of continuum emission over the same frequency range and on the same spatial resolution as the HI products.

The SDC2 full dataset is complemented by the following ancillary data:

  • two smaller downloadable 'development', with which initial data inspection and pipeline development can be performed. The development datasets are accompanied by a truth catalogue, for training and validation purposes.

The full description of the data and the challenge is available here

Download the SDC2 development datasets

development dataset: 1 sq deg (40 GB) non-blind dataset

development dataset (smaller version): 0.25 sq deg (10 GB) non-blind dataset, for slower connections or limited storage. This dataset is the central portion of the development dataset with larger FoV.

Download the SDC2 full dataset

Now that the challenge has closed, the full challenge dataset is available for download along with accompanying continuum image and truth catalogue:

full dataset: 20 sq deg (851 GB), non-blind dataset

The truth catalogues can be used to score your own performance on each SDC2 dataset using the SDC2 scoring module available on the scoring code page.


The SDC2 data archive during the challenge used the facilities of the Italian Center for Astronomical Archive (IA2) operated by INAF