Pierre Auger Observatory Open Data

<

October 2021 release

The Pierre Auger 2021 Open Data is the public release of 10% of the Pierre Auger Observatory cosmic-ray data presented at the 36th International Cosmic Ray Conference held in 2019 in Madison, USA, following the Auger Collaboration Open Data Policy. The release also includes 100% of weather and space-weather data collected until 31 December 2020.
This website hosts the datasets for download. A brief overview of the Pierre Auger Observatory and of the Auger Open Data is set out below. An online event display to explore the released cosmic-ray events, and example analysis codes are provided. An outreach section dedicated to the general public is also available.

About the Pierre Auger Observatory

The Pierre Auger Observatory, located on a vast, high-altitude plain in Argentina, in the Province of Mendoza, is the world's largest cosmic ray observatory and is used to study the extensive air-showers produced by cosmic rays above ~1017 eV. The intensity of high energy cosmic rays (those above about 1014 eV) is only a few particles per square meter per day and thus too low to allow for direct measurement with satisfactory statistical precision. Above 1019 eV the rate is only about 1 per km2 per year. The phenomenon of extensive air-showers must be exploited to study cosmic rays at very high energies. The air-showers are cascade of particles created by the interaction of a single cosmic-ray with the Earth atmosphere. They can be observed by telescopes that pick up the fluorescence radiation emitted from nitrogen molecules excited as the shower crosses the atmosphere, while the particles reaching the ground can be sampled by large arrays of detectors. The properties of these extensive air-showers are measured to determine the energy and arrival direction of each cosmic ray and to provide a statistical determination of the distribution of primary masses.


Schematic map of the Pierre Auger Observatory

The Observatory features an array of 1600 water-Cherenkov particle detector stations (SD, black dots in the map) spread over 3000 km2 on a 1500 m triangular grid, overlooked by 24 air-fluorescence telescopes (FD, blue dots in the map). These are located at Los Leones, Los Morados, Loma Amarilla and Coihueco. The Observatory is at a mean altitude of about 1400 m, corresponding to an atmospheric overburden of about 875 g cm-2. The site is located between latitudes 35.0°S and 35.3°S and between longitudes 69.0°W and 69.4°W. Data-taking started on 1 January 2004 with 154 water-Cherenkov detectors and one fluorescence detector in operation. Installation was completed in June 2008 and running has been on-going since that date. The hole visible in the array map, south-east of Loma Amarilla, is because of difficulties with a local landowner.

Each water-Cherenkov station is filled to a depth of 1.2 m with highly-purified water enclosed within a diffusively-reflective liner. The water is viewed from above by three 9-inch photomultiplier tubes (PMTs) in contact with it. These detect Cherenkov light emitted by charged particles that enter the detectors. Each PMT provides two signals which are tagged with the GPS time stamps to an absolute time accuracy of 12 ns and are digitized using 40 MHz, 10-bit Flash Analog-to-Digital Converters (FADCs). A low-gain signal is taken directly from the anode of the PMT, while a high-gain signal is provided by the last dynode and amplified to be nominally 32 times larger than the low-gain signal, enhancing the total dynamic range to span more than three orders of magnitude in integrated signal.

Information about the time and the amplitudes of station signals above a trigger level are sent, via a purpose-built communications network, at a rate of about 20 Hz to a computer at the Central Campus. If spatial and temporal coincidences are identified, data from the triggered stations are recorded and an event is reconstructed from the temporal and signal information.

The data from the fluorescence emission are collected by a set of six telescopes at each of the FD sites, covering 30 degrees of elevation from the ground up and 6 x 30° over the array. Each telescope has a camera with 440 photomultipliers (pixels), recording the ultraviolet light received in each 100 ns time interval. At each site, an event is recorded whenever there are several pixels with signals above the night-sky background light, compatible with the image of a line. The GPS time is used to connect the fluorescence event to those seen simultaneously in other FDs and with SD stations that have signals.

The Auger Observatory is operated by a Collaboration of more than 400 scientists, engineers, technicians and students from more than 90 institutions in 18 countries. You can find further information about the Observatory and the Collaboration in Nucl.Instrum.Meth.A 798 (2015) 172-213 (arXiv) and on the official Auger website https://www.auger.org/.

Atmospheric monitoring at the Pierre Auger Observatory

The Observatory makes use of the atmosphere as a giant calorimeter. This has required the implementation of an extensive program to monitor the atmosphere above the site, as detailed knowledge of the atmosphere is required for the accurate reconstruction of air showers observed by both fluorescence and surface detectors.

The atmospheric state-variables, such as temperature, pressure and humidity, are needed to reconstruct the amount of fluorescence light emitted by the air showers and hence to discover their longitudinal development. The measurements with the surface detector are also affected by the changes of atmospheric conditions. Varying air densities close to the ground modify the lateral spread of the electromagnetic component of the extensive air-showers. Varying air pressure affects the trigger probability and the rate of events detected above a fixed energy. The atmospheric conditions at the ground are monitored every five to ten minutes by a series of weather stations placed at the sites of each of the four fluorescence detectors, and at the center of the surface detector array.

Aerosols and clouds are also monitored, as they impact the atmospheric transmission of the optical signal from the air shower to the fluorescence detectors. The optical transmission must be taken into account, so as to reconstruct the light generated along the shower axis accurately, starting from the light recorded at the telescopes. At central positions within the surface detector array, two laser facilities are installed (CLF and XLF on the map). These instruments are used to fire beams of UV light into the atmosphere every 15 minutes, to measure the aerosol attenuation of the fluorescence light in the line-of-sight of each telescope. Four infra-red cloud cameras mounted on "pan-and-tilt" platforms are used to scan the night sky for clouds and provide the cloud coverage. In addition to cloud cameras, also satellite data (GOES) are used to infer the cloud coverage above the array. The cloud obscuration deduced from these measurements is used in the analysis of the FD data. Finally, elastic lidars are operating at the fluorescence detector sites to measure the altitude of the cloud layers and the uniformity of the aerosol distribution horizontally across the Observatory, and one Raman-lidar receiver is located at the CLF to provide the vertical aerosol attenuation and other measurements of the atmospheric properties.

Space-weather monitoring at the Pierre Auger Observatory

Although the Pierre Auger Observatory was conceived to study cosmic rays at the highest energies, it can also be used to monitor the so-called "space weather", the phenomena that take place in the space surrounding the Earth, influenced by the variability of the Sun over periods ranging from hours to year.

The solar activity, in particular, changes the intensity of cosmic rays of low energy arriving at the Earth. This intensity can be measured at the Observatory by exploiting the so-called 'single-particle technique'. This consists of counting all of the particles hitting the individual water-Cherenkov detectors, independently of whether they belong to a large shower or are the lonely survivors of small showers. Most of the events detected by the single detectors are, in fact, due to solitary particles, the residuals of showers generated by cosmic rays with energy between 1010 eV and 1012 eV.

The flux of cosmic rays at these energies is modulated by solar activity because, after propagation in our Galaxy, they reach the heliosphere and interact with the magnetic field of the Sun embedded in the solar wind. The conditions of the latter are variable, due to the solar cycle modulation and to transient eruptions of solar ejecta. The solar magnetic field intensity and polarity change with time, following the cyclic solar activity: when this is high, the flux of cosmic rays in the heliosphere is low; in turn, when the Sun is in a quiet state, the flux of cosmic ray at Earth is at its maximum. The solar activity can thus be monitored by analyzing the rate of particles detected by the individual water-Cherenkov detectors, which exhibits time variations such as those due to the short-term Forbush Decreases and to the longer-term modulations over 11- and 22-year solar cycles.

About the Auger Open Data

Data and analysis tools

The following are provided through this portal:

Downloadable datasets

  • Cosmic-ray data:
    • Pseudo-raw data: for each event, a list of SD stations, with their relevant PMT traces, is available. If an event is detected simultaneously with the SD and FD it is called a hybrid event and a list of FD telescopes with a camera view is also provided. The main parameters from the SD and FD reconstruction are also given. The 'ready-to-use' Event Display is a good way to become familiar with the Open Data.
    • Reconstructed data: for each event, only 'high-level' information is provided. Different parameters are extracted from the pseudo-raw dataset to be used in physics analysis. Examples on how to use them can be found in the Analysis page.

  • Atmospheric data:
    • Pseudo-raw data: the values of different atmospheric state-variables, recorded using each of the five weather stations, are available.
    • Processed data: the values of the different atmospheric state- variables, obtained by merging the information from the different weather stations, are provided.

  • Scaler data:
    the counting rate of the surface detectors over 15 minutes, averaged over the active detectors, is provided.

  • Auxiliary data:
    these are additional data necessary for a full physics analysis but that are not extracted directly from the raw data. They include the position of the SD stations, the position of the FD pixels, the SD exposure, the FD acceptance.

Pseudo-raw and reconstructed cosmic-ray data are provided in JSON format. Reconstructed cosmic-ray data are also available in CSV format, representing a summary of the JSON files and containing the information that is needed for analysis. Similarly, atmospheric and scaler, and auxiliary data are in CSV format.

Tools

  • 'Ready-to-use' event display
  • Simple software, reading the JSON and CSV files and producing examples of basic histograms of different data parameters.
  • Analysis software demonstrating how to read the data and how to analyse them. The intention is to offer insights as to how our results have been obtained.

Other Auger Open Data

  • All Auger publications are available as Open Access. Some of them also include Open Data in the form of additional tables, plots, graphs.

Disclaimer

  • The Open Data are released under the (CC BY-SA 4.0) International License.
  • All datasets have a unique DOI that you are requested to cite in any applications or publications.
  • The current release should be cited as: Pierre Auger Collaboration (2021), Auger Open Data release 2021, DOI:10.5281/zenodo.4487612
  • The Auger Collaboration does not endorse any work, scientific or otherwise, produced using these data, even if available on, or linked from, this portal.
  • The spreadsheet-based datasets allow the user to undertake basic analyses. More complex analyses however require some knowledge of the underlying physics and of the instruments.
  • The analysis methods, including the reconstruction of the data, have evolved over time, and will continue to evolve. The reconstructed Open Data are processed with the most up-to-date software. Updates are thus foreseen, for either the reconstructed data or the software needed to analyse them. These will be detailed in later releases.
  • If you are interested in joining or working with the Auger Collaboration, please contact auger-join-request@auger.unam.mx.

Technical information

The Pierre Auger private raw data are kept in ROOT format at the Observatory site and moved to the IN2P3 computing center in Lyon for storage. They are then analyzed using different software frameworks. The data are then converted into JSON format for public distribution and the summary files are extracted. The different notebooks were validated by the corresponding analysis tasks within the Pierre Auger Collaboration.

This website is running PHP, uses bootstrap 4.5 for its responsive styling, aos and animate for some visual effects, and d3.js for the data visualization. The comment system of this page is based on codeshack. All the Data Release code is kept in gitlab and the analysis notebooks are Jupyter python notebooks that are uploaded to kaggle for online running. The Auger nightsky pictures are from Steven Saffi (CC BY-SA 2.0) and available at full resolution here and here.

Policy

The policy of the Auger Collaboration on Data Release and Open Access can be found here.