Top-header-tutorial1
Return To Portal Search and Retrieve NOAO Data PI & Co-I Data Access Data compression Archive file names
Search and Retrieve NOAO Data
The Query page lets you search for data in the NOAO Science Archive. The Archive holds data from many different combinations of telescopes and instruments, including the NOAO facilities at KPNO and CTIO, and NOAO data from consortium facilities such as WIYN, SOAR and SMARTS. All raw data from these telescopes and instruments are archived, as well as pipeline-reduced data products from the DECam, Mosaic and NEWFIRM imagers on the NOAO 4m telescopes. Principal Investigators and authorized co-investigators of NOAO observing programs who have registered with the NOAO Archive can retrieve their proprietary data using the Query form. Any user (registered or not) can search for and retrieve non-proprietary data as well.

The process of finding and accessing NOAO data follows several basic steps:



The Search Form

Searchform
  • At top left (under Search NOAO data), there are four tabs marked Query Form, Advanced Query Form, Results, and Staging Area. These let you switch back and forth between the search form (or the advanced query form), the search results, and the place where you will stage selected data for ftp retrieval.
  • The Reset button resets all search form fields to their default values (usually blank).
  • The Search button executes your query and takes you to the Results page.
  • The Search Type box has two options in a pull-down menu:
    • Search All Data
    • Search My Data
    If you are a registered PI or an authorized co-I, you can use Search My Data to search for and retrieve data from your own NOAO observing programs. This is discussed in more detail below.
You may fill in as few or as many fields in the form as you wish in order to restrict any search. Most fields should be self-explanatory and are not discussed here in detail. Moving your mouse over any of the search field labels will bring up a box describing the search parameter and giving examples of how to use it.

Caution: The archive metadata that can be searched are not always complete or perfect! The data come from many telescopes and instruments, each with its own data-taking system, usually written without archival legacy in mind. Useful information is not always stored in FITS header keywords in a standardized way. Some information may occasionally be incorrect or missing (e.g., object coordinates) due to problems with the telescope, instrument, or data-taking systems. Other parameters (e.g., object name) depend on observer input and thus may be missing, misleading or wrong.

How to find your own NOAO data

If you are a registered user who is the PI for NOAO observing programs, or a co-I who has been granted authorized access by the PI, select Search My Data from the Search Type menu. Then enter any other constraints that you wish (e.g., observing calendar date, telescope and instrument, etc.) and click Search. If you have not already logged in, you will be directed to the Login page, then back to this form.

You may only retrieve and download proprietary data if you are the program PI or an authorized co-I, and have signed in with your NOAO Archive username!

For information on administering co-I access to proprietary data, please see the tutorial on PI/Co-I data access.

Searchtype_searchmydata

How to search all data in the NOAO Archive

To search for any NOAO data (not just those from your own observing programs), select Search All Data from the Search Type menu. Any user may search all NOAO data, without any need for an Archive registration or username. However, proprietary data can only be staged and retrieved by their owner, who must signed in as a registered user.
Searchtype_searchalldata

Search form parameters

Searchform_target

Target information

This section lets you search for data by object name or coordinates.
  • Object name: Entering a common astronomical object name and hitting Resolve will call the Sesame name resolver, which in turn draws upon Simbad, NED and VizierR. If the object name is recognized, it will fill in the Coordinate fields with the object's RA and Dec. If you do not click Resolve, the search will match the name string to the object keyword in the archive database.

    Note: In most cases, the object name in data FITS headers was entered by observers and may not always be correct or reliable. Moreover, the string matching used to search this field (when the astronomical name resolver is not invoked) is a simple, case-insensitive substring match.

  • Coordinates: Right ascension and declination, in either decimal degrees or sexagesimal (hh:mm:ss dd:mm:ss) format.

    Note: Coordinates in raw data FITS headers can sometimes be incorrect or unavailable due to errors in telescope pointing, failed communication between telescope and instrument control systems, or other issues. The coordinates in pipeline-reduced DECam, Mosaic and NEWFIRM data have been calibrated whenever possible to a standard astrometric reference system (e.g., USNO or 2MASS), and should be reliable except when such calibration has failed.

  • Search box size: A coordinate search will match any archived data with RA, Dec falling within a search box width (in arcmin) that can be specified here. The default is 30 arcminutes.

Note: The search region is defined by constant RA and Dec limits that are one half the search box size away from the specified RA and Dec. This simplistic definition therefore distorts from a square box as one moves toward the celestial poles.

Searchform_observation

Observation

This section lets you search by information about the observing program, principle investigator, observing date, or filename.
  • Program number: The NOAO proposal identifier for data in the archive. Generally, this has a format like "2010A-9876", i.e., the observing semester and a 4-digit proposal number. The search is insensitive, and operates by substring matching, i.e., entering '2010A' will find data from all 2010A programs, while '9876' will find all programs from any year with that number. Some data are owned by the observatories, and have program numbers like 'NOAO', 'WIYN', 'SOAR' or 'SMARTS'.
  • Principal Investigator: Searches by Principle Investigator (PI) use case-insensitive substring matching to the name information in the archive database.

    Note: The PI name in the archive database is generally reliable, but there may be variations in the inclusion of middle names or initials, titles, etc. A search by PI last name may be the safest approach, but of course may sometimes match more than one person. E.g., searching by 'smith' could match several different observers, as well as 'Smithson', etc.

  • Observing calendar date: The calendar date (in YYYY-MM-DD format) at the start of the observing night. This may differ from the UT date of the observation. You may select operators "=", "<", and ">" to search on, before, after the specified date, or "BETWEEN" to search a range of dates (inclusive). Clicking on the box will bring up a calendar selection tool.
  • Original filename: The filename originally assigned to a data set at the telescope by the observer or the data-taking system. The search is a case-insensitive substring match. Thus, a search for files named 'object' will match images with names such as 'object0123.fits' or 'Test_Object.imh'. File extensions (e.g., '.fits') need not be specified.
  • Archive filename: The filename used in the NOAO Archive. Files stored in the Archive are assigned new, unique filenames, which can be queried here. As examples, raw data from KPNO or CTIO are often named 'kp??????' or 'ct???????', DECam data are named 'dec???????', and pipeline-reduced data are named 'tu??????'. The search is a case-insensitive substring match. Thus, a search for files named 'kp0336' will match images with names such as 'kp033611.fits.fz', 'kp033612.fits.fz', etc. File extensions (e.g., '.fits' or '.fits.gz') need not be specified.

Note: All FITS files are renamed to a standard nomenclature when they are stored in the archive. The original filenames assigned by the user, however, are recorded in a FITS header keyword, and this filename search uses those names to help find data by those original names.

Searchform_telescope

Telescope & Instrument

This section lets you select telescopes and instruments to search, and to limit searches by exposure time.
  • Telescope and Instrument: A pull-down menu of telescope + instrument combinations for data stored in the archive. You may select multiple choices from this list (generally by using some combination of shift or control keys and mouse clicks, depending on your computer and browser). Selecting nothing will search all telescopes and instruments.

    Note: In a few cases, notably at the KPNO Mayall 4m, the instrument may be unknown, as the same data-taking computer is used for multiple instruments, and header information does not permit these to be easily distinguished.

  • Exposure time: The exposure time for an observation, in seconds. You may select operators "=", "<", and ">", or "BETWEEN" to search a range of exposure times (inclusive).

    Note: In some cases the exposure information may be unreliable or unavailable, particularly for some instruments where it is recorded in the headers in non-standard keywords or formats.

Searchform_products

Data products

This section provides a set of checkboxes that let you select what sort of data products to search, most notably, raw or pipeline-reduced data, and to select various types of pipeline data products. You may select more than one type of product. If you check no boxes, the search will return all types of raw and reduced data products. The data product types are:
  • All instruments:

    • Raw: Data as taken at the telescope, with no processing.
  • Pipeline-reduced data for DECam, Mosaic and NEWFIRM:
    See the NOAO Data Handbook for details about each data product type.

    • Calibrated images: Individual exposures in multi-extension FITS format, with basic instrumental signatures removed, e.g., bias or dark subtraction, flat fielding, etc. Whenever possible, a world coordinate system and approximate photometric zeropoint are recorded in the headers. These are accompanied by data quality masks.
    • Reprojected images: Individual calibrated exposures, geometrically rectified and projected onto the sky plane, combining data from all camera detectors into a single-extension FITS image. These are accompanied by data quality masks.
    • Stacked images: Coadded images constructed from a sequence of overlapping sky exposures, with masking of static and transient defects such as cosmic rays. These are accompanied by exposure maps and data quality masks.
    • Master calibration files: Combined and processed calibration images such as bias frames, darks, dome or sky flat fields, pupil images, etc. These may be accompanied by data quality masks.
  • For NEWFIRM only:

    • Sky subtracted images: Individually calibrated exposures in multi-extension FITS format, with basic instrumental signatures removed (as for 'calibrated images' above) and with astrometric and photometric calibration where possible, and also with two-dimensional background subtraction. These images are accompanied by data quality masks.
  • Public release date: Data in the archive may have a proprietary period. Before that period expires, the data are only available for retrieval by the registered Principal Investigator. This field lets you search for data by public release date (in YYYY-MM-DD format), using operators '<', '>', or 'BETWEEN' as desired. Clicking on the box will bring up a calendar selection tool.


Search Results and Data Selection

Resultsgrid The results of a query appear in a table that will often be longer and wider than can be displayed in your browser, and there are vertical and horizontal scroll bars that can be used to view the whole table. A query may return many archived data sets (up to a current maximum of 1000 files for unregistered users, or 10000 for registered, signed-in users), but displays only 20 rows of results per page. You may navigate through the results by clicking the page numbers that appear near the top left side of the Results table.

Sorting, Filtering and Categorizing

The header of the Results table shows the names of each column. Mousing over the column name gives information about the data in that column. Clicking on the column name will sort the results according to the values in that column.

In the box labeled Refine that appears above the search results, There is a pulldown menu labeled "Filter by". This lets you further restrict the search results to a subset defined by one of the data parameters. For example, you can filter by Telescope, PI, Observing date, etc. Type the value of the parameter into the box to the right of the menu; the Portal will assist you by giving you a list of options that match the string you are typing. Click Go to select only results matching that parameter value; you can revert to the full list of results by clicking Reset.

Resultsgrid_filter_menu
There is also a pulldown menu labeled "Categorize by". This lets you select certain parameters by which to sort your results into separate "categories". Select one of these parameters, and you will see a new set of tabs appear above the Results table, with all of the parameter values from your search.

In the example shown below, the user has categorized by Proposal ID, and a series of tabs appear at top organizing the search results by their NOAO program numbers. You can undo this and go back to the full results set by clicking the Uncategorize button.

Resultsgrid_categorize_menu

Resultsgrid_categorized


Selecting Data for Retrieval

Resultsgrid_retrieve There are several ways to select data sets for retrieval from the archive. If you own proprietary data and have signed in, you may retrieve your data. For unregistered users, only public (i.e., non-proprietary) data may be selected for retrieval. The public release date for any data set is shown in a column at the right hand side of the table (see right). If a data set is public, the Access column will show a link marked Retrieve. Clicking on this link will immediately start a data download for that file.

The leftmost column of the Results table (see left) includes checkboxes that you may use to select individual data sets.

You can also use the "Selection" pulldown menu, located in the Download box at the upper right (see below). This provides options for selecting or de-selecting all visible rows on the current page of Results, or all rows on all pages of the Results table.

Resultsgrid_select

Staging Data for Retrieval

Once you have selected the data that you want, click the button marked "Stage selected rows" in the Download box at upper right. A pop-up box will tell you the volume of data that you are staging, and ask you to confirm that you wish to proceed. This start the process of staging your data for ftp retrieval. Stagingpopup


The Staging Area

After clicking on "Stage selected rows", you will be automatically taken to the Staging Area, where you can retrieve your data. You can also go there any time by clicking on the blue Staging Area tab in the upper portion of the window.

The Staging Area is shown below, seen in the process of staging some data. The panels at left give information about the staging status, the available storage space, and instructions and tips for retrieving your data. At right there is a list of the data files that are being staged; this may continue on multiple pages if you are staging more than 20 files. There are several buttons and selection menus across the top of the page which let you control the staging process. Staging is generally quite rapid, even for large numbers of files.

Stagingarea The image staging status is summarized in a box at the upper left, showing how many files have been queued for staging, successfully staged, or for which there have been errors. The staging status codes have the following meanings:
Not Enqueued: Initial state, where requested data product is waiting to enter the Portal staging queue
Enqueued: Requested data product is now in the Portal staging queue
Staging: Requested data product is being transferred to the ftp area
Staged: Data product is ready for user download from the ftp area
Canceled: User has canceled staging for the data
Timeout: Connection from Portal to Archive has been lost
Error: Data product was not successfully transferred to the ftp area; user must re-request data

The list at right also gives the staging status for each individual data set. The file names are shown in bold face when they have been staged successfully.

FTP Retrieval

When the Imaging Staging Status reports that all of your files have been staged, you may retrieve them by ftp from nvo.noao.edu. The username, password, and the location of your staging area are given at left.
  • Unregistered users may stage non-proprietary data for retrieval by anonymous ftp.
  • Registered users who have signed in will stage data to a password-controlled area.

The Download tips in the Staging Area also give instructions for using lftp, which provides faster parallel transfer and can considerably speed up download times for large volumes of data. You may need to install lftp, which is available from most standard software repositories, and to download an lftp configuration file and save it as ~/.lftprc. Then, follow the instructions provided in the Download tips.

Note: You must use plain ftp (not sftp) to download your data from the staging area. Be sure to select binary file transfer, or your data may be unreadable!

Note: Data in the archive staging area have a limited shelf life, and may be deleted after one week if the staging area nears its maximum capacity. Therefore, prompt retrievals are advised.

If you have problems with staging

Occasionally, some data sets may fail to stage properly. If so, you may need to try to restart staging. At the top of the Staging Area page there are several buttons that can be used to control the staging process:
  • Restart staging: Click here if some files fail to stage properly. A pop-up window should appear asking if you wish to re-stage all images that were not previously staged. Click 'OK' and hopefully everything will work properly this time...
  • Stop staging: Click here to halt staging if necessary.
  • Clear staging area: Remove all files from the staging area.
Stagingarea_restage

Cleaning up your FTP staging area

When you have finished downloading your data, we suggest that you click Clear my staging area to empty the ftp area and free up staging disk space. Staged data will automatically be deleted after approximately one week. Stagingarea_actions


Retrieving data directly from the Archive with cURL

As an alternative to staging your data for FTP retrieval, you can fetch data using cURL, a command-line tool that allows batch downloads directly from the Archive. In certain circumstances, this may be faster or more robust than standard FTP.

Working with your downloaded data

Data compression:

FITS data stored in the NOAO archive are compressed in order to save space and to speed download transfer. Before 2010, data were compressed using gzip. From semester 2010A on, the Archive has begun to use "tile compression", which is a method of handling data compression within the FITS standard (rather than externally compressing existing FITS files). The NOAO Archive is now using the Rice compression algorithm to create tile-compressed FITS images; this is substantially faster than standard gzip and achieves greater compression factors. The tile-compressed data files are recognizable by their ".fz" extension.

Archive filenames:

Data from NOAO telescopes and instruments are assigned unique filenames when they are stored in the NOAO Science Archive. However, the new archive filenames are generally not very informative. If you are the PI of an observing program, you may find it useful to rename the data files that you retrieve from the archive to the names they had at the telescope.