2. MAROON-X DRP: Using the Reduce class API

2.1. Complete Reduction Workflow Using the Python API

This guide walks through the end-to-end reduction of a single MAROON-X science exposure using the DRAGONS Python API (the Reduce class and dataselect.select_data). The example reduces an exposure of object HD 3651 taken on 2025-07-17. The final output is a fully reduced, wavelength-calibrated spectrum with science quality.

The same workflow expressed through the DRAGONS command-line tools is in MAROON-X DRP: Using the CLI tools. Both tutorials reduce the same files and produce the same outputs; they differ only in interface.

2.1.1. Overview of Reduction Steps

The full reduction proceeds in nine steps, executed in order:

Debundle - split raw GOA bundles into per-arm FITS files
Master Darks - combine raw darks per exposure time and arm
Dark Coefficients - fit dark scaling vs. exposure time per arm
Master Flats - build per-arm master flats from DFFFD + DDDDF inputs
Wavelength Calibration - fit the dynamic etalon wavelength solution
Synthetic Darks - interpolate a dark matched to each science exposure
Science Reduction - extract and wavelength-calibrate science spectra
Barycentric Correction - compute the BERV shift of the reduced spectra
Export Bundle - combine BLUE and RED arms into a single output FITS

Note

MAROON-X has two independent arms (BLUE and RED). Every calibration and science step is run separately per arm.

Important

MAROON-X is currently an external instrument package, so every reduction script must (a) import maroonx_instruments to register the AstroData tags, and (b) set myreduce.drpkg = 'maroonxdr' on every Reduce instance to load the maroonxdr recipe library. These two lines will no longer be needed once MAROON-X is integrated into DRAGONS.

2.2. Prerequisites

Before beginning the reduction, ensure you have:

A working maroonxdr installation in a Python virtual environment, with the DRAGONS framework available on the same environment. See the installation guide of the User Manual for details.
All raw FITS files for the example (darks, flats, wavecal, science) downloaded from the Gemini Observatory Archive (GOA) into a single working directory. We call this directory science_dir/ throughout the tutorial.

2.2.1. Directory Structure

The tutorial assumes a single working directory holding every raw bundle:

science_dir/
└── N*.fits                   # Raw GOA bundle files

As the reduction proceeds, Reduce.runr() writes its outputs into the working directory and registers calibrations under a calibrations/ subtree managed by caldb:

science_dir/
├── N*.fits                          # Raw GOA bundle files
├── *_b_*.fits, *_r_*.fits           # Debundled per-arm files and
│                                    # downstream reduction products
├── calibrations/
│   ├── processed_dark/              # Master darks (per exptime, per arm)
│   ├── processed_dark_coeff/        # Dark scaling coefficients (per arm)
│   ├── processed_flat/              # Master flats (per arm)
│   └── processed_wavecal/           # Dynamic etalon wavelength solutions
└── reduce_*.log                     # Reduction log files

Note

Calibration files produced by Reduce.runr() are written twice: once in science_dir/ and once in the matching calibrations/<caltype>/ subdirectory. The copy under calibrations/ is what caldb indexes and serves to later steps.

Note

This tutorial assumes caldb is configured and initialised. See Calibration Database Setup for a one-time setup walkthrough.

2.2.2. Dataset

This example reduces a single 300s exposure of HD3651 taken on 2025-07-17, against calibrations taken on 2025-07-01, 2025-07-17, and 2025-07-21.

All files can be downloaded from the Gemini Observatory Archive (GOA). The table below lists the raw GOA bundle ranges; each N…M*.fits bundle contains one BLUE and one RED exposure.

Type	GOA bundle range	Notes
Darks	`N20250717M6066` - `N20250721M6059`	DDDDE pattern, BLUE and RED, at exposure times 60, 120, 300, 600, 900, 1200, 1800 s
Flats	`N20250701M6126` - `N20250701M6271`	DFFFD and DDDDF illumination patterns, BLUE and RED
Wavecal	`N20250717M5948`	DEEEE etalon frame, BLUE and RED
Science	`N20250717M5299`	HD 3651, 300 s, SOOOE pattern, BLUE and RED

The dark exposure times cover the range needed by the dark-coefficient fit (Step 3), which interpolates a synthetic dark matched to the science exposure time.

2.3. Reduction Steps

2.3.1. Step 0: Environment Setup

Import the libraries used throughout the tutorial, configure logging, and define a small get_files helper plus the per-arm and per-exposure-time tag lists that the loops in later steps iterate over.

import os
import itertools as it
from pathlib import Path

from gempy.adlibrary import dataselect
from gempy.utils import logutils
from recipe_system.reduction.coreReduce import Reduce

import astrodata
import maroonx_instruments  # noqa - registers MAROON-X AstroData tags

# Configure logging. Use mode='debug' for verbose output when troubleshooting.
logutils.config(file_name='reduce.log', mode='standard')

# Working directory. Change the path here to suit your installation.
science_dir = Path('/path/to/science_dir')
os.chdir(science_dir)

def get_files(pattern='*.fits'):
    """Return a sorted list of files in science_dir matching pattern."""
    return sorted(str(p) for p in science_dir.glob(pattern))

# Tag lists used by the per-arm and per-exposure-time loops below
exptime_tags = ['60s', '120s', '300s', '600s', '900s', '1200s', '1800s']
arm_tags = ['BLUE', 'RED']

Note

get_files is the Python analogue of the CLI’s *.fits glob. Later steps reuse it with narrower patterns (*_reduced.fits, *_barycor.fits) to mirror the CLI’s targeted dataselect calls.

2.3.2. Step 1: Debundle Raw Data

Purpose: split each raw GOA bundle into one BLUE and one RED FITS file.

MAROON-X data arrives from GOA as bundle files (tagged BUNDLE) that contain both arms in a single FITS. The default processBundle recipe picks up every BUNDLE-tagged file and writes a per-arm FITS for each one. For example, the science bundle N20250717M5299.fits yields 20250717T144308Z_SOOOE_b_0300.fits and 20250717T144308Z_SOOOE_r_0004.fits.

Select every bundle in the working directory and run Reduce:

# Select bundles
selected_bundles = dataselect.select_data(get_files(), tags=['BUNDLE'])

# Debundle (no recipename set: processBundle is the default for BUNDLE)
myreduce = Reduce()
myreduce.files.extend(selected_bundles)
myreduce.drpkg = 'maroonxdr'
myreduce.runr()

After this step science_dir/ contains the original N*.fits bundles plus the per-arm debundled files used by every subsequent step.

2.3.3. Step 2: Master Darks

Purpose: combine the raw darks at each exposure time into a master dark, per arm.

For every (exposure time, arm) pair, dataselect.select_data finds the matching raw darks (DDDDE pattern, tagged RAW,DARK) and Reduce runs the default makeProcessedDark recipe. The dataset has 7 exposure times × 2 arms, so 14 master darks total will be created.

For a single combination of exposure time and arm, the process is:

# Process darks for a specific exposure time and arm, 300s and BLUE
selected_darks = dataselect.select_data(get_files(), tags=['RAW', 'DARK', '300s', 'BLUE'])

myreduce = Reduce()
myreduce.files.extend(selected_darks)
myreduce.drpkg = 'maroonxdr'
myreduce.runr()

Run the full loop over every exposure time and arm:

# Process darks for each exposure time and arm
for exptime, arm in it.product(exptime_tags, arm_tags):

    selected_darks = dataselect.select_data(
        get_files(), tags=['RAW', 'DARK', exptime, arm])

    myreduce = Reduce()
    myreduce.files.extend(selected_darks)
    myreduce.drpkg = 'maroonxdr'
    myreduce.runr()

Each output is a single master dark file named after the first raw frame in the stack with a _dark suffix - for example 20250721T170049Z_DDDDE_b_0300_dark.fits. The master darks land in calibrations/processed_dark/ (and as duplicates in science_dir/).

Verify processed darks:

# List all processed darks
print(dataselect.select_data(get_files(), tags=['PROCESSED', 'DARK']))

Note the presence of the PROCESSED tag, which serves to distinguish the processed darks from the raw darks that are present in the directory.

Note

Processed calibration files are saved twice: in the current working directory science_dir/ and in the matching calibrations/<caltype>/ subdirectory. The copy under calibrations/ is the one caldb indexes and serves to later steps.

2.3.4. Step 3: Dark Coefficients

Purpose: fit, per arm, the dark scaling as a function of exposure time so a synthetic dark can later be interpolated for any science exposure.

The makeDarkCoefficients recipe takes every master dark for one arm (one per exposure time) and produces a single dark-coefficient file. The input selector takes all PROCESSED,DARK files for the arm from the working directory and excludes the DARK_COEFF and DARK_SYNTH tags so previous coefficient or synthetic-dark outputs in the same directory are not re-ingested.

Run once per arm:

for arm in arm_tags:

    # Select all master darks for this arm, excluding any previous
    # coefficient or synthetic-dark outputs
    selected_darks = dataselect.select_data(
        get_files(),
        tags=['PROCESSED', 'DARK', arm],
        xtags=['DARK_COEFF', 'DARK_SYNTH'])

    # Fit the dark-coefficient model
    myreduce = Reduce()
    myreduce.files.extend(selected_darks)
    myreduce.drpkg = 'maroonxdr'
    myreduce.recipename = 'makeDarkCoefficients'
    myreduce.runr()

Each call writes one *_darkCoefficients.fits file (named after the first input master dark) into science_dir/, and a copy under the processed_dark_coeff caltype in calibrations/processed_dark_coeff/. The output carries the DARK_COEFF tag, which is how later steps locate it.

Verify dark-coefficient files:

# List dark coefficient files
print(dataselect.select_data(get_files(), tags=['DARK_COEFF']))

2.3.5. Step 4: Master Flats

Purpose: build a master flat per arm.

MAROON-X flats come in two illumination patterns: DFFFD (flat on fibers 2-4) and DDDDF (flat on fiber 5 only). The dataset for this tutorial contains both patterns and uses the makeProcessedFlatDFFFF recipe, which combines the two into a single DFFFF master flat per arm.

Run the reduction per arm:

for arm in arm_tags:

    # Select every raw flat for this arm (both DFFFD and DDDDF)
    selected_flats = dataselect.select_data(get_files(), tags=['RAW', 'FLAT', arm])

    # Build the master flat
    myreduce = Reduce()
    myreduce.files.extend(selected_flats)
    myreduce.drpkg = 'maroonxdr'
    myreduce.recipename = 'makeProcessedFlatDFFFF'
    myreduce.runr()

Each call writes one *_flat.fits master flat into science_dir/ and a copy under the processed_flat caltype in calibrations/processed_flat/. The output carries the PROCESSED,FLAT tag set.

Note

The default recipe makeProcessedFlat combines DFFFD and FDDDF patterns into an FFFFF master flat. When the dataset has DDDDF instead of FDDDF (as in this tutorial), use the makeProcessedFlatDFFFF variant explicitly via recipename.

Verify processed flats:

# List all processed flats
print(dataselect.select_data(get_files(), tags=['PROCESSED', 'FLAT']))

2.3.6. Step 5: Wavelength Calibration

Purpose: fit, per arm, a dynamic wavelength solution from the etalon calibration frame.

The Fabry-Perot etalon is the primary wavelength calibration file for MAROON-X. Anchored to an initial ThAr solution, the etalon provides the refined wavelength and drift solution that the science reduction (Step 7) applies to extracted spectra.

The makeDynamicWavecal recipe takes the raw etalon frame (DEEEE pattern, tagged RAW,ETALON) for one arm, identifies the peaks in every order, and fits the per-pixel wavelength solution. The getPeaksAndPolynomials:multithreading=True override parallelizes the fit and is recommended on multi-core machines.

Run once per arm:

for arm in arm_tags:

    # Select the raw etalon wavecal frame for this arm
    selected_wavecal = dataselect.select_data(get_files(), tags=['RAW', 'ETALON', arm])

    # Fit the dynamic wavelength solution
    myreduce = Reduce()
    myreduce.files.extend(selected_wavecal)
    myreduce.drpkg = 'maroonxdr'
    myreduce.recipename = 'makeDynamicWavecal'
    myreduce.uparms = {'getPeaksAndPolynomials:multithreading': True}
    myreduce.runr()

Each call writes one *_wavecal.fits file into science_dir/ and a copy under the processed_wavecal caltype in calibrations/processed_wavecal/. The output carries the PROCESSED,WAVECAL tag set.

Verify wavelength calibrations:

# List processed wavecal files
print(dataselect.select_data(get_files(), tags=['PROCESSED', 'WAVECAL']))

2.3.7. Step 6: Synthetic Darks

Purpose: interpolate, per science frame, a synthetic dark matched to its exact exposure time using the dark coefficients fit in Step 3.

The makeSyntheticDark recipe takes the raw science frames and uses the DARK_COEFF calibration to build a dark frame at the science exposure time. The synthetic darks are stored under the processed_dark caltype with the DARK_SYNTH tag.

Run once per arm:

for arm in arm_tags:

    # Select raw science frames for this arm
    selected_sci = dataselect.select_data(get_files(), tags=['RAW', 'SCI', arm])

    # Build the synthetic dark for each science frame
    myreduce = Reduce()
    myreduce.files.extend(selected_sci)
    myreduce.drpkg = 'maroonxdr'
    myreduce.recipename = 'makeSyntheticDark'
    myreduce.runr()

2.3.8. Step 7: Science Data Reduction

Purpose: extract and calibrate the science spectra, applying the master flat, the dynamic wavelength solution, and the synthetic dark matched to the science exposure time.

The default recipe for RAW,SCI files runs the full echelle reduction: dark and flat correction, straylight removal, stripe extraction, optimal extraction, wavelength assignment, and fiber combination. caldb resolves each input calibration automatically from the work done in Steps 2-6.

The three uparms overrides below are the recommended defaults for the current pipeline: extractStripes:straylight_removal_fibers=[5] applies the straylight correction to the calibration fiber, getPeaksAndPolynomials runs the per-order fits in parallel, and combineFibers:max_clips=20 sets the outlier rejection threshold when combining the science fibers.

Run once per arm:

for arm in arm_tags:

    # Select raw science frames for this arm
    selected_sci = dataselect.select_data(get_files(), tags=['RAW', 'SCI', arm])

    # Full echelle reduction
    myreduce = Reduce()
    myreduce.files.extend(selected_sci)
    myreduce.drpkg = 'maroonxdr'
    myreduce.uparms = {
        'extractStripes:straylight_removal_fibers': [5],
        'getPeaksAndPolynomials:multithreading': True,
        'combineFibers:max_clips': 20,
    }
    myreduce.runr()

Each call writes a *_reduced.fits file per input science frame into science_dir/. The reduced products carry the PROCESSED_SCIENCE tag and are the input to the barycentric-correction step (Step 8).

Verify reduced science frames:

# List all reduced science frames
print(dataselect.select_data(get_files(), tags=['PROCESSED_SCIENCE']))

2.3.9. Step 8: Barycentric Correction

Purpose: compute the solar-system barycentric correction.

The applyBarycentricCorrection recipe queries SIMBAD to resolve the target’s coordinates, computes the barycentric Earth radial velocity (BERV), and writes it to *_barycor.fits.

The barycentricCorrection primitive accepts two related parameters:

target_name — substring match against the FITS OBJECT header. When set, only files whose OBJECT contains this string are processed; when None, every input is processed using its own OBJECT value for the SIMBAD lookup.
simbad_target_name — explicit SIMBAD-resolvable name. Use when the FITS OBJECT value (e.g. HD3651) is not directly resolvable and you need to override the lookup string (e.g. HD 3651).

For this tutorial the FITS header has OBJECT='HD3651' (no space), which is already SIMBAD-resolvable, so only target_name is needed.

Run once per arm:

for arm in arm_tags:

    # Select the reduced science file for this arm
    selected_reduced = dataselect.select_data(
        get_files('*_reduced.fits'), tags=['PROCESSED_SCIENCE', arm])

    # Apply the barycentric correction
    myreduce = Reduce()
    myreduce.files.extend(selected_reduced)
    myreduce.drpkg = 'maroonxdr'
    myreduce.recipename = 'applyBarycentricCorrection'
    myreduce.uparms = {'barycentricCorrection:target_name': 'HD3651'}
    myreduce.runr()

Each call writes one *_barycor.fits file into science_dir/. The output carries the BARYCOR tag.

Verify barycor-corrected files:

# List all barycor-corrected files
print(dataselect.select_data(get_files(), tags=['BARYCOR']))

2.3.10. Step 9: Export Reduced Science Bundle

Purpose: combine the BLUE and RED barycor-corrected spectra of each science observation into a single multi-extension FITS.

The exportReducedBundle recipe groups its inputs by the original GOA bundle name (ARCHNAME), then merges each (BLUE, RED) pair into one output. Both arms for a given observation must be passed in the same Reduce call - the recipe raises an error if either arm is missing for any ARCHNAME.

Select every barycor output and run the recipe once:

# Select all barycor-corrected files (both arms)
selected_barycor = dataselect.select_data(
    get_files('*_barycor.fits'), tags=['PROCESSED', 'BARYCOR'])

# Combine BLUE + RED into the final bundle
myreduce = Reduce()
myreduce.files.extend(selected_barycor)
myreduce.drpkg = 'maroonxdr'
myreduce.recipename = 'exportReducedBundle'
myreduce.runr()

The output is a single <ARCHNAME>_reduced.fits per observation in science_dir/ - for this tutorial, N20250717M5299_reduced.fits. This is the science-ready product.