gnssrefl.extract_arcs module

extract_arcs.py - Standalone module for extracting satellite arcs from SNR data.

This module provides a clean API for detecting and extracting satellite arcs from Signal-to-Noise Ratio (SNR) data files. It refactors arc detection logic from gnssir_v2.py into reusable functions.

An “arc” represents a continuous satellite pass (rising or setting) across the sky. Arcs are split when: 1. Time gap > 600 seconds (10 minutes) 2. Elevation angle direction reverses (rising <-> setting)

gnssrefl.extract_arcs.apply_refraction(snr_array, station_config, year, doy, verbose=True)

Apply refraction correction to SNR elevation angles.

Returns a copy of snr_array with corrected elevations; rows where the correction is invalid (e.g. ele < 1.5 for NITE/MPF) are removed.

gnssrefl.extract_arcs.attach_gnssir_processing_results(arcs, results, time_tolerance=0.17)

Attach gnssir processing results to extracted arcs.

For each arc, finds the matching row in the gnssir result file based on satellite number, frequency, rise/set direction, and UTC time proximity. Sets metadata['gnssir_processing_results'] to a dict or None.

gnssrefl.extract_arcs.attach_phase_processing_results(arcs, results, time_tolerance=0.17)

Attach phase processing results to extracted arcs.

For each arc, finds the matching row in the phase result file based on satellite number, frequency, and UTC time proximity. Sets metadata['phase_processing_results'] to a dict or None.

gnssrefl.extract_arcs.attach_vwc_track_results(arcs, station, year, doy, extension='', az_tolerance=5.0, time_tolerance=0.25)

Attach VWC track results to extracted arcs.

Matches track file rows to arcs by sat, freq suffix, azimuth, and hour. Sets metadata['vwc_track_results'] to a dict or None.

gnssrefl.extract_arcs.check_azimuth_compliance(az_min_ele: float, azlist: List[float]) bool

Check if azimuth is within allowed regions.

Parameters:
  • az_min_ele (float) – Azimuth angle (degrees) at the lowest elevation point of the arc

  • azlist (list of float) – Azimuth regions as pairs [az1_start, az1_end, az2_start, az2_end, …] e.g., [0, 90, 180, 270] means 0-90 and 180-270 degrees

Returns:

True if azimuth is within any of the allowed regions

Return type:

bool

gnssrefl.extract_arcs.extract_arcs(snr_array: ndarray, freq: int | List[int] | None = None, e1: float = 5.0, e2: float = 25.0, ellist: List[float] | None = None, azlist: List[float] | None = None, sat_list: List[int] | None = None, min_pts: int = 20, polyV: int = 4, pele: List[float] | None = None, dbhz: bool = False, screenstats: bool = False, detrend: bool = True, split_arcs: bool = True, filter_to_day: bool = True, year: int | None = None, doy: int | None = None, dec: int = 1) List[Tuple[Dict[str, Any], Dict[str, ndarray]]]

Extract satellite arcs from SNR data array.

Parameters:
  • snr_array (np.ndarray) – 2D array with columns: [sat, ele, azi, seconds, edot, snr1, snr2, …]

  • freq (int, list of int, or None) – Frequency code(s). Default: None (auto-detect)

  • e1 (float) – Minimum elevation angle (degrees). Default: 5.0

  • e2 (float) – Maximum elevation angle (degrees). Default: 25.0

  • ellist (list of floats, optional) – Multiple elevation angle ranges as pairs. Overrides e1/e2.

  • azlist (list of floats, optional) – Azimuth regions as pairs. Default: [0, 360]

  • sat_list (list of int, optional) – Specific satellites to process. Default: all satellites in data

  • min_pts (int) – Minimum points required per arc. Default: 20

  • polyV (int) – Polynomial order for DC removal. Default: 4

  • pele (list of float, optional) – Elevation angle range [min, max] for polynomial fit. Default: [e1, e2]

  • dbhz (bool) – If True, keep SNR in dB-Hz. Default: False

  • screenstats (bool) – If True, print debug information. Default: False

  • detrend (bool) – If True (default), remove DC component via polynomial fit.

  • split_arcs (bool) – If True (default), split data into separate arcs.

  • filter_to_day (bool) – If True (default), only return arcs within the principal day (0-24h).

  • year (int, optional) – Year, used for L2C/L5 satellite list lookup.

  • doy (int, optional) – Day of year, used with year.

  • dec (int) – Decimation factor. Default: 1 (no decimation).

Returns:

Each arc is represented as: - metadata: dict with keys: sat, freq, arc_num, arc_type, ele_start, ele_end,

az_min_ele, az_avg, time_start, time_end, arc_timestamp, num_pts, delT, edot_factor, cf

  • data: dict with keys: ele, azi, snr, seconds, edot (all np.ndarray)

Return type:

list of (metadata, data) tuples

gnssrefl.extract_arcs.extract_arcs_from_file(obsfile: str, freq: int | List[int] | None = None, buffer_hours: float = 2, **kwargs) List[Tuple[Dict[str, Any], Dict[str, ndarray]]]

Extract satellite arcs from an SNR file.

Loads the file with read_snr() and extracts arcs in one call.

Parameters:
  • obsfile (str) – Path to the SNR observation file.

  • freq (int, list of int, or None) – Frequency code(s). Default: None (auto-detect)

  • buffer_hours (float) – Hours of data from adjacent days. Default: 2

  • **kwargs – Additional keyword arguments passed to extract_arcs()

Returns:

See extract_arcs() for format details.

Return type:

list of (metadata, data) tuples

Raises:
  • FileNotFoundError – If obsfile does not exist.

  • RuntimeError – If read_snr() fails to load the file.

gnssrefl.extract_arcs.extract_arcs_from_station(station: str, year: int, doy: int, freq: int | List[int] | None = None, snr_type: int = 66, buffer_hours: float = 2, attach_results: bool | List[str] = False, extension: str = '', station_config: Dict[str, Any] | None = None, gzip: bool = True, track_file: str | Path | None = None, track_cache: Dict[str, Any] | None = None, tag_with_legacy_apriori: bool = False, refraction_verbose: bool = True, **kwargs) List[Tuple[Dict[str, Any], Dict[str, ndarray]]]

Extract satellite arcs for a station/year/day.

Resolves the SNR file path, loads SNR data, optionally applies refraction correction and decimation, extracts arcs, and optionally saves arc files and attaches processing results.

Parameters:
  • station (str) – Station name (4 characters, e.g. ‘mchl’)

  • year (int) – Full year (e.g. 2025)

  • doy (int) – Day of year (1-366)

  • freq (int, list of int, or None) – Frequency code(s). Default: None (auto-detect)

  • snr_type (int) – SNR file type (66, 77, 88, etc.). Default: 66

  • buffer_hours (float) – Hours of data from adjacent days for midnight-crossing arcs. Default: 2

  • attach_results (bool) – If True, attach gnssir/phase/vwc results to arc metadata. Default: False

  • extension (str) – Strategy extension for result file paths. Default: ‘’

  • station_config (dict, optional) – Station analysis parameters. When provided, enables refraction correction (if station_config['refraction']) and savearcs (if station_config['savearcs']).

  • gzip (bool) – If True, gzip the SNR file after reading. Default: True

  • track_file (path-like, optional) – Path to a tracks-shaped JSON file (tracks.json from build_tracks, or vwc_tracks.json from vwc_input). When supplied, each arc’s metadata is tagged via tracks.attach_track_id with track_id, track_epoch, track_azim, and (if present in the epoch dict) apriori_RH. Arcs that don’t match any track get -1/None.

  • track_cache (dict, optional) – Shared dict for reusing the same tracks JSON across many calls. Pass the same dict on each call; the JSON is loaded and indexed on the first call and reused thereafter.

  • tag_with_legacy_apriori (bool) –

    When True, tag arcs from the legacy GPS apriori_rh_{fr}.txt file via tracks.attach_legacy_apriori (sets apriori_RH / track_azim on each arc by (sat, azimuth-within-3 deg) matching). Mutually exclusive with track_file. Default: False.

    When neither track_file nor tag_with_legacy_apriori is provided, arcs are returned without track_id / track_epoch / track_azim / apriori_RH tagging.

  • refraction_verbose (bool) – Forwarded as verbose to apply_refraction so batch callers can silence the per-day refraction prints. Default: True.

  • **kwargs – Additional keyword arguments passed to extract_arcs()

Returns:

See extract_arcs() for format details.

Return type:

list of (metadata, data) tuples

Raises:
  • FileNotFoundError – If the SNR file does not exist and cannot be decompressed, or if track_file is supplied but does not exist.

  • ValueError – If both track_file and tag_with_legacy_apriori=True are set.

gnssrefl.extract_arcs.extract_arcs_from_tracks(tracks_json)

Walk active-epoch days in tracks_json and return tagged (meta, data) arcs.

Robust SNR-walk entry for consumers that need the full per-arc SNR payload tagged against a (possibly QC-edited) in-memory tracks_json. Station and extension come from tracks_json[‘metadata’]; tagging happens via a temp-file round-trip through extract_arcs_from_station’s track_file kwarg. Arcs with no matching track are dropped.

Returns a flat list of (metadata, data) tuples in the standard extract_arcs format, concatenated across all active-epoch days.

For the fast summary-only path (results/ + failQC/ with no SNR walk), use load_gnssir_results_from_tracks instead.

gnssrefl.extract_arcs.load_gnssir_results_from_tracks(tracks_json)

Fast-path summary DataFrame from results/ + failQC/ artifacts.

Walks active-epoch days in tracks_json, reads the gnssir results file and its failQC sibling for each day via load_results_with_failqc, and tags each row against tracks_json via lookup_arc. Requires a prior gnssir run; missing failQC siblings raise FileNotFoundError. Rows with no matching track are dropped.

Returns a DataFrame with columns mjd, azim, constellation, RH, match_T, track_id, track_epoch. match_T is always NaN so tracks.fit_segment falls back to the constellation’s default repeat interval via the constellation column.

For the robust SNR-walk that returns full (meta, data) tuples, use extract_arcs_from_tracks instead.

gnssrefl.extract_arcs.load_results_with_failqc(station, year, doy, extension, require_failqc)

Load the combined results+failQC ndarray for one day.

Reads results/{station}/[{extension}/]{doy:03d}.txt and the sibling failQC/ file written by retrieve_rh. Both files share the RESULT_COLUMNS layout; failQC rows have their RH column overwritten with NaN so that downstream consumers filtering on RH.notna() separate pass from fail without a second column.

Parameters:
  • station (identifier tuple used by FileManagement.) –

  • year (identifier tuple used by FileManagement.) –

  • doy (identifier tuple used by FileManagement.) –

  • extension (strategy extension string ('' for the default strategy).) –

  • require_failqc (bool) – When True, raise FileNotFoundError if the results file has at least one row but the failQC sibling does not exist. Empty results files (zero-row, e.g. from days where the SNR file had no data) are tolerated because gnssir writes no failQC file in that case. The fast-path tracks loader sets this. When False, missing failQC is silently tolerated (used by the attach_results=[‘gnssir’] branch of extract_arcs_from_station).

Returns:

Combined 2-D array with the same column layout as RESULT_COLUMNS, or None when neither file exists.

Return type:

np.ndarray or None

gnssrefl.extract_arcs.move_arc_to_failqc(meta, station, year, doy, extension='')

Move a saved arc file from arcs/ to arcs/failQC/.

gnssrefl.extract_arcs.remove_dc_component(ele: ndarray, snr: ndarray, polyV: int, dbhz: bool, pele: List[float] | None = None) ndarray

Remove direct signal component via polynomial fit.

Parameters:
  • ele (np.ndarray) – Elevation angles (degrees)

  • snr (np.ndarray) – Raw SNR values

  • polyV (int) – Polynomial order for DC removal

  • dbhz (bool) – If True, keep SNR in dB-Hz; if False, convert to linear units first

  • pele (list of float, optional) – Elevation angle range [min, max] for polynomial fit. If provided, the polynomial is fit on data within this range but evaluated (and removed) over the full arc.

Returns:

Detrended SNR data

Return type:

np.ndarray

gnssrefl.extract_arcs.save_arc(meta, data, sdir, station, year, doy, savearcs_format='txt')

Save a single arc file to sdir.

gnssrefl.extract_arcs.setup_arcs_directory(station, year, doy, extension='', nooverwrite=False)

Create arcs directory, optionally clearing old contents.