gnssrefl.tracks module

tracks.py: multi-GNSS track ground-truth for gnssrefl.

This module owns both sides of the multi-GNSS tracks.json artifact:

Build side: build_tracks walks SNR files via gnssrefl.extract_arcs, folds arcs into per-(sat, freq) periodic ground tracks, fits a single epoch per track, and writes the JSON.

Runtime side: load_tracks_json, build_lookup_index, lookup_arc, and attach_track_id consume the JSON to tag arcs at extract time with their (track_id, track_epoch).

Why

gnssrefl currently identifies VWC tracks by clustering arcs on azimuth alone (+/-3 deg from cluster center). This works for GPS because GPS ground tracks repeat every sidereal day, so a sat’s daily passes have stable azimuths. For non-GPS the repeat period is multi-day:

GPS: 1 sidereal day (2 orbits/day) GLONASS: 8 sidereal days (17 orbits / 8 sid days) Galileo: 10 sidereal days (17 orbits / 10 sid days) BeiDou MEO: 7 sidereal days (13 orbits / 7 sid days)

Within one repeat period a single sat traces 10-17 distinct ground tracks across the sky. Pure az clustering can’t separate them (they cover the whole horizon), so the existing VWC code restricts to GPS. This module produces a clean ground-truth track set for the other constellations.

This is the minimal MVP variant: each track has exactly one stable epoch (epoch_id == 0). The schema reserves epochs as a list so future changepoint-detection work can add entries without breaking readers.

Identity scope

There are two layers of identity in the tracks system:

track_id is forever. The same id appears in every artifact derived from any tracks-shaped JSON (tracks.json, vwc_tracks.json, the per-day phase files, the vwc output files).
epoch_id equals the epoch’s list index (0..N-1) within one saved tracks_json. It is regenerated on every save_tracks and renumbered on every structural mutation (split_epoch, merge_epochs).

gnssrefl.tracks.active_epoch_days(tracks_json): Set of (year, doy) pairs spanning the union of active epoch windows.

gnssrefl.tracks.assign_tracks(df_freq, T_candidates_solar)

Walk arcs forward in MJD and assign track ids.

Returns (df_sorted, track_ids, match_T) where match_T[i] is the candidate T (in solar days) used to extend arc i, or NaN if arc i seeded a new track.

gnssrefl.tracks.attach_legacy_apriori(arcs, station, extension='')

Tag arcs with legacy GPS-only per-freq apriori_rh_{fr}.txt entries.

Groups arcs by frequency, loads apriori_rh_{fr}.txt once per freq, and matches each arc to a track by (satellite, circular azimuth distance <= 3 deg), the same rule used historically by the legacy VWC pipeline.

Sets meta['apriori_RH'], meta['track_azim'], meta['track_id'], and meta['track_epoch'] on every arc. apriori_RH / track_azim are None on miss; track_id / track_epoch are -1 on miss. track_epoch is always 0 on match (the legacy path has only one epoch per track).

gnssrefl.tracks.attach_track_id(arcs, track_file_path, year, doy, track_cache=None)

Tag each arc’s metadata with track info from tracks-shaped JSON.

Works against both tracks.json (the station-wide catalog) and vwc_tracks.json (the VWC-eligible filtered subset, which adds a per epoch apriori_RH field).

Each metadata dict gets these new keys:: track_id, track_epoch (both -1 on no match), track_azim (az_avg_minel of the matched epoch, or None), apriori_RH (matched epoch’s apriori_RH, or None; only present in vwc_tracks.json).

Parameters:

arcs (list of (metadata, data) tuples) – Output of extract_arcs, modified in place.
track_file_path (path-like) – Path to a tracks-shaped JSON file (tracks.json from build_tracks, or vwc_tracks.json from vwc_input).
year (int) – Year and day-of-year of the arcs (used together with each arc’s arc_timestamp to compute MJD for the lookup).
doy (int) – Year and day-of-year of the arcs (used together with each arc’s arc_timestamp to compute MJD for the lookup).
track_cache (dict, optional) – Path-keyed cache for reusing prebuilt lookup indexes across many calls. When omitted, a module-level default (TRACK_INDEX_CACHE) is used, so repeated calls transparently reuse the same index across stations, extensions, and tracks.json vs vwc_tracks.json. Cache entries are never invalidated; restart the process if a tracks file is rewritten on disk.

Returns:

The same arcs list (modified in place), for chaining.

Return type:

list

gnssrefl.tracks.build_lookup_index(tracks_json)

Build a (sat, freq) -> [(track_id, track_epoch, def_dict), …] index.

Each def_dict carries the fields needed by lookup_arc:: rise, repeat_interval_d, anchor_mjd, az_avg_minel, az_drift_rate, first_mjd, last_mjd, epoch_type

gnssrefl.tracks.build_tracks(station, year, year_end=None, extension='', snr_type=66, source='auto')

Build tracks.json for a station over [year .. year_end].

Collects per-arc geometry (sat, freq, mjd, azim, rise), folds arcs into periodic tracks, drops fragment tracks below the per-freq filter threshold (10 percent of per-freq median arcs per track), fits a single periodic epoch per surviving track, and writes the JSON via FileManagement.

Two arc sources are supported, selected by source:

'snr' walk SNR files via load_arcs: authoritative, slow,
covers every frequency the SNR file contains.
'results' read results/ + failQC/ via load_arcs_from_results:
fast, but only covers frequencies gnssir was run with.
'auto' (default) prefer 'results' if any results/ dir is
populated in range; else fall back to 'snr'.

Parameters:

station (str) – 4-char station name (lowercase)
year (int) – Start year
year_end (int, optional) – End year inclusive. Defaults to year.
extension (str) – Strategy extension subdirectory. Default ‘’.
snr_type (int) – SNR file type for the SNR-walk path. Default 66.
source (str) – Arc source: ‘auto’ | ‘results’ | ‘snr’. Default ‘auto’.

Returns:

tracks_json (dict) – In-memory tracks_json matching the on-disk JSON.
arcs_df (pandas.DataFrame or None) – Per-arc DataFrame in the extract_arcs_gnssir_results schema (mjd, azim, constellation, RH, match_T, track_id, track_epoch), or None when source=’snr’ (no RH available).

gnssrefl.tracks.doy_hour_to_mjd(year, doy, hours): Convert (year, doy, fractional hours UTC) to MJD.

gnssrefl.tracks.fit_segment(arcs)

Fit T and azimuth model for a single track’s arcs.

Returns (T_fit, anchor_mjd, az_avg_minel, az_drift_rate). az_drift_rate is only nonzero for BeiDou (secular azimuth drift).

gnssrefl.tracks.iso_to_mjd(iso_str): ISO 8601 ‘YYYY-MM-DDTHH:MM:SSZ’ UTC string -> MJD float.

gnssrefl.tracks.load_arcs(station, year, year_end, extension, snr_type=66, fast=False)

Collect per-arc geometry for station across [year..year_end] into a DataFrame.

Unified SNR-walk and results-walk entry point. Both paths return the same schema (year, doy, sat, freq, mjd, azim, rise); the results path also carries RH (useful for later stats).

fast=False (default): walk SNR files day by day via extract_arcs_from_station. Authoritative; covers every frequency the SNR file contains. Slow.
fast=True: read the gnssir results/ + failQC/ artifacts via load_results_with_failqc. Orders of magnitude faster, but only covers frequencies gnssir was configured to run. Requires a prior gnssir run with save_failqc=True.

BeiDou GEO/IGSO PRNs in BEIDOU_NON_MEO_SATS are skipped here so the rest of the pipeline never sees them.

gnssrefl.tracks.load_tracks_json(path): Load a tracks.json file from disk and return the tracks_json dict.

gnssrefl.tracks.lookup_arc(sat, freq, obs_time_mjd, obs_az_minel, track_lookup_index, az_tol=5.0, time_tol_min=30)

Look up the (track_id, track_epoch, epoch_entry) for a single arc.

Parameters:

sat (int) – Satellite number and frequency code identifying the candidate list.
freq (int) – Satellite number and frequency code identifying the candidate list.
obs_time_mjd (float) – Arc observation time in MJD.
obs_az_minel (float) – Arc azimuth at minimum elevation (degrees). Compared against each candidate’s drift-corrected expected azimuth.
track_lookup_index (dict) – Pre-built (sat, freq) -> [(track_id, track_epoch, entry_dict), …] candidate index produced by build_lookup_index(tracks_json). Each entry_dict carries the epoch’s matching parameters (first_mjd, last_mjd, anchor_mjd, repeat_interval_d, az_avg_minel, az_drift_rate, epoch_type, ignored_ranges).
track's (Active matches require the query to fall inside the) –
``[first_mjd –
within (last_mjd]`` interval AND fit the periodic model) –
az_tol. (time_tol_min and) –

Returns:

(track_id, track_epoch, entry) – (-1, -1, None) if no track def covers this arc. entry is the matched epoch dict from build_lookup_index (keys include az_avg_minel and apriori_RH) when the match succeeds.

Return type:

tuple

gnssrefl.tracks.mjd_to_iso_ceil(mjd): MJD -> ISO 8601 Z UTC string, rounded UP to the nearest second.

gnssrefl.tracks.mjd_to_iso_floor(mjd): MJD -> ISO 8601 Z UTC string, rounded DOWN to the nearest second.

gnssrefl.tracks.results_dir_has_files(station, year, year_end, extension): True if any year in the range has a populated results/ dir for station.

gnssrefl.tracks.unwrap_az(az): Bring all azimuths into a single ±180° window centered on az[0].

gnssrefl.tracks.warn_legacy_apriori_and_exit(station, missing_file, extension='')

If any GPS apriori_rh_{fr}.txt exists, print a -legacy T hint and exit.

Called from modern-path entry points when missing_file (e.g. vwc_tracks.json) is absent, to nudge users who still have artifacts from a legacy GPS-only run toward passing -legacy T.

gnssrefl.tracks.write_tracks_json(tracks_json, f): Write tracks_json to file handle f with custom indentation for readability: indent=2 at the structural level, but each inner ignored_ranges pair [mjd_start, mjd_end] is collapsed onto a single line so the range reads as one atomic value.