Tracks

This page describes the tracks module and the tracks.json artifact it produces. It was written for version 4.1.3 and may be out of date. Please visit the function definition page to view the most recent API reference.

Overview

A track is one of the repeating sky paths a satellite traces through the elevation mask. A single satellite produces many distinct tracks. An arc (see the extract_arcs page) is one observation from a track; arcs from different days which share a track can be tagged with a common track_id using this module. The figure below shows tracks produced by a single satellite from each constellation at MCHL, Australia.

../_images/tracks_skyplot.png

Each colored segment is one track that will produce observations (‘arcs’) at the repeat rate.

tracks.json is a derived catalog of tracks at a station. It is the ground-truth reference used by downstream tools (phase, vwc) to associate arcs to a track.

Two entry points cover the most common workflows for generating a tracks.json:

Function

Use when you have…

generate_tracks/vwc_input <station> <year>

CLI access, a station name and a multi-year window; writes tracks.json/vwc_tracks.json

build_tracks(station, year, ...)

programmatic access to the same builder; returns (tracks_json, arcs_df)

Background

Legacy gnssrefl (pre v4.1.4) identified tracks by azimuth clustering. This works for GPS because GPS has a 1-sidereal-day ground-track repeat, with 2 orbits per day. Therefore, a station can observe at most 4 distinct tracks per satellite (rising/setting tracks for each of the 2 orbits) which are generally well seperated in azimuth-space. Non-GPS constellations have much longer repeat periods, and thus produce many more tracks:

Constellation

\(T_{\text{sid}}\)

Solar days

Orbits per repeat

GPS

1

0.99727

2

GLONASS

8

7.97816

17

Galileo

10

9.97270

17

BeiDou MEO

7

6.98089

13

Because tracks of non-GPS constellations can overlap in azimuth, we adopt a time matching approach track associations.

The matching rule

An arc at observation time \(t_{\text{obs}}\) belongs to a given track when it satisfies both a periodic time condition and an azimuth condition against that tracks fitted parameters:

\[ \left| \, (t_{\text{obs}} - t_{\text{anchor}}) - n \, T_{\text{repeat}} \, \right| < \tau_t \quad\text{where}\quad n = \operatorname{round}\!\left(\frac{t_{\text{obs}} - t_{\text{anchor}}}{T_{\text{repeat}}}\right) \]
\[ \left| \, \text{az}_{\text{obs}} - (\text{az}_{\text{avg\_minel}} + \text{az\_drift\_rate} \cdot (t_{\text{obs}} - t_{\text{anchor}})) \, \right| < \tau_{\text{az}} \]

Key properties:

  • \(T_{\text{repeat}}\) is the track-specific fitted repeat period, not the nominal constellation value. fit_segment produces a linear fit to recover a per-track period that accounts for individual satellite drift. This matters for satellites in anomalous orbits (for example Galileo E14 and E18) where the constellation repeat is wrong by several minutes per cycle.

  • az_drift_rate is zero for GPS, GLONASS, and Galileo. For BeiDou MEO it captures the J2-driven westward drift of the ground track. See the final section of this page.

  • tau_t defaults to 30 minutes (TIME_TOL_MIN) and tau_az defaults to 5 degrees (AZ_TOL).

  • A satellite broadcasting many frequencies will have a unique ‘track’ for each frequency.

Track Identity

Two layers of identity are used throughout the pipeline:

  • track_id is a geometric identifier

  • track_epoch is an additional identifier (0..N-1) that can be used to split a track into regions of time that may have logically distinct a priori RH values

A track starts life with a single epoch (epoch_id == 0). Later user-specified operations may split it into multiple epochs representing periods of different hardware, orbital, or environmental states. Tracks can be active or inactive, and only arcs in active epochs are used in downstream processing. Within an active epoch, smaller ignored_ranges can be added to remove specific outlier arcs.

Building a Track Catalog

Build the catalog directly from the command line:

generate_tracks mchl 2023 -year_end 2025

or auto generate from vwc_input, which also generates vwc_tracks.json:

vwc_input mchl 2023 -year_end 2025

Typical output:

tracks source: auto-detected 'results'
loading arcs for mchl 2023-2025 from results/+failQC/ (fast path)
done: 1099 days processed, 0 missing, 273,482 arcs in 12.3s
tolerances: az +/-5.0 deg, time +/-30 min, max_gap 15 cycles
processing frequencies: [1, 5, 20, 101, 102, 201, 205, 206, 207, 208, 301, 302, 305, 306]
building tracks_json over 259,181 kept arcs (9,494 unique track_ids)
wrote /.../Files/mchl/tracks.json
  file size:   4.79 MB
  tracks:      9494

Programmatic access returns the same JSON plus the per-arc DataFrame used to build it:

from gnssrefl.tracks import build_tracks

tracks_json, arcs_df = build_tracks('mchl', 2023, year_end=2025)
print(f"{tracks_json['metadata']['n_tracks']} tracks over {tracks_json['metadata']['duration_d']} days")
# 9494 tracks over 1095 days

Parameter References

Input Reference

Parameters accepted by build_tracks (and the corresponding generate_tracks CLI flags):

Parameter

Default

Description

year / year_end

required / year

Inclusive year window

extension

''

Strategy extension subdirectory

snr_type

66

SNR file type for the SNR-walk path

source

'auto'

Arc source: 'auto', 'results' (fast), or 'snr' (slow, authoritative)

The two arc sources differ in speed and coverage:

  • 'results' reads results/ plus failQC/ for each day. Orders of magnitude faster, but only covers frequencies gnssir was configured to run. Requires a prior gnssir run with save_failqc=True (which is the default as of v4.1.3).

  • 'snr' reads SNR files directly via extract_arcs. Slow, but covers every frequency present in the SNR file.

  • 'auto' prefers 'results' if any year in range has a populated results/ directory; otherwise falls back to 'snr'.

Matching tolerances are module-level constants in tracks.py, shared between the builder and the runtime lookup:

Constant

Default

Scope

AZ_TOL

5.0 deg

build + runtime

TIME_TOL_MIN

30 min

build + runtime

MAX_GAP_CYCLES

15

build only (max missed cycles to bridge a single match)

Output Reference

tracks.json has two top-level keys:

Key

Type

Description

metadata

dict

Station identity, data time range, totals, build history

tracks

dict

Map from track_id string to per-track record

Each per-track record holds the satellite identity plus a list of epochs:

Key

Type

Description

constellation

str

'GPS', 'GLONASS', 'Galileo', or 'BeiDou'

sat

int

Satellite PRN

freq

int

Frequency code (1, 20, 5, 101, 201, 301, …)

rise

int

1 for rising track, -1 for setting

epochs

list of dict

One entry per epoch of this track (see below)

Each epoch describes would ideally be a region of time with a comparable measurement environment:

Key

Type

Description

epoch_id

int

Index within this track’s epoch list (0..N-1)

epoch_type

str

'active' or 'inactive'

start_time / end_time

str

ISO 8601 Z UTC time of first/last observed arc

anchor_time

str

ISO 8601 Z UTC; used as \(t_{\text{anchor}}\)

repeat_interval_d

float

Fitted \(T_{\text{repeat}}\) in days

az_avg_minel

float

Mean azimuth at minimum elevation (deg)

az_drift_rate

float

deg/day; absent or zero for non-BeiDou

ignored_ranges

list of [mjd_start, mjd_end]

QC masks on this epoch

n_arcs

int

Number of arcs within the epoch window (excluding ignored ranges)

n_qc_arcs

int

Only in vwc_tracks.json: arcs that also passed phase QC

apriori_RH / RH_std

float

Only in vwc_tracks.json: a priori reflector height and its std

duration_d

float

end_time - start_time in days

tracks.json vs vwc_tracks.json: both share this schema; vwc_tracks.json is the filtered subset used by the VWC pipeline, containing only tracks at the requested frequencies and adding apriori_RH, RH_std, and n_qc_arcs per epoch.

Example Code

Labelling arcs at runtime

Once tracks.json is built, pass it to extract_arcs_from_station via the track_file kwarg and the returned arcs come back already tagged:

from gnssrefl.extract_arcs import extract_arcs_from_station
import os

tracks_path = f"{os.environ['REFL_CODE']}/Files/mchl/tracks.json"

arcs = extract_arcs_from_station('mchl', 2024, 180, track_file=tracks_path)

for meta, data in arcs[:3]:
    print(f"sat {meta['sat']:3d} freq {meta['freq']:3d}  "
          f"track_id={meta['track_id']:4d}  epoch={meta['track_epoch']}  "
          f"az={meta['az_min_ele']:5.1f}  track_azim={meta['track_azim']}")
# sat  15 freq   1  track_id=  16  epoch=0  az=140.3  track_azim=140.1
# sat  15 freq   1  track_id=  17  epoch=0  az= 33.2  track_azim=33.0
# sat 103 freq 101  track_id= 352  epoch=0  az=153.7  track_azim=153.3

Tagging adds four keys to each arc’s metadata dict:

Key

Type

Description

track_id

int

Matched track id, or -1 on no match

track_epoch

int

Matched epoch id, or -1 on no match

track_azim

float or None

az_avg_minel of the matched epoch

apriori_RH

float or None

Matched epoch’s a priori RH; only populated from vwc_tracks.json

Point track_file at vwc_tracks.json instead to pick up apriori_RH. Internally the kwarg calls tracks.attach_track_id, which loads the JSON once, builds a (sat, freq) lookup via build_lookup_index, and matches each arc with lookup_arc. A module-level cache (tracks.TRACK_INDEX_CACHE) keyed by absolute path reuses prebuilt indexes across calls transparently, so repeated calls don’t reload the JSON. Cache entries are never invalidated; restart the process if the tracks file is changed on disk.

Loading all arcs in a tracks_json

Two entry points cover multi-day workflows that load all arcs from a tracks_json.

extract_arcs_from_tracks(tracks_json) is the robust version, and can bootstrap a tracks_json when gnssir results do not exist. Returns a flat list of (metadata, data) tuples in the same format as extract_arcs_from_station, tagged against the in-memory tracks_json (including any QC edits). Use when you need the full per-arc SNR payload (ele, snr, seconds).

from gnssrefl.extract_arcs import extract_arcs_from_tracks
from gnssrefl.tracks import load_tracks_json

tracks_json = load_tracks_json(tracks_path)
arcs = extract_arcs_from_tracks(tracks_json)  # [(meta, data), ...]

load_gnssir_results_from_tracks(tracks_json): fast summary read from results/ + failQC/ files. Returns a tagged DataFrame with columns mjd, azim, constellation, RH, match_T, track_id, track_epoch. Much faster because no SNR is read; requires a prior gnssir run so the sibling failQC/ files exist, and raises FileNotFoundError if any are missing.


df = load_gnssir_results_from_tracks(tracks_json)

Track Level Quality Control

The companion module tracks_qc provides the operations used to edit a tracks-shaped JSON. All edits mutate the in-memory dict and only become self-consistent once save_tracks recalculates statistics.

QC primitives

Low-level single-operation edits on a track or epoch.

  • split_epoch: subdivide one active epoch into two at a chosen MJD. Both halves inherit the original fit parameters; fresh values come from the save_tracks refit. Existing ignored_ranges are partitioned across the split.

  • merge_epochs: combine two adjacent active epochs. The window becomes the union, ignored_ranges are concatenated, and repeat_interval_d must match.

  • ignore_range / unignore_range: add or subtract a time window on an epoch’s ignored_ranges. Arcs inside any ignored range are excluded from the refit and the n_arcs / n_qc_arcs counts on save.

  • deactivate_epoch: flag an epoch inactive. The refit and stats passes skip it and downstream tools ignore it, but the window and history are preserved.

  • delete_track: remove a whole track from the JSON.

  • save_tracks: append a history entry, refit every active epoch via fit_segment, refresh every derived field (for vwc_tracks.json also apriori_RH, RH_std, n_qc_arcs), and write atomically.

QC Functions

Composite policies built on the primitives. vwc_cl and vwc_input surface these as CLI flags; the flag is the main entry point.

  • -auto_removal T on vwc: drop tracks and epochs at the current frequency that did not pass the run’s QC, where “passed” means the track’s phase RMS was under -warning_value (default 5.5 deg). Bad epochs are flagged via deactivate_epoch; tracks whose active epochs are all bad go via delete_track. Tracks on other frequencies are untouched.

Extra Notes

Legacy GPS-only path

Stations processed before the multi-GNSS refactor may still have per-frequency apriori_rh_{fr}.txt files from the legacy azimuth-matching workflow. The vwc pipeline (vwc_input/phase/vwc) retains this path behind the -legacy T flag (with a 2027-01-01 deprecation notice), and attach_legacy_apriori labels arcs under that scheme. Under the legacy path track_epoch is always 0 on match (legacy tracks have only one epoch each).

New stations should use the default multi-GNSS path. If tracks.json is missing but a legacy apriori_rh file exists, vwc_input prints a hint pointing at -legacy T.

Accuracy of the fitted model

With one satellite per constellation, the figure below plots the timing residual (blue, left y-axis) and azimuth residual (red, right y-axis) of every real arc evaluated against its track’s fitted model, across the 3-year window at mchl:

../_images/tracks_residuals.png

BeiDou orbital drift

BeiDou MEO is the only constellation whose ground tracks do not maintain a static azimuth over multi-year windows. J2 perturbation drives a westward drift of the ground track, so a constant-azimuth track model would accumulate several degrees of error per year. fit_segment detects constellation BeiDou and adds a linear az_drift_rate term to the azimuth model; the other three constellations carry az_drift_rate = 0.

The figure below shows BeiDou sat 325 track 7139 at mchl, 2023 to 2025. Red is the azimuth residual under a constant-azimuth model; blue is the residual after the fitted az_drift_rate term is applied.

../_images/tracks_beidou_drift.png