gnssrefl.phase_functions module
- gnssrefl.phase_functions.apply_vwc_leveling(vwc_values, tmin, years=None, doys=None, mjd=None, level_doys=None, polyorder=-99, **kwargs)
Apply global baseline leveling to time-averaged VWC estimates.
This is the FINAL leveling step applied to time-binned VWC values. Uses per-year polynomial fitting with dry season nodes for baseline calculation.
- Parameters:
vwc_values (array-like) – Time-averaged VWC values in percentage units (0-60 range) These should be AFTER per-track processing and time-averaging
tmin (float) – Minimum soil texture value (0.05 typical)
years (array-like, optional) – Year values (required, or auto-derived from mjd)
doys (array-like, optional) – Day of year values (required, or auto-derived from mjd)
mjd (array-like, optional) – Modified Julian Day values (alternative to years/doys, will be auto-converted)
level_doys (list of int, optional) – [start_doy, end_doy] for dry season baseline (required)
polyorder (int, optional) – Polynomial order override (default: -99 = auto)
**kwargs (dict) – Optional parameters: - station : str (for plot labels) - plot_debug : bool (generate diagnostic plots) - plt2screen : bool (show plots on screen) - extension : str (extension for subdirectory, e.g. ‘test1’ → ‘station/test1’) - fr : int (frequency code for file naming) - bin_hours : int (temporal binning for file naming) - bin_offset : int (temporal offset for file naming)
- Returns:
leveled_vwc (numpy.ndarray) – Leveled VWC values in decimal units (0.0-0.6 range)
info (dict) – Additional information about the leveling: - ‘method’: str - Always ‘polynomial’ - ‘nodes’: numpy.ndarray or None - Level nodes used - ‘baseline_points’: int or None - Number of baseline points used
Examples
- leveled, info = apply_vwc_leveling(vwc, tmin=0.05,
years=yrs, doys=dys, level_doys=[152, 244])
- gnssrefl.phase_functions.apriori_file_exist(station, fr, extension='')
reads in the a priori RH results
- Parameters:
station (string) – station name
fr (integer) – frequency
extension (str, optional) – analysis extension for finding files
- Return type:
boolean as to whether the apriori file exists
- gnssrefl.phase_functions.calculate_avg_phase(vxyz, bin_hours=24, bin_offset=0, minvalperbin=10)
Calculate time-averaged phase statistics from track-level data
This is a pure calculation function that bins vxyz data by time and computes statistics without any file I/O.
- Parameters:
vxyz (numpy array) –
Track-level phase observations (16 columns) Columns: [year, doy, phase, azimuth, sat, rh, norm_amp_LSP, norm_amp_LS,
hour, raw_LSP, raw_LS, apriori_RH, quadrant, delRH, vegMask, MJD]
bin_hours (int, optional) – Time bin size in hours (default: 24 for daily)
bin_offset (int, optional) – Bin timing offset in hours (default: 0)
minvalperbin (int, optional) – Minimum observations required per bin (default: 10)
- Returns:
avg_phase – Averaged phase data with columns: [Year, DOY, Phase, PhaseSig, NormAmp, FracYear, Month, Day, [BinStart]] For daily: 8 columns (no BinStart) For subdaily: 9 columns (with BinStart)
- Return type:
numpy array (N x 8 or N x 9)
- gnssrefl.phase_functions.get_bin_schedule_info(bin_hours, bin_offset=0)
Helper function to generate bin schedule information strings
- Parameters:
bin_hours (int) – Time bin size in hours
bin_offset (int, optional) – Bin timing offset in hours. Default is 0
- Returns:
Bin schedule description line
- Return type:
str
- gnssrefl.phase_functions.get_temporal_suffix(fr, bin_hours=24, bin_offset=0, include_time=True)
Generate consistent suffix for all output files with temporal resolution and offset
- Parameters:
fr (int) – Frequency (1, 5, 20, etc.)
bin_hours (int, optional) – Time bin size in hours. Default is 24 (daily)
bin_offset (int, optional) – Bin timing offset in hours. Default is 0
include_time (bool, optional) – Whether to include bin timing suffix (e.g. _24hr+0). Default is True. Set to False for track files which contain per-observation data.
- Returns:
Suffix string like “_L1_6hr+0”, “_L2_24hr+1”, etc. Or just “_L1”, “_L2” if include_time=False.
- Return type:
str
- gnssrefl.phase_functions.get_vwc_frequency(station: str, extension: str, fr_cmd: str | None = None, station_config: dict | None = None)
Determines the frequency to use for VWC workflows. Priority is: command line -> json file -> default (20).
- Parameters:
station (str) – 4-character station name.
extension (str) – Analysis extension name.
fr_cmd (str, optional) – Frequency provided from the command line (e.g., ‘1’, ‘20’), by default None.
station_config (dict, optional) – Pre-loaded json dict. Pass to skip re-reading the json file.
- Returns:
A list of frequencies to be used for analysis.
- Return type:
list
- gnssrefl.phase_functions.help_debug(rt, xdir, station)
Takes the input of phase files, read by other functions, and writes out a file to help with debugging (comparion of matlab and python codes)
- rtnumpy array of floats
contents of the phase files stored in a numpy array
- xdirstr
where the otuput should be written
- stationstr
name of the station
- gnssrefl.phase_functions.kinda_qc(satellite, track_avg_az, y, t, new_phase, avg_date, avg_phase, warning_value, remove_bad_tracks, avg_exist)
- Parameters:
satellite (int) – satellite number
track_avg_az (float) – yearly average azimuth of the track, degrees
y (numpy array of ints) – year
t (numpy array of ints) – day of year
new_phase (numpy array of floats) – phase values for a given satellite track
avg_date (numpy array of floats) – y + doy/365.25
avg_phase (numpy array of floats) – average phase, in degrees
warning_value (float) – phase noise value
remove_bad_tracks (bool) – whether you write out new tracks with bad ones removed
avg_exist (bool) – whether you have previous solution to compare to
- Returns:
keepit – whether this track should be kept
- Return type:
bool
- gnssrefl.phase_functions.load_avg_phase(station, fr, bin_hours=24, extension='', bin_offset=0)
loads a previously computed averaged phase solution with matching temporal resolution. this is NOT the same as the multi-track phase results. This file is now stored in station subdirectory in $REFL_CODE/Files/
- Parameters:
station (str) – 4 character station ID, lowercase
fr (int) – frequency
bin_hours (int, optional) – time bin size in hours (1,2,3,4,6,8,12,24). Default is 24 (daily). Only compares against files with exact same temporal resolution.
extension (str, optional) – analysis extension for finding files, default is ‘’
- Returns:
avg_exist (bool) – whether the necessary file exists
avg_date (list of floats) – fractional year, i.e. year + doy/365.25
avg_phase (list of floats) – average phase for the given temporal resolution
- gnssrefl.phase_functions.load_phase_filter_out_snow(station, year1, year2, fr, snowmask, extension='')
Load all phase data and attempt to remove outliers from snow if snowmask provided.
- Parameters:
station (str) – four character station name
year1 (int) – starting year
year2 (int) – ending year
fr (int) – frequency, i.e. 1 or 20
snowmask (str) – name/location of the snow mask file None if this value is not going to be used
- Returns:
dataexist (bool) – whether phase data were found
year (numpy array of int) – calendar years
doy (numpy array of int) – day of year
hr (numpy array of floats) – UTC hour of measurement
ph (numpy array of floats) – LS phase estimates
azdata (numpy array of floats) – average azimuth, degrees
ssat (numpy array of int) – satellite number
rh (numpy array of floats) – reflector height, meters
amp_lsp (numpy array of floats) – lomb scargle periodogram amplitude
amp_ls (numpy array of floats) – least squares amplitude
ap_rh (numpy array of floats) – apriori rh
results_trans (numpy array) – all phase results concatenated into numpy array plus column for quadrant and unwrapped phase
- gnssrefl.phase_functions.load_sat_phase(station, year, year_end, freq, extension='')
Picks up the phase estimates from local (REFL_CODE) results section and returns most of the information from those files
- Parameters:
station (str) – four character station name
year (integer) – beginning year
year_end (integer) – ending year
freq (integer) – GPS frequency (1,20 allowed)
- Returns:
dataexist (bool) – whether data found?
results (numpy array of floats) – basically one variable with everything in the original columns from the daily phase files
- gnssrefl.phase_functions.low_pct(amp, basepercent)
emulated amp_normK code from PBO H2O inputs are the amplitudes and a percentage used to define the bottom level. returns normalized amplitudes
this is meant to be used by individual tracks (I think) in this case they are the top values, not the bottom … ugh
- gnssrefl.phase_functions.make_snow_filter(station, medfilter, ReqTracks, year1, year2)
Runs daily_avg code to make a snow mask file. This is so you have some idea of when the soil moisture products are contaminated by snow. Make a file with these years and doys saved. The user can edit if they feel the suggestsions are poor (i.e. days in the summer might show up as “snow”)
If snow mask file exists, it does not overwrite it.
- Parameters:
station (str) – 4 ch station name
medfilter (float) – how much you allow the individual tracks to deviate from the daily median (meters)
ReqTracks (int) – number of tracks to compute trustworthy daily average
year1 (int) – starting year
year2 (int) – ending year
- Returns:
snowmask_exists (bool) – whether file was created
snow_file (str) – name of the snow mask file
Creates output file into a file $REFL_CODE/Files/{ssss}/snowmask_{ssss}.txt
where ssss is the station name
- gnssrefl.phase_functions.mjd_to_file_columns(mjd_list)
Convert MJD values to year, doy, month, day for file writing.
- Parameters:
mjd_list (list or array) – Modified Julian Day values
- Returns:
years (np.ndarray) – Year values
doys (np.ndarray) – Day of year values
months (np.ndarray) – Month values
days (np.ndarray) – Day of month values
- gnssrefl.phase_functions.normAmp(amp, basepercent)
emulated amp_normK code from PBO H2O inputs are the amplitudes and a percentage used to define the bottom level. returns normalized amplitudes this is meant to be used by individual tracks (I think) in this case they are the top values, not the bottom … ugh
- gnssrefl.phase_functions.old_quad(azim)
calculates oldstyle quadrants from PBO H2O
- Parameters:
azim (float) – azimuth, dgrees
q (int) – old quadrant system used in pboh2o
- gnssrefl.phase_functions.phase_tracks(station, year, doy, snr_type, fr_list, station_config, extension='')
This does the main work of estimating phase and other parameters from the SNR files it uses tracks that were predefined by the apriori.py code
- Parameters:
name (station) – 4 char id, lowercase
year (int) – calendar year
doy (int) – day of year
snr_type (int) – SNR file extension (i.e. 99, 66 etc)
fr_list (list of integers) – frequency, [1], [20] or [1,20]
station_config (dict) – station analysis parameters (from json, with CLI overrides applied)
extension (str, optional) – analysis extension for json file. Default is ‘’.
track. (Only GPS frequencies are allowed because this relies on the repeating ground) –
- gnssrefl.phase_functions.plot_baseline_leveling(station, years, doys, vwc_values, new_level, nodes, plot_path, tmin, level_doys, plt2screen=True)
Plot baseline leveling results - before/after style like vegetation correction
- gnssrefl.phase_functions.prepare_track_dir(station, extension='')
Create (or clear) the individual_tracks directory for a station.
- Parameters:
station (str) – Station name
extension (str) – Optional subdirectory extension
- Returns:
Absolute path to the individual_tracks directory
- Return type:
str
- gnssrefl.phase_functions.read_apriori_rh(station, fr, extension='')
read the track dependent a priori reflector heights needed for phase & thus soil moisture.
- Parameters:
station (str) – four character ID, lowercase
fr (int) – frequency (e.g. 1,20)
- Returns:
results – column 1 is just a number (1,2,3,4, etc)
column 2 is RH in meters
column 3 is satellite number
column 4 is azimuth of the track (degrees)
column 5 is number of values used in average
column 6 is minimum azimuth degrees for the quadrant
column 7 is maximum azimuth degrees for the quadrant
- Return type:
numpy array
- gnssrefl.phase_functions.rename_vals(year_sat_phase, doy, hr, phase, azdata, ssat, amp_lsp, amp_ls, rh, ap_rh, ii)
this is just trying to clean up vwc.py send indices ii - and return renamed variables.
calculates and also returns mjd
- Parameters:
year_sat_sat –
doy (numpy array) – day of year (int)
hr (numpy array) – floats (UTC hour)
phase (numpy array of floats) – estimated phase (deg)
azdata (numpy array of floats) – azimuth (deg)
ssat (numpy array of int) – satellite numbers
amp_lsp (numpy array of floats) – lomb scargle periodogram amplitude
amp_ls (numpy array of floats) – least squares amplitude
rh (numpy array of floats) – reflector heights (m)
ap_rh – apriori reflector height (m)
ii –
- Returns:
y (numpy array of int) – year
t (numpy array of int) – day of year
h (numpy array of floats) – hour of the day (UTC)
x (numpy array of floats) – phase, degrees
azd (numpy array of floats) – azimuth for the track
s (numpy array of int)
amps_lsp (numpy array of floats) – LSP amplitude
amps_ls (numpy array of floats) – least squares amplitude
rhs (numpy array of floats) – estimated RH (m)
ap_rhs (numpy array of floats) – a priori RH (m)
mjds (numpy array of int) – Modified julian day
- gnssrefl.phase_functions.save_vwc_plot(fig, pngfile)
- Parameters:
fig (matplotlib figure) – the figure definition you define when you open a figure
pngfile (str) – name of the png file to be saved
- gnssrefl.phase_functions.set_parameters(station, level_doys, minvalperday, tmin, tmax, min_req_pts_track, fr, year, year_end, plt, auto_removal, warning_value, extension, bin_hours=None, minvalperbin=None, bin_offset=None)
the goal of this code is to pick up the relevant parameters used in vwc.
- Parameters:
station (str) – 4 character station name
level_doys (list of int) – option doy inputs for defining leveling period. can be in the json or on the command line
minvalperday (int) – number of phase values required each day
tmin (float) – min soil texture
tmax (float) – max soil texture
min_req_pts_track (int) – minimum number of phase values per year per track
freq (int) – frequency to use (1,20 allowed)
year (int) – first year to analyze
year_end (int) – last year to analyze
plt (bool) – whether you want plots to come to the screen
auto_removal (bool) – whther tracks should be removed when they fail QC
warning_value (float) – phase RMS needed to trigger warning
extension (str) – extension used for inputs and outputs (i.e. test1 in ssss.test1.json)
bin_hours – needs documentation
? (bin_offset) – needs documentation
? – needs documentation
- Returns:
minvalperday (int) – number of phase values required to create a daily average
tmin (float) – min soil texture
tmax (float) – max soil texture
min_req_pts_track (int) – minimum number of phase values per year per track
freq (int) – frequency to use (1,20 allowed)
year_end (int) – last year to analyze
plt (bool) – whether you want plots to come to the screen
auto_removal (bool) – whther tracks should be removed when they fail QC
warning_value (float) – phase RMS needed to trigger warning
extension (str) – extra name for the json file
return_level_doys (list) – start and end day of years for leveling
vegetation_model (int) – vegetation correction model: 1 (simple), 2 (advanced)
- gnssrefl.phase_functions.subdaily_phase_plot(station, fr, vxyz, xdir, subdir, hires_figs, bin_hours=24, bin_offset=0, minvalperbin=10, plt2screen=True)
makes a plot of daily or subdaily averaged phase for vwc code
- Parameters:
station (str) – 4 char station name
fr (int) – frequency of signal
vxyz (numpy array) – Track-level phase data
xdir (str) – location of the results (environment variable REFL_CODE)
subdir (str) – subdirectory in Files
hires_figs (bool) – whether you want eps instead of png files
bin_hours (int) – time bin size in hours (default: 24 for daily)
bin_offset (int) – bin timing offset in hours (default: 0)
minvalperbin (int) – minimum observations required per bin (default: 10)
plt2screen (bool) – whether to display plot to screen (default: True)
- gnssrefl.phase_functions.test_func(x, a, b, rh_apriori)
This is least squares for estimating a sine wave given a fixed frequency, freqLS
- gnssrefl.phase_functions.test_func_new(x, a, b, rh_apriori, freq)
This is least squares for estimating a sine wave given a fixed frequency, freqLS now freq is input so it is not hardwired for L2
- Parameters:
x (numpy array of floats) – sine(elevation angle) I think
a (float) – amplitude - estimated
b (float) – phase - estimated
rh_apriori (float) – reflector height (m)
freq (int) – frequency
- gnssrefl.phase_functions.vwc_plot(station, t_datetime, vwcdata, plot_path, circles, plt2screen=True)
makes a plot of volumetric water content
- Parameters:
station (string) – 4 ch station name
t_datetime (datetime) – observation times for measurements
vwcdata (numpy array of floats (I think)) – volumetric water content
plot_path (Saves a plot to) – full name of the plot file
circles (boolean) – circles in the plot. default is a line (really .-)
plot_path –
- gnssrefl.phase_functions.write_all_phase(v, fname)
writes out preliminary phase values and other metrics for advanced vegetation option. This is in the hope that it can be used in clara chew’s dissertation algorithm.
File is written to $REFL_CODE/Files/station/station_all_phase.txt I think
- Parameters:
v (numpy of floats as defined in vwc_cl) – TBD year, doy, phase, azimuth, satellite number estimated RH, LSP amplitude, LS amplitude, UTC hours raw LSP amp, raw LS amp
fname (str) – name of the output file
filestatus (int) – 1, open the file 2, write to file (well, really any value)
rhtrack (float) – apriori reflector height for the given track, meters
- Returns:
allrh
- Return type:
fileID
- gnssrefl.phase_functions.write_apriori_rh(filepath, tracks, station, year, tmin, tmax)
Write apriori RH track list file. Used by both vwc_input and vwc auto_removal.
A track is defined by satellite number and the azimuth at minimum elevation angle (az_min_ele) of each arc. Arcs are clustered into tracks when their az_min_ele values fall within 3 degrees of the cluster center. The track’s mean azimuth (track_avg_az) is the circular mean of az_min_ele over e.g. a year of arcs in the cluster.
- Parameters:
filepath (path-like) –
tracks (list) – each element is [track_num, rh, satellite, track_avg_az, nvals, az_min, az_max]
station (str) –
year (int) –
tmin (float) – minimum soil texture (header metadata only)
tmax (float) – maximum soil texture (header metadata only)
- gnssrefl.phase_functions.write_avg_phase(station, fr, avg_phase, extension='', bin_hours=24, bin_offset=0)
writes averaged phase results to a text file
- Parameters:
station (string) –
fr (int) – frequency
avg_phase (numpy array) – output of calculate_avg_phase columns: [Year, DOY, Phase, PhaseSig, NormAmp, FracYear, Month, Day, [BinStart]]
extension (str, optional) – analysis extension for directory organization
bin_hours (int, optional) – Time bin size in hours (default: 24). Used for filename suffix and header.
bin_offset (int, optional) – Bin timing offset in hours (default: 0). Used for filename suffix and header.
- gnssrefl.phase_functions.write_out_raw_phase(v, fname)
write daily phase values used in vwc to a new consolidated file I added columns for quadrant and unwrapped phase
- Parameters:
v (numpy array) – phase results read for multiple years. could be with snow filter applied
fname (str) – filename for output
- Returns:
newv – original variable v with columns added for quadrant (1-4) and unwrapped phase
- Return type:
numpy array
- gnssrefl.phase_functions.write_phase_for_advanced(filename, vxyz)
Writes out a file of interim phase results for advanced models developed by Clara Chew
File now written to $REFL_CODE/Files/<station>/<station>_all_phase.txt
September 8, 2025: Added a new column for MJD
- Parameters:
filename (str) – name for output file
vxyz (numpy array of floats) – as defined in vwc_cl.py
- gnssrefl.phase_functions.write_rolling_vwc_output(station, vwc_data, fr, bin_hours, extension='', vegetation_model=2)
Write hourly rolling VWC output file.
This writes a combined file with all hourly rolling measurements (e.g., bins starting at 0:00, 1:00, 2:00, etc.) sorted chronologically.
- Parameters:
station (str) – 4-character station name
vwc_data (dict) – Must contain: - ‘mjd’: Modified Julian Day values - ‘vwc’: VWC values (already leveled) - ‘datetime’: datetime objects - ‘bin_starts’: list of bin start hours
fr (int) – Frequency code (1=L1, 5=L5, 20=L2C)
bin_hours (int) – Time bin size in hours (e.g., 6 for 6-hour windows)
extension (str, optional) – Extension for file naming (default: ‘’)
vegetation_model (int, optional) – 1=simple, 2=advanced (for header documentation, default: 2)
- gnssrefl.phase_functions.write_track_file(track_dir, station, year, sat_num, avg_az, rows, fr, extra_headers=None)
Write a single track file with standard header and row format.
- Parameters:
track_dir (str) – Directory to write the track file in
station (str) – Station name
year (int) – Year for the track
sat_num (int) – Satellite number
avg_az (float) – Track average azimuth (degrees)
rows (numpy.ndarray) – (N, 17) array of track data columns
fr (int) – Frequency code
extra_headers (list of str, optional) – Additional header lines inserted before the column legend
- gnssrefl.phase_functions.write_vwc_output(station, vwc_data, year, fr, bin_hours, bin_offset, extension='', vegetation_model=1)
Write VWC output file in standard gnssrefl format. Works for both vegetation models.
- Parameters:
station (str) – 4-character station name
vwc_data (dict) – Must contain: - ‘mjd’: Modified Julian Day values - ‘vwc’: VWC values - ‘datetime’: datetime objects - ‘bin_starts’: list of bin start hours (subdaily) or empty list (daily)
year (int) – Year for file naming
fr (int) – Frequency code (1=L1, 5=L5, 20=L2C)
bin_hours (int) – Time bin size in hours (default: 24)
bin_offset (int) – Bin timing offset in hours (default: 0)
extension (str, optional) – Extension for file naming (default: ‘’)
vegetation_model (int, optional) – 1=simple, 2=advanced (for header documentation, default: 1)