gnssrefl.phase_functions module

gnssrefl.phase_functions.apply_vwc_leveling(vwc_values, tmin, years=None, doys=None, mjd=None, level_doys=None, polyorder=-99, **kwargs)

Apply global baseline leveling to time-averaged VWC estimates.

This is the FINAL leveling step applied to time-binned VWC values. Uses per-year polynomial fitting with dry season nodes for baseline calculation.

Parameters:
  • vwc_values (array-like) – Time-averaged VWC values in percentage units (0-60 range) These should be AFTER per-track processing and time-averaging

  • tmin (float) – Minimum soil texture value (0.05 typical)

  • years (array-like, optional) – Year values (required, or auto-derived from mjd)

  • doys (array-like, optional) – Day of year values (required, or auto-derived from mjd)

  • mjd (array-like, optional) – Modified Julian Day values (alternative to years/doys, will be auto-converted)

  • level_doys (list of int, optional) – [start_doy, end_doy] for dry season baseline (required)

  • polyorder (int, optional) – Polynomial order override (default: -99 = auto)

  • **kwargs (dict) – Optional parameters: - station : str (for plot labels) - plot_debug : bool (generate diagnostic plots) - plt2screen : bool (show plots on screen) - extension : str (extension for subdirectory, e.g. ‘test1’ → ‘station/test1’) - fr : int (frequency code for file naming) - bin_hours : int (temporal binning for file naming) - bin_offset : int (temporal offset for file naming)

Returns:

  • leveled_vwc (numpy.ndarray) – Leveled VWC values in decimal units (0.0-0.6 range)

  • info (dict) – Additional information about the leveling: - ‘method’: str - Always ‘polynomial’ - ‘nodes’: numpy.ndarray or None - Level nodes used - ‘baseline_points’: int or None - Number of baseline points used

Examples

leveled, info = apply_vwc_leveling(vwc, tmin=0.05,

years=yrs, doys=dys, level_doys=[152, 244])

gnssrefl.phase_functions.apriori_file_exist(station, fr, extension='')

reads in the a priori RH results

Parameters:
  • station (string) – station name

  • fr (integer) – frequency

  • extension (str, optional) – analysis extension for finding files

Return type:

boolean as to whether the apriori file exists

gnssrefl.phase_functions.calculate_avg_phase(vxyz, bin_hours=24, bin_offset=0, minvalperbin=10)

Calculate time-averaged phase statistics from track-level data

This is a pure calculation function that bins vxyz data by time and computes statistics without any file I/O.

Parameters:
  • vxyz (numpy array) –

    Track-level phase observations (16 columns) Columns: [year, doy, phase, azimuth, sat, rh, norm_amp_LSP, norm_amp_LS,

    hour, raw_LSP, raw_LS, apriori_RH, quadrant, delRH, vegMask, MJD]

  • bin_hours (int, optional) – Time bin size in hours (default: 24 for daily)

  • bin_offset (int, optional) – Bin timing offset in hours (default: 0)

  • minvalperbin (int, optional) – Minimum observations required per bin (default: 10)

Returns:

avg_phase – Averaged phase data with columns: [Year, DOY, Phase, PhaseSig, NormAmp, FracYear, Month, Day, [BinStart]] For daily: 8 columns (no BinStart) For subdaily: 9 columns (with BinStart)

Return type:

numpy array (N x 8 or N x 9)

gnssrefl.phase_functions.get_bin_schedule_info(bin_hours, bin_offset=0)

Helper function to generate bin schedule information strings

Parameters:
  • bin_hours (int) – Time bin size in hours

  • bin_offset (int, optional) – Bin timing offset in hours. Default is 0

Returns:

Bin schedule description line

Return type:

str

gnssrefl.phase_functions.get_temporal_suffix(fr, bin_hours=24, bin_offset=0, include_time=True)

Generate consistent suffix for all output files with temporal resolution and offset

Parameters:
  • fr (int) – Frequency (1, 5, 20, etc.)

  • bin_hours (int, optional) – Time bin size in hours. Default is 24 (daily)

  • bin_offset (int, optional) – Bin timing offset in hours. Default is 0

  • include_time (bool, optional) – Whether to include bin timing suffix (e.g. _24hr+0). Default is True. Set to False for track files which contain per-observation data.

Returns:

Suffix string like “_L1_6hr+0”, “_L2_24hr+1”, etc. Or just “_L1”, “_L2” if include_time=False.

Return type:

str

gnssrefl.phase_functions.get_vwc_frequency(station: str, extension: str, fr_cmd: str | None = None, station_config: dict | None = None)

Determines the frequency to use for VWC workflows. Priority is: command line -> json file -> default (20).

Parameters:
  • station (str) – 4-character station name.

  • extension (str) – Analysis extension name.

  • fr_cmd (str, optional) – Frequency provided from the command line (e.g., ‘1’, ‘20’), by default None.

  • station_config (dict, optional) – Pre-loaded json dict. Pass to skip re-reading the json file.

Returns:

A list of frequencies to be used for analysis.

Return type:

list

gnssrefl.phase_functions.help_debug(rt, xdir, station)

Takes the input of phase files, read by other functions, and writes out a file to help with debugging (comparion of matlab and python codes)

rtnumpy array of floats

contents of the phase files stored in a numpy array

xdirstr

where the otuput should be written

stationstr

name of the station

gnssrefl.phase_functions.kinda_qc(satellite, track_avg_az, y, t, new_phase, avg_date, avg_phase, warning_value, remove_bad_tracks, avg_exist)
Parameters:
  • satellite (int) – satellite number

  • track_avg_az (float) – yearly average azimuth of the track, degrees

  • y (numpy array of ints) – year

  • t (numpy array of ints) – day of year

  • new_phase (numpy array of floats) – phase values for a given satellite track

  • avg_date (numpy array of floats) – y + doy/365.25

  • avg_phase (numpy array of floats) – average phase, in degrees

  • warning_value (float) – phase noise value

  • remove_bad_tracks (bool) – whether you write out new tracks with bad ones removed

  • avg_exist (bool) – whether you have previous solution to compare to

Returns:

keepit – whether this track should be kept

Return type:

bool

gnssrefl.phase_functions.load_avg_phase(station, fr, bin_hours=24, extension='', bin_offset=0)

loads a previously computed averaged phase solution with matching temporal resolution. this is NOT the same as the multi-track phase results. This file is now stored in station subdirectory in $REFL_CODE/Files/

Parameters:
  • station (str) – 4 character station ID, lowercase

  • fr (int) – frequency

  • bin_hours (int, optional) – time bin size in hours (1,2,3,4,6,8,12,24). Default is 24 (daily). Only compares against files with exact same temporal resolution.

  • extension (str, optional) – analysis extension for finding files, default is ‘’

Returns:

  • avg_exist (bool) – whether the necessary file exists

  • avg_date (list of floats) – fractional year, i.e. year + doy/365.25

  • avg_phase (list of floats) – average phase for the given temporal resolution

gnssrefl.phase_functions.load_phase_filter_out_snow(station, year1, year2, fr, snowmask, extension='')

Load all phase data and attempt to remove outliers from snow if snowmask provided.

Parameters:
  • station (str) – four character station name

  • year1 (int) – starting year

  • year2 (int) – ending year

  • fr (int) – frequency, i.e. 1 or 20

  • snowmask (str) – name/location of the snow mask file None if this value is not going to be used

Returns:

  • dataexist (bool) – whether phase data were found

  • year (numpy array of int) – calendar years

  • doy (numpy array of int) – day of year

  • hr (numpy array of floats) – UTC hour of measurement

  • ph (numpy array of floats) – LS phase estimates

  • azdata (numpy array of floats) – average azimuth, degrees

  • ssat (numpy array of int) – satellite number

  • rh (numpy array of floats) – reflector height, meters

  • amp_lsp (numpy array of floats) – lomb scargle periodogram amplitude

  • amp_ls (numpy array of floats) – least squares amplitude

  • ap_rh (numpy array of floats) – apriori rh

  • results_trans (numpy array) – all phase results concatenated into numpy array plus column for quadrant and unwrapped phase

gnssrefl.phase_functions.load_sat_phase(station, year, year_end, freq, extension='')

Picks up the phase estimates from local (REFL_CODE) results section and returns most of the information from those files

Parameters:
  • station (str) – four character station name

  • year (integer) – beginning year

  • year_end (integer) – ending year

  • freq (integer) – GPS frequency (1,20 allowed)

Returns:

  • dataexist (bool) – whether data found?

  • results (numpy array of floats) – basically one variable with everything in the original columns from the daily phase files

gnssrefl.phase_functions.low_pct(amp, basepercent)

emulated amp_normK code from PBO H2O inputs are the amplitudes and a percentage used to define the bottom level. returns normalized amplitudes

this is meant to be used by individual tracks (I think) in this case they are the top values, not the bottom … ugh

gnssrefl.phase_functions.make_snow_filter(station, medfilter, ReqTracks, year1, year2)

Runs daily_avg code to make a snow mask file. This is so you have some idea of when the soil moisture products are contaminated by snow. Make a file with these years and doys saved. The user can edit if they feel the suggestsions are poor (i.e. days in the summer might show up as “snow”)

If snow mask file exists, it does not overwrite it.

Parameters:
  • station (str) – 4 ch station name

  • medfilter (float) – how much you allow the individual tracks to deviate from the daily median (meters)

  • ReqTracks (int) – number of tracks to compute trustworthy daily average

  • year1 (int) – starting year

  • year2 (int) – ending year

Returns:

  • snowmask_exists (bool) – whether file was created

  • snow_file (str) – name of the snow mask file

  • Creates output file into a file $REFL_CODE/Files/{ssss}/snowmask_{ssss}.txt

  • where ssss is the station name

gnssrefl.phase_functions.mjd_to_file_columns(mjd_list)

Convert MJD values to year, doy, month, day for file writing.

Parameters:

mjd_list (list or array) – Modified Julian Day values

Returns:

  • years (np.ndarray) – Year values

  • doys (np.ndarray) – Day of year values

  • months (np.ndarray) – Month values

  • days (np.ndarray) – Day of month values

gnssrefl.phase_functions.normAmp(amp, basepercent)

emulated amp_normK code from PBO H2O inputs are the amplitudes and a percentage used to define the bottom level. returns normalized amplitudes this is meant to be used by individual tracks (I think) in this case they are the top values, not the bottom … ugh

gnssrefl.phase_functions.old_quad(azim)

calculates oldstyle quadrants from PBO H2O

Parameters:
  • azim (float) – azimuth, dgrees

  • q (int) – old quadrant system used in pboh2o

gnssrefl.phase_functions.phase_tracks(station, year, doy, snr_type, fr_list, station_config, extension='')

This does the main work of estimating phase and other parameters from the SNR files it uses tracks that were predefined by the apriori.py code

Parameters:
  • name (station) – 4 char id, lowercase

  • year (int) – calendar year

  • doy (int) – day of year

  • snr_type (int) – SNR file extension (i.e. 99, 66 etc)

  • fr_list (list of integers) – frequency, [1], [20] or [1,20]

  • station_config (dict) – station analysis parameters (from json, with CLI overrides applied)

  • extension (str, optional) – analysis extension for json file. Default is ‘’.

  • track. (Only GPS frequencies are allowed because this relies on the repeating ground) –

gnssrefl.phase_functions.plot_baseline_leveling(station, years, doys, vwc_values, new_level, nodes, plot_path, tmin, level_doys, plt2screen=True)

Plot baseline leveling results - before/after style like vegetation correction

gnssrefl.phase_functions.prepare_track_dir(station, extension='')

Create (or clear) the individual_tracks directory for a station.

Parameters:
  • station (str) – Station name

  • extension (str) – Optional subdirectory extension

Returns:

Absolute path to the individual_tracks directory

Return type:

str

gnssrefl.phase_functions.read_apriori_rh(station, fr, extension='')

read the track dependent a priori reflector heights needed for phase & thus soil moisture.

Parameters:
  • station (str) – four character ID, lowercase

  • fr (int) – frequency (e.g. 1,20)

Returns:

results – column 1 is just a number (1,2,3,4, etc)

column 2 is RH in meters

column 3 is satellite number

column 4 is azimuth of the track (degrees)

column 5 is number of values used in average

column 6 is minimum azimuth degrees for the quadrant

column 7 is maximum azimuth degrees for the quadrant

Return type:

numpy array

gnssrefl.phase_functions.rename_vals(year_sat_phase, doy, hr, phase, azdata, ssat, amp_lsp, amp_ls, rh, ap_rh, ii)

this is just trying to clean up vwc.py send indices ii - and return renamed variables.

calculates and also returns mjd

Parameters:
  • year_sat_sat

  • doy (numpy array) – day of year (int)

  • hr (numpy array) – floats (UTC hour)

  • phase (numpy array of floats) – estimated phase (deg)

  • azdata (numpy array of floats) – azimuth (deg)

  • ssat (numpy array of int) – satellite numbers

  • amp_lsp (numpy array of floats) – lomb scargle periodogram amplitude

  • amp_ls (numpy array of floats) – least squares amplitude

  • rh (numpy array of floats) – reflector heights (m)

  • ap_rh – apriori reflector height (m)

  • ii

Returns:

  • y (numpy array of int) – year

  • t (numpy array of int) – day of year

  • h (numpy array of floats) – hour of the day (UTC)

  • x (numpy array of floats) – phase, degrees

  • azd (numpy array of floats) – azimuth for the track

  • s (numpy array of int)

  • amps_lsp (numpy array of floats) – LSP amplitude

  • amps_ls (numpy array of floats) – least squares amplitude

  • rhs (numpy array of floats) – estimated RH (m)

  • ap_rhs (numpy array of floats) – a priori RH (m)

  • mjds (numpy array of int) – Modified julian day

gnssrefl.phase_functions.save_vwc_plot(fig, pngfile)
Parameters:
  • fig (matplotlib figure) – the figure definition you define when you open a figure

  • pngfile (str) – name of the png file to be saved

gnssrefl.phase_functions.set_parameters(station, level_doys, minvalperday, tmin, tmax, min_req_pts_track, fr, year, year_end, plt, auto_removal, warning_value, extension, bin_hours=None, minvalperbin=None, bin_offset=None)

the goal of this code is to pick up the relevant parameters used in vwc.

Parameters:
  • station (str) – 4 character station name

  • level_doys (list of int) – option doy inputs for defining leveling period. can be in the json or on the command line

  • minvalperday (int) – number of phase values required each day

  • tmin (float) – min soil texture

  • tmax (float) – max soil texture

  • min_req_pts_track (int) – minimum number of phase values per year per track

  • freq (int) – frequency to use (1,20 allowed)

  • year (int) – first year to analyze

  • year_end (int) – last year to analyze

  • plt (bool) – whether you want plots to come to the screen

  • auto_removal (bool) – whther tracks should be removed when they fail QC

  • warning_value (float) – phase RMS needed to trigger warning

  • extension (str) – extension used for inputs and outputs (i.e. test1 in ssss.test1.json)

  • bin_hours – needs documentation

  • ? (bin_offset) – needs documentation

  • ? – needs documentation

Returns:

  • minvalperday (int) – number of phase values required to create a daily average

  • tmin (float) – min soil texture

  • tmax (float) – max soil texture

  • min_req_pts_track (int) – minimum number of phase values per year per track

  • freq (int) – frequency to use (1,20 allowed)

  • year_end (int) – last year to analyze

  • plt (bool) – whether you want plots to come to the screen

  • auto_removal (bool) – whther tracks should be removed when they fail QC

  • warning_value (float) – phase RMS needed to trigger warning

  • extension (str) – extra name for the json file

  • return_level_doys (list) – start and end day of years for leveling

  • vegetation_model (int) – vegetation correction model: 1 (simple), 2 (advanced)

gnssrefl.phase_functions.subdaily_phase_plot(station, fr, vxyz, xdir, subdir, hires_figs, bin_hours=24, bin_offset=0, minvalperbin=10, plt2screen=True)

makes a plot of daily or subdaily averaged phase for vwc code

Parameters:
  • station (str) – 4 char station name

  • fr (int) – frequency of signal

  • vxyz (numpy array) – Track-level phase data

  • xdir (str) – location of the results (environment variable REFL_CODE)

  • subdir (str) – subdirectory in Files

  • hires_figs (bool) – whether you want eps instead of png files

  • bin_hours (int) – time bin size in hours (default: 24 for daily)

  • bin_offset (int) – bin timing offset in hours (default: 0)

  • minvalperbin (int) – minimum observations required per bin (default: 10)

  • plt2screen (bool) – whether to display plot to screen (default: True)

gnssrefl.phase_functions.test_func(x, a, b, rh_apriori)

This is least squares for estimating a sine wave given a fixed frequency, freqLS

gnssrefl.phase_functions.test_func_new(x, a, b, rh_apriori, freq)

This is least squares for estimating a sine wave given a fixed frequency, freqLS now freq is input so it is not hardwired for L2

Parameters:
  • x (numpy array of floats) – sine(elevation angle) I think

  • a (float) – amplitude - estimated

  • b (float) – phase - estimated

  • rh_apriori (float) – reflector height (m)

  • freq (int) – frequency

gnssrefl.phase_functions.vwc_plot(station, t_datetime, vwcdata, plot_path, circles, plt2screen=True)

makes a plot of volumetric water content

Parameters:
  • station (string) – 4 ch station name

  • t_datetime (datetime) – observation times for measurements

  • vwcdata (numpy array of floats (I think)) – volumetric water content

  • plot_path (Saves a plot to) – full name of the plot file

  • circles (boolean) – circles in the plot. default is a line (really .-)

  • plot_path

gnssrefl.phase_functions.write_all_phase(v, fname)

writes out preliminary phase values and other metrics for advanced vegetation option. This is in the hope that it can be used in clara chew’s dissertation algorithm.

File is written to $REFL_CODE/Files/station/station_all_phase.txt I think

Parameters:
  • v (numpy of floats as defined in vwc_cl) – TBD year, doy, phase, azimuth, satellite number estimated RH, LSP amplitude, LS amplitude, UTC hours raw LSP amp, raw LS amp

  • fname (str) – name of the output file

  • filestatus (int) – 1, open the file 2, write to file (well, really any value)

  • rhtrack (float) – apriori reflector height for the given track, meters

Returns:

allrh

Return type:

fileID

gnssrefl.phase_functions.write_apriori_rh(filepath, tracks, station, year, tmin, tmax)

Write apriori RH track list file. Used by both vwc_input and vwc auto_removal.

A track is defined by satellite number and the azimuth at minimum elevation angle (az_min_ele) of each arc. Arcs are clustered into tracks when their az_min_ele values fall within 3 degrees of the cluster center. The track’s mean azimuth (track_avg_az) is the circular mean of az_min_ele over e.g. a year of arcs in the cluster.

Parameters:
  • filepath (path-like) –

  • tracks (list) – each element is [track_num, rh, satellite, track_avg_az, nvals, az_min, az_max]

  • station (str) –

  • year (int) –

  • tmin (float) – minimum soil texture (header metadata only)

  • tmax (float) – maximum soil texture (header metadata only)

gnssrefl.phase_functions.write_avg_phase(station, fr, avg_phase, extension='', bin_hours=24, bin_offset=0)

writes averaged phase results to a text file

Parameters:
  • station (string) –

  • fr (int) – frequency

  • avg_phase (numpy array) – output of calculate_avg_phase columns: [Year, DOY, Phase, PhaseSig, NormAmp, FracYear, Month, Day, [BinStart]]

  • extension (str, optional) – analysis extension for directory organization

  • bin_hours (int, optional) – Time bin size in hours (default: 24). Used for filename suffix and header.

  • bin_offset (int, optional) – Bin timing offset in hours (default: 0). Used for filename suffix and header.

gnssrefl.phase_functions.write_out_raw_phase(v, fname)

write daily phase values used in vwc to a new consolidated file I added columns for quadrant and unwrapped phase

Parameters:
  • v (numpy array) – phase results read for multiple years. could be with snow filter applied

  • fname (str) – filename for output

Returns:

newv – original variable v with columns added for quadrant (1-4) and unwrapped phase

Return type:

numpy array

gnssrefl.phase_functions.write_phase_for_advanced(filename, vxyz)

Writes out a file of interim phase results for advanced models developed by Clara Chew

File now written to $REFL_CODE/Files/<station>/<station>_all_phase.txt

September 8, 2025: Added a new column for MJD

Parameters:
  • filename (str) – name for output file

  • vxyz (numpy array of floats) – as defined in vwc_cl.py

gnssrefl.phase_functions.write_rolling_vwc_output(station, vwc_data, fr, bin_hours, extension='', vegetation_model=2)

Write hourly rolling VWC output file.

This writes a combined file with all hourly rolling measurements (e.g., bins starting at 0:00, 1:00, 2:00, etc.) sorted chronologically.

Parameters:
  • station (str) – 4-character station name

  • vwc_data (dict) – Must contain: - ‘mjd’: Modified Julian Day values - ‘vwc’: VWC values (already leveled) - ‘datetime’: datetime objects - ‘bin_starts’: list of bin start hours

  • fr (int) – Frequency code (1=L1, 5=L5, 20=L2C)

  • bin_hours (int) – Time bin size in hours (e.g., 6 for 6-hour windows)

  • extension (str, optional) – Extension for file naming (default: ‘’)

  • vegetation_model (int, optional) – 1=simple, 2=advanced (for header documentation, default: 2)

gnssrefl.phase_functions.write_track_file(track_dir, station, year, sat_num, avg_az, rows, fr, extra_headers=None)

Write a single track file with standard header and row format.

Parameters:
  • track_dir (str) – Directory to write the track file in

  • station (str) – Station name

  • year (int) – Year for the track

  • sat_num (int) – Satellite number

  • avg_az (float) – Track average azimuth (degrees)

  • rows (numpy.ndarray) – (N, 17) array of track data columns

  • fr (int) – Frequency code

  • extra_headers (list of str, optional) – Additional header lines inserted before the column legend

gnssrefl.phase_functions.write_vwc_output(station, vwc_data, year, fr, bin_hours, bin_offset, extension='', vegetation_model=1)

Write VWC output file in standard gnssrefl format. Works for both vegetation models.

Parameters:
  • station (str) – 4-character station name

  • vwc_data (dict) – Must contain: - ‘mjd’: Modified Julian Day values - ‘vwc’: VWC values - ‘datetime’: datetime objects - ‘bin_starts’: list of bin start hours (subdaily) or empty list (daily)

  • year (int) – Year for file naming

  • fr (int) – Frequency code (1=L1, 5=L5, 20=L2C)

  • bin_hours (int) – Time bin size in hours (default: 24)

  • bin_offset (int) – Bin timing offset in hours (default: 0)

  • extension (str, optional) – Extension for file naming (default: ‘’)

  • vegetation_model (int, optional) – 1=simple, 2=advanced (for header documentation, default: 1)