Methods

Several methods exist to help combine multiple data sets and convert between equivalent indices.

DMSP

Supports the Defense Meteorological Satellite Program instruments by providing common custom routines alongside reference and acknowledgement information.

Methods supporting the Defense Meteorological Satellite Program (DMSP).

pysatMadrigal.instruments.methods.dmsp.add_drift_unit_vectors(inst)[source]

Add unit vectors for expressing plasma motion at high latitudes.

Parameters:
instpysat.Instrument

DMSP IVM Instrument object

pysatMadrigal.instruments.methods.dmsp.add_drifts_polar_cap_x_y(inst, rpa_flag_key=None, rpa_vel_key='ion_v_sat_for', cross_vel_key='ion_v_sat_left')[source]

Add polar cap drifts in cartesian coordinates.

Parameters:
instpysat.Instrument

DMSP IVM Instrument object

rpa_flag_keystring or NoneType

RPA flag key, if None will not select any data. The UTD RPA flag key is ‘rpa_flag_ut’ (default=None)

rpa_vel_keystring

RPA velocity data key (default=’ion_v_sat_for’)

cross_vel_keystring

Cross-track velocity data key (default=’ion_v_sat_left’)

pysatMadrigal.instruments.methods.dmsp.references(name)[source]

Provide references for the DMSP instruments and experiments.

Parameters:
namestr

Instrument name

Returns:
refsstr

String providing reference guidenance for the DMSP data

pysatMadrigal.instruments.methods.dmsp.smooth_ram_drifts(inst, rpa_flag_key=None, rpa_flag_max=1, rpa_vel_key='ion_v_sat_for', smooth_key=None, roll_window=15, roll_kwargs=None)[source]

Smooth the ram drifts using a rolling mean.

Parameters:
instpysat.Instrument

DMSP IVM Instrument object

rpa_flag_keystr or NoneType

RPA flag key, if None then no data is selected. The UTD RPA flag key is ‘rpa_flag_ut’ (default=None)

rpa_flag_maxint

Maximum allowable RPA flag (default=1)

rpa_vel_keystr

RPA velocity data key (default=’ion_v_sat_for’)

smooth_keystr or NoneType

If None will fill old RPA data with smoothed data, otherwise assigns the smoothed data to this new variable

roll_windowint, offset, or BaseIndexer subclass

Size of the moving window (default=15)

roll_kwargsdict or NoneType

Keyword args for rolling mean window. If None uses {‘min_periods’: 5} (default=None)

Raises:
KeyError

If unknown values are used for rpa_flag_key or rpa_vel_key

See also

pandas.rolling
pysatMadrigal.instruments.methods.dmsp.update_DMSP_ephemeris(inst, ephem=None)[source]

Update DMSP instrument data with improved DMSP ephemeris.

Parameters:
instpysat.Instrument

DMSP IVM Instrument object

ephempysat.Instrument or NoneType

DMSP IVM_EPHEM instrument object

GNSS

Supports the Global Navigation Satellite System instruments by providing reference and acknowledgement information, specialised load functions, and supporting information for probing the line-of-sight (LoS) files.

Methods supporting the Global Navigation Satellite System platform.

pysatMadrigal.instruments.methods.gnss.acknowledgements(name)[source]

Provide the acknowledgements for different GNSS instruments.

Parameters:
namestr

Instrument name

Returns:
acknstr

Acknowledgement information to provide in studies using this data

pysatMadrigal.instruments.methods.gnss.get_los_receiver_sites(los_fnames)[source]

Retrieve an array of unique receiver names for the desired LoS files.

Parameters:
los_fnameslist

List of filenames

Returns:
sitesnp.array

Array of strings containing GNSS receiver names with data in the files

pysatMadrigal.instruments.methods.gnss.get_los_times(los_fnames)[source]

Retrieve an array of unique times for the desired LoS files.

Parameters:
los_fnameslist

List of filenames

Returns:
all_timesnp.array

Array of datetime objects with data in the files

pysatMadrigal.instruments.methods.gnss.load_los(fnames, los_method, los_value, gnss_network='all')[source]

Load the GNSS slant TEC data.

Parameters:
fnameslist

List of filenames

los_methodstr

For ‘los’ tag only, load data for a unique GNSS receiver site (‘site’) or at a unique time (‘time’)

los_valuestr or dt.datetime

For ‘los’ tag only, load data at this unique site or time

gnss_networkbool

Limit data by GNSS network, if not ‘all’. Currently supports ‘all’, ‘gps’, and ‘glonass’ (default=’all’)

Returns:
dataxarray.Dataset

Object containing satellite data

metapysat.Meta

Object containing metadata such as column names and units

lat_keyslist

Latitude key names

lon_keyslist

Longitude key names

pysatMadrigal.instruments.methods.gnss.load_site(fnames)[source]

Load the GNSS TEC site data.

Parameters:
fnameslist

List of filenames

Returns:
dataxarray.Dataset

Object containing satellite data

metapysat.Meta

Object containing metadata such as column names and units

lat_keyslist

Latitude key names

lon_keyslist

Longitude key names

pysatMadrigal.instruments.methods.gnss.load_vtec(fnames)[source]

Load the GNSS vertical TEC data.

Parameters:
fnameslist

List of filenames

Returns:
dataxarray.Dataset

Object containing satellite data

metapysat.Meta

Object containing metadata such as column names and units

lat_keylist

Latitude key names

lon_keylist

Longitude key names

pysatMadrigal.instruments.methods.gnss.references(name)[source]

Provide suggested references for the specified data set.

Parameters:
namestr

Instrument name

Returns:
refsstr

Suggested Instrument reference(s)

JRO

Supports the Jicamarca Radio Observatory instrumnets by providing common custom routines alongside reference and acknowledgement information.

Methods supporting the Jicamarca Radio Observatory (JRO) platform.

pysatMadrigal.instruments.methods.jro.acknowledgements()[source]

Provide acknowlegements for the JRO instruments and experiments.

Returns:
acknstr

String providing acknowledgement text for studies using JRO data

pysatMadrigal.instruments.methods.jro.calc_measurement_loc(inst)[source]

Calculate the instrument measurement location in geographic coordinates.

Parameters:
instpysat.Instrument

JRO ISR Instrument object

Raises:
ValueError

If no appropriate azimuth and elevation angles are found, if no range variable is found, or if multiple range variables are found.

pysatMadrigal.instruments.methods.jro.references()[source]

Provide references for the JRO instruments and experiments.

Returns:
refsstr

String providing reference guidenance for the JRO experiments

General

Supports the Madrigal data access.

General routines for integrating CEDAR Madrigal instruments into pysat.

pysatMadrigal.instruments.methods.general.build_madrigal_datetime_index(mad_data)[source]

Create a datetime index using standard Madrigal variables.

Parameters:
mad_datapds.DataFrame

Madrigal data, expects time variables ‘year’, ‘month’, ‘day’, ‘hour’,

‘min’, and ‘sec’

Returns:
data_time

Datetime index for use by pysat

Raises:
ValueError

If expected time variables are missing

pysatMadrigal.instruments.methods.general.cedar_rules()[source]

General acknowledgement statement for Madrigal data.

Returns:
acknstr

String with general acknowledgement for all CEDAR Madrigal data

pysatMadrigal.instruments.methods.general.convert_pandas_to_xarray(xarray_coords, data, time_ind)[source]

Convert Madrigal HDF5/simple data from pandas to xarray.

Parameters:
xarray_coordslist or NoneType

List of keywords to use as coordinates if xarray output is desired instead of a Pandas DataFrame. Can build an xarray Dataset that have different coordinate dimensions by providing a dict inside the list instead of coordinate variable name strings. Each dict will have a tuple of coordinates as the key and a list of variable strings as the value. Empty list if None. For example, xarray_coords=[{(‘time’,): [‘year’, ‘doy’], (‘time’, ‘gdalt’): [‘data1’, ‘data2’]}]. (default=None)

datapds.DataFrame

Data to be converted into the xarray format

time_indpds.DatetimeIndex or NoneType

Time index for the data or None for no time index

Returns:
dataxr.Dataset

Data in the dataset format.

pysatMadrigal.instruments.methods.general.download(date_array, inst_code=None, kindat=None, data_path=None, user=None, password=None, url='http://cedar.openmadrigal.org', file_type='hdf5', **kwargs)[source]

Download data from Madrigal.

Parameters:
date_arrayarray-like

list of datetimes to download data for. The sequence of dates need not be contiguous.

inst_codestr

Madrigal instrument code(s), cast as a string. If multiple are used, separate them with commas. (default=None)

kindatstr

Experiment instrument code(s), cast as a string. If multiple are used, separate them with commas. (default=None)

data_pathstr

Path to directory to download data to. (default=None)

userstr

User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)

passwordstr

Password for data download. (default=None)

urlstr

URL for Madrigal site (default=’http://cedar.openmadrigal.org’)

file_typestr

File format for Madrigal data. Load routines currently only accepts ‘hdf5’ and ‘netCDF4’, but any of the Madrigal options may be used here. (default=’hdf5’)

**kwargsdict

Additional kwarg catch, allows general use when tag/inst_id are not needed for a given instrument.

Raises:
ValueError

If the specified input type or Madrigal experiment codes are unknown

pysatMadrigal.instruments.methods.general.filter_data_single_date(inst)[source]

Filter data to a single date.

Parameters:
instpysat.Instrument

Instrument object to which this routine should be attached

Warning

For the best performance, this function should be added first in the queue. This may be ensured by setting the default function in a pysat instrument file to this one.

To do this, within platform_name.py set preprocess at the top level.

preprocess = pysat.instruments.methods.madrigal.filter_data_single_date

Examples

This routine is intended to be added to the Instrument nanokernel processing queue via

inst = pysat.Instrument()
inst.custom_attach(filter_data_single_date)

This function will then be automatically applied to the Instrument object data on every load by the pysat nanokernel.

pysatMadrigal.instruments.methods.general.get_remote_filenames(inst_code=None, kindat='', user=None, password=None, web_data=None, url='http://cedar.openmadrigal.org', start=datetime.datetime(1900, 1, 1, 0, 0), stop=datetime.datetime(2024, 3, 15, 20, 57, 52, 257182), date_array=None)[source]

Retrieve the remote filenames for a specified Madrigal experiment.

Parameters:
inst_codestr or NoneType

Madrigal instrument code(s), cast as a string. If multiple are used, separate them with commas. (default=None)

kindatstr

Madrigal experiment code(s), cast as a string. If multiple are used, separate them with commas. If not supplied, all will be returned. (default=’’)

data_pathstr or NoneType

Path to directory to download data to. (default=None)

userstr or NoneType

User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)

passwordstr or NoneType

Password for data download. (default=None)

web_dataMadrigalData or NoneType

Open connection to Madrigal database or None (will initiate using url) (default=None)

urlstr

URL for Madrigal site (default=’http://cedar.openmadrigal.org’)

startdt.datetime or NoneType

Starting time for file list, None reverts to default (default=dt.datetime(1900, 1, 1))

stopdt.datetime or NoneType

Ending time for the file list, None reverts to default (default=dt.datetime.utcnow())

date_arraydt.datetime or NoneType

Array of datetimes to download data for. The sequence of dates need not be contiguous and will be used instead of start and stop if supplied. (default=None)

Returns:
filesmadrigalWeb.madrigalWeb.MadrigalExperimentFile

Madrigal file object that contains remote experiment file data

Raises:
ValueError

If unexpected date_array input is supplied

pysatMadrigal.instruments.methods.general.good_exp(exp, date_array=None)[source]

Determine if a Madrigal experiment has good data for specified dates.

Parameters:
expMadrigalExperimentFile

MadrigalExperimentFile object

date_arraylist-like or NoneType

List of datetimes to download data for. The sequence of dates need not be contiguous. If None, then any valid experiment will be assumed to be valid. (default=None)

Returns:
gflagboolean

True if good, False if bad

pysatMadrigal.instruments.methods.general.known_madrigal_inst_codes(pandas_format=None)[source]

Supply known Madrigal instrument codes with a brief description.

Parameters:
pandas_formatbool or NoneType

Separate instrument codes by time-series (True) or multi-dimensional data types (False) if a boolean is supplied, or supply all if NoneType (default=None)

Returns:
inst_codesdict

Dictionary with string instrument code values as keys and a brief description of the corresponding instrument as the value.

pysatMadrigal.instruments.methods.general.list_files(tag, inst_id, data_path, format_str=None, supported_tags=None, file_cadence=datetime.timedelta(days=1), two_digit_year_break=None, delimiter=None, file_type=None)[source]

Create a Pandas Series of every file for chosen Instrument data.

Parameters:
tagstr

Denotes type of file to load. Accepts strings corresponding to the appropriate Madrigal Instrument tags.

inst_idstr

Specifies the instrument ID to load. Accepts strings corresponding to the appropriate Madrigal Instrument inst_ids.

data_pathstr

Path to data directory.

format_strstr or NoneType

User specified file format. If None is specified, the default formats associated with the supplied tags are used. (default=None)

supported_tagsdict or NoneType

Keys are inst_id, each containing a dict keyed by tag where the values file format template strings. (default=None)

file_cadencedt.timedelta or pds.DateOffset

pysat assumes a daily file cadence, but some instrument data file contain longer periods of time. This parameter allows the specification of regular file cadences greater than or equal to a day (e.g., weekly, monthly, or yearly). (default=dt.timedelta(days=1))

two_digit_year_breakint or NoneType

If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break. If None, then four-digit years are assumed. (default=None)

delimiterstr or NoneType

Delimiter string upon which files will be split (e.g., ‘.’). If None, filenames will be parsed presuming a fixed width format. (default=None)

file_typestr or NoneType

File format for Madrigal data. Load routines currently accepts ‘hdf5’, ‘simple’, and ‘netCDF4’, but any of the Madrigal options may be used here. If None, will look for all known file types. (default=None)

Returns:
outpds.Series

A pandas Series containing the verified available files

pysatMadrigal.instruments.methods.general.list_remote_files(tag, inst_id, inst_code=None, kindats=None, user=None, password=None, supported_tags=None, url='http://cedar.openmadrigal.org', two_digit_year_break=None, start=datetime.datetime(1900, 1, 1, 0, 0), stop=datetime.datetime(2024, 3, 15, 20, 57, 52, 257187))[source]

List files available from Madrigal.

Parameters:
tagstr

Denotes type of file to load. Accepts strings corresponding to the appropriate Madrigal Instrument tags.

inst_idstr

Specifies the instrument ID to load. Accepts strings corresponding to the appropriate Madrigal Instrument inst_ids.

inst_codestr or NoneType

Madrigal instrument code(s), cast as a string. If multiple are used, separate them with commas. (default=None)

kindatsdict

Madrigal experiment codes, in a dict of dicts with inst_ids as top level keys and tags as second level keys with Madrigal experiment code(s) as values. These should be strings, with multiple codes separated by commas. (default=None)

data_pathstr or NoneType

Path to directory to download data to. (default=None)

userstr or NoneType

User string input used for download. Provided by user and passed via pysat. If an account is required for downloads this routine here must error if user not supplied. (default=None)

passwordstr or NoneType

Password for data download. (default=None)

supported_tagsdict or NoneType

keys are inst_id, each containing a dict keyed by tag where the values file format template strings. (default=None)

urlstr

URL for Madrigal site (default=’http://cedar.openmadrigal.org’)

two_digit_year_breakint or NoneType

If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break. (default=None)

startdt.datetime

Starting time for file list. (default=01-01-1900)

stopdt.datetime

Ending time for the file list. (default=time of run)

Returns:
pds.Series

A series of filenames, see pysat.utils.files.process_parsed_filenames for more information.

Raises:
ValueError

For missing kwarg input

KeyError

For dictionary input missing requested tag/inst_id

Examples

This method is intended to be set in an instrument support file at the top level using functools.partial

list_remote_files = functools.partial(mad_meth.list_remote_files,
                                      supported_tags=supported_tags,
                                      inst_code=madrigal_inst_code,
                                      kindats=madrigal_tag)
pysatMadrigal.instruments.methods.general.load(fnames, tag='', inst_id='', xarray_coords=None)[source]

Load data from Madrigal into Pandas or XArray.

Parameters:
fnamesarray-like

Iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.

tagstr

Tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself and is not used here. (default=’’)

inst_idstr

Instrument ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself, and is not used here. (default=’’)

xarray_coordslist or NoneType

List of keywords to use as coordinates if xarray output is desired instead of a Pandas DataFrame. Can build an xarray Dataset that have different coordinate dimensions by providing a dict inside the list instead of coordinate variable name strings. Each dict will have a tuple of coordinates as the key and a list of variable strings as the value. Empty list if None. For example, xarray_coords=[{(‘time’,): [‘year’, ‘doy’], (‘time’, ‘gdalt’): [‘data1’, ‘data2’]}]. (default=None)

Returns:
datapds.DataFrame or xr.Dataset

A pandas DataFrame or xarray Dataset holding the data from the file

metapysat.Meta

Metadata from the file, as well as default values from pysat

Raises:
ValueError

If data columns expected to create the time index are missing or if coordinates are not supplied for all data columns.

pysatMadrigal.instruments.methods.general.madrigal_file_format_str(inst_code, strict=False, verbose=True)[source]

Supply known Madrigal instrument codes with a brief description.

Parameters:
inst_codeint

Madrigal instrument code as an integer

strictbool

If True, returns only file formats that will definitely not have a problem being parsed by pysat. If False, will return any file format. (default=False)

verbosebool

If True raises logging warnings, if False does not log any warnings. (default=True)

Returns:
fstrstr

File formatting string that may or may not be parsable by pysat

Raises:
ValueError

If file formats with problems would be returned and strict is True.

pysatMadrigal.instruments.methods.general.sort_file_formats(fnames)[source]

Separate filenames by file format type.

Parameters:
fnamesarray-like

Iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.

Returns:
load_file_typesdict

A dictionary with file types as keys and a list of filenames for each file type.

pysatMadrigal.instruments.methods.general.update_meta_with_hdf5(file_ptr, meta)[source]

Get meta data from a Madrigal HDF5 file.

Parameters:
file_ptrh5py._hl.files.File

Pointer to an open HDF5 file

metapysat.Meta

Existing Meta class to be updated

Returns:
file_labelslist

List of metadata available