Methods
Several methods exist to help combine multiple data sets and convert between equivalent indices.
DMSP
Supports the Defense Meteorological Satellite Program instruments by providing common custom routines alongside reference and acknowledgement information.
Methods supporting the Defense Meteorological Satellite Program (DMSP).
- pysatMadrigal.instruments.methods.dmsp.add_drift_unit_vectors(inst)[source]
Add unit vectors for expressing plasma motion at high latitudes.
- Parameters:
- instpysat.Instrument
DMSP IVM Instrument object
- pysatMadrigal.instruments.methods.dmsp.add_drifts_polar_cap_x_y(inst, rpa_flag_key=None, rpa_vel_key='ion_v_sat_for', cross_vel_key='ion_v_sat_left')[source]
Add polar cap drifts in cartesian coordinates.
- Parameters:
- instpysat.Instrument
DMSP IVM Instrument object
- rpa_flag_keystring or NoneType
RPA flag key, if None will not select any data. The UTD RPA flag key is ‘rpa_flag_ut’ (default=None)
- rpa_vel_keystring
RPA velocity data key (default=’ion_v_sat_for’)
- cross_vel_keystring
Cross-track velocity data key (default=’ion_v_sat_left’)
- pysatMadrigal.instruments.methods.dmsp.references(name)[source]
Provide references for the DMSP instruments and experiments.
- Parameters:
- namestr
Instrument name
- Returns:
- refsstr
String providing reference guidenance for the DMSP data
- pysatMadrigal.instruments.methods.dmsp.smooth_ram_drifts(inst, rpa_flag_key=None, rpa_flag_max=1, rpa_vel_key='ion_v_sat_for', smooth_key=None, roll_window=15, roll_kwargs=None)[source]
Smooth the ram drifts using a rolling mean.
- Parameters:
- instpysat.Instrument
DMSP IVM Instrument object
- rpa_flag_keystr or NoneType
RPA flag key, if None then no data is selected. The UTD RPA flag key is ‘rpa_flag_ut’ (default=None)
- rpa_flag_maxint
Maximum allowable RPA flag (default=1)
- rpa_vel_keystr
RPA velocity data key (default=’ion_v_sat_for’)
- smooth_keystr or NoneType
If None will fill old RPA data with smoothed data, otherwise assigns the smoothed data to this new variable
- roll_windowint, offset, or BaseIndexer subclass
Size of the moving window (default=15)
- roll_kwargsdict or NoneType
Keyword args for rolling mean window. If None uses {‘min_periods’: 5} (default=None)
- Raises:
- KeyError
If unknown values are used for rpa_flag_key or rpa_vel_key
See also
pandas.rolling
GNSS
Supports the Global Navigation Satellite System instruments by providing reference and acknowledgement information, specialised load functions, and supporting information for probing the line-of-sight (LoS) files.
Methods supporting the Global Navigation Satellite System platform.
- pysatMadrigal.instruments.methods.gnss.acknowledgements(name)[source]
Provide the acknowledgements for different GNSS instruments.
- Parameters:
- namestr
Instrument name
- Returns:
- acknstr
Acknowledgement information to provide in studies using this data
- pysatMadrigal.instruments.methods.gnss.get_los_receiver_sites(los_fnames)[source]
Retrieve an array of unique receiver names for the desired LoS files.
- Parameters:
- los_fnameslist
List of filenames
- Returns:
- sitesnp.array
Array of strings containing GNSS receiver names with data in the files
- pysatMadrigal.instruments.methods.gnss.get_los_times(los_fnames)[source]
Retrieve an array of unique times for the desired LoS files.
- Parameters:
- los_fnameslist
List of filenames
- Returns:
- all_timesnp.array
Array of datetime objects with data in the files
- pysatMadrigal.instruments.methods.gnss.load_los(fnames, los_method, los_value, gnss_network='all')[source]
Load the GNSS slant TEC data.
- Parameters:
- fnameslist
List of filenames
- los_methodstr
For ‘los’ tag only, load data for a unique GNSS receiver site (‘site’) or at a unique time (‘time’)
- los_valuestr or dt.datetime
For ‘los’ tag only, load data at this unique site or time
- gnss_networkbool
Limit data by GNSS network, if not ‘all’. Currently supports ‘all’, ‘gps’, and ‘glonass’ (default=’all’)
- Returns:
- dataxarray.Dataset
Object containing satellite data
- metapysat.Meta
Object containing metadata such as column names and units
- lat_keyslist
Latitude key names
- lon_keyslist
Longitude key names
- pysatMadrigal.instruments.methods.gnss.load_site(fnames)[source]
Load the GNSS TEC site data.
- Parameters:
- fnameslist
List of filenames
- Returns:
- dataxarray.Dataset
Object containing satellite data
- metapysat.Meta
Object containing metadata such as column names and units
- lat_keyslist
Latitude key names
- lon_keyslist
Longitude key names
- pysatMadrigal.instruments.methods.gnss.load_vtec(fnames)[source]
Load the GNSS vertical TEC data.
- Parameters:
- fnameslist
List of filenames
- Returns:
- dataxarray.Dataset
Object containing satellite data
- metapysat.Meta
Object containing metadata such as column names and units
- lat_keylist
Latitude key names
- lon_keylist
Longitude key names
JRO
Supports the Jicamarca Radio Observatory instrumnets by providing common custom routines alongside reference and acknowledgement information.
Methods supporting the Jicamarca Radio Observatory (JRO) platform.
- pysatMadrigal.instruments.methods.jro.acknowledgements()[source]
Provide acknowlegements for the JRO instruments and experiments.
- Returns:
- acknstr
String providing acknowledgement text for studies using JRO data
- pysatMadrigal.instruments.methods.jro.calc_measurement_loc(inst)[source]
Calculate the instrument measurement location in geographic coordinates.
- Parameters:
- instpysat.Instrument
JRO ISR Instrument object
- Raises:
- ValueError
If no appropriate azimuth and elevation angles are found, if no range variable is found, or if multiple range variables are found.
General
Supports the Madrigal data access.
General routines for integrating CEDAR Madrigal instruments into pysat.
- pysatMadrigal.instruments.methods.general.build_madrigal_datetime_index(mad_data)[source]
Create a datetime index using standard Madrigal variables.
- Parameters:
- mad_datapds.DataFrame
Madrigal data, expects time variables ‘year’, ‘month’, ‘day’, ‘hour’,
‘min’, and ‘sec’
- Returns:
- data_time
Datetime index for use by pysat
- Raises:
- ValueError
If expected time variables are missing
- pysatMadrigal.instruments.methods.general.cedar_rules()[source]
General acknowledgement statement for Madrigal data.
- Returns:
- acknstr
String with general acknowledgement for all CEDAR Madrigal data
- pysatMadrigal.instruments.methods.general.convert_pandas_to_xarray(xarray_coords, data, time_ind)[source]
Convert Madrigal HDF5/simple data from pandas to xarray.
- Parameters:
- xarray_coordslist or NoneType
List of keywords to use as coordinates if xarray output is desired instead of a Pandas DataFrame. Can build an xarray Dataset that have different coordinate dimensions by providing a dict inside the list instead of coordinate variable name strings. Each dict will have a tuple of coordinates as the key and a list of variable strings as the value. Empty list if None. For example, xarray_coords=[{(‘time’,): [‘year’, ‘doy’], (‘time’, ‘gdalt’): [‘data1’, ‘data2’]}]. (default=None)
- datapds.DataFrame
Data to be converted into the xarray format
- time_indpds.DatetimeIndex or NoneType
Time index for the data or None for no time index
- Returns:
- dataxr.Dataset
Data in the dataset format.
- pysatMadrigal.instruments.methods.general.download(date_array, inst_code=None, kindat=None, data_path=None, user=None, password=None, url='http://cedar.openmadrigal.org', file_type='hdf5', **kwargs)[source]
Download data from Madrigal.
- Parameters:
- date_arrayarray-like
list of datetimes to download data for. The sequence of dates need not be contiguous.
- inst_codestr
Madrigal instrument code(s), cast as a string. If multiple are used, separate them with commas. (default=None)
- kindatstr
Experiment instrument code(s), cast as a string. If multiple are used, separate them with commas. (default=None)
- data_pathstr
Path to directory to download data to. (default=None)
- userstr
User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
- passwordstr
Password for data download. (default=None)
- urlstr
URL for Madrigal site (default=’http://cedar.openmadrigal.org’)
- file_typestr
File format for Madrigal data. Load routines currently only accepts ‘hdf5’ and ‘netCDF4’, but any of the Madrigal options may be used here. (default=’hdf5’)
- **kwargsdict
Additional kwarg catch, allows general use when tag/inst_id are not needed for a given instrument.
- Raises:
- ValueError
If the specified input type or Madrigal experiment codes are unknown
- pysatMadrigal.instruments.methods.general.filter_data_single_date(inst)[source]
Filter data to a single date.
- Parameters:
- instpysat.Instrument
Instrument object to which this routine should be attached
Warning
For the best performance, this function should be added first in the queue. This may be ensured by setting the default function in a pysat instrument file to this one.
To do this, within platform_name.py set preprocess at the top level.
preprocess = pysat.instruments.methods.madrigal.filter_data_single_date
Examples
This routine is intended to be added to the Instrument nanokernel processing queue via
inst = pysat.Instrument() inst.custom_attach(filter_data_single_date)
This function will then be automatically applied to the Instrument object data on every load by the pysat nanokernel.
- pysatMadrigal.instruments.methods.general.get_remote_filenames(inst_code=None, kindat='', user=None, password=None, web_data=None, url='http://cedar.openmadrigal.org', start=datetime.datetime(1900, 1, 1, 0, 0), stop=datetime.datetime(2024, 3, 15, 21, 9, 18, 994590), date_array=None)[source]
Retrieve the remote filenames for a specified Madrigal experiment.
- Parameters:
- inst_codestr or NoneType
Madrigal instrument code(s), cast as a string. If multiple are used, separate them with commas. (default=None)
- kindatstr
Madrigal experiment code(s), cast as a string. If multiple are used, separate them with commas. If not supplied, all will be returned. (default=’’)
- data_pathstr or NoneType
Path to directory to download data to. (default=None)
- userstr or NoneType
User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
- passwordstr or NoneType
Password for data download. (default=None)
- web_dataMadrigalData or NoneType
Open connection to Madrigal database or None (will initiate using url) (default=None)
- urlstr
URL for Madrigal site (default=’http://cedar.openmadrigal.org’)
- startdt.datetime or NoneType
Starting time for file list, None reverts to default (default=dt.datetime(1900, 1, 1))
- stopdt.datetime or NoneType
Ending time for the file list, None reverts to default (default=dt.datetime.utcnow())
- date_arraydt.datetime or NoneType
Array of datetimes to download data for. The sequence of dates need not be contiguous and will be used instead of start and stop if supplied. (default=None)
- Returns:
- filesmadrigalWeb.madrigalWeb.MadrigalExperimentFile
Madrigal file object that contains remote experiment file data
- Raises:
- ValueError
If unexpected date_array input is supplied
- pysatMadrigal.instruments.methods.general.good_exp(exp, date_array=None)[source]
Determine if a Madrigal experiment has good data for specified dates.
- Parameters:
- expMadrigalExperimentFile
MadrigalExperimentFile object
- date_arraylist-like or NoneType
List of datetimes to download data for. The sequence of dates need not be contiguous. If None, then any valid experiment will be assumed to be valid. (default=None)
- Returns:
- gflagboolean
True if good, False if bad
- pysatMadrigal.instruments.methods.general.known_madrigal_inst_codes(pandas_format=None)[source]
Supply known Madrigal instrument codes with a brief description.
- Parameters:
- pandas_formatbool or NoneType
Separate instrument codes by time-series (True) or multi-dimensional data types (False) if a boolean is supplied, or supply all if NoneType (default=None)
- Returns:
- inst_codesdict
Dictionary with string instrument code values as keys and a brief description of the corresponding instrument as the value.
- pysatMadrigal.instruments.methods.general.list_files(tag, inst_id, data_path, format_str=None, supported_tags=None, file_cadence=datetime.timedelta(days=1), two_digit_year_break=None, delimiter=None, file_type=None)[source]
Create a Pandas Series of every file for chosen Instrument data.
- Parameters:
- tagstr
Denotes type of file to load. Accepts strings corresponding to the appropriate Madrigal Instrument tags.
- inst_idstr
Specifies the instrument ID to load. Accepts strings corresponding to the appropriate Madrigal Instrument inst_ids.
- data_pathstr
Path to data directory.
- format_strstr or NoneType
User specified file format. If None is specified, the default formats associated with the supplied tags are used. (default=None)
- supported_tagsdict or NoneType
Keys are inst_id, each containing a dict keyed by tag where the values file format template strings. (default=None)
- file_cadencedt.timedelta or pds.DateOffset
pysat assumes a daily file cadence, but some instrument data file contain longer periods of time. This parameter allows the specification of regular file cadences greater than or equal to a day (e.g., weekly, monthly, or yearly). (default=dt.timedelta(days=1))
- two_digit_year_breakint or NoneType
If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break. If None, then four-digit years are assumed. (default=None)
- delimiterstr or NoneType
Delimiter string upon which files will be split (e.g., ‘.’). If None, filenames will be parsed presuming a fixed width format. (default=None)
- file_typestr or NoneType
File format for Madrigal data. Load routines currently accepts ‘hdf5’, ‘simple’, and ‘netCDF4’, but any of the Madrigal options may be used here. If None, will look for all known file types. (default=None)
- Returns:
- outpds.Series
A pandas Series containing the verified available files
- pysatMadrigal.instruments.methods.general.list_remote_files(tag, inst_id, inst_code=None, kindats=None, user=None, password=None, supported_tags=None, url='http://cedar.openmadrigal.org', two_digit_year_break=None, start=datetime.datetime(1900, 1, 1, 0, 0), stop=datetime.datetime(2024, 3, 15, 21, 9, 18, 994595))[source]
List files available from Madrigal.
- Parameters:
- tagstr
Denotes type of file to load. Accepts strings corresponding to the appropriate Madrigal Instrument tags.
- inst_idstr
Specifies the instrument ID to load. Accepts strings corresponding to the appropriate Madrigal Instrument inst_ids.
- inst_codestr or NoneType
Madrigal instrument code(s), cast as a string. If multiple are used, separate them with commas. (default=None)
- kindatsdict
Madrigal experiment codes, in a dict of dicts with inst_ids as top level keys and tags as second level keys with Madrigal experiment code(s) as values. These should be strings, with multiple codes separated by commas. (default=None)
- data_pathstr or NoneType
Path to directory to download data to. (default=None)
- userstr or NoneType
User string input used for download. Provided by user and passed via pysat. If an account is required for downloads this routine here must error if user not supplied. (default=None)
- passwordstr or NoneType
Password for data download. (default=None)
- supported_tagsdict or NoneType
keys are inst_id, each containing a dict keyed by tag where the values file format template strings. (default=None)
- urlstr
URL for Madrigal site (default=’http://cedar.openmadrigal.org’)
- two_digit_year_breakint or NoneType
If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break. (default=None)
- startdt.datetime
Starting time for file list. (default=01-01-1900)
- stopdt.datetime
Ending time for the file list. (default=time of run)
- Returns:
- pds.Series
A series of filenames, see pysat.utils.files.process_parsed_filenames for more information.
- Raises:
- ValueError
For missing kwarg input
- KeyError
For dictionary input missing requested tag/inst_id
Examples
This method is intended to be set in an instrument support file at the top level using functools.partial
list_remote_files = functools.partial(mad_meth.list_remote_files, supported_tags=supported_tags, inst_code=madrigal_inst_code, kindats=madrigal_tag)
- pysatMadrigal.instruments.methods.general.load(fnames, tag='', inst_id='', xarray_coords=None)[source]
Load data from Madrigal into Pandas or XArray.
- Parameters:
- fnamesarray-like
Iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.
- tagstr
Tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself and is not used here. (default=’’)
- inst_idstr
Instrument ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself, and is not used here. (default=’’)
- xarray_coordslist or NoneType
List of keywords to use as coordinates if xarray output is desired instead of a Pandas DataFrame. Can build an xarray Dataset that have different coordinate dimensions by providing a dict inside the list instead of coordinate variable name strings. Each dict will have a tuple of coordinates as the key and a list of variable strings as the value. Empty list if None. For example, xarray_coords=[{(‘time’,): [‘year’, ‘doy’], (‘time’, ‘gdalt’): [‘data1’, ‘data2’]}]. (default=None)
- Returns:
- datapds.DataFrame or xr.Dataset
A pandas DataFrame or xarray Dataset holding the data from the file
- metapysat.Meta
Metadata from the file, as well as default values from pysat
- Raises:
- ValueError
If data columns expected to create the time index are missing or if coordinates are not supplied for all data columns.
- pysatMadrigal.instruments.methods.general.madrigal_file_format_str(inst_code, strict=False, verbose=True)[source]
Supply known Madrigal instrument codes with a brief description.
- Parameters:
- inst_codeint
Madrigal instrument code as an integer
- strictbool
If True, returns only file formats that will definitely not have a problem being parsed by pysat. If False, will return any file format. (default=False)
- verbosebool
If True raises logging warnings, if False does not log any warnings. (default=True)
- Returns:
- fstrstr
File formatting string that may or may not be parsable by pysat
- Raises:
- ValueError
If file formats with problems would be returned and strict is True.
- pysatMadrigal.instruments.methods.general.sort_file_formats(fnames)[source]
Separate filenames by file format type.
- Parameters:
- fnamesarray-like
Iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.
- Returns:
- load_file_typesdict
A dictionary with file types as keys and a list of filenames for each file type.