Contexts
The contexts are a class from strax and used everywhere in straxen
Below, all of the contexts functions are shown including the minianalyses
Contexts documentation
Auto generated documention of all the context functions including minianalyses
- class strax.context.Context(storage=None, config=None, register=None, register_all=None, **kwargs)[source]
Bases:
object
Context for strax analysis.
A context holds info on HOW to process data, such as which plugins provide what data types, where to store which results, and configuration options for the plugins.
You start all strax processing through a context.
- accumulate(run_id: str, targets: Tuple[str] | List[str], fields=None, function=None, store_first_for_others=True, function_takes_fields=False, **kwargs)[source]
Return a dictionary with the sum of the result of get_array.
- Parameters:
function –
Apply this function to the array before summing the results. Will be called as function(array), where array is a chunk of the get_array result. Should return either:
A scalar or 1d array -> accumulated result saved under ‘result’
A record array or dict -> fields accumulated individually
None -> nothing accumulated
If not provided, the identify function is used.
NB: Additionally and independently, if there are any functions registered under context_config[‘apply_data_function’] these are applied first directly after loading the data.
fields – Fields of the function output to accumulate. If not provided, all output fields will be accumulated.
store_first_for_others – if True (default), for fields included in the data but not fields, store the first value seen in the data (if any value is seen).
function_takes_fields – If True, function will be called as function(data, fields) instead of function(data).
All other options are as for get_iter.
- Return dictionary:
Dictionary with the accumulated result; see function and store_first_for_others arguments. Four fields are always added:
start: start time of the first processed chunk end: end time of the last processed chunk n_chunks: number of chunks in run n_rows: number of data entries in run
- apply_cmt_version(cmt_global_version: str) None
Sets all the relevant correction variables.
- Parameters:
cmt_global_version – A specific CMT global version, or ‘latest’ to get the newest one
- available_for_run(run_id: str, include_targets: None | list | tuple | str = None, exclude_targets: None | list | tuple | str = None, pattern_type: str = 'fnmatch') DataFrame
For a given single run, check all the targets if they are stored. Excludes the target if never stored anyway.
- Parameters:
run_id – requested run
include_targets – targets to include e.g. raw_records, raw_records* or *_nv. If multiple targets (e.g. a list) is provided, the target should match any of the arguments!
exclude_targets – targets to exclude e.g. raw_records, raw_records* or *_nv. If multiple targets (e.g. a list) is provided, the target should match none of the arguments!
pattern_type – either ‘fnmatch’ (Unix filename pattern matching) or ‘re’ (Regular expression operations).
- Returns:
Table of available data per target
- compare_metadata(data1, data2, return_results=False)[source]
Compare the metadata between two strax data.
- Parameters:
data2 (data1,) – either a list (tuple) of runid + target pair, or path to metadata to
compare, or a dictionary of the metadata :param return_results: bool, if True, returns a dictionary with metadata and lineages that
are found for the inputs does not do the comparison
- example usage:
context.compare_metadata( (“053877”, “peak_basics”), “./my_path_to/JSONfile.json”) first_metadata = context.get_metadata(run_id, “events”) context.compare_metadata(
(“053877”, “peak_basics”), first_metadata)
- context.compare_metadata(
(“053877”, “records”), (“053899”, “records”) )
- results_dict = context.compare_metadata(
- (“053877”, “peak_basics”), (“053877”, “events_info”),
return_results=True)
- copy_to_frontend(run_id: str, target: str, target_frontend_id: int | None = None, target_compressor: str | None = None, rechunk: bool = False, rechunk_to_mb: int = 200)[source]
Copy data from one frontend to another.
- Parameters:
run_id – run_id
target – target datakind
target_frontend_id – index of the frontend that the data should go to in context.storage. If no index is specified, try all.
target_compressor – if specified, recompress with this compressor.
rechunk – allow re-chunking for saving
rechunk_to_mb – rechunk to specified target size. Only works if rechunk is True.
- daq_plot(run_id: str, **kwargs)
Plot with peak, records and records sorted by “link” or “ADC ID” (other items are also possible as long as it is in the channel map). This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: . Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- data_info(data_name: str) DataFrame [source]
Return pandas DataFrame describing fields in data_name.
- define_run(name: str, data: ndarray | DataFrame | dict | list | tuple, from_run: str | None = None)
Function for defining new superruns from a list of run_ids.
- Note:
The function also allows to create a superrun from data (numpy.arrays/pandas.DataFframrs). However, this is currently not supported from the data loading side.
- Parameters:
name – Name/run_id of the superrun. Suoerrun names must start with an underscore.
data – Data from which the superrun should be created. Can be either one of the following: a tuple/list of run_ids or a numpy.array/pandas.DataFrame containing some data.
from_run – List of run_ids which were used to create the numpy.array/pandas.DataFrame passed in data.
- dependency_tree(target='event_info', dump_plot=True, to_dir='./', format='svg')
- deregister_plugins_with_missing_dependencies()[source]
Deregister plugins in case a data_type the plugin depends on is not provided by any other plugin.
- estimate_run_start_and_end(run_id, targets=None)[source]
Return run start and end time in ns since epoch.
This fetches from run metadata, and if this fails, it estimates it using data metadata from the targets or the underlying data-types (if it is stored).
- event_display(run_id: str, **kwargs)
- Make a waveform-display of a given event. Requires events, peaks and
peaklets (optionally: records). NB: time selection should return only one event!
- Parameters:
context – strax.Context provided by the minianalysis wrapper
run_id – run-id of the event
events – events, provided by the minianalysis wrapper
to_pe – gains, provided by the minianalysis wrapper
records_matrix – False (no record matrix), True, or “raw” (show raw-record matrix)
s2_fuzz – extra time around main S2 [ns]
s1_fuzz – extra time around main S1 [ns]
max_peaks – max peaks for plotting in the wf plot
xenon1t – True: is 1T, False: is nT
display_peak_info – tuple, items that will be extracted from event and displayed in the event info panel see above for format
display_event_info – tuple, items that will be extracted from event and displayed in the peak info panel see above for format
s1_hp_kwargs – dict, optional kwargs for S1 hitpatterns
s2_hp_kwargs – dict, optional kwargs for S2 hitpatterns
:param event_time_limit = overrides x-axis limits of event plot :param plot_all_positions if True, plot best-fit positions
from all posrec algorithms
- Returns:
axes used for plotting: ax_s1, ax_s2, ax_s1_hp_t, ax_s1_hp_b, ax_event_info, ax_peak_info, ax_s2_hp_t, ax_s2_hp_b, ax_ev, ax_rec Where those panels (axes) are:
ax_s1, main S1 peak
ax_s2, main S2 peak
ax_s1_hp_t, S1 top hit pattern
ax_s1_hp_b, S1 bottom hit pattern
ax_s2_hp_t, S2 top hit pattern
ax_s2_hp_b, S2 bottom hit pattern
ax_event_info, text info on the event
ax_peak_info, text info on the main S1 and S2
ax_ev, waveform of the entire event
ax_rec, (raw)record matrix (if any otherwise None)
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: event_info. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- event_display_interactive(run_id: str, **kwargs)
Interactive event display for XENONnT. Plots detailed main/alt S1/S2, bottom and top PMT hit pattern as well as all other peaks in a given event.
- Parameters:
bottom_pmt_array – If true plots bottom PMT array hit-pattern.
only_main_peaks – If true plots only main peaks into detail plots as well as PMT arrays.
only_peak_detail_in_wf – Only plots main/alt S1/S2 into waveform. Only plot main peaks if only_main_peaks is true.
plot_all_pmts – Bool if True, colors switched off PMTs instead of showing them in gray, useful for graphs shown in talks.
plot_record_matrix – If true record matrix is plotted below. waveform.
plot_records_threshold – Threshold at which zoom level to display record matrix as polygons. Larger values may lead to longer render times since more polygons are shown.
xenon1t – Flag to use event display with 1T data.
colors – Colors to be used for peaks. Order is as peak types, 0 = Unknown, 1 = S1, 2 = S2. Can be any colors accepted by bokeh.
yscale – Defines scale for main/alt S1 == 0, main/alt S2 == 1, waveform plot == 2. Please note, that the log scale can lead to funny glyph renders for small values.
log – If true color sclae is used for hitpattern plots.
example:
from IPython.core.display import display, HTML display(HTML("<style>.container { width:80% !important; }</style>")) import bokeh.plotting as bklt fig = st.event_display_interactive( run_id, time_range=(event['time'], event['endtime']) ) bklt.show(fig)
- Raises:
Raises an error if the user queries a time range which contains more than a single event.
- Returns:
bokeh.plotting.figure instance.
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: event_basics, peaks, peak_basics, peak_positions. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- event_display_simple(run_id: str, **kwargs)
Straxen mini-analysis for which someone was too lazy to write a proper docstring This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: event_info. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- event_scatter(run_id: str, **kwargs)
Plot a (cS1, cS2) event scatter plot.
- Parameters:
show_single – Show events with only S1s or only S2s just besides the axes.
s – Scatter size
color_dim – Dimension to use for the color. Must be in event_info.
color_range – Minimum and maximum color value to show.
figsize – (w, h) figure size to use, or leave None to not make a new matplotlib figure.
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: event_info. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- extract_latest_comment()
Extract the latest comment in the runs-database. This just adds info to st.runs.
- Example:
st.extract_latest_comment() st.select_runs(available=(‘raw_records’))
- get_array(run_id: str | tuple | list, targets, save=(), max_workers=None, **kwargs) ndarray [source]
Compute target for run_id and return as numpy array.
- Parameters:
run_id – run id to get
targets – list/tuple of strings of data type names to get
save – extra data types you would like to save to cache, if they occur in intermediate computations. Many plugins save automatically anyway.
max_workers – Number of worker threads/processes to spawn. In practice more CPUs may be used due to strax’s multithreading.
allow_multiple – Allow multiple targets to be computed simultaneously without merging the results of the target. This can be used when mass producing plugins that are not of the same datakind. Don’t try to use this in get_array or get_df because the data is not returned.
add_run_id_field – Boolean whether to add a run_id field in case of multi-runs.
run_id_as_bytes – Boolean if true uses byte string instead of an unicode string added to a multi-run array. This can save a lot of memory when loading many runs.
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- get_components(run_id: str, targets=(), save=(), time_range=None, chunk_number=None) ProcessorComponents [source]
Return components for setting up a processor.
- Parameters:
run_id – run id to get
targets – list/tuple of strings of data type names to get
save – extra data types you would like to save to cache, if they occur in intermediate computations. Many plugins save automatically anyway.
max_workers – Number of worker threads/processes to spawn. In practice more CPUs may be used due to strax’s multithreading.
allow_multiple – Allow multiple targets to be computed simultaneously without merging the results of the target. This can be used when mass producing plugins that are not of the same datakind. Don’t try to use this in get_array or get_df because the data is not returned.
add_run_id_field – Boolean whether to add a run_id field in case of multi-runs.
run_id_as_bytes – Boolean if true uses byte string instead of an unicode string added to a multi-run array. This can save a lot of memory when loading many runs.
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- get_df(run_id: str | tuple | list, targets, save=(), max_workers=None, **kwargs) DataFrame [source]
Compute target for run_id and return as pandas DataFrame.
- Parameters:
run_id – run id to get
targets – list/tuple of strings of data type names to get
save – extra data types you would like to save to cache, if they occur in intermediate computations. Many plugins save automatically anyway.
max_workers – Number of worker threads/processes to spawn. In practice more CPUs may be used due to strax’s multithreading.
allow_multiple – Allow multiple targets to be computed simultaneously without merging the results of the target. This can be used when mass producing plugins that are not of the same datakind. Don’t try to use this in get_array or get_df because the data is not returned.
add_run_id_field – Boolean whether to add a run_id field in case of multi-runs.
run_id_as_bytes – Boolean if true uses byte string instead of an unicode string added to a multi-run array. This can save a lot of memory when loading many runs.
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- get_iter(run_id: str, targets: Tuple[str] | List[str], save=(), max_workers=None, time_range=None, seconds_range=None, time_within=None, time_selection='fully_contained', selection=None, selection_str=None, keep_columns=None, drop_columns=None, allow_multiple=False, progress_bar=True, _chunk_number=None, **kwargs) Iterator[Chunk] [source]
Compute target for run_id and iterate over results.
Do NOT interrupt the iterator (i.e. break): it will keep running stuff in background threads…
- Parameters:
run_id – run id to get
targets – list/tuple of strings of data type names to get
save – extra data types you would like to save to cache, if they occur in intermediate computations. Many plugins save automatically anyway.
max_workers – Number of worker threads/processes to spawn. In practice more CPUs may be used due to strax’s multithreading.
allow_multiple – Allow multiple targets to be computed simultaneously without merging the results of the target. This can be used when mass producing plugins that are not of the same datakind. Don’t try to use this in get_array or get_df because the data is not returned.
add_run_id_field – Boolean whether to add a run_id field in case of multi-runs.
run_id_as_bytes – Boolean if true uses byte string instead of an unicode string added to a multi-run array. This can save a lot of memory when loading many runs.
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- get_meta(run_id, target) dict [source]
Return metadata for target for run_id, or raise DataNotAvailable if data is not yet available.
- Parameters:
run_id – run id to get
target – data type to get
- get_metadata(run_id, target) dict
Return metadata for target for run_id, or raise DataNotAvailable if data is not yet available.
- Parameters:
run_id – run id to get
target – data type to get
- get_save_when(target: str) SaveWhen | int [source]
For a given plugin, get the save when attribute either being a dict or a number.
- get_single_plugin(run_id, data_name)[source]
Return a single fully initialized plugin that produces data_name for run_id.
For use in custom processing.
- get_source(run_id: str, target: str, check_forbidden: bool = True) set | None [source]
For a given run_id and target get the stored bases where we can start processing from, if no base is available, return None.
- Parameters:
run_id – run_id
target – target
check_forbidden – Check that we are not requesting to make a plugin that is forbidden by the context to be created.
- Returns:
set of plugin names that are needed to start processing from and are needed in order to build this target.
- get_source_sf(run_id, target, should_exist=False)[source]
Get the source storage frontends for a given run_id and target.
- Parameters:
target (run_id,) – run_id, target
should_exist – Raise a ValueError if we cannot find one (e.g. we already checked the data is stored)
- Returns:
list of strax.StorageFrontend (when should_exist is False)
- get_zarr(run_ids, targets, storage='./strax_temp_data', progress_bar=False, overwrite=True, **kwargs)[source]
Get persistent arrays using zarr. This is useful when loading large amounts of data that cannot fit in memory zarr is very compatible with dask. Targets are loaded into separate arrays and runs are merged. the data is added to any existing data in the storage location.
- Parameters:
run_ids – (Iterable) Run ids you wish to load.
targets – (Iterable) targets to load.
storage – (str, optional) fsspec path to store array. Defaults to ‘./strax_temp_data’.
overwrite – (boolean, optional) whether to overwrite existing arrays for targets at given path.
- Return zarr.Group:
zarr group containing the persistant arrays available at the storage location after loading the requested data the runs loaded into a given array can be seen in the array .attrs[‘RUNS’] field
- hvdisp_plot_peak_waveforms(run_id: str, **kwargs)
Plot the sum waveforms of peaks. Holoviews time dimension; will create new one if not provided.
- Parameters:
width – Plot width in pixels
show_largest – Maximum number of peaks to show
time_dim –
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: peaks, peak_basics. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- hvdisp_plot_pmt_pattern(run_id: str, **kwargs)
Plot a PMT array, with colors showing the intensity of light observed in the time range.
- Parameters:
array – ‘top’ or ‘bottom’, array to show.
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: records. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- hvdisp_plot_records_2d(run_id: str, **kwargs)
Plot records in a dynamic 2D histogram of (time, pmt)
- Parameters:
width – Plot width in pixels
time_stream – holoviews rangex stream to use. If provided, we assume records is already converted to points (which hopefully is what the stream is derived from)
tools – Tools to be used in the interactive plot. Only works with bokeh as plot library.
plot_library – Default bokeh, library to be used for the plotting.
width – With of the record matrix in pixel.
hooks – Hooks to adjust plot settings.
- Returns:
datashader object, records holoview points, RangeX time stream of records.
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: records. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- is_stored(run_id, target, detailed=False, **kwargs)[source]
Return whether data type target has been saved for run_id through any of the registered storage frontends.
Note that even if False is returned, the data type may still be made with a trivial computation.
- key_for(run_id, target)[source]
Get the DataKey for a given run and a given target plugin. The DataKey is inferred from the plugin lineage. The lineage can come either from the _fixed_plugin_cache or computed on the fly.
- Parameters:
run_id – run id to get
target – data type to get
- Returns:
strax.DataKey of the target
- keys_for_runs(target: str, run_ids: ndarray | list | tuple | str) List[DataKey]
Get the data-keys for a multitude of runs. If use_per_run_defaults is False which it preferably is (#246), getting many keys should be fast as we only only compute the lineage once.
- Parameters:
run_ids – Runs to get datakeys for
target – datatype requested
- Returns:
list of datakeys of the target for the given runs.
- lineage(run_id, data_type)[source]
Return lineage dictionary for data_type and run_id, based on the options in this context.
- list_available(target, runs=None, **kwargs) list
Return sorted list of run_id’s for which target is available.
- Parameters:
target – Data type to check
runs – Runs to check. If None, check all runs.
- load_corrected_positions(run_id: str, **kwargs)
Returns the corrected position for each position algorithm available, without the need to reprocess event_basics, as the needed information is already stored in event_basics.
- Parameters:
alt_s1 – False by default, if True it uses alternative S1 as main one
alt_s2 – False by default, if True it uses alternative S2 as main one
cmt_version – CMT version to use (it can be a list of same length as posrec_algos, if different versions are required for different posrec algorithms, default ‘local_ONLINE’)
posrec_algos – list of position reconstruction algorithms to use (default [‘mlp’, ‘gcn’, ‘cnn’])
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: event_basics. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- make(run_id: str | tuple | list, targets, save=(), max_workers=None, _skip_if_built=True, **kwargs) None [source]
Compute target for run_id. Returns nothing (None).
- Parameters:
run_id – run id to get
targets – list/tuple of strings of data type names to get
save – extra data types you would like to save to cache, if they occur in intermediate computations. Many plugins save automatically anyway.
max_workers – Number of worker threads/processes to spawn. In practice more CPUs may be used due to strax’s multithreading.
allow_multiple – Allow multiple targets to be computed simultaneously without merging the results of the target. This can be used when mass producing plugins that are not of the same datakind. Don’t try to use this in get_array or get_df because the data is not returned.
add_run_id_field – Boolean whether to add a run_id field in case of multi-runs.
run_id_as_bytes – Boolean if true uses byte string instead of an unicode string added to a multi-run array. This can save a lot of memory when loading many runs.
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- new_context(storage=(), config=None, register=None, register_all=None, replace=False, **kwargs)[source]
Return a new context with new setting adding to those in this context.
- Parameters:
replace – If True, replaces settings rather than adding them. See Context.__init__ for documentation on other parameters.
- plot_energy_spectrum(run_id: str, **kwargs)
Plot an energy spectrum histogram, with 1 sigma Poisson confidence intervals around it.
- Parameters:
exposure_kg_sec – Exposure in kg * sec
unit – Unit to plot spectrum in. Can be either: - events (events per bin) - kg_day_kev (events per kg day keV) - tonne_day_kev (events per tonne day keV) - tonne_year_kev (events per tonne year keV) Defaults to kg_day_kev if exposure_kg_sec is provided, otherwise events.
min_energy – Minimum energy of the histogram
max_energy – Maximum energy of the histogram
geomspace – If True, will use a logarithmic energy binning. Otherwise will use a linear scale.
n_bins – Number of energy bins to use
color – Color to plot in
label – Label for the line
error_alpha – Alpha value for the statistical error band
errors – Type of errors to draw, passed to ‘errors’ argument of Hist1d.plot.
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: event_info. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- plot_hit_pattern(run_id: str, **kwargs)
Straxen mini-analysis for which someone was too lazy to write a proper docstring This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: peaks, peak_basics. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- plot_nveto_event_display(run_id: str, **kwargs)
Straxen mini-analysis for which someone was too lazy to write a proper docstring This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: hitlets_nv, events_nv, event_positions_nv. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- plot_peak_classification(run_id: str, **kwargs)
Make an (area, rise_time) scatter plot of peaks.
- Parameters:
s – Size of dot for each peak
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: peak_basics. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- plot_peaks(run_id: str, **kwargs)
Straxen mini-analysis for which someone was too lazy to write a proper docstring This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: peaks, peak_basics. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- plot_peaks_aft_histogram(run_id: str, **kwargs)
Plot side-by-side (area, width) histograms of the peak rate and mean area fraction top.
- Parameters:
pe_bins – Array of bin edges for the peak area dimension [PE]
rt_bins – array of bin edges for the rise time dimension [ns]
extra_labels – List of (area, risetime, text, color) extra labels to put on the plot
rate_range – Range of rates to show [peaks/(bin*s)]
aft_range – Range of mean S1 area fraction top / bin to show
figsize – Figure size to use
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: peak_basics. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- plot_pulses_mv(run_id: str, **kwargs)
Mini-analyis to plot pulses for the specified list of records. You have to provide a a run-id for which pulses should be plotted. You can use the same arguments as for get_array to select a specific time range or data (see also further ).
In addition you can provide the following arguments:
- Parameters:
plot_hits – If True plot hit boundaries including the left and right extension as orange shaded regions.
plot_median – If true plots pulses sample median as dotted line.
max_plots – Limits the number of figures. If you would like to plot more pulses you should put the plots in a PDF.
store_pdf – If true figures are put to a PDF instead of plotting them to your notebook. The file name is automatically generated including the time range and run_id.
path – Relative path where the PDF should be stored. By default it is the directory of the notebook.
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: raw_records_mv. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- plot_pulses_nv(run_id: str, **kwargs)
Mini-analyis to plot pulses for the specified list of records. You have to provide a a run-id for which pulses should be plotted. You can use the same arguments as for get_array to select a specific time range or data (see also further ).
In addition you can provide the following arguments:
- Parameters:
plot_hits – If True plot hit boundaries including the left and right extension as orange shaded regions.
plot_median – If true plots pulses sample median as dotted line.
max_plots – Limits the number of figures. If you would like to plot more pulses you should put the plots in a PDF.
store_pdf – If true figures are put to a PDF instead of plotting them to your notebook. The file name is automatically generated including the time range and run_id.
path – Relative path where the PDF should be stored. By default it is the directory of the notebook.
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: raw_records_nv. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- plot_pulses_tpc(run_id: str, **kwargs)
Mini-analyis to plot pulses for the specified list of records. You have to provide a a run-id for which pulses should be plotted. You can use the same arguments as for get_array to select a specific time range or data (see also further ).
In addition you can provide the following arguments:
- Parameters:
plot_hits – If True plot hit boundaries including the left and right extension as orange shaded regions.
plot_median – If true plots pulses sample median as dotted line.
max_plots – Limits the number of figures. If you would like to plot more pulses you should put the plots in a PDF.
store_pdf – If true figures are put to a PDF instead of plotting them to your notebook. The file name is automatically generated including the time range and run_id.
path – Relative path where the PDF should be stored. By default it is the directory of the notebook.
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: raw_records. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- plot_records_matrix(run_id: str, **kwargs)
Straxen mini-analysis for which someone was too lazy to write a proper docstring This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: . Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- plot_waveform(run_id: str, **kwargs)
Plot the sum waveform and optionally per-PMT waveforms.
- Parameters:
deep – If True, show per-PMT waveform matrix under sum waveform. If ‘raw’, use raw_records instead of records to do so.
show_largest – Show only the largest show_largest peaks.
figsize – Matplotlib figure size for the plot.
cbar_loc – location of the intensity color bar. Set to None to omit it altogether.
lower_panel_height – Height of the lower panel in terms of the height of the upper panel.
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: . Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- provided_dtypes(runid='0')[source]
Summarize dtype information provided by this context.
- Returns:
dictionary of provided dtypes with their corresponding lineage hash, save_when, version
- raw_records_matrix(run_id: str, **kwargs)
Straxen mini-analysis for which someone was too lazy to write a proper docstring This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: raw_records. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- records_matrix(run_id: str, **kwargs)
Return (wv_matrix, times, pms)
wv_matrix: (n_samples, n_pmt) array with per-PMT waveform intensity in PE/ns
times: time labels in seconds (corr. to rows)
pmts: PMT numbers (corr. to columns)
Both times and pmts have one extra element.
- Parameters:
max_samples – Maximum number of time samples. If window and dt conspire to exceed this, waveforms will be downsampled.
ignore_max_sample_warning – If True, suppress warning when this happens.
- Example:
wvm, ts, ys = st.records_matrix(run_id, seconds_range=(1., 1.00001)) plt.pcolormesh(ts, ys, wvm.T, norm=matplotlib.colors.LogNorm()) plt.colorbar(label=’Intensity [PE / ns]’)
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: records. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs
- register(plugin_class)[source]
Register plugin_class as provider for data types in provides.
- Parameters:
plugin_class – class inheriting from strax.Plugin. You can also pass a sequence of plugins to register, but then you must omit the provides argument. If a plugin class omits the .provides attribute, we will construct one from its class name (CamelCase -> snake_case) Returns plugin_class (so this can be used as a decorator)
- register_all(module)[source]
Register all plugins defined in module.
Can pass a list/tuple of modules to register all in each.
- run_defaults(run_id)[source]
Get configuration defaults from the run metadata (if these exist)
This will only call the rundb once for each run while the context is in existence; further calls to this will return a cached value.
- run_metadata(run_id, projection=None) dict [source]
Return run-level metadata for run_id, or raise DataNotAvailable if this is not available.
- Parameters:
run_id – run id to get
projection – Selection of fields to get, following MongoDB syntax. May not be supported by frontend.
- scan_runs(check_available=(), if_check_available='raise', store_fields=()) DataFrame
Update and return self.runs with runs currently available in all storage frontends.
- Parameters:
check_available – Check whether these data types are available Availability of xxx is stored as a boolean in the xxx_available column.
if_check_available – ‘raise’ (default) or ‘skip’, whether to do the check
store_fields – Additional fields from run doc to include as rows in the dataframe. The context options scan_availability and store_run_fields list data types and run fields, respectively, that will always be scanned.
- search_field(pattern: str, include_code_usage: bool = True, return_matches: bool = False)[source]
Find and print which plugin(s) provides a field that matches pattern (fnmatch).
- Parameters:
pattern – pattern to match, e.g. ‘time’ or ‘tim*’
include_code_usage – Also include the code occurrences of the fields that match the pattern.
return_matches – If set, return a dictionary with the matching fields and the occurrences in code.
- Returns:
when return_matches is set, return a dictionary with the matching fields and the occurrences in code. Otherwise, we are not returning anything and just print the results
- search_field_usage(search_string: str, plugin: Plugin | List[Plugin] | None = None) List[str] [source]
Find and return which plugin(s) use a given field.
- Parameters:
search_string – a field that matches pattern exact
plugin – plugin where to look for a field
- Returns:
list of code occurrences in the form of PLUGIN.FUNCTION
- select_runs(run_mode=None, run_id=None, include_tags=None, exclude_tags=None, available=(), pattern_type='fnmatch', ignore_underscore=True, force_reload=False)
Return pandas.DataFrame with basic info from runs that match selection criteria.
- Parameters:
run_mode – Pattern to match run modes (reader.ini.name)
run_id – Pattern to match a run_id or run_ids
available – str or tuple of strs of data types for which data must be available according to the runs DB.
include_tags – String or list of strings of patterns for required tags
exclude_tags – String / list of strings of patterns for forbidden tags. Exclusion criteria have higher priority than inclusion criteria.
pattern_type – Type of pattern matching to use. Defaults to ‘fnmatch’, which means you can use unix shell-style wildcards (?, *). The alternative is ‘re’, which means you can use full python regular expressions.
ignore_underscore – Ignore the underscore at the start of tags (indicating some degree of officialness or automation).
force_reload – Force reloading of runs from storage. Otherwise, runs are cached after the first time they are loaded in self.runs.
- Examples:
- run_selection(include_tags=’blinded’)
select all datasets with a blinded or _blinded tag.
- run_selection(include_tags=’*blinded’)
… with blinded or _blinded, unblinded, blablinded, etc.
- run_selection(include_tags=[‘blinded’, ‘unblinded’])
… with blinded OR unblinded, but not blablinded.
- `run_selection(include_tags=’blinded’,
exclude_tags=[‘bad’, ‘messy’])`
… select blinded dsatasets that aren’t bad or messy
- set_config(config=None, mode='update')[source]
Set new configuration options.
- Parameters:
config – dict of new options
mode – can be either - update: Add to or override current options in context - setdefault: Add to current options, but do not override - replace: Erase config, then set only these options
- set_context_config(context_config=None, mode='update')[source]
Set new context configuration options.
- Parameters:
context_config – dict of new context configuration options
mode – can be either - update: Add to or override current options in context - setdefault: Add to current options, but do not override - replace: Erase config, then set only these options
- show_config(data_type=None, pattern='*', run_id='99999999999999999999')[source]
Return configuration options that affect data_type.
- Parameters:
data_type – Data type name
pattern – Show only options that match (fnmatch) pattern
run_id – Run id to use for run-dependent config options. If omitted, will show defaults active for new runs.
- storage_graph(run_id, target, graph=None, not_stored=None, dump_plot=True, to_dir='./', format='svg')
Plot the dependency graph indicating the storage of the plugins.
- Parameters:
target – str of the target plugin to check
graph – graphviz.graphs.Digraph instance
not_stored – set of plugins which are not stored
dump_plot – bool, if True, save the plot to the to_dir
to_dir – str, directory to save the plot
format – str, format of the plot
- Returns:
all plugins that will be calculated when running self.make(run_id, target)
The colors used in the graph represent the following storage states: - grey: strax.SaveWhen.NEVER - red: strax.SaveWhen.EXPLICIT - orange: strax.SaveWhen.TARGET - yellow: strax.SaveWhen.ALWAYS - green: target is stored
- stored_dependencies(run_id: str, target: str | list | tuple, check_forbidden: bool = True, _targets_stored: dict | None = None) dict | None [source]
For a given run_id and target(s) get a dictionary of all the datatypes that are required to build the requested target.
- Parameters:
run_id – run_id
target – target or a list of targets
check_forbidden – Check that we are not requesting to make a plugin that is forbidden by the context to be created.
- Returns:
dictionary of data types (keys) required for building the requested target(s) and if they are stored (values)
- Raises:
strax.DataNotAvailable – if there is at least one data type that is not stored and has no dependency or if it cannot be created
- takes_config = immutabledict({'storage_converter': <strax.config.Option object>, 'fuzzy_for': <strax.config.Option object>, 'fuzzy_for_options': <strax.config.Option object>, 'allow_incomplete': <strax.config.Option object>, 'allow_rechunk': <strax.config.Option object>, 'allow_multiprocess': <strax.config.Option object>, 'allow_shm': <strax.config.Option object>, 'allow_lazy': <strax.config.Option object>, 'forbid_creation_of': <strax.config.Option object>, 'store_run_fields': <strax.config.Option object>, 'check_available': <strax.config.Option object>, 'max_messages': <strax.config.Option object>, 'timeout': <strax.config.Option object>, 'saver_timeout': <strax.config.Option object>, 'use_per_run_defaults': <strax.config.Option object>, 'free_options': <strax.config.Option object>, 'apply_data_function': <strax.config.Option object>, 'write_superruns': <strax.config.Option object>})
- to_absolute_time_range(run_id, targets=None, time_range=None, seconds_range=None, time_within=None, full_range=None)[source]
Return (start, stop) time in ns since unix epoch corresponding to time range.
- Parameters:
run_id – run id to get
time_range – (start, stop) time in ns since unix epoch. Will be returned without modification
targets – data types. Used only if run metadata is unavailable, so run start time has to be estimated from data.
seconds_range – (start, stop) seconds since start of run
time_within – row of strax data (e.g. eent)
full_range – If True returns full time_range of the run.
- waveform_display(run_id: str, **kwargs)
Plot a waveform overview display”.
- Parameters:
width – Plot width in pixels.
This is a straxen mini-analysis. The method takes run_id as its only positional argument, and additional arguments through keywords only.
The function requires the data types: records, peaks, peak_basics. Unless you specify this through data_kind = array keyword arguments, this data will be loaded automatically.
The function takes the same selection arguments as context.get_array:
- Parameters:
selection – Query string, sequence of strings, or simple function to apply. The function must take a single argument which represents the structure numpy array of the loaded data.
selection_str – Same as selection (deprecated)
keep_columns – Array field/dataframe column names to keep. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
drop_columns – Array field/dataframe column names to drop. Useful to reduce amount of data in memory. (You can only specify either keep or drop column.)
time_range – (start, stop) range to load, in ns since the epoch
seconds_range – (start, stop) range of seconds since the start of the run to load.
time_within – row of strax data (e.g. event) to use as time range
time_selection – Kind of time selection to apply: - fully_contained: (default) select things fully contained in the range - touching: select things that (partially) overlap with the range - skip: Do not select a time range, even if other arguments say so
_chunk_number – For internal use: return data from one chunk.
progress_bar – Display a progress bar if metedata exists.
multi_run_progress_bar – Display a progress bar for loading multiple runs