straxen package
Subpackages
- straxen.analyses package
- Submodules
- straxen.analyses.bokeh_waveform_plot module
- straxen.analyses.daq_waveforms module
- straxen.analyses.event_display module
- straxen.analyses.holoviews_waveform_display module
- straxen.analyses.posrec_comparison module
- straxen.analyses.pulse_plots module
- straxen.analyses.quick_checks module
- straxen.analyses.records_matrix module
- straxen.analyses.waveform_plot module
- Module contents
- straxen.plugins package
- Submodules
- straxen.plugins.acqmon_processing module
- straxen.plugins.daqreader module
- straxen.plugins.double_scatter module
- straxen.plugins.event_area_per_channel module
- straxen.plugins.event_info module
- straxen.plugins.event_patternfit module
- straxen.plugins.event_processing module
- straxen.plugins.led_calibration module
- straxen.plugins.nveto_recorder module
- straxen.plugins.online_monitor module
- straxen.plugins.pax_interface module
- straxen.plugins.peak_processing module
- straxen.plugins.peaklet_processing module
- straxen.plugins.position_reconstruction module
- straxen.plugins.pulse_processing module
- straxen.plugins.veto_events module
- straxen.plugins.veto_hitlets module
- straxen.plugins.veto_pulse_processing module
- straxen.plugins.veto_veto_regions module
- straxen.plugins.x1t_cuts module
- Module contents
Submodules
straxen.bokeh_utils module
- straxen.bokeh_utils.bokeh_to_wiki(fig, outputfile=None)[source]
Function which converts bokeh HTML code to a wiki readable code.
- Parameters
fig – Figure to be conerted
outputfile – String of absolute file path. If specified output is writen to the file. Else output is print to the notebook and can be simply copied into the wiki.
straxen.common module
- straxen.common.check_loading_allowed(data, run_id, target, max_in_disallowed=1, disallowed=('event_positions', 'corrected_areas', 'energy_estimates'))[source]
Check that the loading of the specified targets is not disallowed
- Parameters
data – chunk of data
run_id – run_id of the run
target – list of targets requested by the user
max_in_disallowed – the max number of targets that are in the disallowed list
disallowed – list of targets that are not allowed to be loaded simultaneously by the user
- Returns
data
- Raise
RuntimeError if more than max_in_disallowed targets are requested
- straxen.common.download_test_data()[source]
Downloads strax test data to strax_test_data in the current directory
- straxen.common.get_dtypes(_data)[source]
Return keys/dtype names of pd.DataFrame or numpy array
- Parameters
_data – data to get the keys/dtype names
- Returns
keys/dtype names
- straxen.common.get_livetime_sec(context, run_id, things=None)[source]
Get the livetime of a run in seconds. If it is not in the run metadata, estimate it from the data-level metadata of the data things.
- straxen.common.get_resource(x: str, fmt='text')[source]
- Get the resource from an online source to be opened here. We will
- sequentially try the following:
Load if from memory if we asked for it before;
load it from a file if the path exists;
(preferred option) Load it from our database
Load the file from some URL (e.g. raw github content)
- Parameters
x – str, either it is : A.) a path to the file; B.) the identifier of the file as it’s stored under in the database; C.) A URL to the file (e.g. raw github content).
fmt – str, format of the resource x
- Returns
the opened resource file x opened according to the specified format
- straxen.common.get_secret(x)[source]
Return secret key x. In order of priority, we search: * Environment variable: uppercase version of x * xenon_secrets.py (if included with your straxen installation) * A standard xenon_secrets.py located on the midway analysis hub
(if you are running on midway)
- straxen.common.open_resource(file_name: str, fmt='text')[source]
Open file :param file_name: str, file to open :param fmt: format of the file :return: opened file
- straxen.common.pmt_positions(xenon1t=False)[source]
Return pandas dataframe with PMT positions columns: array (top/bottom), i (PMT number), x, y
- straxen.common.pre_apply_function(data, run_id, target, function_name='pre_apply_function')[source]
Prior to returning the data (from one chunk) see if any function(s) need to be applied.
- Parameters
data – one chunk of data for the requested target(s)
run_id – Single run-id of of the chunk of data
target – one or more targets
function_name – the name of the function to be applied. The function_name.py should be stored in the database.
- Returns
Data where the function is applied.
- straxen.common.remap_channels(data, verbose=True, safe_copy=False, _tqdm=False)[source]
- There were some errors in the channel mapping of old data as described in
https://xe1t-wiki.lngs.infn.it/doku.php?id=xenon:xenonnt:dsg:daq:sector_swap using this function, we can convert old data to reflect the right channel map while loading the data. We convert both the field ‘channel’ as well as anything that is an array of the same length of the number of channels.
- Parameters
data – numpy array of pandas dataframe
verbose – print messages while converting data
safe_copy – if True make a copy of the data prior to performing manipulations. Will prevent overwrites of the internal references but does require more memory.
_tqdm – bool (try to) add a tqdm wrapper to show the progress
- Returns
Correctly mapped data
- straxen.common.remap_old(data, targets, run_id, works_on_target='')[source]
- If the data is of before the time sectors were re-cabled, apply a software remap
otherwise just return the data is it is.
- Parameters
data – numpy array of data with at least the field time. It is assumed the data is sorted by time
targets – targets in the st.get_array to get
run_id – required positional argument of apply_function_to_data in strax
works_on_target – regex match string to match any of the targets. By default set to ‘’ such that any target in the targets would be remapped (which is what we want as channels are present in most data types). If one only wants records (no raw-records) and peaks* use e.g. works_on_target = ‘records|peaks’.
straxen.contexts module
- straxen.contexts.xenon1t_dali(output_folder='./strax_data', build_lowlevel=False, **kwargs)[source]
- straxen.contexts.xenonnt_online(output_folder='./strax_data', use_rucio=None, use_rucio_remote=False, we_are_the_daq=False, _minimum_run_number=7157, _maximum_run_number=None, _database_init=True, _forbid_creation_of=None, _rucio_path='/dali/lgrandi/rucio/', _include_rucio_remote=False, _raw_path='/dali/lgrandi/xenonnt/raw', _processed_path='/dali/lgrandi/xenonnt/processed', _add_online_monitor_frontend=False, _context_config_overwrite=None, **kwargs)[source]
XENONnT online processing and analysis
- Parameters
output_folder – str, Path of the strax.DataDirectory where new data can be stored
use_rucio – bool, whether or not to use the rucio frontend (by default, we add the frontend when running on an rcc machine)
use_rucio_remote – bool, if download data from rucio directly
we_are_the_daq – bool, if we have admin access to upload data
_minimum_run_number – int, lowest number to consider
_maximum_run_number – Highest number to consider. When None (the default) consider all runs that are higher than the minimum_run_number.
_database_init – bool, start the database (for testing)
_forbid_creation_of – str/tuple, of datatypes to prevent form being written (raw_records* is always forbidden).
_include_rucio_remote – allow remote downloads in the context
_rucio_path – str, path of rucio
_raw_path – str, common path of the raw-data
_processed_path – str. common path of output data
_context_config_overwrite – dict, overwrite config
_add_online_monitor_frontend – bool, should we add the online monitor storage frontend.
kwargs – dict, context options
- Returns
strax.Context
- straxen.contexts.xenonnt_simulation(output_folder='./strax_data', cmt_run_id_sim=None, cmt_run_id_proc=None, cmt_version='v3', fax_config='fax_config_nt_design.json', overwrite_from_fax_file_sim=False, overwrite_from_fax_file_proc=False, cmt_option_overwrite_sim=immutabledict({}), cmt_option_overwrite_proc=immutabledict({}), _forbid_creation_of=None, _config_overlap=immutabledict({'drift_time_gate': 'electron_drift_time_gate', 'drift_velocity_liquid': 'electron_drift_velocity', 'electron_lifetime_liquid': 'elife_conf'}), **kwargs)[source]
The most generic context that allows for setting full divergent settings for simulation purposes
It makes full divergent setup, allowing to set detector simulation part (i.e. for wfsim up to truth and raw_records). Parameters _sim refer to detector simulation parameters.
Arguments having _proc in their name refer to detector parameters that are used for processing of simulations as done to the real datector data. This means starting from already existing raw_records and finishing with higher level data, such as peaks, events etc.
If only one cmt_run_id is given, the second one will be set automatically, resulting in CMT match between simulation and processing. However, detector parameters can be still overwritten from fax file or manually using cmt config overwrite options.
CMT options can also be overwritten via fax config file. :param output_folder: Output folder for strax data. :param cmt_run_id_sim: Run id for detector parameters from CMT to be used
for creation of raw_records.
- Parameters
cmt_run_id_proc – Run id for detector parameters from CMT to be used for processing from raw_records to higher level data.
cmt_version – Global version for corrections to be loaded.
fax_config – Fax config file to use.
overwrite_from_fax_file_sim – If true sets detector simulation parameters for truth/raw_records from from fax_config file istead of CMT
overwrite_from_fax_file_proc – If true sets detector processing parameters after raw_records(peaklets/events/etc) from from fax_config file istead of CMT
cmt_option_overwrite_sim – Dictionary to overwrite CMT settings for the detector simulation part.
cmt_option_overwrite_proc – Dictionary to overwrite CMT settings for the data processing part.
_forbid_creation_of – str/tuple, of datatypes to prevent form being written (e.g. ‘raw_records’ for read only simulation context).
_config_overlap – Dictionary of options to overwrite. Keys must be simulation config keys, values must be valid CMT option keys.
kwargs – Additional kwargs taken by strax.Context.
- Returns
strax.Context instance
straxen.corrections_services module
Return corrections from corrections DB
- class straxen.corrections_services.CorrectionsManagementServices(username=None, password=None, mongo_url=None, is_nt=True)[source]
Bases:
object
A class that returns corrections Corrections are set of parameters to be applied in the analysis stage to remove detector effects. Information on the strax implementation can be found at https://github.com/AxFoundation/strax/blob/master/strax/corrections.py
- get_config_from_cmt(run_id, model_type, version='ONLINE')[source]
Smart logic to return NN weights file name to be downloader by straxen.MongoDownloader() :param run_id: run id from runDB :param model_type: model type and neural network type; model_mlp, or model_gcn or model_cnn :param version: version :param return: NN weights file name
- get_corrections_config(run_id, config_model=None)[source]
Get context configuration for a given correction :param run_id: run id from runDB :param config_model: configuration model (tuple type) :return: correction value(s)
- get_local_versions(global_version)[source]
Returns a dict of local versions for a given global version. Use ‘latest’ to get newest version
- get_pmt_gains(run_id, model_type, version, cacheable_versions=('ONLINE', ), gain_dtype=<class 'numpy.float32'>)[source]
Smart logic to return pmt gains to PE values. :param run_id: run id from runDB :param model_type: to_pe_model (gain model) :param version: version :param cacheable_versions: versions that are allowed to be cached in ./resource_cache :param gain_dtype: dtype of the gains to be returned as array :return: array of pmt gains to PE values
- get_start_time(run_id)[source]
Smart logic to return start time from runsDB :param run_id: run id from runDB :return: run start time
- property global_versions
straxen.get_corrections module
- straxen.get_corrections.get_cmt_resource(run_id, conf, fmt='')[source]
Get resource with CMT correction file name
- straxen.get_corrections.get_correction_from_cmt(run_id, conf)[source]
Get correction from CMT general format is conf = (‘correction_name’, ‘version’, True) where True means looking at nT runs, e.g. get_correction_from_cmt(run_id, conf[:2]) special cases: version can be replaced by consant int, float or array when user specify value(s) :param run_id: run id from runDB :param conf: configuration :return: correction value(s)
straxen.hitfinder_thresholds module
straxen.holoviews_utils module
straxen.itp_map module
- class straxen.itp_map.InterpolateAndExtrapolate(points, values, neighbours_to_use=None, array_valued=False)[source]
Bases:
object
Linearly interpolate- and extrapolate using inverse-distance weighted averaging between nearby points.
- class straxen.itp_map.InterpolatingMap(data, method='WeightedNearestNeighbors', **kwargs)[source]
Bases:
object
Correction map that computes values using inverse-weighted distance interpolation.
- The map must be specified as a json translating to a dictionary like this:
‘coordinate_system’ : [[x1, y1], [x2, y2], [x3, y3], [x4, y4], …], ‘map’ : [value1, value2, value3, value4, …] ‘another_map’ : idem ‘name’: ‘Nice file with maps’, ‘description’: ‘Say what the maps are, who you are, etc’, ‘timestamp’: unix epoch seconds timestamp
with the straightforward generalization to 1d and 3d.
- Alternatively, a grid coordinate system can be specified as follows:
‘coordinate_system’ : [[‘x’, [x_min, x_max, n_x]], [[‘y’, [y_min, y_max, n_y]]
Alternatively, an N-vector-valued map can be specified by an array with last dimension N in ‘map’.
The default map name is ‘map’, I’d recommend you use that.
- For a 0d placeholder map, use
‘points’: [], ‘map’: 42, etc
Default method return inverse-distance weighted average of nearby 2 * dim points Extra support includes RectBivariateSpline, RegularGridInterpolator in scipy by pass keyword argument like
method=’RectBivariateSpline’
- The interpolators are called with
‘positions’ : [[x1, y1], [x2, y2], [x3, y3], [x4, y4], …] ‘map_name’ : key to switch to map interpolator other than the default ‘map’
- metadata_field_names = ['timestamp', 'description', 'coordinate_system', 'name', 'irregular', 'compressed', 'quantized']
straxen.matplotlib_utils module
- straxen.matplotlib_utils.draw_box(x, y, **kwargs)[source]
Draw rectangle, given x-y boundary tuples
- straxen.matplotlib_utils.log_x(a=None, b=None, scalar_ticks=True, tick_at=None)[source]
Make the x axis use a log scale from a to b
- straxen.matplotlib_utils.log_y(a=None, b=None, scalar_ticks=True, tick_at=None)[source]
Make the y axis use a log scale from a to b
- straxen.matplotlib_utils.plot_on_single_pmt_array(c, array_name='top', xenon1t=False, r=68.39200000000001, pmt_label_size=8, pmt_label_color='white', show_tpc=True, log_scale=False, vmin=None, vmax=None, dead_pmts=None, dead_pmt_color='gray', **kwargs)[source]
Plot one of the PMT arrays and color it by c. :param c: Array of colors to use. Must be len() of the number of TPC PMTs :param label: Label for the color bar :param pmt_label_size: Fontsize for the PMT number labels. Set to 0 to disable. :param pmt_label_color: Text color of the PMT number labels. :param log_scale: If True, use a logarithmic color scale :param extend: same as plt.colorbar(extend=…) :param vmin: Minimum of color scale :param vmax: maximum of color scale Other arguments are passed to plt.scatter.
- straxen.matplotlib_utils.plot_pmts(c, label='', figsize=None, xenon1t=False, show_tpc=True, extend='neither', vmin=None, vmax=None, **kwargs)[source]
Plot the PMT arrays side-by-side, coloring the PMTS with c. :param c: Array of colors to use. Must have len() n_tpc_pmts :param label: Label for the color bar :param figsize: Figure size to use. :param extend: same as plt.colorbar(extend=…) :param vmin: Minimum of color scale :param vmax: maximum of color scale :param show_axis_labels: if True it will show x and y labels Other arguments are passed to plot_on_single_pmt_array.
straxen.mini_analysis module
straxen.misc module
- class straxen.misc.TimeWidgets[source]
Bases:
object
- straxen.misc.convert_array_to_df(array: numpy.ndarray) pandas.core.frame.DataFrame [source]
Converts the specified array into a DataFrame drops all higher dimensional fields during the process.
- Parameters
array – numpy.array to be converted.
- Returns
DataFrame with higher dimensions dropped.
- straxen.misc.dataframe_to_wiki(df, float_digits=5, title='Awesome table', force_int=())[source]
Convert a pandas dataframe to a dokuwiki table (which you can copy-paste onto the XENON wiki) :param df: dataframe to convert :param float_digits: Round float-ing point values to this number of digits. :param title: title of the table.
- straxen.misc.print_versions(modules=('strax', 'straxen', 'cutax'), return_string=False)[source]
Print versions of modules installed.
- Parameters
modules – Modules to print, should be str, tuple or list. E.g. print_versions(modules=(‘strax’, ‘straxen’, ‘wfsim’, ‘cutax’, ‘pema’))
return_string – optional. Instead of printing the message, return a string
- Returns
optional, the message that would have been printed
straxen.mongo_storage module
- class straxen.mongo_storage.GridFsInterface(readonly=True, file_database='files', config_identifier='config_name', collection=None)[source]
Bases:
object
Base class to upload/download the files to a database using GridFS for PyMongo: https://pymongo.readthedocs.io/en/stable/api/gridfs/index.html#module-gridfs
This class does the basic shared initiation of the downloader and uploader classes.
- static compute_md5(abs_path)[source]
NB: RAM intensive operation! Get the md5 hash of a file stored under abs_path
- Parameters
abs_path – str, absolute path to a file
- Returns
str, the md5-hash of the requested file
- config_exists(config)[source]
Quick check if this config is already saved in the collection
- Parameters
config – str, name of the file of interest
- Returns
bool, is this config name stored in the database
- document_format(config)[source]
Format of the document to upload
- Parameters
config – str, name of the file of interest
- Returns
dict, that will be used to add the document
- get_query_config(config)[source]
Generate identifier to query against. This is just the configs name.
- Parameters
config – str, name of the file of interest
- Returns
dict, that can be used in queries
- list_files()[source]
Get a complete list of files that are stored in the database
- Returns
list, list of the names of the items stored in this database
- class straxen.mongo_storage.MongoDownloader(store_files_at=None, *args, **kwargs)[source]
Bases:
straxen.mongo_storage.GridFsInterface
Class to download files from GridFs
- download_single(config_name: str, human_readable_file_name=False)[source]
Download the config_name if it exists
- Parameters
config_name – str, the name under which the file is stored
human_readable_file_name – bool, store the file also under it’s human readable name. It is better not to use this as the user might not know if the version of the file is the latest.
- Returns
str, the absolute path of the file requested
- class straxen.mongo_storage.MongoUploader(readonly=False, *args, **kwargs)[source]
Bases:
straxen.mongo_storage.GridFsInterface
Class to upload files to GridFs
straxen.online_monitor module
straxen.rucio module
- class straxen.rucio.RucioFrontend(include_remote=False, download_heavy=False, staging_dir='./strax_data', *args, **kwargs)[source]
Bases:
strax.storage.common.StorageFrontend
Uses the rucio client for the data find.
- did_is_local(did)[source]
Determines whether or not a given did is on a local RSE. If there is no local RSE, returns False.
- Parameters
did – Rucio DID string
- Returns
boolean for whether DID is local or not.
- find_several(keys, **kwargs)[source]
Return list with backend keys or False for several data keys.
Options are as for find()
- local_did_cache = None
- local_rses = {'UC_DALI_USERDISK': '.rcc.'}
- local_rucio_path = None
- class straxen.rucio.RucioLocalBackend(rucio_dir, *args, **kwargs)[source]
Bases:
strax.storage.files.FileSytemBackend
Get data from local rucio RSE
- get_metadata(did: str, **kwargs)[source]
Get the metadata using the backend_key and the Backend specific _get_metadata method. When an unforeseen error occurs, raises an strax.DataCorrupted error. Any kwargs are passed on to _get_metadata
- Parameters
backend_key – The key the backend should look for (can be string or strax.DataKey)
- Returns
metadata for the data associated to the requested backend-key
- Raises
strax.DataCorrupted – This backend is not able to read the metadata but it should exist
strax.DataNotAvailable – When there is no data associated with this backend-key
- class straxen.rucio.RucioRemoteBackend(staging_dir, download_heavy=False, **kwargs)[source]
Bases:
strax.storage.files.FileSytemBackend
Get data from remote Rucio RSE
- get_metadata(dset_did, rse='UC_OSG_USERDISK', **kwargs)[source]
Get the metadata using the backend_key and the Backend specific _get_metadata method. When an unforeseen error occurs, raises an strax.DataCorrupted error. Any kwargs are passed on to _get_metadata
- Parameters
backend_key – The key the backend should look for (can be string or strax.DataKey)
- Returns
metadata for the data associated to the requested backend-key
- Raises
strax.DataCorrupted – This backend is not able to read the metadata but it should exist
strax.DataNotAvailable – When there is no data associated with this backend-key
- heavy_types = ['raw_records', 'raw_records_nv', 'raw_records_he']
straxen.rundb module
- class straxen.rundb.RunDB(minimum_run_number=7157, maximum_run_number=None, runid_field='name', local_only=False, new_data_path=None, reader_ini_name_is_mode=False, rucio_path=None, mongo_url=None, mongo_user=None, mongo_password=None, mongo_database=None, *args, **kwargs)[source]
Bases:
strax.storage.common.StorageFrontend
Frontend that searches RunDB MongoDB for data.
- find_several(keys: List[strax.storage.common.DataKey], **kwargs)[source]
Return list with backend keys or False for several data keys.
Options are as for find()
- hosts = {'dali': '^dali.*rcc.*'}
- provide_run_metadata = True
straxen.scada module
- class straxen.scada.SCADAInterface(context=None, use_progress_bar=True)[source]
Bases:
object
- find_pmt_names(pmts=None, hv=True, current=False)[source]
Function which returns a list of PMT parameter names to be called in SCADAInterface.get_scada_values. The names refer to the high voltage of the PMTs, not their current.
Thanks to Hagar and Giovanni who provided the file.
- Parameters
pmts – Optional parameter to specify which PMT parameters should be returned. Can be either a list or array of channels or just a single one.
hv – Bool if true names of high voltage channels are returned.
current – Bool if true names for the current channels are returned.
- Returns
dictionary containing short names as keys and scada parameter names as values.
- get_scada_values(parameters, start=None, end=None, run_id=None, query_type_lab=True, time_selection_kwargs=None, fill_gaps=None, filling_kwargs=None, down_sampling=False, every_nth_value=1)[source]
Function which returns XENONnT slow control values for a given set of parameters and time range.
The time range can be either defined by a start and end time or via the run_id, target and context.
- Parameters
parameters – dictionary containing the names of the requested scada-parameters. The keys are used as identifier of the parameters in the returned pandas.DataFrame.
start – int representing the start time of the interval in ns unix time.
end – same as start but as end.
run_id – Id of the run. Can also be specified as a list or tuple of run ids. In this case we will return the time range lasting between the start of the first and endtime of the second run.
query_type_lab – Mode on how to query data from the historians. Can be either False to get raw data or True (default) to get data which was interpolated by historian. Useful if large time ranges have to be queried.
time_selection_kwargs – Keyword arguments taken by st.to_absolute_time_range(). Default: {“full_range”: True}
fill_gaps – Decides how to fill gaps in which no data was recorded. Only needed for query_type_lab=False. Can be either None, “interpolation” or “forwardfill”.None keeps the gaps (default), “interpolation” uses pandas.interpolate and “forwardfill” pandas.ffill. See https://pandas.pydata.org/docs/ for more information. You can change the filling options of the methods with the filling_kwargs.
filling_kwargs – Kwargs applied to pandas .ffill() or .interpolate(). Only needed for query_type_lab=False.
down_sampling – Boolean which indicates whether to donw_sample result or to apply average. The averaging is deactivated in case of interpolated data. Only needed for query_type_lab=False.
every_nth_value – Defines over how many values we compute the average or the nth sample in case we down sample the data. In case query_type_lab=True every nth second is returned.
- Returns
pandas.DataFrame containing the data of the specified parameters.
- straxen.scada.convert_time_zone(df, tz)[source]
Function which converts the current time zone of a given pd.DataFrame into another timezone.
- Parameters
df – pandas.DataFrame containing the Data. Index must be a datetime object with time zone information.
tz – str representing the timezone the index should be converted to. See the notes for more information.
- Returns
pandas.DataFrame with converted time index.
- Notes:
1. ) The input pandas.DataFrame must be indexed via datetime objects which are timezone aware.
2.) You can find a complete list of available timezones via:
` import pytz pytz.all_timezones `
You can also specify ‘strax’ as timezone which will convert the time index into a ‘strax time’ equivalent. The default timezone of strax is UTC.
straxen.units module
Define unit system for pax (i.e., seconds, etc.)
This sets up variables for the various unit abbreviations, ensuring we always have a ‘consistent’ unit system. There are almost no cases that you should change this without talking with a maintainer.