straxen package



straxen.bokeh_utils module

straxen.bokeh_utils.bokeh_to_wiki(fig, outputfile=None)[source]

Function which converts bokeh HTML code to a wiki readable code.

  • fig – Figure to be conerted

  • outputfile – String of absolute file path. If specified output is writen to the file. Else output is print to the notebook and can be simply copied into the wiki.

straxen.common module

straxen.common.check_loading_allowed(data, run_id, target, max_in_disallowed=1, disallowed=('event_positions', 'corrected_areas', 'energy_estimates'))[source]

Check that the loading of the specified targets is not disallowed

  • data – chunk of data

  • run_id – run_id of the run

  • target – list of targets requested by the user

  • max_in_disallowed – the max number of targets that are in the disallowed list

  • disallowed – list of targets that are not allowed to be loaded simultaneously by the user




RuntimeError if more than max_in_disallowed targets are requested


Downloads strax test data to strax_test_data in the current directory


Return keys/dtype names of pd.DataFrame or numpy array


_data – data to get the keys/dtype names


keys/dtype names

straxen.common.get_livetime_sec(context, run_id, things=None)[source]

Get the livetime of a run in seconds. If it is not in the run metadata, estimate it from the data-level metadata of the data things.

straxen.common.get_resource(x: str, fmt='text')[source]
Get the resource from an online source to be opened here. We will
sequentially try the following:
  1. Load if from memory if we asked for it before;

  2. load it from a file if the path exists;

  3. (preferred option) Load it from our database

  4. Load the file from some URL (e.g. raw github content)

  • x – str, either it is : A.) a path to the file; B.) the identifier of the file as it’s stored under in the database; C.) A URL to the file (e.g. raw github content).

  • fmt – str, format of the resource x


the opened resource file x opened according to the specified format


Return secret key x. In order of priority, we search: * Environment variable: uppercase version of x * (if included with your straxen installation) * A standard located on the midway analysis hub

(if you are running on midway)

straxen.common.open_resource(file_name: str, fmt='text')[source]

Open file :param file_name: str, file to open :param fmt: format of the file :return: opened file


Return URL to file hosted in the pax repository master branch


Return pandas dataframe with PMT positions columns: array (top/bottom), i (PMT number), x, y

straxen.common.pre_apply_function(data, run_id, target, function_name='pre_apply_function')[source]

Prior to returning the data (from one chunk) see if any function(s) need to be applied.

  • data – one chunk of data for the requested target(s)

  • run_id – Single run-id of of the chunk of data

  • target – one or more targets

  • function_name – the name of the function to be applied. The should be stored in the database.


Data where the function is applied.

straxen.common.remap_channels(data, verbose=True, safe_copy=False, _tqdm=False)[source]
There were some errors in the channel mapping of old data as described in using this function, we can convert old data to reflect the right channel map while loading the data. We convert both the field ‘channel’ as well as anything that is an array of the same length of the number of channels.

  • data – numpy array of pandas dataframe

  • verbose – print messages while converting data

  • safe_copy – if True make a copy of the data prior to performing manipulations. Will prevent overwrites of the internal references but does require more memory.

  • _tqdm – bool (try to) add a tqdm wrapper to show the progress


Correctly mapped data

straxen.common.remap_old(data, targets, run_id, works_on_target='')[source]
If the data is of before the time sectors were re-cabled, apply a software remap

otherwise just return the data is it is.

  • data – numpy array of data with at least the field time. It is assumed the data is sorted by time

  • targets – targets in the st.get_array to get

  • run_id – required positional argument of apply_function_to_data in strax

  • works_on_target – regex match string to match any of the targets. By default set to ‘’ such that any target in the targets would be remapped (which is what we want as channels are present in most data types). If one only wants records (no raw-records) and peaks* use e.g. works_on_target = ‘records|peaks’.

straxen.contexts module


Return strax context used in the straxen demo notebook


Context for processing fake DAQ data in the current directory

straxen.contexts.xenon1t_dali(output_folder='./strax_data', build_lowlevel=False, **kwargs)[source]
straxen.contexts.xenonnt(cmt_version='global_ONLINE', **kwargs)[source]

XENONnT context

straxen.contexts.xenonnt_online(output_folder='./strax_data', use_rucio=None, use_rucio_remote=False, we_are_the_daq=False, _minimum_run_number=7157, _maximum_run_number=None, _database_init=True, _forbid_creation_of=None, _rucio_path='/dali/lgrandi/rucio/', _include_rucio_remote=False, _raw_path='/dali/lgrandi/xenonnt/raw', _processed_path='/dali/lgrandi/xenonnt/processed', _add_online_monitor_frontend=False, _context_config_overwrite=None, **kwargs)[source]

XENONnT online processing and analysis

  • output_folder – str, Path of the strax.DataDirectory where new data can be stored

  • use_rucio – bool, whether or not to use the rucio frontend (by default, we add the frontend when running on an rcc machine)

  • use_rucio_remote – bool, if download data from rucio directly

  • we_are_the_daq – bool, if we have admin access to upload data

  • _minimum_run_number – int, lowest number to consider

  • _maximum_run_number – Highest number to consider. When None (the default) consider all runs that are higher than the minimum_run_number.

  • _database_init – bool, start the database (for testing)

  • _forbid_creation_of – str/tuple, of datatypes to prevent form being written (raw_records* is always forbidden).

  • _include_rucio_remote – allow remote downloads in the context

  • _rucio_path – str, path of rucio

  • _raw_path – str, common path of the raw-data

  • _processed_path – str. common path of output data

  • _context_config_overwrite – dict, overwrite config

  • _add_online_monitor_frontend – bool, should we add the online monitor storage frontend.

  • kwargs – dict, context options



straxen.contexts.xenonnt_simulation(output_folder='./strax_data', cmt_run_id_sim=None, cmt_run_id_proc=None, cmt_version='v3', fax_config='fax_config_nt_design.json', overwrite_from_fax_file_sim=False, overwrite_from_fax_file_proc=False, cmt_option_overwrite_sim=immutabledict({}), cmt_option_overwrite_proc=immutabledict({}), _forbid_creation_of=None, _config_overlap=immutabledict({'drift_time_gate': 'electron_drift_time_gate', 'drift_velocity_liquid': 'electron_drift_velocity', 'electron_lifetime_liquid': 'elife_conf'}), **kwargs)[source]

The most generic context that allows for setting full divergent settings for simulation purposes

It makes full divergent setup, allowing to set detector simulation part (i.e. for wfsim up to truth and raw_records). Parameters _sim refer to detector simulation parameters.

Arguments having _proc in their name refer to detector parameters that are used for processing of simulations as done to the real datector data. This means starting from already existing raw_records and finishing with higher level data, such as peaks, events etc.

If only one cmt_run_id is given, the second one will be set automatically, resulting in CMT match between simulation and processing. However, detector parameters can be still overwritten from fax file or manually using cmt config overwrite options.

CMT options can also be overwritten via fax config file. :param output_folder: Output folder for strax data. :param cmt_run_id_sim: Run id for detector parameters from CMT to be used

for creation of raw_records.

  • cmt_run_id_proc – Run id for detector parameters from CMT to be used for processing from raw_records to higher level data.

  • cmt_version – Global version for corrections to be loaded.

  • fax_config – Fax config file to use.

  • overwrite_from_fax_file_sim – If true sets detector simulation parameters for truth/raw_records from from fax_config file istead of CMT

  • overwrite_from_fax_file_proc – If true sets detector processing parameters after raw_records(peaklets/events/etc) from from fax_config file istead of CMT

  • cmt_option_overwrite_sim – Dictionary to overwrite CMT settings for the detector simulation part.

  • cmt_option_overwrite_proc – Dictionary to overwrite CMT settings for the data processing part.

  • _forbid_creation_of – str/tuple, of datatypes to prevent form being written (e.g. ‘raw_records’ for read only simulation context).

  • _config_overlap – Dictionary of options to overwrite. Keys must be simulation config keys, values must be valid CMT option keys.

  • kwargs – Additional kwargs taken by strax.Context.


strax.Context instance

straxen.corrections_services module

Return corrections from corrections DB

class straxen.corrections_services.CorrectionsManagementServices(username=None, password=None, mongo_url=None, is_nt=True)[source]

Bases: object

A class that returns corrections Corrections are set of parameters to be applied in the analysis stage to remove detector effects. Information on the strax implementation can be found at

get_config_from_cmt(run_id, model_type, version='ONLINE')[source]

Smart logic to return NN weights file name to be downloader by straxen.MongoDownloader() :param run_id: run id from runDB :param model_type: model type and neural network type; model_mlp, or model_gcn or model_cnn :param version: version :param return: NN weights file name

get_corrections_config(run_id, config_model=None)[source]

Get context configuration for a given correction :param run_id: run id from runDB :param config_model: configuration model (tuple type) :return: correction value(s)


Returns a dict of local versions for a given global version. Use ‘latest’ to get newest version

get_pmt_gains(run_id, model_type, version, cacheable_versions=('ONLINE', ), gain_dtype=<class 'numpy.float32'>)[source]

Smart logic to return pmt gains to PE values. :param run_id: run id from runDB :param model_type: to_pe_model (gain model) :param version: version :param cacheable_versions: versions that are allowed to be cached in ./resource_cache :param gain_dtype: dtype of the gains to be returned as array :return: array of pmt gains to PE values


Smart logic to return start time from runsDB :param run_id: run id from runDB :return: run start time

property global_versions

straxen.get_corrections module

straxen.get_corrections.get_cmt_resource(run_id, conf, fmt='')[source]

Get resource with CMT correction file name

straxen.get_corrections.get_correction_from_cmt(run_id, conf)[source]

Get correction from CMT general format is conf = (‘correction_name’, ‘version’, True) where True means looking at nT runs, e.g. get_correction_from_cmt(run_id, conf[:2]) special cases: version can be replaced by consant int, float or array when user specify value(s) :param run_id: run id from runDB :param conf: configuration :return: correction value(s)

straxen.hitfinder_thresholds module


Return hitfiner height threshold to use in processing


model – Model name (str), or int to use a uniform threshold,

or array/tuple or thresholds to use.

straxen.holoviews_utils module

straxen.itp_map module

class straxen.itp_map.InterpolateAndExtrapolate(points, values, neighbours_to_use=None, array_valued=False)[source]

Bases: object

Linearly interpolate- and extrapolate using inverse-distance weighted averaging between nearby points.

class straxen.itp_map.InterpolatingMap(data, method='WeightedNearestNeighbors', **kwargs)[source]

Bases: object

Correction map that computes values using inverse-weighted distance interpolation.

The map must be specified as a json translating to a dictionary like this:

‘coordinate_system’ : [[x1, y1], [x2, y2], [x3, y3], [x4, y4], …], ‘map’ : [value1, value2, value3, value4, …] ‘another_map’ : idem ‘name’: ‘Nice file with maps’, ‘description’: ‘Say what the maps are, who you are, etc’, ‘timestamp’: unix epoch seconds timestamp

with the straightforward generalization to 1d and 3d.

Alternatively, a grid coordinate system can be specified as follows:

‘coordinate_system’ : [[‘x’, [x_min, x_max, n_x]], [[‘y’, [y_min, y_max, n_y]]

Alternatively, an N-vector-valued map can be specified by an array with last dimension N in ‘map’.

The default map name is ‘map’, I’d recommend you use that.

For a 0d placeholder map, use

‘points’: [], ‘map’: 42, etc

Default method return inverse-distance weighted average of nearby 2 * dim points Extra support includes RectBivariateSpline, RegularGridInterpolator in scipy by pass keyword argument like


The interpolators are called with

‘positions’ : [[x1, y1], [x2, y2], [x3, y3], [x4, y4], …] ‘map_name’ : key to switch to map interpolator other than the default ‘map’

metadata_field_names = ['timestamp', 'description', 'coordinate_system', 'name', 'irregular', 'compressed', 'quantized']
scale_coordinates(scaling_factor, map_name='map')[source]

Scales the coordinate system by the specified factor :params scaling_factor: array (n_dim) of scaling factors if different or single scalar.

straxen.matplotlib_utils module

straxen.matplotlib_utils.draw_box(x, y, **kwargs)[source]

Draw rectangle, given x-y boundary tuples

straxen.matplotlib_utils.log_x(a=None, b=None, scalar_ticks=True, tick_at=None)[source]

Make the x axis use a log scale from a to b

straxen.matplotlib_utils.log_y(a=None, b=None, scalar_ticks=True, tick_at=None)[source]

Make the y axis use a log scale from a to b

straxen.matplotlib_utils.plot_on_single_pmt_array(c, array_name='top', xenon1t=False, r=68.39200000000001, pmt_label_size=8, pmt_label_color='white', show_tpc=True, log_scale=False, vmin=None, vmax=None, dead_pmts=None, dead_pmt_color='gray', **kwargs)[source]

Plot one of the PMT arrays and color it by c. :param c: Array of colors to use. Must be len() of the number of TPC PMTs :param label: Label for the color bar :param pmt_label_size: Fontsize for the PMT number labels. Set to 0 to disable. :param pmt_label_color: Text color of the PMT number labels. :param log_scale: If True, use a logarithmic color scale :param extend: same as plt.colorbar(extend=…) :param vmin: Minimum of color scale :param vmax: maximum of color scale Other arguments are passed to plt.scatter.

straxen.matplotlib_utils.plot_pmts(c, label='', figsize=None, xenon1t=False, show_tpc=True, extend='neither', vmin=None, vmax=None, **kwargs)[source]

Plot the PMT arrays side-by-side, coloring the PMTS with c. :param c: Array of colors to use. Must have len() n_tpc_pmts :param label: Label for the color bar :param figsize: Figure size to use. :param extend: same as plt.colorbar(extend=…) :param vmin: Minimum of color scale :param vmax: maximum of color scale :param show_axis_labels: if True it will show x and y labels Other arguments are passed to plot_on_single_pmt_array.

straxen.matplotlib_utils.plot_single_pulse(records, run_id, pulse_i='')[source]

Function which plots a single pulse.

  • records – Records which belong to the pulse.

  • run_id – Id of the run.

  • pulse_i – Index of the pulse to be plotted.


fig, axes objects.


straxen.mini_analysis module

straxen.mini_analysis.mini_analysis(requires=(), hv_bokeh=False, warn_beyond_sec=None, default_time_selection='touching')[source]

straxen.misc module

class straxen.misc.TimeWidgets[source]

Bases: object


Creates time and time zone widget for simpler time querying.


Please be aware that the correct format for the time field is HH:MM.


Returns start and end time of the specfied time interval in nano-seconds utc unix time.

straxen.misc.convert_array_to_df(array: numpy.ndarray) pandas.core.frame.DataFrame[source]

Converts the specified array into a DataFrame drops all higher dimensional fields during the process.


array – numpy.array to be converted.


DataFrame with higher dimensions dropped.

straxen.misc.dataframe_to_wiki(df, float_digits=5, title='Awesome table', force_int=())[source]

Convert a pandas dataframe to a dokuwiki table (which you can copy-paste onto the XENON wiki) :param df: dataframe to convert :param float_digits: Round float-ing point values to this number of digits. :param title: title of the table.

straxen.misc.print_versions(modules=('strax', 'straxen', 'cutax'), return_string=False)[source]

Print versions of modules installed.

  • modules – Modules to print, should be str, tuple or list. E.g. print_versions(modules=(‘strax’, ‘straxen’, ‘wfsim’, ‘cutax’, ‘pema’))

  • return_string – optional. Instead of printing the message, return a string


optional, the message that would have been printed

straxen.misc.utilix_is_configured(header='RunDB', section='xent_database') bool[source]

Check if we have the right connection to :return: bool, can we connect to the Mongo database?

straxen.mongo_storage module

class straxen.mongo_storage.GridFsInterface(readonly=True, file_database='files', config_identifier='config_name', collection=None)[source]

Bases: object

Base class to upload/download the files to a database using GridFS for PyMongo:

This class does the basic shared initiation of the downloader and uploader classes.

static compute_md5(abs_path)[source]

NB: RAM intensive operation! Get the md5 hash of a file stored under abs_path


abs_path – str, absolute path to a file


str, the md5-hash of the requested file


Quick check if this config is already saved in the collection


config – str, name of the file of interest


bool, is this config name stored in the database


Format of the document to upload


config – str, name of the file of interest


dict, that will be used to add the document


Generate identifier to query against. This is just the configs name.


config – str, name of the file of interest


dict, that can be used in queries


Get a complete list of files that are stored in the database


list, list of the names of the items stored in this database


NB: RAM intensive operation! Carefully compare if the MD5 identifier is the same as the file as stored under abs_path.


abs_path – str, absolute path to the file name


bool, returns if the exact same file is already stored in the database


Test the connection to the self.collection to see if we can perform a collection.find operation.

class straxen.mongo_storage.MongoDownloader(store_files_at=None, *args, **kwargs)[source]

Bases: straxen.mongo_storage.GridFsInterface

Class to download files from GridFs


Download all the files that are stored in the mongo collection

download_single(config_name: str, human_readable_file_name=False)[source]

Download the config_name if it exists

  • config_name – str, the name under which the file is stored

  • human_readable_file_name – bool, store the file also under it’s human readable name. It is better not to use this as the user might not know if the version of the file is the latest.


str, the absolute path of the file requested

class straxen.mongo_storage.MongoUploader(readonly=False, *args, **kwargs)[source]

Bases: straxen.mongo_storage.GridFsInterface

Class to upload files to GridFs


Upload all files in the dictionary to the database.


file_path_dict – dict, dictionary of paths to upload. The dict should be of the format: file_path_dict = {‘config_name’: ‘/the_config_path’, …}



upload_single(config, abs_path)[source]

Upload a single file to gridfs

  • config – str, the name under which this file should be stored

  • abs_path – str, the absolute path of the file

straxen.online_monitor module

class straxen.online_monitor.OnlineMonitor(uri=None, take_only=None, database=None, col_name='online_monitor', readonly=True, *args, **kwargs)[source]


Online monitor Frontend for Saving data temporarily to the database

backends: list
straxen.online_monitor.get_mongo_uri(user_key='pymongo_user', pwd_key='pymongo_password', url_key='pymongo_url', header='RunDB')[source]

straxen.rucio module

class straxen.rucio.RucioFrontend(include_remote=False, download_heavy=False, staging_dir='./strax_data', *args, **kwargs)[source]


Uses the rucio client for the data find.

backends: list

Determines whether or not a given did is on a local RSE. If there is no local RSE, returns False.


did – Rucio DID string


boolean for whether DID is local or not.

find_several(keys, **kwargs)[source]

Return list with backend keys or False for several data keys.

Options are as for find()

local_did_cache = None
local_rses = {'UC_DALI_USERDISK': '.rcc.'}
local_rucio_path = None
class straxen.rucio.RucioLocalBackend(rucio_dir, *args, **kwargs)[source]


Get data from local rucio RSE

get_metadata(did: str, **kwargs)[source]

Get the metadata using the backend_key and the Backend specific _get_metadata method. When an unforeseen error occurs, raises an strax.DataCorrupted error. Any kwargs are passed on to _get_metadata


backend_key – The key the backend should look for (can be string or strax.DataKey)


metadata for the data associated to the requested backend-key

  • strax.DataCorrupted – This backend is not able to read the metadata but it should exist

  • strax.DataNotAvailable – When there is no data associated with this backend-key

class straxen.rucio.RucioRemoteBackend(staging_dir, download_heavy=False, **kwargs)[source]


Get data from remote Rucio RSE

get_metadata(dset_did, rse='UC_OSG_USERDISK', **kwargs)[source]

Get the metadata using the backend_key and the Backend specific _get_metadata method. When an unforeseen error occurs, raises an strax.DataCorrupted error. Any kwargs are passed on to _get_metadata


backend_key – The key the backend should look for (can be string or strax.DataKey)


metadata for the data associated to the requested backend-key

  • strax.DataCorrupted – This backend is not able to read the metadata but it should exist

  • strax.DataNotAvailable – When there is no data associated with this backend-key

heavy_types = ['raw_records', 'raw_records_nv', 'raw_records_he']

straxen.rundb module

class straxen.rundb.RunDB(minimum_run_number=7157, maximum_run_number=None, runid_field='name', local_only=False, new_data_path=None, reader_ini_name_is_mode=False, rucio_path=None, mongo_url=None, mongo_user=None, mongo_password=None, mongo_database=None, *args, **kwargs)[source]


Frontend that searches RunDB MongoDB for data.

backends: list
find_several(keys: List[], **kwargs)[source]

Return list with backend keys or False for several data keys.

Options are as for find()

hosts = {'dali': '^dali.*rcc.*'}
provide_run_metadata = True
run_metadata(run_id, projection=None)[source]

Return run metadata dictionary, or raise RunMetadataNotAvailable

straxen.scada module

class straxen.scada.SCADAInterface(context=None, use_progress_bar=True)[source]

Bases: object

find_pmt_names(pmts=None, hv=True, current=False)[source]

Function which returns a list of PMT parameter names to be called in SCADAInterface.get_scada_values. The names refer to the high voltage of the PMTs, not their current.

Thanks to Hagar and Giovanni who provided the file.

  • pmts – Optional parameter to specify which PMT parameters should be returned. Can be either a list or array of channels or just a single one.

  • hv – Bool if true names of high voltage channels are returned.

  • current – Bool if true names for the current channels are returned.


dictionary containing short names as keys and scada parameter names as values.


Function to renew the token of the current session.

get_scada_values(parameters, start=None, end=None, run_id=None, query_type_lab=True, time_selection_kwargs=None, fill_gaps=None, filling_kwargs=None, down_sampling=False, every_nth_value=1)[source]

Function which returns XENONnT slow control values for a given set of parameters and time range.

The time range can be either defined by a start and end time or via the run_id, target and context.

  • parameters – dictionary containing the names of the requested scada-parameters. The keys are used as identifier of the parameters in the returned pandas.DataFrame.

  • start – int representing the start time of the interval in ns unix time.

  • end – same as start but as end.

  • run_id – Id of the run. Can also be specified as a list or tuple of run ids. In this case we will return the time range lasting between the start of the first and endtime of the second run.

  • query_type_lab – Mode on how to query data from the historians. Can be either False to get raw data or True (default) to get data which was interpolated by historian. Useful if large time ranges have to be queried.

  • time_selection_kwargs – Keyword arguments taken by st.to_absolute_time_range(). Default: {“full_range”: True}

  • fill_gaps – Decides how to fill gaps in which no data was recorded. Only needed for query_type_lab=False. Can be either None, “interpolation” or “forwardfill”.None keeps the gaps (default), “interpolation” uses pandas.interpolate and “forwardfill” pandas.ffill. See for more information. You can change the filling options of the methods with the filling_kwargs.

  • filling_kwargs – Kwargs applied to pandas .ffill() or .interpolate(). Only needed for query_type_lab=False.

  • down_sampling – Boolean which indicates whether to donw_sample result or to apply average. The averaging is deactivated in case of interpolated data. Only needed for query_type_lab=False.

  • every_nth_value – Defines over how many values we compute the average or the nth sample in case we down sample the data. In case query_type_lab=True every nth second is returned.


pandas.DataFrame containing the data of the specified parameters.


Function which displays how long until the current token expires.

straxen.scada.convert_time_zone(df, tz)[source]

Function which converts the current time zone of a given pd.DataFrame into another timezone.

  • df – pandas.DataFrame containing the Data. Index must be a datetime object with time zone information.

  • tz – str representing the timezone the index should be converted to. See the notes for more information.


pandas.DataFrame with converted time index.


1. ) The input pandas.DataFrame must be indexed via datetime objects which are timezone aware.

2.) You can find a complete list of available timezones via: ` import pytz pytz.all_timezones ` You can also specify ‘strax’ as timezone which will convert the time index into a ‘strax time’ equivalent. The default timezone of strax is UTC.

straxen.units module

Define unit system for pax (i.e., seconds, etc.)

This sets up variables for the various unit abbreviations, ensuring we always have a ‘consistent’ unit system. There are almost no cases that you should change this without talking with a maintainer.

Module contents