straxen.storage package

Submodules

straxen.storage.mongo_storage module

class straxen.storage.mongo_storage.GridFsInterface(readonly=True, file_database='files', config_identifier='config_name', collection=None, _test_on_init=False)[source]

Bases: object

Base class to upload/download the files to a database using GridFS for PyMongo: https://pymongo.readthedocs.io/en/stable/api/gridfs/index.html#module-gridfs

This class does the basic shared initiation of the downloader and uploader classes.

static compute_md5(abs_path)[source]

NB: RAM intensive operation! Get the md5 hash of a file stored under abs_path

Parameters:

abs_path – str, absolute path to a file

Returns:

str, the md5-hash of the requested file

config_exists(config)[source]

Quick check if this config is already saved in the collection.

Parameters:

config – str, name of the file of interest

Returns:

bool, is this config name stored in the database

document_format(config)[source]

Format of the document to upload.

Parameters:

config – str, name of the file of interest

Returns:

dict, that will be used to add the document

get_query_config(config)[source]

Generate identifier to query against. This is just the configs name.

Parameters:

config – str, name of the file of interest

Returns:

dict, that can be used in queries

list_files()[source]

Get a complete list of files that are stored in the database.

Returns:

list, list of the names of the items stored in this database

md5_stored(abs_path)[source]

NB: RAM intensive operation! Carefully compare if the MD5 identifier is the same as the file as stored under abs_path.

Parameters:

abs_path – str, absolute path to the file name

Returns:

bool, returns if the exact same file is already stored in the database

test_find()[source]

Test the connection to the self.collection to see if we can perform a collection.find operation.

class straxen.storage.mongo_storage.MongoDownloader(store_files_at=None, *args, **kwargs)[source]

Bases: GridFsInterface

Class to download files from GridFs.

download_all()[source]

Download all the files that are stored in the mongo collection.

download_single(config_name: str, human_readable_file_name=False)[source]

Download the config_name if it exists.

Parameters:
  • config_name – str, the name under which the file is stored

  • human_readable_file_name – bool, store the file also under it’s human readable name. It is better not to use this as the user might not know if the version of the file is the latest.

Returns:

str, the absolute path of the file requested

get_abs_path(config_name)[source]
class straxen.storage.mongo_storage.MongoUploader(readonly=False, *args, **kwargs)[source]

Bases: GridFsInterface

Class to upload files to GridFs.

upload_from_dict(file_path_dict)[source]

Upload all files in the dictionary to the database.

Parameters:

file_path_dict – dict, dictionary of paths to upload. The dict should be of the format: file_path_dict = {‘config_name’: ‘/the_config_path’, …}

Returns:

None

upload_single(config, abs_path)[source]

Upload a single file to gridfs.

Parameters:
  • config – str, the name under which this file should be stored

  • abs_path – str, the absolute path of the file

straxen.storage.online_monitor_frontend module

class straxen.storage.online_monitor_frontend.OnlineMonitor(uri=None, take_only=None, database=None, col_name='online_monitor', readonly=True, *args, **kwargs)[source]

Bases: MongoFrontend

Online monitor Frontend for Saving data temporarily to the database.

backends: list
straxen.storage.online_monitor_frontend.get_mongo_uri(user_key='pymongo_user', pwd_key='pymongo_password', url_key='pymongo_url', header='RunDB')[source]

straxen.storage.rucio_local module

class straxen.storage.rucio_local.RucioLocalBackend(rucio_dir, *args, **kwargs)[source]

Bases: FileSytemBackend

Get data from local rucio RSE.

class straxen.storage.rucio_local.RucioLocalFrontend(path=None, *args, **kwargs)[source]

Bases: StorageFrontend

Storage that loads from rucio by assuming the rucio file naming convention without access to the rucio database.

Normally, you don’t need this StorageFrontend as it should return the same data as the RunDB frontend

backends: list
determine_rse()[source]
did_is_local(did)[source]

Determines whether or not a given did is on a local RSE. If there is no local RSE, returns False.

Parameters:

did – Rucio DID string

Returns:

boolean for whether DID is local or not.

local_prefixes = {'SDSC_USERDISK': '/expanse/lustre/projects/chi135/shockley/rucio', 'UC_DALI_USERDISK': '/dali/lgrandi/rucio/'}
local_rses = {'SDSC_USERDISK': '.sdsc.', 'UC_DALI_USERDISK': '.rcc.'}
storage_type = 1

straxen.storage.rucio_remote module

class straxen.storage.rucio_remote.RucioFrontend(*args, **kwargs)[source]

Bases: RucioRemoteFrontend

backends: list
class straxen.storage.rucio_remote.RucioRemoteBackend(staging_dir, download_heavy=False, **kwargs)[source]

Bases: FileSytemBackend

Get data from remote Rucio RSE.

dset_cache: Dict[str, str] = {}
heavy_types = ['raw_records', 'raw_records_nv', 'raw_records_he']
class straxen.storage.rucio_remote.RucioRemoteFrontend(download_heavy=False, staging_dir='./strax_data', *args, **kwargs)[source]

Bases: StorageFrontend

Uses the rucio client for the data find.

backends: list
find(key: DataKey, write=False, check_broken=False, **kwargs)[source]

Return (str: backend class name, backend-specific) key to get at / write data, or raise exception.

Parameters:
  • key – DataKey of data to load {data_type: (plugin_name, version, {config_option: value, …}, …}

  • write – Set to True if writing new data. The data is immediately registered, so you must follow up on the write!

  • check_broken – If True, raise DataNotAvailable if data has not been complete written, or writing terminated with an exception.

find_several(keys, **kwargs)[source]

Return list with backend keys or False for several data keys.

Options are as for find()

local_did_cache = None
path = None
storage_type = 4
class straxen.storage.rucio_remote.RucioSaver(*args, **kwargs)[source]

Bases: Saver

TODO Saves data to rucio if you are the production user.

exception straxen.storage.rucio_remote.TooMuchDataError[source]

Bases: Exception

straxen.storage.rucio_remote.key_to_rucio_did(key: DataKey) str[source]

Convert a strax.datakey to a rucio did field in rundoc.

straxen.storage.rucio_remote.parse_rucio_did(did: str) tuple[source]

Parses a Rucio DID and returns a tuple of (number:int, dtype:str, hash:str)

straxen.storage.rundb module

class straxen.storage.rundb.RunDB(minimum_run_number=7157, maximum_run_number=None, runid_field='name', local_only=False, new_data_path=None, reader_ini_name_is_mode=False, rucio_path=None, mongo_url=None, mongo_user=None, mongo_password=None, mongo_database=None, *args, **kwargs)[source]

Bases: StorageFrontend

Frontend that searches RunDB MongoDB for data.

backends: list
find_several(keys: List[DataKey], **kwargs)[source]

Return list with backend keys or False for several data keys.

Options are as for find()

hosts = {'dali': '^dali.*rcc.*|^midway2.*rcc.*|^midway.*rcc.*|fried.rice.edu'}
number_query()[source]
progress_bar = False
provide_run_metadata = True
run_metadata(run_id, projection=None)[source]

Return run metadata dictionary, or raise RunMetadataNotAvailable.

storage_type = 1

Module contents