Storing configuration files in straxen
Most of the configuration files are stored in a mongo database that require a password to access. Below the different methods of opening files such as neural-nets in straxen are explained. Further down below a scheme is included what is done behind the scenes.
Downloading XENONnT files from the database
Most generically one downloads files using the utilix.mongo_storage.MongoDownloader
function. For example, one can download a file:
import utilix
downloader = utilix.mongo_storage.MongoDownloader()
# The downloader allows one to download files from the mongo database by
# looking for the requested name in the files database. The downloader
#returns the path of the downloaded file.
requested_file_name = 'fax_config.json'
config_path = downloader.download_single(requested_file_name)
# We can now open the file using get_resource
simulation_config = straxen.get_resource(config_path, fmt='json')
Alternatively, one can rely on the loading of straxen.get_resource() as below:
import straxen
simulation_config = straxen.get_resource(requested_file_name, fmt='json')
Downloading public placeholder files
It is also possible to load any of our placeholder files. This is for example used for testing purposes of the software using continuous integration. This can be done as per the example below. The advantage of loading files in this manner is that it does not require any password. However, this kind of access is restricted to very few files and there are also no checks in place to make sure if the requested file is the latest file. Therefore, this manner of loading data is intended only for testing purposes.
import straxen
requested_url = (
'https://github.com/XENONnT/strax_auxiliary_files/blob/'
'3548132b55f81a43654dba5141366041e1daaf01/strax_files/fax_config.json')
simulation_config = straxen.common.resource_from_url(requested_url, fmt='json')
How does the downloading work?
In utilix/mongo_storage.py there are two classes that take care of the
downloading and the uploading of files to the files database. In this
database we store configuration files under a config_identifier i.e. the
'file_name'. This is the label that is used to find the document one is
requesting.
Scheme
The scheme below illustrates how the different components work together to make
files available in the database to be loaded by a user. Starting on the left,
an admin user (with the credentials to upload files to the database) uploads a
file to the files- database (not shown) such that it can be downloaded later
by any user. The admin user can upload a file using the command
MongoUploader.upload_from_dict({'file_name', '/path/to/file'}).
This command will use the utilix.mongo_storage.MongoUploader class to put the file
'file_name' in the files database. The utilix.mongo_storage.MongoUploader will
communicate with the database via GridFs.
The GridFs interface communicates with two mongo-collections; 'fs.files' and
'fs.chunks', where the former is used for bookkeeping and the latter for
storing pieces of data (not to be confused with strax.Chunks).
Uploading
When the admin user issues the command to upload the 'file_name'-file. The
utilix.mongo_storage.MongoUploader will check that the file is not already stored in the
database. To this end, the utilix.mongo_storage.MongoUploader computes the md5-hash of
the file stored under the '/path/to/file'. If this is the first time a file
with this md5-hash is uploaded, utilix.mongo_storage.MongoUploader will upload it to
GridFs. If there is already an existing file with the md5-hash, there is no
need to upload. This however does mean that if there is already a file 'file_name'
stored and you modify the 'file_name'-file, it will be uploaded again! This is
a feature, not a bug. When a user requests the 'file_name'-file, the
utilix.mongo_storage.MongoDownloader will fetch the 'file_name'-file that was uploaded
last.
Downloading
Assuming that an admin user uploaded the 'file_name'-file, any user (no
required admin rights) can now download the 'file_name'-file (see above for the
example). When the user executes MongoUploader.download_single('file_name'),
the utilix.mongo_storage.MongoDownloader will check if the file is downloaded already. If
this is the case it will simply return the path of the file. Otherwise, it will
start downloading the file. It is important to notice that the files are saved
under their md5-hash-name. This means that wherever the files are stored,
it’s unreadable what the file (or extension is). The reason to do it in this
way is that it will make sure that the file is never downloaded when it is
already stored but it would be if the file has been changed as explained above.
Straxen Mongo config loader classes
Both the utilix.mongo_storage.MongoUploader and utilix.mongo_storage.MongoDownloader share a common
parent class, the straxen.GridFsInterface that provides the appropriate
shared functionality and connection to the database. The important difference
is the readonly argument that naturally has to be False for the
utilix.mongo_storage.MongoUploader but True for the utilix.mongo_storage.MongoDownloader.