Quickstart
This is the quickstart guide for creating a simulation workflow package based on libpyvinyl
. Please first install libpyvinyl
following the instruction in the Installation section.
Introduction
This section is intended to help a developer understand how a new simulation package can use libpyvinyl as a foundation. It is important to understand that libpyvinyl provides base classes from which a developer inherits more specialised classes from, and the final class then both contains the new functionality and the basic capabilities. To make a new package, a developer would have to inherit from these baseclasses:
BaseCalculator
BaseData
BaseDataFormat
Calculator
The specialised calculator that inherits from BaseCalculator is capable of performing a calculation of some sort. The calculation can depend on some data and input values for parameters specified when the calculator is built. The calculator can also return output data. The scope of a calculator is somewhat arbitrary, but the power of libpyvinyl comes from the ability to break a big calculation down into smaller parts of individual calculators. When using a calculator, it is easy for the user to understand a small number of parameters as there are less risks of ambiguity. A rich Parameter class is provided by libpyvinyl to create the necessary parameters in each calculator. When creating a parameter it is possible to set allowed intervals to avoid undefined behaviour.
Data
To create a description of some data that can be either given to or returned from a calculator one starts with the BaseData class. This data could be for example a number of particle states.
DataFormat
Each Data class will have a number of supported DataFormat which are necessary in order to save the data to disk. Our particle data from before could be saved as a json, yaml or some compressed format, and each would need a DataFormat class that contains methods to read and write such data, and make it available to a corresponding Data class.
First steps as a developer
To build a simulation package in this framework, think about what calculation need to be performed and what parameters are needed to describe it. Then divide this big calculation into calculators with a limited number of parameters and clear input and output data. For example a particle source, it would need parameters describing the properties of emitted particles and then return a Data object with a large number of particle states. Then a calculator describing a piece of optics might have parameters describing its geometry, and it could have particle states as both input and output. With these kinds of considerations it becomes clear what Calculators and Data classes should be written.
Benefit of libpyvinyl
When a package uses libpyvinyl as a foundation, libpyvinyl can be used to write a simulation from a series of these calculators using the Instrument class. Here is an example of a series of calculators that form a simple instrument.
Calculator |
Description |
Parameters |
Input Data |
Output Data |
---|---|---|---|---|
Source |
Emits particles |
size, divergence, energy |
None |
particle states |
Monochromator |
Crystal |
position, d_spacing, mosaicity |
particle states |
particle states |
Monochromator |
Crystal |
position, d_spacing, mosaicity |
particle states |
particle states |
Sample |
Crystal sample |
position, d_spacing, mosaicity |
particle states |
particle states |
Detector |
Particle detector |
position, size, sensitivity |
particle states |
counts in bins |
This setup uses two monochromators, each with their own parameters. The user can set up a master parameter that control both, for example to ensure they have the same d_spacing. Running the instrument then corresponds to running each calculator in turn and providing the output of one to the next.
Design a minimal instrument
As a minimal start, we will create an instrument with a calculator that can get the sum of two numbers.
There are 3 specialized classes needed to be defined for the package:
CalculatorClass
: a class based onBaseCalculator
to perform the calculation.DataClass
: to represent the input and output data of theCalculatorClass
.FormatClass
: the interface to exchange data between the memory and the file on the disk in a specific format.
Define a simple python object mapping DataClass
Let’s firstly define a NumberData
class mapping the python objects in the memory. This is done by creating a mapping dictionary to connect the data (e.g. an array or a single value) in the python object to the reference variable.
[1]:
from libpyvinyl.BaseData import BaseData
class NumberData(BaseData):
def __init__(self,key,data_dict=None,filename=None,
file_format_class=None,file_format_kwargs=None):
expected_data = {}
### DataClass developer's job start
expected_data["number"] = None
### DataClass developer's job end
super().__init__(key,expected_data,data_dict,
filename,file_format_class,file_format_kwargs)
@classmethod
def supported_formats(self):
### DataClass developer's job start
format_dict = {}
### DataClass developer's job end
return format_dict
# Test if the definition works
data = NumberData(key="test")
The above example shows a minimal definition of a DataClass. There are only two sections need to consider by the simulation package developers:
expected_data
: A dictionary whose keys are the expected keys of the dictionary returned byget_data()
, we just simply would like to get a “number” from aNumberData
.format_dict
: A dictionary of supported format for hard disk files. Now we only need a python object mapper, so we just assign an empty dict to it for the moment.
Define a DataClass also supporting file mapping
For a software writing the data to a file instead of a python object, it’s necessary to have a interface between the file and the DataClass. We create a FormatClass as the interface:
[2]:
import numpy as np
from libpyvinyl.BaseFormat import BaseFormat
class TXTFormat(BaseFormat):
def __init__(self) -> None:
super().__init__()
@classmethod
def format_register(self):
key = "TXT"
desciption = "TXT format for NumberData"
file_extension = ".txt"
read_kwargs = [""]
write_kwargs = [""]
return self._create_format_register(
key, desciption, file_extension, read_kwargs, write_kwargs
)
@staticmethod
def direct_convert_formats():
return []
@classmethod
def convert(
cls, obj: BaseData, output: str, output_format_class: str, key, **kwargs):
raise NotImplementedError
@classmethod
def read(cls, filename: str) -> dict:
"""Read the data from the file with the `filename` to
a dictionary. The dictionary will be used by its corresponding data class."""
number = float(np.loadtxt(filename))
data_dict = {"number": number}
return data_dict
@classmethod
def write(cls, object, filename: str, key: str = None):
"""Save the data with the `filename`."""
data_dict = object.get_data()
arr = np.array([data_dict["number"]])
np.savetxt(filename, arr, fmt="%.3f")
if key is None:
original_key = object.key
key = original_key + "_to_TXTFormat"
return object.from_file(filename, cls, key)
# Test if the definition works
data = TXTFormat()
In the above example, we create a TXTFormat
class based on the BaseFormat
abstract class. We need to provide:
The information of the
format_register
method to get registered in theNumberData.supported_formats()
method. This will be explained later.the
read
function to read the data from the file into thedata_dict
, which will be accessed by theNumberData
class byNumberData.get_data()
. The dictionary keys match those in theexpected_data
ofNumberData
.The
write
function to write theNumberData
object into a file inTXTFormat
.
For the other methods above, we just need to copy but don’t have to touch them at this moment.
Then, we just need add the TXTFormat
to the NumberData
created in the last section.
[3]:
class NumberData(BaseData):
def __init__(self,key,data_dict=None,filename=None,
file_format_class=None,file_format_kwargs=None):
expected_data = {}
### DataClass developer's job start
expected_data["number"] = None
### DataClass developer's job end
super().__init__(key,expected_data,data_dict,
filename,file_format_class,file_format_kwargs)
@classmethod
def supported_formats(self):
### DataClass developer's job start
format_dict = {}
self._add_ioformat(format_dict, TXTFormat)
### DataClass developer's job end
return format_dict
You can list the formats it supports with:
[4]:
NumberData.list_formats()
Format class: <class '__main__.TXTFormat'>
Key: TXT
Description: TXT format for NumberData
File extension: .txt
Define a Calculator with native python object output
Assuming we have a simulation code whose output is a native python object (e.g. a list or dict), we can create a CalculatorClass for the simulation code:
[5]:
from typing import Union
from pathlib import Path
from libpyvinyl.BaseData import DataCollection
from libpyvinyl.BaseCalculator import BaseCalculator, CalculatorParameters
class PlusCalculator(BaseCalculator):
def __init__(self, name: str, input: Union[DataCollection, list, NumberData],
output_keys: Union[list, str] = ["plus_result"],
output_data_types=[NumberData], output_filenames: Union[list, str] = [],
instrument_base_dir="./", calculator_base_dir="PlusCalculator",
parameters=None):
"""A python object calculator example"""
super().__init__(name, input, output_keys, output_data_types=output_data_types,
output_filenames=output_filenames, instrument_base_dir=instrument_base_dir,
calculator_base_dir=calculator_base_dir, parameters=parameters)
def init_parameters(self):
parameters = CalculatorParameters()
times = parameters.new_parameter(
"plus_times", comment="How many times to do the plus"
)
times.value = 1
self.parameters = parameters
def backengine(self):
Path(self.base_dir).mkdir(parents=True, exist_ok=True)
input_num0 = self.input.to_list()[0].get_data()["number"]
input_num1 = self.input.to_list()[1].get_data()["number"]
output_num = float(input_num0) + float(input_num1)
if self.parameters["plus_times"].value > 1:
for i in range(self.parameters["plus_times"].value - 1):
output_num += input_num1
data_dict = {"number": output_num}
key = self.output_keys[0]
output_data = self.output[key]
output_data.set_dict(data_dict)
return self.output
In the above example, we define a PlusCalculator
based on the BaseCalculator
. The following needs to be provided:
Some default output-related values to initialize empty output Data containers (see here):
output_keys: the key of each Data object in the output
DataCollection
output_data_types: the Data type of each Data object.
output_filenames: the filenames of the output files (if any)
init_parameters
to define the default values of the parameters need by the calculator. Range restrictions and units of values can be also set here. Details can be found in theparameter
use guide.backengine
to define how to conduct the calculation. It should return a reference of the output DataCollection.
The PlusCalculator.backengine
adds two numbers enclosed in a input DataCollection
for PlusCalculator.parameters["plus_times"].value
times. The reference dictionary of python objects data_dict
is passed to the corresponding NumberData
in the auto-initialized self.output: DataCollection
by
output_data.set_dict(data_dict)
Let’s create an instance from the class:
[6]:
input1 = NumberData.from_dict({"number": 1}, "input1")
input2 = NumberData.from_dict({"number": 1}, "input2")
calculator_plus = PlusCalculator(name="test",input=[input1,input2])
Check available parameters of it:
[7]:
print(calculator_plus.parameters)
- Parameters object -
plus_times 1 How many times to do the plus
Run the calculator with default parameters
[8]:
result = calculator_plus.backengine()
print(result.get_data())
{'number': 2.0}
Modify the parameter and see the difference:
[9]:
calculator_plus.parameters["plus_times"] = 5
print(calculator_plus.backengine().get_data())
{'number': 6.0}
Define a Calculator with native file output
[10]:
from typing import Union
from pathlib import Path
import numpy as np
from libpyvinyl.BaseData import DataCollection
from libpyvinyl.BaseCalculator import BaseCalculator, CalculatorParameters
class MinusCalculator(BaseCalculator):
def __init__(
self,
name: str,
input: Union[DataCollection, list, NumberData],
output_keys: Union[list, str] = ["minus_result"],
output_data_types=[NumberData],
output_filenames: Union[list, str] = ["minus_result.txt"],
instrument_base_dir="./",
calculator_base_dir="MinusCalculator",
parameters=None,
):
"""A python object calculator example"""
super().__init__(
name,
input,
output_keys,
output_data_types=output_data_types,
output_filenames=output_filenames,
instrument_base_dir=instrument_base_dir,
calculator_base_dir=calculator_base_dir,
parameters=parameters,
)
def init_parameters(self):
parameters = CalculatorParameters()
times = parameters.new_parameter(
"minus_times", comment="How many times to do the minus"
)
times.value = 1
self.parameters = parameters
def backengine(self):
Path(self.base_dir).mkdir(parents=True, exist_ok=True)
input_num0 = self.input.to_list()[0].get_data()["number"]
input_num1 = self.input.to_list()[1].get_data()["number"]
output_num = float(input_num0) - float(input_num1)
if self.parameters["minus_times"].value > 1:
for i in range(self.parameters["minus_times"].value - 1):
output_num -= input_num1
arr = np.array([output_num])
file_path = self.output_file_paths[0]
np.savetxt(file_path, arr, fmt="%.3f")
key = self.output_keys[0]
output_data = self.output[key]
output_data.set_file(file_path, TXTFormat)
return self.output
MinusCalculator
is the similar to PlusCalculator
except its output_data is a NumberData
mapping to TXTFormat
instead of python object.
The simulation results can be obtained in the same way as that of PlusCalculator
[11]:
input1 = NumberData.from_dict({"number": 5}, "input1")
input2 = NumberData.from_dict({"number": 1}, "input2")
calculator_minus = MinusCalculator(name="test",input=[input1,input2])
output = calculator_minus.backengine()
print(output.get_data())
{'number': 4.0}
We can see that output
is now mapping to a file :
[12]:
print(output)
Data collection:
key - mapping
minus_result - <class '__main__.TXTFormat'>: MinusCalculator/minus_result.txt
If we read the file, we should get the same result.
[13]:
print(output["minus_result"].filename)
with open(output["minus_result"].filename,'r') as fh:
print(fh.read())
MinusCalculator/minus_result.txt
4.000
Define an instrument
We can assmeble a single PlusMinus
instrument from the two Calculators to sum input1
and input2
and then subtract the result with input2
:
[16]:
from libpyvinyl import Instrument
# Create an Instrument with the name PlusMinus
calculation_instrument = Instrument("PlusMinus")
# Create python object data as input
input1 = NumberData.from_dict({"number": 1}, "input1")
input2 = NumberData.from_dict({"number": 2}, "input2")
calculator_plus = PlusCalculator(name="Plus",input=[input1,input2])
# The the output of calculator_plus as the input of calculator_minus
calculator_minus = MinusCalculator(name="Minus",input=[calculator_plus.output["plus_result"],input2])
# Assemble the instrument
calculation_instrument.add_calculator(calculator_plus)
calculation_instrument.add_calculator(calculator_minus)
# Set the base output path of the instrument
instrument_path = "PlusMinus"
calculation_instrument.set_instrument_base_dir(str(instrument_path))
Run the instrument
[17]:
# 1+2-2 = 1
calculation_instrument.run()
calculation_instrument.calculators['Minus'].output.get_data()
[17]:
{'number': 1.0}