Overview¶
Work flow¶
atomicrex processes an XML file that describes the job to be performed. Generally speaking, an atomicrex job can be divided into two parts, the training phase and the output phase.
During the training phase selected degrees of freedom (parameters) of
the model (usually an interatomic potential) are varied such that the
predicted properties (energies, forces, elastic constants etc.) most
closely match the target data. The training phase is optional and can
be skipped by leaving out the <fitting>
element in the job
file.
The training phase is followed by the output phase. Here, additional properties can be calculated that were not included during training. This allows a convenient separation of the available data into a training set and a validation set. The latter set of properties enables one to assess the predictive capability of a model.
Key concepts¶
atomicrex operates with two principal types of entities: potentials and structures.
A potential consists of a parameter set and a routine that allows one to calculate the total energy and the forces for a given atomic structure.
An atomic structure consists of a simulation cell with or without periodic boundary conditions and a list of atoms. Potentials and structures each expose a set of degrees of freedom and a set of properties.
Degrees of freedom¶
A degree of freedom (DOF) describes an aspect of a potential or a structure that can be continuously varied. Note that a DOF of a potential is varied during the training process to minimize the objective function, while a DOF of an atomic structure is varied to relax the structure, i.e. to minimize its potential energy. There are different classes of DOFs: In the most simple case, a DOF is a scalar variable that describes a single parameter, e.g. the \(\sigma\) parameter of the Lennard-Jones potential or the lattice parameter of a cubic lattice. Furthermore, multi-dimensional DOFs exist that control more complex parameters, e.g. the atomic positions of a structure or the coefficients of a spline potential function.
Each type of potential or structure exposes a certain set of DOFs. The
user has to specify an initial value in the job file for each of the
DOFs. By default all DOFs are static, i.e. their value does not change
during a job. By using the <fit-dof>
element, however, the user
can include selected DOFs of a potential in the training process as
described in this section. Correspondingly,
relaxation of DOFs of structures can be enabled by using the
<relax>
element in the job file as described in this section. If relaxation is enabled for a structural DOF, the
value of the DOF will be varied such that the potential energy of the
structure is minimized.
Properties¶
Properties are quantities that can be calculated from the DOFs of a
structure or a potential, i.e. they are functionals of the DOFs
1. Each type of structure or potential provides
a certain set of properties. By default only very few of them are
calculated, such as the potential energy and the equilibrium volume of
a structure. As detailed in this section, the user
can enable the calculation of additional properties, which should be
included in the fit and/or in the final output, by using the
<property>
element. When including a property in the fit, one has
to specify a target value for the property. The target value is a
scalar in the case of simple properties, e.g., the energy of a
structure, or it can be multi-dimensional data, e.g., the force
vectors acting on the atoms of a structure. Each fit property
contributes to the objective function,
which is minimized during the training process, according to its
weight as discussed below.
Note that properties are calculated after the DOFs of a structure have been relaxed. This implies that, if relaxation is enabled for certain DOFs, the training process consists of two nested minimization passes: Each time the objective function for the potential training is evaluated in the outer optimization loop, the energy of all structures is first minimized with respect to the structural DOFs. A third level of variation may exist if the relaxation of the atomic positions has been enabled. In that case, the atomic positions must be relaxed each time the total energy is calculated for a given trial configuration of the DOFs for which relaxation is enabled. Please keep this in mind when the computational time required by a certain run is unexpectedly large.
For some properties there is a one-to-one relationship between the
property and the corresponding DOF. For instance, the lattice
parameter of a structure determines the interatomic spacing. On the
other hand, one may want to fit the equilibrium lattice spacing to
an experimental value. Thus, the lattice structure exposes both a
lattice parameter DOF and a lattice parameter property. If the
property is included in the fit, then relaxation will be enabled for
the corresponding DOF even if no <relax>
element has been
specified.
Objective function¶
The objective (or cost) function is the main quantity being computed by atomicrex. It can be written in the general form
with
where \(w_G\), \(w_{GS}\), and \(w_{GSP}\) denote the
relative weight factors that can be assigned by the user via the
relative-weight
attributes of the respective elements for
structure groups, structures, and properties.
Note
A structure group can contain both structures and structure subgroups, which enables complex hierarchies. In such cases, when computing the weight factors, the subgroups of a group are treated as equivalent to the structures in the same group, i.e. the subgroup weights are included in the computation of the structure weight factors \(w_{GS}\). This is implemented recursively and ensures that all the normalized weights sum up to one (\(\sum_{GSP} \bar{w}_{GSP} = 1\), \(\sum_{GS} \bar{w}_{GS} = 1\), \(\sum_G \bar{w}_G = 1\)).
The residual \(r_{GSP}\) is calculated for each property from the
predicted \(A^{\text{predicted}}\) and target values
\(A^{\text{target}}\) according to the specified
residual-style
, which is set via the homonymous attribute of
the property element. For example, the keyword
squared
implies
Here, \(\delta_{GSP}\) is the tolerance that has been specified for this property. Note that not all residual styles invoke the tolerance parameter.
The rationale behind this nested weight model is as follows. When training interatomic potentials certain structures are usually more important than others. For example in the case of silicon the diamond structure should naturally be given a higher weight than, say, the face-centered cubic structure, whereas the opposite scenario applies in the case of e.g., aluminum or gold. To achieve this balance one can set the weights of individual structures (\(w_{GS}\)) as well as structure groups (\(w_G\)). Note that the structure and structure group weights are individually normalized (see equation above), such that the ratio between structures remains the same while changing the weights of individual properties.
Note
All user controllable (relative) weight factors, i.e. \(w_G\),
\(w_{GS}\), and \(w_{GSP}\), as well as the tolerance
parameters, \(\delta_{GSP}\) default to 1. If one prefers to
work with a single level weight model (one and only one weight
factor per property), one should only work with the
\(w_{GSP}\) values (i.e. the relative-weight
attributes of
the <property>
elements) and leave all other weights
(\(w_G\), \(w_{GS}\)) as well as the tolerance
attribute (\(\delta_{GSP}\)) untouched.
Similarly, it is also desirable to use “intuitive” weights to express
for example the fact that the cohesive energy of a certain structure
to be “three times more important” than the bulk modulus. These
properties, however, have very different units and thus the
differences between predicted and target values
\(A^{\text{predicted}} - A^{\text{target}}\) can be of very
different magnitudes. It is usually inconvenient to adjust the
property weights manually to correct for this imbalance. One can
partially remedy the situation by using the squared-relative
residual style, which, however, imposes a
normalization with regard to the absolute value of the property.
A more refined approach is to use the tolerance parameter \(\delta\) that enables one to specify the acceptable range for each property. It naturally carries the same unit as the property such that all residuals are unitless. This allows one to use this parameter rather sensibly. For example it is often reasonable to aim for a cohesive energy to agree within, say, \(\delta = 0.1\,\text{eV}\) with the target value, whereas for an elastic constant \(\delta = 5\,\text{GPa}\) could be acceptable.
The specification of the tolerances via the tolerance
attribute
should be done when setting up the fit database, after which the
\(\delta\) parameters can be left untouched. Subsequently tuning
the balance of the fit proceeds by adjusting the weights of
individual structures, structure groups, and properties.
For convenience atomicrex provides presets for the tolerance
parameters, which can be selected via the <tolerance-preset> tag in the XML
input file. In this case, the tolerance parameter for a given property is
assigned a value according to its unit as detailed below. These values can be
individually overridden for any property by using the tolerance
attribute.
The following presets are available
uniform: \(\delta_i=1\) for all parameters
balanced:
Unit
Tolerance
eV
0.2
eV/atom
0.01
GPa
2.0
A
0.005
meV
10
meV/A
10
eV/A
0.01
accurate-energies:
Unit
Tolerance
eV
0.1
eV/atom
0.005
GPa
2.0
A
0.005
meV
5
meV/A
10
eV/A
0.01
Footnotes
- 1
Note that in the case of structural properties, the calculation of properties can be rather complex and usually involves the evaluation of the potential functions. In contrast, properties of potentials can be directly derived from the DOFs of the potential. This complexity is handled efficiently by the internal structure of atomicrex and usually does not affect the user.