Overview

Work flow

atomicrex processes an XML file that describes the job to be performed. Generally speaking, an atomicrex job can be divided into two parts, the training phase and the output phase.

During the training phase selected degrees of freedom (parameters) of the model (usually an interatomic potential) are varied such that the predicted properties (energies, forces, elastic constants etc.) most closely match the target data. The training phase is optional and can be skipped by leaving out the <fitting> element in the job file.

The training phase is followed by the output phase. Here, additional properties can be calculated that were not included during training. This allows a convenient separation of the available data into a training set and a validation set. The latter set of properties enables one to assess the predictive capability of a model.

Key concepts

atomicrex operates with two principal types of entities: potentials and structures.

  • A potential consists of a parameter set and a routine that allows one to calculate the total energy and the forces for a given atomic structure.
  • An atomic structure consists of a simulation cell with or without periodic boundary conditions and a list of atoms. Potentials and structures each expose a set of degrees of freedom and a set of properties.

Degrees of freedom

A degree of freedom (DOF) describes an aspect of a potential or a structure that can be continuously varied. Note that a DOF of a potential is varied during the training process to minimize the objective function, while a DOF of an atomic structure is varied to relax the structure, i.e. to minimize its potential energy. There are different classes of DOFs: In the most simple case, a DOF is a scalar variable that describes a single parameter, e.g. the \(\sigma\) parameter of the Lennard-Jones potential or the lattice parameter of a cubic lattice. Furthermore, multi-dimensional DOFs exist that control more complex parameters, e.g. the atomic positions of a structure or the coefficients of a spline potential function.

Each type of potential or structure exposes a certain set of DOFs. The user has to specify an initial value in the job file for each of the DOFs. By default all DOFs are static, i.e. their value does not change during a job. By using the <fit-dof> element, however, the user can include selected DOFs of a potential in the training process as described in this section. Correspondingly, relaxation of DOFs of structures can be enabled by using the <relax> element in the job file as described in this section. If relaxation is enabled for a structural DOF, the value of the DOF will be varied such that the potential energy of the structure is minimized.

Properties

Properties are quantities that can be calculated from the DOFs of a structure or a potential, i.e. they are functionals of the DOFs [1]. Each type of structure or potential provides a certain set of properties. By default only very few of them are calculated, such as the potential energy and the equilibrium volume of a structure. As detailed in this section, the user can enable the calculation of additional properties, which should be included in the fit and/or in the final output, by using the <property> element. When including a property in the fit, one has to specify a target value for the property. The target value is a scalar in the case of simple properties, e.g., the energy of a structure, or it can be multi-dimensional data, e.g., the force vectors acting on the atoms of a structure. Each fit property contributes to the objective function, which is minimized during the training process, according to its weight as discussed below.

Note that properties are calculated after the DOFs of a structure have been relaxed. This implies that, if relaxation is enabled for certain DOFs, the training process consists of two nested minimization passes: Each time the objective function for the potential training is evaluated in the outer optimization loop, the energy of all structures is first minimized with respect to the structural DOFs. A third level of variation may exist if the relaxation of the atomic positions has been enabled. In that case, the atomic positions must be relaxed each time the total energy is calculated for a given trial configuration of the DOFs for which relaxation is enabled. Please keep this in mind when the computational time required by a certain run is unexpectedly large.

For some properties there is a one-to-one relationship between the property and the corresponding DOF. For instance, the lattice parameter of a structure determines the interatomic spacing. On the other hand, one may want to fit the equilibrium lattice spacing to an experimental value. Thus, the lattice structure exposes both a lattice parameter DOF and a lattice parameter property. If the property is included in the fit, then relaxation will be enabled for the corresponding DOF even if no <relax> element has been specified.

Objective function

The objective (or cost) function is the main quantity being computed by atomicrex. It can be written in the general form

\[\chi^2 = \underbrace{ \sum_G \bar{w}_G \underbrace{ \sum_S \bar{w}_{GS} \underbrace{ \sum_P \bar{w}_{GSP} r_{GSP} }_\text{properties} }_\text{structures} }_\text{structure groups}\]

with

\[\begin{split}\bar{w}_{GSP} = w_{GSP} \Big/ \sum_{P'}^\text{properties} w_{GSP'} \\ \bar{w}_{GS} = w_{GS} \Big/ \sum_{S'}^\text{structures} w_{GS'} \\ \bar{w}_{G} = w_G \Big/ \sum_{G'}^\text{structure groups} w_{G'} ,\end{split}\]

where \(w_G\), \(w_{GS}\), and \(w_{GSP}\) denote the relative weight factors that can be assigned by the user via the relative-weight attributes of the respective elements for structure groups, structures, and properties.

Note

A structure group can contain both structures and structure subgroups, which enables complex hierarchies. In such cases, when computing the weight factors, the subgroups of a group are treated as equivalent to the structures in the same group, i.e. the subgroup weights are included in the computation of the structure weight factors \(w_{GS}\). This is implemented recursively and ensures that all the normalized weights sum up to one (\(\sum_{GSP} \bar{w}_{GSP} = 1\), \(\sum_{GS} \bar{w}_{GS} = 1\), \(\sum_G \bar{w}_G = 1\)).

The residual \(r_{GSP}\) is calculated for each property from the predicted \(A^{\text{predicted}}\) and target values \(A^{\text{target}}\) according to the specified residual-style, which is set via the homonymous attribute of the property element. For example, the keyword squared implies

\[r_{GSP} = \left[ \left( A^{\text{predicted}}_i - A^{\text{target}}_i\right) \big/ \delta_{GSP} \right]^2.\]

Here, \(\delta_{GSP}\) is the tolerance that has been specified for this property. Note that not all residual styles invoke the tolerance parameter.

The rationale behind this nested weight model is as follows. When training interatomic potentials certain structures are usually more important than others. For example in the case of silicon the diamond structure should naturally be given a higher weight than, say, the face-centered cubic structure, whereas the opposite scenario applies in the case of e.g., aluminum or gold. To achieve this balance one can set the weights of individual structures (\(w_{GS}\)) as well as structure groups (\(w_G\)). Note that the structure and structure group weights are individually normalized (see equation above), such that the ratio between structures remains the same while changing the weights of individual properties.

Note

All user controllable (relative) weight factors, i.e. \(w_G\), \(w_{GS}\), and \(w_{GSP}\), as well as the tolerance parameters, \(\delta_{GSP}\) default to 1. If one prefers to work with a single level weight model (one and only one weight factor per property), one should only work with the \(w_{GSP}\) values (i.e. the relative-weight attributes of the <property> elements) and leave all other weights (\(w_G\), \(w_{GS}\)) as well as the tolerance attribute (\(\delta_{GSP}\)) untouched.

Similarly, it is also desirable to use “intuitive” weights to express for example the fact that the cohesive energy of a certain structure to be “three times more important” than the bulk modulus. These properties, however, have very different units and thus the differences between predicted and target values \(A^{\text{predicted}} - A^{\text{target}}\) can be of very different magnitudes. It is usually inconvenient to adjust the property weights manually to correct for this imbalance. One can partially remedy the situation by using the squared-relative residual style, which, however, imposes a normalization with regard to the absolute value of the property.

A more refined approach is to use the tolerance parameter \(\delta\) that enables one to specify the acceptable range for each property. It naturally carries the same unit as the property such that all residuals are unitless. This allows one to use this parameter rather sensibly. For example it is often reasonable to aim for a cohesive energy to agree within, say, \(\delta = 0.1\,\text{eV}\) with the target value, whereas for an elastic constant \(\delta = 5\,\text{GPa}\) could be acceptable.

The specification of the tolerances via the tolerance attribute should be done when setting up the fit database, after which the \(\delta\) parameters can be left untouched. Subsequently tuning the balance of the fit proceeds by adjusting the weights of individual structures, structure groups, and properties.

For convenience atomicrex provides presets for the tolerance parameters, which can be selected via the <tolerance-preset> tag in the XML input file. In this case, the tolerance parameter for a given property is assigned a value according to its unit as detailed below. These values can be individually overridden for any property by using the tolerance attribute.

The following presets are available

  • uniform: \(\delta_i=1\) for all parameters

  • balanced:

    Unit Tolerance
    eV 0.2
    eV/atom 0.01
    GPa 2.0
    A 0.005
    meV 10
    meV/A 10
    eV/A 0.01
  • accurate-energies:

    Unit Tolerance
    eV 0.1
    eV/atom 0.005
    GPa 2.0
    A 0.005
    meV 5
    meV/A 10
    eV/A 0.01

Footnotes

[1]Note that in the case of structural properties, the calculation of properties can be rather complex and usually involves the evaluation of the potential functions. In contrast, properties of potentials can be directly derived from the DOFs of the potential. This complexity is handled efficiently by the internal structure of atomicrex and usually does not affect the user.