Basis Set Input

Introduction

CRYSTAL performs ab initio calculations on periodic systems within the linear combination of atomic orbitals (LCAO) approximation. That is, the crystalline orbitals (CO) are treated as linear combinations of Bloch functions (BF),

defined in terms of local functions, hereafter indicated as atomic orbitals (AO). Those local functions are expressed as linear combination of a certain number of Gaussian type functions (GTF).

They are characterized by the same centre A, with fixed coefficients, d, and exponents, alpha, defined in the input.

r	coordinate of an electron
g	direct lattice vector; the sum over g is extended to the all lattice vectors (infinite) of direct lattice.
k	lattice vector defining a point in the reciprocal lattice
A	coordinate of an atom in the reference cell
a	variational coefficients. They multiply the BF; the sum over m is limited to the number of basis functions
d	coefficients of the primitive gaussians in the contraction, fixed for a given basis set; the sum over j is limited to the number of functions in the contraction

The AOs belonging to a given atom are grouped into shells.
The shell can contain either all AOs with the same quantum numbers, n and l (for instance 3s, 2p, 3d shells), or all the AOs with the same principal quantum number n and different l (sp shells; exponent of s and p gaussians are the same).

A single, normalized, s-type GTF, the adjoined gaussian, is associated with each shell. The exponent of the adjoined gaussian is the smallest exponent of the gaussians in the contraction.
The adjoined gaussian is used to estimate the AO overlap and select the level of approximation to be adopted for the evaluation of the integrals.

This chapter discusses briefly the basis set input section. The basis set definition is the first step to uniquely define the level of calculation. The molecular/crystalline basis set must be balanced, that means each centre must have the same variational freedom in describing the electrons formally attributed to the centre.
Basis sets of different quality on different atoms (minimal basis sets on some atoms and split valence + polarization on others) may give spurious effects, exploited during the SCF iterations, and driving to solution not converging.

Few simple examples will be shown to explain how the basis set has to be specified in the CRYSTAL input.

Basis set input

The definition of the basis set is in the second input block. Basis set and initial electronic configuration must be given for each atom with a different conventional atomic number in the crystal structure input. CRYSTAL can use either general basis sets, including s, p, d functions or standard Pople basis sets (internally stored).

When the basis set input has been specified, several optional keywords can be used, related to modification of the electronic configuration, use of ghost functions, and printing options.

Input instructions

The basis set input format is strictly related to the mathematical definition of basis set given above.

For each atom (as many blocks as different types of atoms in the crystal structure) it must be specified:
the conventional atomic number and the number of shells n_sof the atomic basis set

for each shell (n_sblocks of records), type of basis set (0-1-2), type of shell (0-1-2-3),
number of primitives GTF n_g, shell electronic charge, scale factor

for each primitive (n_g records - optional - basis set type 0 only)
exponent, contraction coefficient, [contraction coefficient]

The definition of atomic basis sets ends with the record:

99 0

that is the conventional atomic number 99 with a 0 shell. Optional basis set keywords may follow.
The "conventional atomic number" links the basis set to the atoms entered in geometry input.

Basis set input is closed by the keyword END

In CRYSTAL three basis set types are available:

· 0	general basis set: exponent and contraction coefficients defined in input;
· 1	Pople STO-nG type basis set;
· 2	Pople 3(6)-21G type basis set;

The shell types available correspond to:

shell type code	shell type	AO	AO order	max shell charge
· 0	s	1	s	2
· 1	sp	4	s, x, y, z	8
· 2	p	3	x, y, z	6
· 3	d	5	2z²-x²-y², xz, yz, x²-y², xy	10
· 4	f	7	(2z²-3x²-3y²)z, (4z²-x²-y²)x, (4z²-x²-y²)y, (x²-y²)z, xyz, (x²-3y²)x, (3 x²-y²)y	0 - polarization only

d shells include 5 d orbitals, f shells include 7 orbitals.
For sp shells two contraction coefficients must be specified, for s and p AO, respectively.

Standard polarization functions can be added to 3(6)-21G basis sets of atoms up to Z=18, by inserting a record describing the polarization shell.

The formal shell electronic charge is the number of electrons attributed to each shell as initial electronic configuration. The electronic configuration of the atoms is used in the calculation of the atomic wave function only (when the guess for SCF is a superposition of atomic densities). It may correspond to a neutral atom or to an ion (for MgO, Mg and O, or Mg⁺⁺ and O^--). The net charge in the cell must be zero, the cell must be neutral.

The following example of basis set input are followed by the printed output.

Si, Pople STO-3G minimal basis set. The value of the scale factor "0." means standard Pople scale factor.

14 3
1 0 3 2. 0.
1 1 3 8. 0.
1 1 3 4. 0.

*******************************************************************************
 ATOM X(AU) Y(AU) Z(AU) NO. TYPE EXPONENT S COEF P COEF D/F/G 
 *******************************************************************************
 1 SI 1.280 1.280 1.280
 1 S
 4.078E+02 1.543E-01 0.000E+00 0.000E+00
 7.428E+01 5.353E-01 0.000E+00 0.000E+00
 2.010E+01 4.446E-01 0.000E+00 0.000E+00
 2- 5 SP
 2.319E+01-9.997E-02 1.559E-01 0.000E+00
 5.390E+00 3.995E-01 6.077E-01 0.000E+00
 1.753E+00 7.001E-01 3.920E-01 0.000E+00
 6- 9 SP
 1.479E+00-2.196E-01 1.059E-02 0.000E+00
 4.126E-01 2.256E-01 5.952E-01 0.000E+00
 1.615E-01 9.004E-01 4.620E-01 0.000E+00
 *******************************************************************************
The adjoined gaussian exponent is underlined.

The standard Pople scale factor is used for STO-nG basis set when the input scale factor is 0.0. If the input scale factor is 1.0, then the same exponents are attributed to all the atoms of the same row.

Si, Pople 6-21G split valence basis set

14 4
2 0 6 2. 1. 
2 1 6 8. 1.
2 1 2 4. 1.
2 1 1 0. 1.

*******************************************************************************
 ATOM X(AU) Y(AU) Z(AU) NO. TYPE EXPONENT S COEF P COEF D/F/G
 *******************************************************************************
 1 SI 1.280 1.280 1.280
 1 S
 1.612E+04 1.959E-03 0.000E+00 0.000E+00
 2.426E+03 1.493E-02 0.000E+00 0.000E+00
 5.539E+02 7.285E-02 0.000E+00 0.000E+00
 1.563E+02 2.461E-01 0.000E+00 0.000E+00
 5.007E+01 4.859E-01 0.000E+00 0.000E+00
 1.702E+01 3.250E-01 0.000E+00 0.000E+00
 2- 5 SP
 2.927E+02-2.781E-03 4.438E-03 0.000E+00
 6.987E+01-3.571E-02 3.267E-02 0.000E+00
 2.234E+01-1.150E-01 1.347E-01 0.000E+00
 8.150E+00 9.356E-02 3.287E-01 0.000E+00
 3.135E+00 6.030E-01 4.496E-01 0.000E+00
 1.225E+00 4.190E-01 2.614E-01 0.000E+00
 6- 9 SP
 1.079E+00-3.761E-01 6.710E-02 0.000E+00
 3.024E-01 1.252E+00 9.569E-01 0.000E+00
 10- 13 SP
 9.334E-02 1.000E+00 1.000E+00 0.000E+00
 *******************************************************************************
The adjoined gaussian exponent is underlined

Si, Pople 6-21G split valence basis set modified for crystalline calculation

14 4
2 0 6 2. 1.
2 1 6 8. 1.
2 1 2 4. 1.
0 1 1 0. 1.
0.1233392 1. 1.

*******************************************************************************
 ATOM X(AU) Y(AU) Z(AU) NO. TYPE EXPONENT S COEF P COEF D/F/G
 *******************************************************************************
 1 SI 1.280 1.280 1.280
 1 S
 1.612E+04 1.959E-03 0.000E+00 0.000E+00
 2.426E+03 1.493E-02 0.000E+00 0.000E+00
 5.539E+02 7.285E-02 0.000E+00 0.000E+00
 1.563E+02 2.461E-01 0.000E+00 0.000E+00
 5.007E+01 4.859E-01 0.000E+00 0.000E+00
 1.702E+01 3.250E-01 0.000E+00 0.000E+00
 2- 5 SP
 2.927E+02-2.781E-03 4.438E-03 0.000E+00
 6.987E+01-3.571E-02 3.267E-02 0.000E+00
 2.234E+01-1.150E-01 1.347E-01 0.000E+00
 8.150E+00 9.356E-02 3.287E-01 0.000E+00
 3.135E+00 6.030E-01 4.496E-01 0.000E+00
 1.225E+00 4.190E-01 2.614E-01 0.000E+00
 6- 9 SP
 1.079E+00-3.761E-01 6.710E-02 0.000E+00
 3.024E-01 1.252E+00 9.569E-01 0.000E+00
 10- 13 SP
 1.233E-01 1.000E+00 1.000E+00 0.000E+00
 *******************************************************************************

Si, Pople 3-21G split valence basis set

14 4
2 0 3 2. 1.
2 1 3 8. 1.
2 1 2 4. 1.
2 1 1 0. 1.

*******************************************************************************
 ATOM X(AU) Y(AU) Z(AU) NO. TYPE EXPONENT S COEF P COEF D/F/G
 *******************************************************************************
 1 SI 1.280 1.280 1.280
 1 S
 9.107E+02 6.608E-02 0.000E+00 0.000E+00
 1.373E+02 3.862E-01 0.000E+00 0.000E+00
 2.976E+01 6.724E-01 0.000E+00 0.000E+00
 2- 5 SP
 3.667E+01-1.045E-01 1.134E-01 0.000E+00
 8.317E+00 1.074E-01 4.576E-01 0.000E+00
 2.216E+00 9.514E-01 6.074E-01 0.000E+00
 6- 9 SP
 1.079E+00-3.761E-01 6.710E-02 0.000E+00
 3.024E-01 1.252E+00 9.569E-01 0.000E+00
 10- 13 SP
 9.334E-02 1.000E+00 1.000E+00 0.000E+00
 *******************************************************************************

Si, Pople 3-21G split valence + polarization basis set

14 5
2 0 3 2. 1.
2 1 3 8. 1.
2 1 2 4. 1.
2 1 1 0. 1.
2 3 1 0. 1.

*******************************************************************************
 ATOM X(AU) Y(AU) Z(AU) NO. TYPE EXPONENT S COEF P COEF D/F/G
 *******************************************************************************
 1 SI 1.280 1.280 1.280
 1 S
 9.107E+02 6.608E-02 0.000E+00 0.000E+00
 1.373E+02 3.862E-01 0.000E+00 0.000E+00
 2.976E+01 6.724E-01 0.000E+00 0.000E+00
 2- 5 SP
 3.667E+01-1.045E-01 1.134E-01 0.000E+00
 8.317E+00 1.074E-01 4.576E-01 0.000E+00
 2.216E+00 9.514E-01 6.074E-01 0.000E+00
 6- 9 SP
 1.079E+00-3.761E-01 6.710E-02 0.000E+00
 3.024E-01 1.252E+00 9.569E-01 0.000E+00
 10- 13 SP
 9.334E-02 1.000E+00 1.000E+00 0.000E+00
 14- 18 D
 4.500E-01 0.000E+00 0.000E+00 1.000E+00
 *****************************************************************************

Si, user defined basis set derived from Pople 6-21G

14 4
0 0 6 2. 1.
16115.9 0.00195948
2425.58 0.0149288
553.867 0.0728478
156.340 0.24613
50.0683 0.485914
17.0178 0.325002
0 1 6 8. 1.
292.718 -0.0027809 0.0044383
69.8731 -0.0357146 0.0326679
22.3363 -0.114985 0.134721
8.15039 0.0935634 0.328678
3.13458 0.603017 0.449640
1.22543 0.418959 0.261372
0 1 2 4. 1.
1.07913 -0.376108 0.067103
0.302422 1.25165 0.956883
0 1 1 0. 1.
0.123 1. 1.

 *******************************************************************************
 ATOM X(AU) Y(AU) Z(AU) NO. TYPE EXPONENT S COEF P COEF D/F/G
 *******************************************************************************
 1 SI 1.280 1.280 1.280
 1 S
 1.612E+04 1.959E-03 0.000E+00 0.000E+00
 2.426E+03 1.493E-02 0.000E+00 0.000E+00
 5.539E+02 7.285E-02 0.000E+00 0.000E+00
 1.563E+02 2.461E-01 0.000E+00 0.000E+00
 5.007E+01 4.859E-01 0.000E+00 0.000E+00
 1.702E+01 3.250E-01 0.000E+00 0.000E+00
 2- 5 SP
 2.927E+02-2.781E-03 4.438E-03 0.000E+00
 6.987E+01-3.571E-02 3.267E-02 0.000E+00
 2.234E+01-1.150E-01 1.347E-01 0.000E+00
 8.150E+00 9.356E-02 3.287E-01 0.000E+00
 3.135E+00 6.030E-01 4.496E-01 0.000E+00
 1.225E+00 4.190E-01 2.614E-01 0.000E+00
 6- 9 SP
 1.079E+00-3.761E-01 6.710E-02 0.000E+00
 3.024E-01 1.252E+00 9.569E-01 0.000E+00
 10- 13 SP
 1.230E-01 1.000E+00 1.000E+00 0.000E+00
 ******************************************************************************

Si Basis set	AO	exponent	Total energy bulk silicon	SCF cycles	Million. biel integrals	CPU integrals	CPU total
sto-3g	9	0.1315	-5.713207531E+02	8	0.11	2.46	3.22
3-21	13	0.09334	-5.748914299E+02	9	0.86	26.38	28.70
3-21+d	18	0.09334	-5.749481876E+02	9	2.04	53.19	57.70
6-21	13	0.09334	-5.778250380E+02	10	0.92	30.04	32.75
6-21mod	13	0.1233	-5.778265558E+02	9	0.58	12.80	14.75
free.out	13	0.123	-5.778265813E+02	9	0.58	12.81	14.74

The table shows for bulk Silicon how the value of the exponent of the lowest exponent gaussian influence the computational resources and the total energy, when all other computational parameters are fixed. The crystalline basis functions are the Bloch functions, built with localized functions, the AOs. The diffuse functions, very important to describe the tails of the electron density in a molecular systems, are useless in an infinite system, increase the computational cost, but do not improve the wave function.
Crystalline atomic basis sets are built starting from an atomic basis set optimized for molecules, and reoptimizing the exponent of the most diffuse GTF.

The conventional atomic number, NAT, links the basis set with the atoms defined in the crystal structure.

NAT<200: all-electron BS	Given Z, NAT=Z, NAT'=Z+100
NAT>200: valence-electron BS	Given Z, NAT=Z+200, NAT'=Z+300. A core pseudopotential (ECP) must be defined

A maximum of two different basis sets may be given for the same chemical species in positions not symmetry related, using the conventional atomic number NAT and NAT'.
Atoms with equal conventional atomic number are associated with the same basis set.

The atomic number Z is given by the remainder of the division of the conventional atomic number by 100 (Example: NAT=108, Z=8, Oxygen, all electron; NAT=208, Z=8, Oxygen, ECP).

A conventional atomic number 0 defines ghost atoms, that is points in space with an associated basis set, but lacking a nuclear charge.

Here an example is reported concerning the use of different conventional atomic numbers for the same atom, but in non-equivalent positions.

In the following example (test 35 of CRYSTAL test cases), a three-layer slab model of the MgO(001) surface is created (SLABCUT) and a CO molecule is added (ATOMINSE) upon the surface to simulate an adsorption process. Two different atomic basis sets are used for the oxygen atom: in MgO the oxygen (NAT=8, Z=8) basis set is optimized for O^--, in CO molecule the oxygen (NAT=108, Z=8) basis set is a standard molecular one.

TEST35 - MGO SLAB (001), 3 LAYER + CO ADSORPTION
CRYSTAL
0 0 0
 225
4.21
2
 12 0. 0. 0.
 8  0.5 0.5 0.5
SLABCUT
0 0 1
1 3
BREAKSYM
ATOMINSE CO molecule added
2
6 1.488 -1.488 4.605
108 1.488 -1.488 5.729
END
12 3
 0 0 8 2. 1.
 68371.875 0.0002226
 9699.34009 0.0018982
 2041.176786 0.0110451
 529.862906 0.0500627
 529.862906 0.0500627
 159.186000 0.169123
 54.6848 0.367031
 21.2357 0.400410
 8.74604 0.14987
0 1 6 8. 1.
156.795 -0.00624 0.00772
31.0339 -0.07882 0.06427
9.6453 -0.07992 0.2104
3.7109 0.29063 0.34314
1.61164 0.57164 0.3735
0.64294 0.30664 0.23286
0 1 1 0. 1.
0.4 1. 1.
8 3 user defined basis set
0 0 8 2. 1.
4000. 0.00144
1355.58 0.00764
248.545 0.05370
69.5339 0.16818
23.8868 0.36039
9.27593 0.38612
3.82034 0.14712
1.23514 0.07105
0 1 5 8. 1.
52.1878 -0.00873 0.00922
10.3293 -0.08979 0.07068
3.21034 -0.04079 0.20433
1.23514 0.37666 0.34958
0.536420 0.42248 0.27774
0 1 1 0. 1.
0.210000 1. 1.
6 3 Pople 6-21G basis set
2 0 6 2. 0.
2 1 2 4. 0.
2 1 1 0. 0.
108 3
2 0 6 2. 0.
2 1 2 6. 0.
2 1 1 0. 0.
99 0
END
END
8 0 8
FMIXING
30
END

Exercise: define a STO-3G basis set for Oxygen.
Exercise: define a STO-3G basis set for Magnesium.
Exercise: define a STO-3G basis set for MgO
Exercise: define an extended basis set for MgO referring to the CRYSTAL basis set library.

The choice of the basis set is a fundamental step in defining the level of calculation and its accuracy. This is of particular importance when treating periodic systems where a large variety of chemical bonding can be found.

Molecular basis sets can be used in periodic calculations but their adequacy must be carefully checked. In particular the role of diffuse functions in crystalline systems must be assessed.
Very diffuse functions can yield numerical instabilities and risk of linear dependence catastrophes. Further, due to the truncation criteria of the infinite sum, based on the overlap, decreasing the exponents of the primitive gaussians the number of integrals to be calculated increases very rapidly.
Too extended basis sets are not needed in periodic calculations because the complete basis set limit is reached quicker than in molecular calculations. Further, the risk of linearly dependence problems increases.

The choice of the basis set is a compromise between accuracy and costs. Nevertheless, we think that the accuracy must be the main goal of ab initio calculations. So, good quality basis sets should be always used in spite of their computational cost to avoid producing meaningless numbers.