Nonlinear gyrokinetic equation

Youjun Hu1

Institute of Plasma Physics, Chinese Academy of Sciences
Email: yjhu@ipp.cas.cn

Abstract. The nonlinear δf gyrokinetic equation in Frieman-Chen’s paper[3] is re-derived in this note, giving more details. Numerical implementation of the gyrokinetic model using the PIC method is also discussed.

1 Introduction
2 Transform Vlasov equation from particle coordinates to guiding-center coordinates
3 δf form of Vlasov equation in guiding-center variables
4 Gyrokinetic equation suitable for numerical simulation
5 Parallel Ampere’s Law
6 Poisson’s equation and polarization density
7 Polarization density expressed as local Fourier expansion
8 Polarization density matrix obtained by numerically integrating in phase space using grid
A Adiabatic electron response
B Characteristic curves of Frieman-Chen nonlinear gyrokinetic equation
C From (δΦ,δA) to (δE,δB)
D Split-weight scheme for electrons in GEM code
E Coordinate system and grid in TEK code
F Implementation of gyrokinetics in particle-in-cell (PIC) codes
G Diamagnetic ﬂow **check**
H Transform gyrokinetic equation from (X,μ,𝜀,α) to (X,μ,v_∥,α) coordinates
I Drift-kinetic limit
J Derivation of Eq. (123), not ﬁnished
K Modern view of gyrokinetic equation
L About this document
References

1 Introduction

1.1 Gyrokinetic?

Electromagnetic perturbations of frequency lower than the ion cyclotron frequency are widely believed to be more important than high-frequency ones in transporting plasma in tokamaks. (This assumption can be veriﬁed numerically when we are able to do a full simulation including both low-frequency and high-frequency perturbations. This kind of veriﬁcation is not possible at present due to computation costs.)

If we assume that only low-frequency perturbations are present, the Vlasov equation can be simpliﬁed. Speciﬁcally, symmetry of the particle distribution function in the phase space can be established if we choose suitable phase-space coordinates and split the distribution function in a proper way. The symmetry is along the so-called gyro-angle α in the guiding-center coordinates (X,μ,𝜀,α). In obtaining the equation for the gyro-angle independent part of the distribution function, we need to average the equation over the gyro-angle α and thus this model is called “gyrokinetic”.

In deriving the gyrokinetic equation, the perturbed electromagnetic ﬁeld is assumed to be known and of low-frequency. To do a kinetic simulation, we need to solve the ﬁeld equation to obtain the perturbed electromagnetic ﬁeld. It is still possible that high frequency modes (e.g., compressional Alfven waves and Ω_H modes) appear in a gyrokinetic simulation. If the amplitude of high frequency modes is large, then the simulation is invalid since the gyrokinetic model is invalid in this case.

Our starting point is the Vlasov equation in terms of particle coordinates (x,v):

∂f-+ v ⋅ ∂f-+ q-(E+ v × B) ⋅ ∂f-= 0, ∂t ∂x m ∂v

(1)

where f = f(t,x,v) is the particle distribution function, x and v are the location and velocity of particles. The distribution function f depends on 6 phase-space variables (x,v), besides the time t.

1.2 Guiding-center coordinates: a simple example

Choose Cartesian coordinates (x,y,z) for the conﬁguration space x. Consider a simple case where the electromagnetic ﬁeld is a time-independent ﬁeld given by B = B₀(x,y,z) and E = 0. Let us examine the Vlasov equation in this case and see whether there is any coordinate system that can simplify the Vlasov equation.

Describe the velocity space using a right-handed cylindrical coordinates (v_⊥,α,v_∥), where v_∥ = v ⋅ e_∥, e_∥ = e_z is the unit vector along the magnetic ﬁeld, α is the azimuthal angle of the perpendicular velocity (v_⊥ = v − v_∥e_z) relative to e_x. Note that these coordinates are deﬁned relative to the local magnetic ﬁeld, which, in more general cases, may vary in space.

Next, let us express the spatial gradient of f in terms of partial derivatives:

∂f|| ∂f|| ∂f|| ∂f-|| ∂x|| = ∂x|| ex + ∂y|| ey + ∂z || ez. (2) v v v v

Note that this gradient is taken by holding v constant (i.e., (v_x,v_y,v_z) constant rather than (v_⊥,α,v_∥) constant). So, we need do the following coordinate transform:

( ′ ′ ′ ∘ -2---2- ) (x,y,z,vx,vy,vz) ⇒ x = x,y = y,z = z,v⊥ = vx + vy,α = atan 2(vy,vx),v∥ = vz

(3)

Using chain rule, expression (2) is written as

| | | | ∂f-|| = ∂f-∂x′ + ∂f--∂v⊥-|| + ∂f-∂α-|| + ∂f- ∂v∥|| . (4) ∂x |v ∂x′∂x ∂v⊥ ∂x |v ∂α ∂x |v ∂v∥ ∂x |v ∂f ≈ ∂x′. (5)

Note that the partial derivatives

|| || ∂v || ∂v⊥-|| , ∂α-||,--∥|| ∂x v ∂x v ∂x v

(6)

are generally nonzero since the deﬁntion of (v_⊥,α,v_∥) depends on the local magnetic ﬁeld direction. In our simple case, they are zero since the magnetic ﬁeld direction is uniform. Similarly, we obtain

∂f|| ∂f ∂y|| ≈ ∂y′ v

(7)

∂f|| ∂f --|| = --′ ∂z v ∂z

(8)

Then, in (x′,y′,z′,v_⊥,α,v_∥) coordinates, the spatial gradient of f is written as

∂f || ∂f ∂f ∂f ---|| ≈ --′ex + --′ey +--′ez. ∂x v ∂x ∂y ∂z

(9)

Next, consider the gradient in velocity space

∂f- ∂f- -∂f ∂f- ∂v = ∂vxex + ∂vyey + ∂vzez ∂f ∂f 1 ∂f = ---e1 +--- ez +-----eα, (10) ∂v⊥ ∂v ∥ v⊥ ∂α

where e₁ = v_⊥∕|v_⊥|, v_⊥ = v − (v ⋅e_z)e_z, and e_α = e_z ×e₁. Using Eq. (10), -q
m

(v ×B) ⋅

is written as

q ∂f q ( ∂f ∂f 1 ∂f ) -(v × B)⋅ ---= --(v⊥e1 × B )⋅----e1 +---ez + -----eα m ∂v m (∂v⊥ ∂)v∥ v⊥ ∂α = q-(v e × B )⋅ -1-∂f-e m ⊥ 1 v⊥ ∂α α Bq-∂f- = m ∂α(e1 × ez)⋅eα ( ∂f) = − Ω ∂α- , (11)

where Ω = Bq∕m is the gyro-frequency. Then the Vlasov equation (1) is written as

∂f ∂f ∂f -∂t + v ⋅∂x-− Ω ∂α-= 0,

(12)

i.e.,

∂f ∂f ∂f ∂f ∂f --+ v∥--′ + v⊥ cosα---′ + v⊥sin α-′ − Ω---= 0. ∂t ∂z ∂x ∂y ∂α

(13)

Deﬁne the following coordinates transform (guiding-center transform):

( || X = x′ + v⊥-siΩnα, |||| Y = y′ − v⊥cΩosα, ||{ Z = z′, α′ = α, |||| v′∥ = v∥, |||| v′⊥ = v⊥, ( t′ = t

(14)

Next, we express the partial derivatives of f appearing in Eq. (13) in terms of the guiding-center coordinates. Using the chain rule, we obtain

′ -∂f = ∂f-∂X- + ∂f-∂Y-+ ∂f-∂Z- + ∂f-∂α′ + ∂f-∂v∥ + ∂f-∂v-′⊥+ ∂f∂t′ ∂ α ∂X ∂α ∂Y ∂α ∂Z ∂ α ∂α′∂α ∂v′∥∂α ∂v′⊥ ∂α ∂t′∂ α ∂f ∂X ∂f ∂Y ∂f = ∂X-∂α- + ∂Y-∂α-+ ∂α-′ ∂f v cosα ∂f v sin α ∂f = --′-⊥-----+ ---′⊥------+ --′. (15) ∂x Ω ∂y Ω ∂α

Similarly, for other derivatives, we obtain

∂f-= ∂f, ∂t ∂t′

(16)

and

∂f ∂f ∂X ∂f ∂Y ∂f ∂Z ∂f ∂α′ ∂f ∂v′∥ ∂f ∂v′ ∂f ∂t′ ∂z′ = ∂X-∂z′ + ∂Y-∂z′ + ∂Z-∂z′ + ∂α′∂z′ + ∂v′∂z′ + ∂v′-∂z⊥′-+ ∂t′∂z-′ ∥ ⊥ ≈ ∂f-× 0 + ∂f-× 0+ -∂f, (17) ∂X ∂Y ∂Z

where we have assumed that the dependence of X and Y on z′ (via Ω) is weak enough to be neglected. And also

∂f- ∂f-∂X- ∂f-∂Y- ∂f-∂Z- ∂f-∂α′ ∂f-∂v′∥ ∂f--∂v′⊥- ∂f-∂t′ ∂x′ = ∂X ∂x′ + ∂Y ∂x′ + ∂Z ∂x′ + ∂α′∂x′ + ∂v′∂x′ + ∂v′ ∂x′ + ∂t′∂x′ ∥ ⊥ = ∂f-∂X′ + ∂f-∂Y′ ∂X ∂x ∂Y ∂x ≈ ∂f-× 1+ ∂f-× 0, (18) ∂X ∂Y

∂f ∂f ∂y-′ ≈ ∂Y

(19)

Using the above results, equation (13) in the guiding-center coordinates is written as

( ) ∂f-+ v′∥ ∂f-+ v′⊥cosα′ ∂f +v′⊥ sin α′ ∂f-− Ω -∂f v⊥cosα-+ ∂f-v⊥-sinα-+ -∂f = 0. ∂t′ ∂Z ∂X ∂Y ∂X Ω ∂Y Ω ∂α ′

(20)

Here we ﬁnd some terms cancel each other, giving

∂f ∂f ( ∂f) ∂t′-+ v′∥∂Z-− Ω ∂α-′ = 0.

(21)

Equation (21) is the equation for f in the guiding-center coordinates (X,Y,Z,v_⊥′,α′,v_∥′).

Gyro-averaging

Deﬁne gyro-phase averaging operator

−1∫ 2π ′ ⟨...⟩ ≡ (2π) (...)dα . 0

(22)

Then averaging Eq. (21) over α′, we get

∂⟨f⟩ ∂⟨f⟩ ∫ 2π ( ∂f ) --′-+ v′∥---- − (2π)−1 Ω --′ dα′ = 0. ∂t ∂Z 0 ∂α

(23)

As is assumed above, Ω is approximately independent of α′ and thus Ω in Eq. (23) can be moved outside of the gyro-angle integration, giving

∫ 2π( ) ∂⟨f⟩′ + v′∥∂⟨f⟩ − Ω(2π)−1 ∂f′ dα′ = 0. ∂t ∂Z 0 ∂α

(24)

Peforming the integration, we get

∂⟨f⟩ ′∂⟨f⟩ − 1 ′ ′ ∂t′ + v∥ ∂Z − Ω (2π) [f(α = 2π)− f (α = 0)] = 0.

(25)

Since α′ = 2π and α′ = 0 correspond to the same phase space location, the corresponding values of the distribution function must be equal. Then the above equation reduces to

∂⟨f-⟩ ′∂⟨f⟩ ∂t′ + v∥∂Z = 0,

(26)

which is an equation for the gyro-angle independent part of the distribution function.

Next let us investigate whether the guiding-center transform can be made use of to simplify the kinetic equation for more general cases where we have a (weakly) non-uniform static magnetic ﬁeld, plus electromagnetic perturbations of low frequency (and of small amplitude and k_∥ρ_i ≪ 1). And we will include the eﬀect that B₀ depends on α′, i.e., grad-B and curvature drift.

2 Transform Vlasov equation from particle coordinates to guiding-center coordinates

Next, we deﬁne the guiding-center transform and then transform the Vlasov equation from the particle coordinates (x,v) to the guiding-center coordinates, i.e., express the gradient operators ∂∕∂x and ∂∕∂v in terms of the guiding-center coordinates.

2.1 Guiding-center transformation

In a magnetic ﬁeld, given a particle location and velocity (x,v), we know how to calculate its guiding-center location X, i.e.,

e∥(x)- X = x + v × Ω(x),

(27)

where e_∥ = B₀∕B₀, Ω = qB₀∕m, B₀ = B₀(x) is the equilibrium (macroscopic) magnetic ﬁeld at the particle position. We will consider Eq. (27) as a transform and call it guiding-center transform[1]. Note that the transform (27) involves both position and velocity of particles.

Given (x,v), it is straightforward to obtain X by using Eq. (27). On the other hand, the inverse transform (i.e., given (X,v), to ﬁnd x) is in principle not easy because Ω and e_∥ depend on x, which usually requires solving a nonlinear equation for its root. Numerically, one can use

e (xn) xn+1 = X − v× -∥---, Ω(xn)

(28)

as an iteration scheme to compute x, with the initial guess chosen as x₀ = X. If we stop at the ﬁrst iteration, then

e∥(X ) x ≈ X − v× -Ω(X-).

(29)

In gyrokinetic PIC simulation, Eq. (29) is used as the inverse guiding-center transformation. This transformation needs to be performed numerically when we deposit markers to grids, or when we calculate the gyro-averaged ﬁeld to be used in pushing guiding-centers.

[The equilibrium magnetic ﬁeld we will consider has spatial scale length much larger than the thermal gyro radius ρ. In this case the diﬀerence between the values of e_∥(x)∕Ω(x) and e_∥(X)∕Ω(X) is negligible. The diﬀerence between equilibrium ﬁeld values evaluated at X and x is usually neglected in gyrokinetic theory (except in deriving the gradient/curvature drift). Therefore it does not matter whether the above e_∥∕Ω is evaluated at x or X. What matters is where the perturbed ﬁelds are evaluated: at x or at X. The values of perturbed ﬁelds at x or at X are diﬀerent and this is called the ﬁnite Larmor radius (FLR) eﬀect.]

For later use, deﬁne ρ ≡−v × e_∥∕Ω, which is the vector gyro-radius pointing from the the guiding-center to the particle position.

2.2 Choosing velocity coordinates

The guiding-center transformations (27) and (29) involve the particle velocity v. It is the cross product between v and e_∥(x) or e_∥(X) that is actually used. Therefore, only the perpendicular velocity (which is deﬁned by v_⊥ = v − v ⋅ e_∥) enters the transform. A natural choice of coordinates for the perpendicular velocity is (v_⊥,α), where v_⊥ = |v_⊥| and α is the azimuthal angle of the perpendicular velocity in the local perpendicular plane.

The parallel direction is fully determined by B₀(x), but there are degrees of freedom in choosing one of the two perpendicular basis vectors. In order to make the azimuthal angle α fully deﬁned, we need to choose a way to deﬁne one of the two perpendicular directions. In GEM simulations, one of the perpendicular direction is chosen as the direction perpendicular to the magnetic surface, which is fully determined at each spatial point. (We need to deﬁne the perpendicular direction at each spatial location to make ∂α∕∂x|_v deﬁned, which is needed in the Vlasov diﬀerential operators. However, it seems that terms related to ∂α∕∂x|_v are ﬁnally dropped due to that they are of higher order**check.)

In the following, α will be called the “gyro-angle” . [Note that, in the guiding-center coordinates (X,v_∥,v_⊥,α), α is a velocity coordinate rather than a spatial coordinate. When transformed back to particle coordinates, α aﬀects both the velocity coordinate and the spatial coordinate. Consider a series of points in terms of guiding-center coordinates (X,v_∥,v_⊥,α) with (X,v_∥,v_⊥) ﬁxed but with α changing. Using the inverse guiding-center transform (29), we know that the above points form a gyro-ring in space, i.e., α inﬂuences spatial location, in addtion to velocity.]

The gyro-angle is an important variable we will stick to because we need to directly perform averaging over this variable (with X ﬁxed) in deriving the gyrokinetic equation. We have multiple options for the remaining velocity coordinates , such as (v,v_∥), or (v,v_⊥), or (v_∥,v_⊥). In Frieman-Chen’s paper, the velocity coordinates other than α are chosen to be (𝜀,μ) deﬁned by

2 𝜀 = 𝜀(v,x) ≡ v-+ qΦ0-(x), 2 m

(30)

and

v2 μ = μ(v⊥,x) ≡ 2B-⊥(x), 0

(31)

where Φ₀(x) is the equilibrium (macroscopic) electrical potential. Choosing μ as one of the phase space coordinates is nontrivial because it turns out a constant of motion. And this choice seems to be important in sucessfully getting the ﬁnal gyrokinetic equation (I need to check this).

Note that (𝜀,μ,α) is not suﬃcient in uniquely determining a velocity vector. An additional parameter σ = sign(v_∥) is needed to determine the sign of v_∥ = v ⋅e_∥. In the following, the dependence of the distribution function on σ is often not explicitly shown in the variable list (i.e., σ is hidden/suppressed), which, however, does not mean that the distribution function is independent of σ.

Another frequently used velocity coordinates are (μ,v_∥,α). In the following, I will derive the gyrokinetic equation in (𝜀,μ,α) coordinates. After that, I transform it to (μ,v_∥,α) coordinates.

One important thing to note about the above velocity coordinates is that they are deﬁned relative to the local magnetic ﬁeld. If the ﬁeld itself is spatially varying (such as in tokamaks), the above velocity coordinates are also spatially varying for a ﬁxed velocity v. Speciﬁcally, the following derivatives are nonzero:

∂α- ∂v⊥- ∂v∥ ∂μ- ∂x|v, ∂x |v, ∂x |v,∂x |v.

(32)

2.3 Summary of the phase-space coordinate transform

The transform from particle variables (x,v) to guiding-center variables (X,𝜀,μ,α,σ) is given by

( e∥(x) |||| X = x + v× Ω(x)- ||{ 𝜀 = |v|22+ qΦ0m(x) μ = |v−-v⋅e∥(x)|2 |||| 2B0(x) ||( α = (anglebetweenv⊥ and locale⊥) σ = sign(v⋅e∥(x))

(33)

As mentioned above, the dependence of the distribution function on σ will not be explicitly indicated in the following.

2.4 Distribution function in terms of guiding-center variables

Denote the particle distribution function expressed in particle coordinates (x,v) by f_p, and the same distribution expressed in the guiding-center variables (X,𝜀,μ,α) by f_g. Then

f (X, 𝜀,μ,α) = f (x,v), g p

(34)

where (x,v) and (X,𝜀,μ,α) are related to each other by the guiding-center transform (29). Equation (34) along the guiding-center transform can be considered as the deﬁnition of f_g.

As is conventionally adopted in multi-variables calculus, both f_p and f_g are often denoted by the same symbol, say f. Which set of independent variables are actually assumed is inferred from the context. This is one subtle thing for gyrokinetic theory in particular and for multi-variables calculus in general. (Sometimes, it may be better to use subscript notation on f to identify which coordinates are assumed. One example where this distinguishing is important is encountered when we try to express the diamagnetic ﬂow in terms of f_g, which is discussed in Appendix G.)

In practice, f_g is often called the guiding-center distribution function whereas f_p is called the particle distribution function. However, they are actually the same distribution function expressed in diﬀerent variables. The name “guiding-center distribution function” is misleading because it may imply that we can count the number of guiding-centers to obtain this distribution function but this implication is wrong.

2.5 Spatial gradient operator in guiding-center coordinates

Using the chain-rule, the spatial gradient ∂f_p∕∂x is written

∂fp ∂X ∂fg ∂𝜀 ∂fg ∂ μ∂fg ∂α ∂fg ∂x-|v = ∂x-⋅∂X- + ∂x-∂𝜀-+ ∂x-∂-μ + ∂x-∂α-.

(35)

From the deﬁnition of X, Eq. (27), we obtain

( ) ∂X- = I+ v × ∂-- e∥ , ∂x ∂x Ω

(36)

where I is the unit dyad. From the deﬁnition of 𝜀, we obtain

∂𝜀 q ∂x-= − m-E0,

(37)

where E₀ = −∂Φ₀∕∂x. Using the above results, equation (35) is written as

∂f ∂f [ ∂ (e ) ] ∂f q ∂f ∂μ ∂f ∂α∂f --p |v = --g+ v × --- -∥ ⋅--g − --E0--g + -----g+ ----g . ∂x ∂X ∂x Ω ∂X m ∂𝜀 ∂x ∂μ ∂x ∂α

(38)

As mentioned above, the partial derivative ∂∕∂x is taken by holding v constant. Since B₀ is spatially varying, v_⊥ is spatially varying when holding v constant. Therefore ∂μ
∂x- and ∂α
∂x are generally nonzero. The explicit expressions of these two derivatives are needed later in the derivation of the gyrokinetic equation and is discussed in Appendix J. For notation ease, deﬁne

[ (e ) ] λB1 = v × -∂- -∥ ⋅-∂-, ∂x Ω ∂X

(39)

and

∂μ ∂ ∂α ∂ λB2 = -----+ -----, ∂x ∂μ ∂x∂ α

(40)

then expression (38) is written as

∂fp|v = ∂fg + [λB1 + λB2]fg −-qE0 ∂fg. ∂x ∂X m ∂𝜀

(41)

Note that ∂∕∂X is a shorthand for

|_𝜀,μ,α

i.e., it is taken by holding (𝜀,μ,α) constant (rather than holding v constant). For notation ease, ∂∕∂X is sometimes denoted by ∇_X or simply ∇.

2.6 Velocity gradient operator in guiding-center coordinates

Next, let us express the velocity gradient ∂f∕∂v in terms of the guiding-center variables. Using the chain rule, ∂f∕∂v is written

∂fp| = ∂X-⋅ ∂fg + ∂𝜀-∂fg+ ∂-μ∂fg + ∂α-∂fg. ∂v x ∂v ∂X ∂v ∂𝜀 ∂v ∂ μ ∂v ∂α

(42)

From the deﬁnition of X, we obtain

( ) ∂X- ∂-- v-×-e∥ ∂v = ∂v Ω ∂v e∥ = ∂v-× Ω- e∥ = I × Ω-. (43)

From the deﬁnition of 𝜀, we obtain

∂ 𝜀 ∂v-= v,

(44)

From the deﬁnition of μ, we obtain

∂-μ = v⊥, ∂v B0

(45)

From the deﬁnition of α, we obtain

( ) ∂α-= -1- e∥ × v⊥- = eα, ∂v v⊥ v⊥ v⊥

(46)

where e_α is deﬁned by

e = e × v⊥. α ∥ v⊥

(47)

Using the above results, expression (42) is written

∂fp I×-e∥- ∂fg ∂fg v⊥-∂fg eα-∂fg ∂v |x = Ω ⋅∂X + v∂ 𝜀 + B0 ∂μ + v⊥ ∂α .

(48)

2.7 Time derivatives in guiding-center coordinates

In terms of the guiding-center variables, the time partial derivative ∂f_p∕∂t appearing in Vlasov equation is written as

∂f ∂f ∂X ∂f ∂V ∂f --p|x,v = --g|X,V + ---⋅--g + ---⋅---g, ∂t ∂t ∂t ∂X ∂t ∂V

(49)

where V = (𝜀,μ,α). Here ∂X∕∂t and ∂V∕∂t are not necessarily zero because the equilibrium quantities involved in the deﬁnition of the guiding-center transformation are in general time dependent. This time dependence is assumed to be very slow in the gyrokinetic ordering discussed later. In the following, ∂X∕∂t and ∂V∕∂t will be dropped, i.e.,

∂f ∂f --p ≈ --g. ∂t ∂t

(50)

2.8 Final form of Vlasov equation in guiding-center coordinates

Using the above results, the Vlasov equation in guiding-center coordinates is written

[ ] ∂fg+ v ⋅ ∂fg+ [λB1 + λB2]fg − q-E0∂fg ∂t ∂X ( m ∂𝜀 ) q- I×-e∥- ∂fg ∂fg v⊥-∂fg eα-∂fg + m(E + v ×B )⋅ Ω ⋅ ∂X + v ∂𝜀 + B0 ∂μ + v⊥ ∂α = 0 (51)

Using tensor identity a ⋅ I × b = a × b, equation (51) is written as

[ ] ∂fg + v⋅ ∂fg+ [λB1 + λB2 ]fg −-qE0 ∂fg ∂t ∂X m ∂𝜀 ( ) -q (e∥) ∂fg q- eα-∂fg + m (E + v× B )× Ω ⋅∂X + m (v × B) ⋅ v⊥ ∂α q ( ∂fg v ⊥∂fg eα ∂fg) + m-E ⋅ v ∂𝜀-+ -B--∂μ + v--∂α- 0 ⊥ = 0, (52)

This is the Vlasov equation in guiding-center coordinates.

3 δf form of Vlasov equation in guiding-center variables

3.1 Electromagnetic ﬁeld perturbation

Since the deﬁnition of the guiding-center variables (X,𝜀,μ,α) involves the equilibrium ﬁelds B₀ and E₀, to further simplify Eq. (52), we need to separate electromagnetic ﬁeld into equilibrium and perturbation parts. Writing the electromagnetic ﬁeld as

E = E0 + δE

(53)

and

B = B0 + δB,

(54)

then substituting these expressions into equation (52) and moving all terms involving the ﬁeld perturbations to the right-hand side, we obtain

[ ] ∂fg + v⋅ ∂fg + v⋅ [λ + λ ]f − q-E ∂fg ∂t ∂X B1 B2 g m 0 ∂𝜀 q- ( e∥) ∂fg -q ( eα∂fg) + m (E0 + v × B0)× Ω ⋅ ∂X + m (v × B0)⋅ v⊥ ∂α q ( ∂f v ∂f e ∂f ) + --E0 ⋅ v---g+ -⊥---g + -α---g m ∂𝜀 B0 ∂ μ v⊥ ∂α = δRfg, (55)

where δR is deﬁned by

( ) ( ) δR = − -q(δE +v × δB) × e∥ ⋅-∂-− -q(v × δB )⋅ eα-∂- m ( Ω ∂X )m v⊥∂ α -q ∂-- v-⊥-∂- eα--∂- − m δE ⋅ v ∂𝜀 + B0 ∂μ + v⊥ ∂α . (56)

Next, let us simplify the left-hand side of Eq. (55). Note that

( ) q- eα-∂fg m (v × B0)⋅ v⊥ ∂α q e∥ × v⊥ ∂fg = m-(v × B0)⋅ --v2---∂α- ⊥ = − Ω ∂fg. (57) ∂α

Note that

( ) ( ) -qE0 × e∥ ⋅ ∂fg = c E0-×-e∥ ⋅ ∂fg = vE0 ⋅ ∂fg, m Ω ∂X B0 ∂X ∂X

(58)

where v_E0 is deﬁned by v_E0 = cE₀ × e_∥∕B₀, which is the E₀ × B₀ drift. Further note that

(e ) q-v-×-B0 × -∥ ⋅ ∂fg = [(v× e∥)× e∥]⋅ ∂fg m c Ω ∂X ∂X = [v∥e∥ − v]⋅ ∂fg , (59) ∂X

which can be combined with v ⋅ ∂f_g∕∂X term, yielding v_∥e_∥⋅ ∂f_g∕∂X.

Using Eqs. (58), (59), and (57), the left-hand side of equation (55) is written as

∂fg ∂fg ∂fg -∂t + (v∥e∥ + VE0)⋅-∂X + v ⋅[λB1 + λB2]fg − Ω ∂α q (v ∂f e ∂f ) + --E0 ⋅ --⊥--g + -α---g ≡ Lgfg, (60) m B0 ∂μ v⊥ ∂α

where L_g is often called the unperturbed Vlasov propagator in guiding-center coordinates (X,𝜀,μ,α).

[Equation (60), corresponds to Eq. (7) in Frieman-Chen’s paper[3]. In Frieman-Chen’s equation (7), there is a term

(E_mac − E₀) ⋅ v

where E_mac is a given macroscopic electric ﬁeld introduced when deﬁning the guiding-center transformation. In my derivation E_mac is chosen to be equal to the equilibrium electric ﬁeld, and thus the above term is zero.]

Using the above results, Eq. (55) is written as

Lgfg = δRfg,

(61)

i.e.

∂fg + (v∥e∥ + VE0)⋅ ∂fg− Ω ∂fg ∂t [ ∂X ∂α ] +v ⋅ v × -∂-(e∥) ⋅ ∂fg + ∂μ∂fg + ∂α-∂fg ∂x Ω ∂X ∂x ∂μ ∂x ∂α q ( v⊥ ∂fg eα∂fg) + m-E0 ⋅ B0-∂μ- + v⊥-∂α ( e ) ( ) = − q-(δE + v × δB)× -∥ ⋅ ∂fg− -q(v× δB )⋅ eα∂fg m ( Ω ∂X ) m v⊥ ∂α − q-δE ⋅ v∂fg + v⊥-∂fg + eα∂fg . (62) m ∂𝜀 B0 ∂μ v⊥ ∂α

It is instructive to consider some special cases of the above complicated equation. Consider the case that the equilibrium magnetic ﬁeld B₀ is uniform and time-independent, E₀ = 0, and the electrostatic limit δB = 0, then equation (62) is simpliﬁed as

∂f ∂f ∂f --g+ v∥e∥ ⋅-g − Ω --g ∂t ∂X(e ) ∂α = − q-(δE)× -∥ ⋅ ∂fg (63) m ( Ω ∂X ) − q-δE ⋅ v∂fg + v⊥-∂fg + eα∂fg (64) m ∂𝜀 B0 ∂ μ v⊥ ∂α

If neglecting the δE perturbation, the above equation reduces to

∂f ∂f ∂f --g + v∥e∥ ⋅--g− Ω --g = 0, ∂t ∂X ∂ α

(65)

which agrees with Eq. (21) discussed in Sec. 1.2.

3.2 Distribution function perturbation

Expand the distribution function f_g as

fg = Fg + δFg,

(66)

where F_g is assumed to be an equilibrium distribution function, i.e.,

LgFg = 0.

(67)

Using Eqs. (66) and (67) in Eq. (61), we obtain an equation for δF_g:

Lg δFg = δRFg + δRδFg.

(68)

3.3 Gyrokinetic ordering

To facilitate the simpliﬁcation of the Vlasov equation in the low-frequency regime, we assume the following orderings (some of which are roughly based on experiment measure of ﬂuctuations responsible for tokamak plasma transport, some of which can be invalid in some interesting cases.) These ordering are often called the standard gyrokinetic orderings.

Assumptions for equilibrium quantities

Deﬁne the spatial scale length L₀ of equilibrium quantities by L₀ ≈ F_g∕|∇_XF_g|. Assume that L₀ is much larger than the thermal gyro-radius ρ_i ≡ v_t∕Ω, i.e., λ ≡ ρ_i∕L₀ is a small parameter, where v_t = ∘ -----
2T ∕m is the thermal velocity. That is

1-- 1 Fgρi|∇XFg | ∼ O (λ ),

(69)

and

-1-ρ |∇ B | ∼ O(λ1). B0 i X 0

(70)

The equilibrium E₀ × B₀ ﬂow, i.e.,

vE0 = E0 × e∥∕B0 = − ∇ Φ0 × e∥∕B0,

(71)

is assumed to be weak with

|v | --E0-∼ O(λ1), vt

(72)

Assumptions for perturbations

We assume that the amplitudes of perturbations are small. Speciﬁcally, we assume

δFg-∼ qδΦ-∼ |δB-|∼ O (λ1), F0 T B0

(73)

where δΦ is the perturbed scalar potential deﬁned later in Eq. (79).

We consider low frequency perturbations with ω∕Ω ∼ O(λ¹), then

-1--1∂δFg- 1 δFgΩ ∂t ∼ O (λ ).

(74)

The perturbation is assumed to have a long wavelength (much longer than ρ_i) in the parallel direction

1 ---|ρie∥ ⋅∇X δFg| ∼ O(λ1), δFg

(75)

and have a short wavelength comparable to the thermal gyro-radius in the perpendicular direction

-1-|ρ ∇ δF | ∼ O(λ0). δFg i X⊥ g

(76)

[Combining Eq. (75) and (76), we obtain

k∥-≈ e∥ ⋅∇X-∼ O(λ), k⊥ ∇X⊥

(77)

i.e., the parallel wave number is one order smaller than the perpendicular wave-number.]

In terms of the scalar and vector potentials δΦ and δA, the perturbed electromagnetic ﬁeld is written as

δB = ∇x × δA,

(78)

and

∂δA δE = − ∇xδΦ − -∂t-.

(79)

Then

( ) δE∥ = − ∇∥δΦ − ∂-δA- ∂t ∥

(80)

(∂δA ) δE ⊥ = − ∇ ⊥δΦ − -∂t- , ⊥

(81)

Using the above orderings, it is ready see that δE_∥ is one order smaller than δE_⊥, i.e.,

δE --∥-= O (λ1). δE⊥

(82)

Most gyrokinetic simulations approximate the vector potential as δA ≈ δA_∥e_∥.

δB = ∇x × (δA ∥e∥) ≈ ∇xδA ∥ × e∥.

(83)

We will assume v_tδA_∥∼ δΦ.

3.4 Equation for macroscopic distribution function F_g

The evolution of the macroscopic quantity F_g is governed by Eq. (67), i.e.,

LgFg = 0,

(84)

where the left-hand side is written as

∂Fg ∂X ∂Fg ∂V ∂Fg LgFg = -∂t-+ -∂t ⋅∂X--+ ∂t-⋅-∂V- ∂Fg ∂Fg + (v∥e∥ + VE0)⋅ ∂X-+ v ⋅[(λB1 + λB2)Fg]− Ω ∂α-- q (v ∂F e ∂F ) + --E0 ⋅ -⊥----g+ -α---g- m B0 ∂μ v⊥ ∂α

Expand F_g as F_g = F_g0 + F_g1 + …., where F_gi ∼ F_g0O(λⁱ). Then, the balance on order O(λ⁰) gives

∂F --g0 = 0 ∂α

(85)

i.e., F_g0 is independent of the gyro-angle α. The balance on O(λ¹) gives

∂F q (v ∂F ) ∂F v∥e∥ ⋅--g0+ --E0 ⋅ --⊥---g0 = Ω ---g1 . ∂X m B0 ∂μ ∂α

(86)

Performing averaging over α, ∫ ₀^2π(…)dα, on the above equation and noting that F_g0 is independent of α, we obtain

(∫ 2π ) ∂Fg0 q-∂Fg0∫ 2π ( v⊥) ∫ 2π ∂Fg1 0 dαv∥e∥ ⋅∂X + m ∂μ 0 dαE0 ⋅ B0 = 0 dα Ω ∂α

(87)

Note that a quantity A = A(x) that is independent of v will depend on v when transformed to guiding-center coordinates, i.e., A(x) = A_g(X,v). Therefore A_g depends on gyro-angle α. However, since ρ_i∕L ≪ 1 for equilibrium quantities, the gyro-angle dependence of the equilibrium quantities can be neglected. Speciﬁcally, e_∥, B₀ and Ω can be considered to be independent of α. As to v_∥, we have v_∥ = ± ∘ ----------
2(𝜀− B0μ) . Since B₀ is considered independent of α, so does v_∥. Using these results, equation (87) is written

∂F q ∂F ∫ 2π ( v ) v∥e∥ ⋅--g0+ -- --g0 dαE0 ⋅ -⊥- = 0. ∂X m ∂ μ 0 B0

(88)

Using E₀ = −∇Φ₀, the above equation is written as

∫ 2π ( ) v∥e∥ ⋅ ∂Fg0 + q-∂Fg0 dα − v⊥-⋅∇-Φ0 = 0, ∂X m ∂μ 0 B0

(89)

Note that

∫ 2π -1- 0 dαB0 v⊥ ⋅∇X Φ0 ≈ 0,

(90)

where the error is of O(λ²)Φ₀, and thus, accurate to O(λ), the last term of equation (89) is zero. Then equation (89) is written as

∂Fg0 v∥e∥ ⋅ ∂X = 0,

(91)

which implies that F_g0 is constant along a magnetic ﬁeld line.

3.5 Equation for δF_g

Using F_g ≈ F_g0, equation (68) is written as

LgδFg = δRFg0 + δR δFg , Nonlin◟ear◝T◜erm◞∼O(λ2)

(92)

where δRδF_g is a nonlinear term which is of order O(λ²) or higher, L_gδF_g and δRF_g0 are linear terms which are of order O(λ¹) or higher. The linear term δRF_g0 is given by

( ) (e ) ( ) δRFg0 = − q- δE + v-×-δB × -∥ ⋅ ∂Fg0 − q-δE ⋅ v∂Fg0 + v⊥-∂Fg0 , ◟-m----------c◝◜------Ω----∂X-◞ m◟--------∂𝜀◝◜--B0--∂μ--◞ O(λ2) O(λ1)

(93)

In obtaining (93), use has been made of ∂F_g0∕∂α = 0. Another linear term L_gδF_g is written as

L δF = ∂-δFg-+ (v e + V )⋅ ∂δFg-+ v⋅[λ + λ ]δF − Ω∂-δFg- g g ∂t ∥ ∥ E0 ∂X B1 B2 g ◟-∂◝α◜-◞ O(λ1) q ( v ∂δF e ∂δF ) + --E0 ⋅ -⊥----g+ --α---g- , (94) m B0 ∂ μ v⊥ ∂α

where Ω∂δF_g∕∂α is of order O(λ¹) and all the other terms are of order O(λ²).

Next, to reduce the complexity of algebra, we consider the easier case in which ∂F_g0∕∂μ = 0.

Balance on order O(λ¹): adiabatic response

The balance between the leading terms (terms of O(λ)) in Eq. (92) requires that

( ) Ω∂-δFa-= q-δE ⋅ v∂Fg0 , ∂α m ∂𝜀

(95)

where δF_a is a unknown distribution function to be solved from the above equation. It is ready to verify that

q- ∂Fg0 δFa = m δΦ ∂𝜀 ,

(96)

is a solution to the above equation, accurate to O(λ). [Proof: Substitute expression (96) into the left-hand side of Eq. (95), we obtain

( ) Ω ∂δFa-= Ω ∂-- q-δΦ∂Fg0 ∂α ∂α m ∂𝜀 q-∂Fg0 -∂- = m ∂𝜀 Ω ∂α (δΦ ) q-∂Fg0 ∂x- = m ∂𝜀 Ω (∇xδΦ) ⋅∂α (97)

Using

[ ] ∂x- ∂-- e∥(X-)- ∂α = ∂α − v × Ω(X ) ∂ e (X ) = --[− v]×-∥---- ∂αv Ω(X ) = − -⊥- (98) Ω

Eq. (97) is written as

∂δFa q ∂Fg0 v⊥ Ω -∂α--= − m--∂𝜀-Ω(∇x δΦ)⋅-Ω- q ∂F ( ∂δA ) = -----g0 δE + ---- ⋅v⊥ (99) m ∂𝜀 ∂t ≈ q-∂Fg0(δE) ⋅v⊥, (100) m ∂𝜀

where terms of O(λ²) have been dropped. Similarly, dropping the parallel electric ﬁeld term (which is of O(λ²)) on the right-hand side of Eq. (95), we ﬁnd it is identical to the right-hand side of Eq. (100)]

Separate δF_g into adiabatic and non-adiabatic part

As is discussed above, the terms of O(λ) can be eliminated by splitting a so-called adiabatic term form δF_g. Speciﬁcally, write δF_g as

δF = δF + δG, g a

(101)

where δF_a is given by (96), i.e.,

q- ∂Fg0 δFa = m δΦ ∂𝜀 ,

(102)

which depends on the gyro-angle via δΦ and this term is often called adiabatic term. Plugging expression (101) into equation (92), we obtain

LgδG = δ◟RFg0-−◝◜-LgδFa◞+ δR◟ ◝δ◜Fg◞ . LinearTerms NonlinearTerms

(103)

Next, let us simplify the linear term on the right-hand side, i.e, δRF_g0 −L_gfδF_a, (which should be of O(λ²) or higher because Ω∂δF_a∕∂α cancels all the O(λ¹) terms in δRF_g0).

L_gδF_a is written

L δF = q-∂Fg0L δΦ + q-δΦL ∂Fg0 g a m ∂𝜀 g m g ∂ 𝜀 ≈ q-∂Fg0L δΦ, (104) m ∂𝜀 g

where the error is of order O(λ³). In obtaining the above expression, use has been made of e_∥⋅∂F_g0∕∂X = 0, ∂F_g0∕∂X = O(λ¹)F_g0, ∂F_g0∕∂α = 0, ∂F_g0∕∂μ = 0, and the deﬁnition of λ_B1 and λ_B2 given in expressions (39) and (40). The expression (104) involves δΦ operated by the Vlasov propagator L_g. Since δΦ takes the most simple form when expressed in particle coordinates (if in guiding-center coordinates, δΦ(x) = δΦ(X−v ×e_∥∕Ω), which depends on velocity coordinates and thus more complicated), it is convenient to use the Vlasov propagator L_g expressed in particle coordinates. Transforming L_g back to the particle coordinates, expression (104) is written

[ ] LgδFa = q-∂Fg0 ∂δΦ|x,v + v⋅∇x δΦ+ -q(E0 + v× B0 )⋅ ∂Φ-|x m ∂𝜀 [ ∂t ] m ∂v q-∂Fg0 ∂δΦ- = m ∂𝜀 ∂t |x,v + v⋅∇x δΦ (105) q ∂Fg0[ ∂δΦ ( ∂δA )] = m--∂𝜀- ∂t-|x,v + v⋅ − δE − -∂t-|x,v [ ] = q-∂Fg0 ∂δΦ|x,v − v⋅δE − ∂v-⋅δA|x,v . (106) m ∂𝜀 [ ∂t ∂t] q-∂Fg0 ∂δΦ- ∂v-⋅δA- = m ∂𝜀 ∂t − v⋅δE − ∂t . (107)

Using this and expression (93), δRF_g0 − L_gδF_a is written as

( ) q- (e∥) ∂Fg0 q- ∂Fg0 δRFg0 − LgδFa = − m(δE + v ×δB )× Ω ⋅ ∂X − m δE ⋅ v ∂ 𝜀 q ∂Fg0 [∂δΦ ∂v ⋅δA ] − ------ ----− v⋅δE − ------- m ∂𝜀 ∂t ( ) ∂t [ ] = − q(δE + v ×δB )× e∥ ⋅ ∂Fg0 − q-∂Fg0 ∂Φ-− ∂v-⋅δA- , (108) m Ω ∂X m ∂𝜀 ∂t ∂t

where the two terms of O(λ¹) (the terms in blue and red) cancel each other, with the remain terms being all of O(λ²), i.e, the contribution of the adiabatic term cancels the leading order terms of O(λ¹) on the RHS of Eq. (103).

The consequence of this is that, as we will see in Sec. 3.6, δG is independent of the gyro-angle, accurate to order O(λ¹). Therefore, separating δF into adiabatic and non-adiabatic parts also corresponds to separating δF into gyro-angle dependent and gyro-angle independent parts.

Linear term expressed in terms of δΦ and δA

Let us rewrite the linear term (108) in terms of δΦ and δA. The δE + v ×δB term in expression (108) is written as

∂δA δE +v × δB = − ∇x δΦ −----+ v × ∇x × δA. ∂t

(109)

Note that this term needs to be accurate to only O(λ). Then

δE + v× δB ≈ − ∇xδΦ + v× ∇x × δA,

(110)

where the error is of O(λ²). Using the vector identity v ×∇_x ×δA = (∇δA) ⋅v − (v ⋅∇)δA and noting v is constant for ∇_x operator, the above equation is written

δE + v × δB = − ∇x δΦ+ ∇x (δA ⋅v )− (v ⋅∇x )δA

(111)

Note that Eq. (41) indicates that ∇_xδΦ ≈∇_XδΦ, where the error is of O(λ²), then the above equation is written

δE+ v × δB = − ∇X δΦ + ∇X (δA ⋅v)− (v⋅∇X )δA

(112)

Further note that the parallel gradients in the above equation are of O(λ²) and thus can be dropped. Then expression (112) is written

δE + v× δB = − ∇X ⊥δΦ+ ∇X ⊥ (δA ⋅v) − (v⊥ ⋅∇X ⊥)δA. = − ∇X ⊥δL− v ⊥ ⋅∇X ⊥δA, (113)

where δL is deﬁned by

δL = δΦ − v ⋅δA.

(114)

Using expression (113), equation (108) is written

q [ e ] ∂F q∂δL ∂F δRF0 − LgδFa = − -- (− ∇X ⊥δL − v⊥ ⋅∇X ⊥δA) ×-∥ ⋅---g0− --------g0, m Ω ∂X m ∂t ∂𝜀

(115)

where all terms are of O(λ²).

3.6 Equation for the non-adiabatic part δG

Plugging expression (115) into Eq. (103), we obtain

[ ] LgδG = − q- (− ∇X ⊥ δL− v⊥ ⋅∇X ⊥δA )× e∥ ⋅ ∂Fg0− q-∂δL∂Fg0 + δRδFg, m Ω ∂X m ∂t ∂ 𝜀

(116)

where L_g is given by Eq. (94), i.e.,

Lg = -∂ + (v∥e∥ + VE0 )⋅-∂ + v⋅[λB1 + λB2 ]− Ω-∂ ∂t ( ∂X ) ∂ α + -qE0 ⋅ v-⊥-∂-+ eα--∂- , (117) m B0 ∂μ v⊥ ∂α

Expansion of δG

Expand δG as

δG = δG₀ + δG₁ + …,

where δG_i ∼ O(λⁱ⁺¹)F_g0, and note that the right-hand side of Eq. (116) is of O(λ²), then, the balance on order O(λ¹) requires

∂-δG0- ∂ α = 0,

(118)

i.e., δG₀ is gyro-phase independent.

The balance on order O(λ²) requires (for the special case of E₀ = 0):

∂δG0-+ v∥e∥ ⋅ ∂δG0-+ v ⋅[λB1 + λB2]δG0 ∂t [ ∂X ] = −-q (− ∇X ⊥δL − v⊥ ⋅∇X δA )× e∥ ⋅ ∂Fg0 − q-∂δL-∂Fg0+ δR δFg. (119) m Ω ∂X m ∂t ∂𝜀

Gyro-averaging

Deﬁne the gyro-average operator ⟨…⟩_α by

− 1∫ 2π ⟨h⟩α = (2π) hdα, 0

(120)

where h = h(X,α,𝜀,μ) is an arbitrary function of guiding-center variables. The gyro-averaging is an integration in the velocity space. [For a ﬁeld quantity, which is independent of the velocity in particle coordinates, i.e., h = h(x), it is ready to see that the above averaging is a spatial averaging over a gyro-ring.]

Gyro-averaging Eq. (119), we obtain

∂δG0 ⟨ ∂ δG0 ⟩ -∂t--+ v∥e∥ ⋅-∂X- + ⟨v⋅[λB1 + λB2 ]δG0⟩α [ e ] = −-q − ∇X ⊥⟨δL⟩α ×-∥ ⋅ ∂Fg0 − q-∂⟨δL⟩α∂Fg0 + ⟨δRδFg⟩α, (121) m Ω ∂X m ∂t ∂𝜀

where use has been made of ⟨(v_⊥⋅∇_X)δA⟩_α ≈ 0, where the error is of order higher than O(λ²). Note that v_∥ = ± ∘ ----------
2(𝜀− B0μ )

. Since B₀ is approximately independent of α, so does v_∥. Using this, the ﬁrst gyro-averaging on the left-hand side of the above equation is written

⟨ ⟩ ∂δG0- ∂δG0- ∂δG0- v∥e∥ ⋅ ∂X α = ⟨v∥e∥⟩⋅ ∂X = v∥e∥ ⋅ ∂X

(122)

The second gyro-averaging on the left-hand side of Eq. (121) can be written as

⟨v ⋅[λB1 + λB2]δG0⟩α = VD ⋅∇X δG0,

(123)

where V_D is the magnetic curvature and gradient drift (Eq. (123) is derived in Appendix xx, to do later). Then Eq. (121) is written

[ ∂ ] ∂t + (v∥e∥ + VD )⋅∇X δG0 [ ] = −-q − ∇X ⊥⟨δL⟩α × e∥ ⋅ ∂Fg0 − q-∂⟨δL⟩α ∂Fg0+ ⟨δRδFg⟩α. (124) m Ω ∂X m ∂t ∂𝜀

Simpliﬁcation of the nonlinear term

Next, we try to simplify the nonlinear term ⟨δRδF_g⟩_α appearing in Eq. (124), which is written as

⟨ ( ) ⟩ ⟨δRδFg⟩α = δR -qδΦ ∂Fg0+ δG0 ⟨ m( ∂𝜀 )⟩ α -q ∂Fg0 = m δR δΦ ∂𝜀 α + ⟨δRδG0⟩α (125)

First, let us focus on the ﬁrst term, which can be written as

( ) ( ) ( ) ( ) ( ) δR δΦ ∂Fg0 ≈ − q-∂Fg0 δE + v×-δB- × e∥ ⋅ ∂δΦ-− q-∂Fg0 v-×-δB ⋅ eα-∂δΦ- ∂𝜀 m ∂𝜀 ( c Ω ∂X ) m ∂𝜀 c v⊥ ∂α -q∂Fg0 ∂δΦ- v⊥-∂δΦ- eα-∂δΦ- -q ∂2Fg0 − m ∂ 𝜀 δE⋅ v ∂𝜀 + B0 ∂μ + v⊥ ∂ α + m δΦδE ⋅v ∂𝜀2 q ( v × δB) q ∂2F = − -- δE + ------ ⋅∇v δΦ+ -δΦ δE⋅v ---g20 m c m ∂𝜀 = -qδΦδE ⋅v ∂2Fg0- (126) m ∂ 𝜀2

Using the above results, the nonlinear term ⟨δRδF⟩_α is written as

q ⟨ ∂2F ⟩ ⟨δRδF ⟩α = -- δΦδE ⋅v---g20 + ⟨δRδG0⟩α m ∂𝜀 α

(127)

Accurate to O(λ²),the ﬁrst term on the right-hand side of the above is zero. [Proof:

⟨ ∂2Fg0⟩ ⟨ ∂2Fg0 ⟩ δΦ δE ⋅v -∂𝜀2- = -∂𝜀2-δΦ∇ δΦ ⋅v α 2 α = ∂-Fg0⟨v ⋅∇(δΦ)2⟩α ∂𝜀2 ≈ ∂2Fg0⟨v ⋅∇(δΦ)2⟩ ∂𝜀2 ⊥ α ≈ 0, (128)

where use has been made of ⟨v_⊥⋅∇_XδΦ⟩_α ≈ 0, where the error is of O(λ²). Using the above results, expression (127) is written as

⟨δRδFg⟩α = ⟨δR δG0⟩α.

(129)

Using the expression of δR given by Eq. (56), the above expression is written as

⟨( ) ( )⟩ ⟨δRδG ⟩ = −-q δE + v-×-δB × e∥ ⋅ ∂δG0 0α m c Ω α ∂X q ∂δG0 q∂ δG0 ⟨ v⊥⟩ −m- -∂𝜀-⟨δE ⋅v⟩α − m-∂-μ- δE ⋅B0- (130) α

where use has been made of ∂δG₀∕∂α = 0. Using Eq. (113), we obtain

q ⟨ ( e∥)⟩ q e∥ − m- (δE +v × δB) × Ω- α = m-∇X ⊥⟨δL⟩α × Ω-.

(131)

The other two terms in Eq. (130) can be proved to be zero. [Proof:

q∂δG0 q∂ δG0 −m---∂𝜀-⟨δE ⋅v ⟩α = m---∂𝜀-⟨v⋅∇x Φ⟩α q∂ δG0 ≈ m---∂𝜀-⟨v⊥ ⋅∇xΦ⟩α q∂ δG0 ≈ m---∂𝜀-⟨v⊥ ⋅∇X Φ⟩α ≈ 0 (132)

⟨ ⟩ ⟨ ⟩ − -q∂δG0- δE⋅ v⊥- = q-∂δG0- -1-v ⋅∇ Φ m ∂μ B0 α m ∂μ B0 ⊥ x α q ∂δG0 ⟨ 1 ⟩ ≈ m--∂μ-- B0-v⊥ ⋅∇X Φ α ≈ 0 (133)

] Using the above results, the nonlinear term is ﬁnally written as

-q[ e∥] ⟨δRδG0 ⟩α = m ∇X ⊥⟨δL⟩α × Ω ⋅∇X δG0.

(134)

Using this in Eq. (129), we obtain

[ ] ⟨δRδFg⟩α = q- ∇X ⊥⟨δL⟩α × e∥ ⋅∇X δG0, m Ω

(135)

which is of O(λ²).

Final equation for the non-adiabatic part of perturbed distribution function

Using the above results, the gyro-averaged kinetic equation for δG₀ is ﬁnally written as

( ) ∂δG0 | q e ∥| -∂t--+|(v ∥e∥ + VD − m-∇⟨δL⟩× -Ω |) ⋅∇X δG0 ◟----◝◜----◞ ( ) nonlinear = q-∇⟨δL⟩× e∥ ⋅∇XFg0 − q-∂⟨δL⟩∂Fg0 . (136) ◟-m-------◝◜Ω--------◞ m◟---∂t◝◜-∂𝜀-◞ spatial− drive velocit− space− damp

where V_D is the equilibrium guiding-center drift velocity, ⟨…⟩ is the gyro-phase averaging operator, δL = δΦ− v ⋅δA, and δG₀ = δG₀(X,𝜀,μ,t) is gyro-angle independent and is related to the perturbed distribution function δF_g by

δFg = -qδΦ∂Fg0 + δG0, m ∂ 𝜀

(137)

where the ﬁrst term is called “adiabatic term”, which depends on the gyro-phase α via δΦ. Equation (136) is the special case (∂F_g0∕∂μ|_𝜀 = 0) of the Frieman-Chen nonlinear gyrokinetic equation given in Ref. [3]. Note that the nonlinear terms only appear on the left-hand side of Eq. (136) and all the terms on the right-hand side are linear. The term

− q∇ ⟨δL⟩× e∥ m Ω

(138)

consists of the δE × B₀ drift and magnetic ﬂuttering term (refer to expression (330) in Sec. C.3). For notaiton ease, this term is denoted by δV_D in the following:

-q e∥ q- e∥ δVD ≡ −m ∇ ⟨δL⟩× Ω = − m ∇⟨δΦ− v ⋅δA⟩α × Ω

(139)

Note that ∇≡ ∂∕∂X, which is taken in the guiding-center coordinates (X,𝜀,μ,α) while holding (𝜀,μ,α) constant. How do we numerically calculate ∂Φ_g∕∂X in PIC simulations? The diﬃculty is that X grid is not available, which makes it diﬃcult to use the ﬁnite diﬀerence method. We note x grid is available in PIC simulations. It turns out we can make use of x grid to approximatly calculate the derivatives in X space. Using x = X + ρ and the deﬁnition δΦ_g(X,α,v_⊥) = δΦ_p(x) we obtain

∂δΦg(X,α,v⊥-) ∂-δΦp-(x-) ∂X = ∂X ∂ δΦp (x ) ∂x = ---∂x-- ⋅∂X- ∂ δΦp (x ) ( ∂ρ ) = ---∂x-- ⋅ 1 + ∂X- ≈ ∂-δΦp-(x-), (140) ∂x

which is accurate up to O(λ) in gyrokinetic ordering. The above relation indicates that the value of ∂δΦ_g∕∂X at a guidng-center location is approximately equal to the value of ∂δΦ_g∕∂x at the corresponding particle position. Therefore we can use δΦ deﬁned on x grid to compute ∂δΦ_p∕∂x, and interpolate it to the particle position and this is a good approximation of ∂δΦ_g∕∂X.

Similarly, we have ∂δA_∥g∕∂X ≈ ∂δA_∥p∕∂x.

Also note that ⟨δΦ_g⟩ is not known on X grid (it is known on random markers). So, to calculate ∂⟨δΦ_g⟩∕∂X, we need to exchange the operation order:

δ⟨δΦg⟩= ⟨∂δΦg-⟩ ≈ ⟨∂δΦp⟩. ∂X ∂X ∂x

(141)

For notation ease, the subscripts g and p are usually dropped. Which one is intended should be obvious from the context.

4 Gyrokinetic equation suitable for numerical simulation

The gyrokinetic equation (136) contains time derivatives on the right-hand side, which is problematic if treated by using explicit ﬁnite diﬀerence in PIC simulations. Next, we discuss some methods that can eliminate these terms, making the gyrokinetic equation more amenable to PIC simulations.

4.1 Eliminate ∂⟨δϕ⟩∕∂t term on the right-hand side of Eq. (136)

Note that the coeﬃcient before ∂F₀∕∂𝜀 in Eq. (136) involves the time derivative of ⟨δϕ⟩_α, which is problematic if treated by using explicit ﬁnite diﬀerence (I test the algorithm that treats this term by implicit scheme, the result roughly agrees with the standard method discussed in Sec. 6). It turns out that ∂⟨δΦ⟩_α∕∂t can be eliminated by deﬁning another gyro-phase independent function δf by

δf = -q⟨δΦ⟩∂F0-+ δG0. m ∂𝜀

(142)

Substituting this into Eq. (136), we obtain the equation for δf:

[ ∂ ] ∂t + (v∥e∥ + VD + δVD )⋅∇X δf [ ] − -q∂F0- -∂ + (v∥e∥ + VD + δVD )⋅∇X ⟨δΦ⟩ m ∂𝜀 [∂t ] − -q⟨δΦ⟩ -∂ + (v e + V + δV )⋅∇ ∂F0- m ∂t ∥ ∥ D D X ∂𝜀 q ∂⟨δL⟩∂F0 = − δVD ⋅∇XF0 − m---∂t--∂𝜀-. (143)

Noting that ∂F₀∕∂t = 0, e_∥⋅∇F₀ = 0, ∂F₀∕∂𝜀 ≈−F₀m∕T, we ﬁnd that the third line of the above equation is of order O(λ³) (after both sides being divided by F₀Ω). Therefore the third line can be dropped. Moving the second line to the right-hand side and noting that ⟨δL⟩_α = ⟨δϕ − v ⋅ δA⟩_α, the above equation is written as

[ ] ∂-+ (ve + V + δV )⋅∇ δf ∂t ∥∥ D D X = − δVD ⋅∇XF0 q [ ∂⟨v⋅δA ⟩ ] ∂F0 − m- − ---∂t---− (v∥e∥ + VD + δVD )⋅∇X ⟨δΦ⟩ ∂𝜀-, (144)

where two ∂⟨ϕ⟩_α∕∂t terms cancel each other. Note that the right-hand side of Eq. (144) contains a nonlinear term δV_D ⋅∇_X⟨δΦ⟩. This is diﬀerent from the original Frieman-Chen equation, where all nonlinear terms appear on the left-hand side. For the electrostatic limit, this term disappears because δV_D is perpendicular to ∇_X⟨δΦ⟩_α.

Using

δV_D = −

∇⟨δΦ − v ⋅ δA⟩×

in the right-hand side of Eq. (144) yields

[ ] ∂- ∂t +(v∥e∥ + VD + δVD ) ⋅∇X δf = − δVD ⋅∇XF0 q [∂⟨v⋅δA ⟩ ( q e ) ] ∂F + -- --------+ v∥e∥ +VD + --∇X ⟨v⋅δA ⟩× -∥ ⋅∇X ⟨δΦ⟩ --0. (145) m ∂t m Ω ∂𝜀

[Equation (145) corresponds to Eqs. (A8-A9) in Yang Chen’s paper[2], where the ﬁrst minus on the right-hand side of Eq. (A8) is wrong and should be replaced with q∕m; one q is missing before ∂⟨v ⋅ δA⟩∕∂t in Eq. (A9).]

4.2 Eliminate ∂⟨δv ⋅ δA⟩∕∂t term on the right-hand side of GK equation

Similar to the method of eliminating ∂⟨δϕ⟩∕∂t, we deﬁne another gyro-phase independent function by

δh = δf − q-∂F0⟨v ⋅δA⟩. m ∂𝜀

(146)

Most gyrokinetic simulations approximate the vector potential as δA ≈ δA_∥e_∥. Let us simplify Eq. (144) for this case. Then ⟨v ⋅ δA⟩_α is written as

⟨v⋅δA ⟩α ≈ ⟨v∥δA∥⟩.

(147)

Then expression (146) is written as

δh = δf − q-⟨vδA ⟩∂F0. m ∥ ∥ ∂𝜀

(148)

Then Eq. (144) is written in terms of δh as

[ ∂ ] ∂t + (v∥e∥ + VD + δVD )⋅∇X δh q∂F [ ∂ ] +-----0 -- + (v∥e∥ +VD + δVD )⋅∇X ⟨v∥δA∥⟩ m ∂𝜀 ∂t ( ) +-q⟨v∥δA∥⟩[(VD + δVD )⋅∇X ] ∂F0- m ∂ 𝜀 = − δVD ⋅∇XF0 q[ ∂⟨v∥δA∥⟩ ] ∂F0 −m- − ---∂t---− (v∥e∥ + VD +δVD ) ⋅∇X ⟨δΦ ⟩ ∂-𝜀 , (149)

where use has been made of that ∂F₀∕∂t = 0 and e_∥⋅∇F₀ = 0. Further noting that v_∥δA_∥∼ δΦ, qδΦ∕T ∼ O(λ¹), ∂F₀∕∂𝜀 ∼−F₀m∕T, V_D∕v_t ∼ O(λ¹), δV_D∕v_t ∼ O(λ¹), and ρ∇F₀ ∼ O(λ¹)F₀, we ﬁnd that the red term of the above equation (after divided by ΩF₀) is of O(λ³), hence can be dropped. Move the second line to the right-hand side, giving

[ ∂ ] -- + (v∥e∥ +VD + δVD )⋅∇X δh ∂t = − δVD ⋅∇XF0 + q-[(v∥e∥ + VD + δVD )⋅∇X (⟨δΦ − v∥δA ∥⟩)]∂F0, (150) m ∂ 𝜀

where ∂⟨v_∥δA_∥⟩∕∂t terms cancel each other. Further note that δV_D (given by Eq. (139)) is perpendicular to ∇_X⟨δΦ − v_∥δA_∥⟩. Therefore the blue term in Eq. (150) is zero, then Eq. (150) simpliﬁes to

[ ] ∂-+ (v e + V +δV ) ⋅∇ δh ∂t ∥ ∥ D D X = − δVD ⋅∇XF0 q ∂F0 +m-[(v∥e∥ + VD )⋅∇X ⟨δΦ − v∥δA∥⟩]∂𝜀-. (151)

Note that in terms of (X,𝜀,μ,α,σ) coordinates, v_∥ is written as

v∥ = σ∘2-𝜀−-2μB0,

(152)

where B₀(x) = B₀(X + ρ) with ρ = ρ(X,𝜀,μ,α). Since the scale length of B₀ is much larger than the thermal Larmor radius, B₀(x) ≈ B₀(X) and hence v_∥ can be approximated as a constant when gyro-angle α changes. Then v_∥ can be taken out of the gyro-averaging in expression (147), yielding

⟨v∥δA∥⟩ ≈ v∥⟨δA ∥⟩.

(153)

Using this, the term related to δA_∥ in (151) can be further written as

(v∥e∥ + VD )⋅∇X ⟨v∥δA∥⟩ = (v∥e∥ + VD )⋅∇X (v∥⟨δA ∥⟩) = ⟨δA ⟩(ve + V )⋅∇ (v) ∥ ∥∥ D X ∥ + v∥(v∥e∥ + VD )⋅∇X ⟨δA∥⟩. (154)

Using expression (152), (v_∥e_∥ + V_D) ⋅∇_X(v_∥) is written as

⟨δA ∥⟩(v∥e∥ + VD )⋅∇X (v∥) ≈ ⟨δA∥⟩(v∥e∥)⋅∇X (v(∥) ) = ⟨δA ⟩(v e )⋅∇ σ∘2-𝜀−-2μB-- ∥ ∥ ∥ X (∘ -------0) = ⟨δA∥⟩σ(v∥e∥)⋅∇X 2𝜀− 2μB0 = ⟨δA∥⟩σv∥− 2√μe∥-⋅∇XB0- 2 2𝜀− 2μB0 − 2μe∥-⋅∇XB0- = ⟨δA∥⟩v∥ 2v∥ = − ⟨δA ⟩μe ⋅∇ B . (155) ∥ ∥ X 0

(We can also obtain ∇_X(v_∥) = −μ(∇B₀)∕v_∥ by using Eq. (372)). Using the above results, equation (151) is written as

[ ] ∂-+ (v e + V + δV )⋅∇ δh ∂t ∥ ∥ D D X q- ∂F0- = − δVD ⋅∇XF0 + m [(v∥e∥ + VD )⋅∇X ⟨δΦ⟩]∂ 𝜀 q- ∂F0- − m [v∥(v∥e∥ + VD )⋅∇X ⟨δA∥⟩− ⟨δA∥⟩(μe∥ ⋅∇XB0 )]∂ 𝜀 , (156)

which agrees with the so-called p_∥ formulation given in the GEM manual (the ﬁrst line of Eq. 28). (It is said that p_∥ formulation uses p_∥ = v_∥ + q⟨δA_∥⟩∕m as an independent variable, but I can not relate this explanation to my derivation given above.)

4.3 Mixed-variable pullback method[4]

Deﬁne δf^(h) by

δf(h) = δf −-q∂F0-⟨v∥δA (h∥)⟩, m ∂𝜀

(157)

where δA_∥^(h) is a part of δA_∥:

δA∥ = δA(∥s)+ δA (h∥),

(158)

with δA_∥^(s) determined by an evolution equation:

∂ δA (s) ----∥- = − b ⋅∇δΦ. ∂t

(159)

Begining with Eq. (144) and following the same procedure in Sec. 4.2, we obtain the following evoluation equation for δf^(h):

[ ∂ ] ∂t + (v∥e∥ + VD + δVD )⋅∇X δf (h) = − δVD ⋅∇XF0 + q-∂F0[(VD + δVD )⋅∇X ⟨δΦ⟩] m ∂𝜀 − -q∂F0-[v∥(v∥e∥ + VD + δVD )⋅∇X ⟨δA(∥h)⟩− ⟨δA (h∥)⟩(μe∥ ⋅∇XB0 )]. (160) m ∂𝜀

In terms of δA_∥^(h), the parallel Ampere’s law is written as

( ) ( ) ∑ ω2 ∑ ( -p2j− ∇2⊥) δA(∥h)= μ0( δJ(|h|j)) + ∇2⊥δA (s∥), j c j

(161)

where δJ_||j^(h) is the parallel current carried by the distrbution funciton δf_j^(h), where the subscript j is species index. Here δA_∥^(s) has been moved to the right-hand side since its value is already known when we solve the Ampere’s law for δA_∥^(h).

Compared with the p_∥ formulation, the improvement of the above procedure is that a part of δA_∥ is solved from an evolution equation (which is chosen by some physical consideration) and the remainder is solved from the Ampere’s law. We hope that A_∥^(s) is the dominant part of δA_∥ so that the remainder solved from the Ampere’s law is small and hence the cancellation problem (of the Ampere’s law) become less signiﬁcant. While a clever choice of the evolution equation for δA_∥^(s) may help ensure that δA_∥^(s) remain dominant over the entire simulation duration, we have another simple way to make δA_∥^(s) remain dominant: collect both δA_∥^(s) and δA_∥^(h) into δA_∥^(s) at the end of each time step:

δA (s) = δA(s) + δA(h). ∥new ∥old ∥old

(162)

Then, to make δA_∥ untouched (so that electromagnetic ﬁeld remain unchanged), we set δA_∥^(h) to zero:

δA (h∥)new = 0.

(163)

Here “old” and “new” refers to before and after the re-spliting, respectively. Note that this re-spliting is made at the end of each time step and does not correspond to any time evolution. This re-splitting keeps the value of δA_∥ untouched and hence does not inﬂuence the electromagnetic ﬁeld. Meanwhile, the deﬁnition of Eq. (157) indicates that, for a given δf, the re-splitting will make the value of δf^(h) increase by Δ = q-
m ⟨v_∥δA_∥old^(h)⟩_α ∂F0
∂𝜀 since δA^(h) is reduced by δA_old^(h). To keep δf unchanged, which is what we need to ensure, we must add Δ to δf^(h):

(h) (h) -q (h) ∂F0- δfnew = δfold + m ⟨v∥δA∥old⟩ ∂𝜀 .

(164)

After the operations in (162) (163) and (164), both the electromagnetic ﬁeld and the distribution function δf remain unchanged. Therefore we do not change any physical state of the system. Hence the above operations can only have eﬀects on numerical round-oﬀ errors. As mentioned above, we hope the numerical eﬀects will reduce the cancellation errors.

Next, consider the special case that F₀ is a local Maxwellian given by

( m )3∕2 [ m 𝜀 ] F0(X,𝜀) = n0(X ) 2πT-(X-)- exp − T-(X)- , 0 0

(165)

Then ∂F₀∕∂𝜀 and ∇_XF₀ are written as

( ) ∂F0-= − m- F0, ∂𝜀 T

(166)

and

∇F0 = ∂F0-∇n0 + ∂F0∇T0 ∂n0 ∂T( ) = F ∇n0-+ F mv2- − 3 ∇T0- 0 n0 0 2T0 2 T0 [ ( mv2 3 ) ] = − F0 κn + 2T0-− 2 κT , (167)

where κ_n = − ∇n0
n0

, and κ_T = − ∇T0
T0

. Consider the case that F₀(X,𝜀) is independent of 𝜃 and α, then κ_n and κ_Tare written as κ_n = κ_n∇r and κ_T = κ_T∇r, where κ_n ≡− -1
n0

, κ_T ≡−

, and r is chosen to be the minor radius one the low ﬁeld side.

Using the above results, Eq. (160) is written as

(h) [ ( 2 ) ] dδf---= δVD ⋅∇ ψdr- κn + mv--− 3 κT F0 − q[(VD + δVD )⋅∇⟨δΦ⟩]F0 dt dψ 2T0 2 T + q-[v∥(v∥e∥ + VD + δVD )⋅∇ ⟨δA (h)⟩− ⟨δA(h)⟩(μe∥ ⋅∇B0 )]F0. (168) T ∥ ∥

Weight evolution equation

The weight of the j th marker is deﬁned by

(h) wj = δf (Xj,vj,t)Vpj0,

(169)

where V _pj0 is the phase space volume occupied by the jth marker. Multiplying both sides of Eq. (168) by V _pj0, and noting that d(V _pj0)∕dt = 0, we obtain

dwj dr[ (mv2 3) ] q -dt-= δVD ⋅∇ ψdψ- κn + -2T- − 2 κT Vpj0F0 − T-[(VD + δVD ) ⋅⟨∇ δΦ⟩]Vpj0F0. q 0 (h) (h) + T-[v∥(v∥e∥ + VD + δVD )⋅⟨∇ δA∥ ⟩− ⟨δA∥ ⟩(μe∥ ⋅∇B0 )]Vpj0F0, (170)

where use has been made of ∇⟨δΦ⟩ = ⟨∇δΦ⟩ and ∇⟨δA_∥^(h)⟩ = ⟨∇δA_∥^(h)⟩.

In ﬁeld-aligned coordinates

In ﬁeld-aligned coordinates (x,y,z), ∂δΦ
∂x can be written as

∂δΦ- ∂δΦ- ∂δΦ- ∂δΦ- ∂x = ∂x ∇x + ∂y ∇y + ∂z ∇z.

(171)

Note that the direction of ∂r∕∂x at 𝜃 = −π is diﬀerent from that of ∂r∕∂x at 𝜃 = +π. This implies that ∂δΦ∕∂x is a non-periodic function of 𝜃. Similarly, ∇y is also a non-perioidic functon of 𝜃. Then both ∂δΦ∕∂x and ∇y are discontinuous across the 𝜃 cut (𝜃 = ±π).

The gyro-average of expression (171) is written as

∂δΦ ∂δΦ ∂δΦ ∂δΦ ⟨-∂x-⟩ ≈ ⟨-∂x-⟩∇x + ⟨-∂y-⟩∇y + ⟨∂z-⟩∇z,

(172)

where we approximate ∇x, ∇y, and ∇z as constants when performing the gyro-averge since they are determined by the equilibrim magnetic ﬁeld, which is nearly constant on the Larmor radius scale. As is mentioned above, ∇y is not continuous at the 𝜃 cut. Do we need to worry about this? We do not. This is because we must stick to the same branch when we perform gryo-average on it, and hence ∇y is always continous. Then, do we need to worry about the disconunity of ∂δΦ∕∂x across the 𝜃 cut when performing the gyro-averge on it? We do not either. The reason is the same: we must stick to a single branch. The disconunity is just irrelevant here. The discontinuity only manifest itself when we need to infer value on 𝜃 = −π from that on 𝜃 = +π (vice versa), i.e., when across branch communication is explicitly needed. In TEK, the ﬁeld equations are not solved at 𝜃 = +π and hence the ﬁeld values are not directly obtained. Instead, the ﬁeld values at 𝜃 = +π are infered from the ﬁeld values at 𝜃 = −π using the coninuity of ∇δΦ. At the 𝜃 cut, the continuity of ∇δΦ requires

⟨∂-δΦ-⟩+∇x + ⟨∂δΦ⟩∇y+ = ⟨∂δΦ⟩− ∇x + ⟨∂δΦ-⟩∇y − , ∂x ∂y ∂x ∂y

(173)

where the superscript “+” and “−” refer to the location 𝜃 = +π and 𝜃 = −π, respectively. Dotting the above by ∇x, we obtain

∂ δΦ ∂ δΦ ∂δΦ ∇y− ⋅∇x − ∇y+ ⋅∇x ⟨-∂x-⟩+ = ⟨∂x-⟩− + ⟨-∂y-⟩------|∇x-|2-------,

(174)

which is used in TEK to infer the value of ⟨ ∂δΦ
∂x ⟩⁺ from that of ⟨⟩⁻.

Similarly, for ∂δA_∥^(h)∕∂x, we obtain:

∂δA(∥h) ∂δA(∥h) ∂δA(∥h) ∂δA(∥h) ------= -----∇x + -----∇y + -----∇z. ∂x ∂x ∂y ∂z

(175)

(h) (h) (h) (h) ∂δA∥-- ∂δA∥-- ∂δA∥-- ∂δA-∥- ⟨ ∂x ⟩ ≈ ⟨ ∂x ⟩∇x + ⟨ ∂y ⟩∇y + ⟨ ∂z ⟩∇z

(176)

Using these, Eq. (170) is written as

dwj dr [ (mv2 3) ] -dt-= δVD ⋅∇ ψdψ- κn + 2T0-− 2 κT Vpj0F0 [ ′ ′ ′ ] − q- ⟨∂δΦ⟩dX--⋅∇x + ⟨∂δΦ-⟩dX--⋅∇y + ⟨∂δΦ-⟩dX--⋅∇z Vpj0F0 T ∂(x dt ∂y dt ∂z dt ) q ∂δA (h∥) dX ∂ δA (∥h) dX ∂ δA (h∥) dX + --v∥(⟨------⟩--- ⋅∇x +⟨------⟩--- ⋅∇y +⟨------⟩--- ⋅∇z) Vpj0F0 T ∂x dt ∂y dt ∂z dt q- (h) − T ⟨δA ∥ ⟩(μb ⋅∇B0 )Vpj0F0, (177)

where

≡ v_∥b + V_D + δV_D and ′
dXdt

≡ V_D + δV_D. Deﬁne

-- q δΦ --(h) quvuδA (h) δΦ = -u---,δA ∥ = ------∥-, Tu Tu

(178)

Then, a typical term of the above is written as

-- q ∂δA(∥h)dX q Tu - ∂ δA(∥h) dX- -- T-v∥⟨-∂x--⟩-dt ⋅∇xtn = T-q-v-vnv∥⟨-∂x--⟩-dt ⋅∇x uu -(h) -- q∕qu-vn- ∂δA∥-- dX- -- = T∕Tu vuv∥⟨ ∂x ⟩dt ⋅∇x,

where

is the normlizing factor needed when coding. The mirror force term:

q⟨δA(h)⟩(μb ⋅∇B0)tn = q-Tu--Bnμn-⟨δA-(h∥)⟩(μb ⋅∇B0-)tn (179) T ∥ T quvu Ln q∕qu-1-Bnv2n -(h) -- ---- = T∕Tu vuLnBn ⟨δA∥ ⟩(μb ⋅∇ B0)tn q∕qu vn --(h) -- ---- = T∕T--v-⟨δA ∥ ⟩(μb⋅∇ B0) u u

where

is the normlizing factor needed when coding.

The pullback:

( ) q⟨v∥δA(h)⟩∂F0-Vp = q-⟨v∥δA(h)⟩ − m- F0Vp m ∥ ∂𝜀 m ∥ T ≈ − q-v∥⟨δA (h∥)⟩F0Vp T = − q∕qu-v∥⟨δA(∥h)⟩F0Vp T∕Tu vu

The δE × B₀ drift term is written as

∂δΦ ( ) − ⟨∂x-⟩×-B0-= − 1-- ⟨∂δΦ-⟩∇x + ⟨∂δΦ-⟩∇y +⟨∂-δΦ-⟩∇z × B0 B20 B20 ∂x ∂y ∂z ∂-δΦ -1- ∂δΦ- -1- ∂δΦ- 1-- = − ⟨∂x ⟩B20 ∇x × B0 − ⟨ ∂y ⟩B20∇y × B0 − ⟨ ∂z ⟩B20∇z × B0.

The ∇x, ∇y, and ∇z components of the above expression are written as

⟨∂∂δΦx-⟩×-B0- ∂δΦ- -1- ∂δΦ- 1-- − B20 ⋅∇x = − ⟨ ∂y ⟩B20∇y ×B0 ⋅∇x − ⟨ ∂z ⟩B20∇z × B0 ⋅∇x ∂δΦ 1 ∂δΦ 1 = − ⟨-∂y-⟩B2∇x × ∇y ⋅B0 − ⟨-∂z-⟩B2∇x × ∇z ⋅B0. (180) 0 0

∂δΦ − ⟨-∂x-⟩×2-B0-⋅∇y = − ⟨∂δΦ⟩-12∇x × B0 ⋅∇y − ⟨∂-δΦ-⟩-12-∇z ×B0 ⋅∇y B0 ∂x B 0 ∂z B0 = ⟨∂δΦ-⟩ 1-∇x × ∇y ⋅B − ⟨∂δΦ⟩-1-∇y × ∇z ⋅B (181) ∂x B20 0 ∂z B20 0

⟨∂δ∂xΦ⟩×-B0- ∂δΦ--1- ∂-δΦ- -1- − B20 ⋅∇z = − ⟨∂x ⟩B20∇x × B0 ⋅∇z − ⟨ ∂y ⟩B20 ∇y ×B0 ⋅∇z ∂δΦ 1 ∂δΦ 1 = ⟨-∂x-⟩B2∇x × ∇z ⋅B0 + ⟨-∂y ⟩B2-∇y × ∇z ⋅B0 (182) 0 0

Using

dx --= VG ⋅∇x, dt

(183)

i.e.,

dx-= tnVG ⋅∇x dt

(184)

To get the normalizing factor, consider only the contribution from E ×B drift. Then a typical term is written as

dx ∂δΦ 1 dt = − tn⟨∂y-⟩B2-∇x × ∇y ⋅B0 0 -- = − Tu-tn--⟨∂δΦ⟩-1-∇x × ∇y ⋅B0 qu BnL2n ∂y B20 T 1 ∂δΦ- 1 -- -- -- = −-u--------⟨---⟩--2∇x × ∇y ⋅B0, (185) qu BnvnLn ∂y B 0

where T_u∕(q_uv_nB_nL_n) is the normlizing factor we need when coding.

The magnetic ﬂuttering term is written as

⟨ ( )⟩ − v∥⟨b × ∇ (δA )⟩ = − v∥ b × ∂δA-∥∇x + ∂δA-∥∇y + ∂δA∥∇z B0 x ∥ B0 ∂x ∂y ∂z v∥- ( ∂δA-∥ ∂δA∥- ∂δA∥- ) ≈ − B0b × ⟨ ∂x ⟩∇x + ⟨ ∂y ⟩∇y + ⟨ ∂z ⟩∇z

The ∇x, ∇y, and ∇z components of the above expression are written as

v∥ v∥ ( ∂ δA ∥ ∂δA∥ ) − B-⟨b × ∇x(δA∥)⟩⋅∇x = − B--b× ⟨--∂y-⟩∇y + ⟨-∂z-⟩∇z ⋅∇x 0 0( ) = v∥2- ⟨∂δA∥⟩∇x × ∇y + ⟨∂δA∥-⟩∇x × ∇z ⋅B0 (186) B0 ∂y ∂z

( ) − v∥⟨b × ∇x(δA )⟩⋅∇y = − v∥-b× ⟨∂-δA-∥⟩∇x + ⟨∂δA∥⟩∇z ⋅∇y B0 ∥ B0 ∂x ∂z v∥( ∂δA∥- ∂δA∥- ) = B20 ⟨ ∂x ⟩∇y × ∇x + ⟨ ∂z ⟩∇y × ∇z ⋅B0 (187)

( ) − v∥⟨b × ∇ (δA )⟩⋅∇z = − v∥-b× ⟨∂-δA-∥⟩∇x + ⟨∂δA∥⟩∇y ⋅∇z B0 x ∥ B0 ∂x ∂y v∥( ∂δA∥- ∂δA∥- ) = B2 ⟨ ∂x ⟩∇z × ∇x + ⟨ ∂y ⟩∇z × ∇y ⋅B0 (188) 0

Also to get the normalizing factor needed in the code, consider a typical term:

dx v ∂δA ---= tn-∥2⟨---∥⟩∇x × ∇y ⋅B0 dt B 0 ∂y - -- = Tu--t--vn- v∥⟨∂δA-∥⟩∇x × ∇y ⋅B- quvu nL2nBn B2 ∂y 0 - 0 -- -- = Tu----1--v∥2⟨∂δA∥⟩∇x × ∇y ⋅B0 (189) quvu LnBn B 0 ∂y

where T_u∕(q_uv_uB_nL_n) is the normlizing factor I need when coding.

The basic terms involved in the above expressions can be written as

B2 (∇x × ∇y )⋅B0 = -0′ , Ψ

(190)

( ) (∇x ×∇z )⋅B0 = ∂x-∂z-− ∂x-∂z- Bϕ ≡ W 14 ∂Z ∂R ∂R ∂Z

(191)

-- ( ) -- ( ) -- ( ) (∇y × ∇z)⋅B0 = BR 1-∂α-∂𝜃- + B ϕ ∂α-∂𝜃-− ∂α-∂-𝜃 + BZ − 1-∂α-∂𝜃- R ∂ϕ ∂Z ∂Z ∂R ∂R ∂Z R ∂ϕ ∂R ≡ W 15 (192)

Deﬁne

-- q δΦ -(h) quvuδA(h) --(s) quvuδA (s) -(h) δJ(h) δΦ = -u--,δA∥ = ------∥--,δA ∥ = ------∥-,δJ||s = ---∥s--, Tu Tu Tu nuquvns

(193)

where q_u,n_u,T_u,v_u are chosen units, v_ns is a velocity unit for a species, whose value is diﬀerent among diﬀerent species [v_ns = L_n∕(2π∕(B_n|q_s|∕m_s))]. In TEK, v_u is chosen as v_ns of the ﬁrst ion species.

Then the Ampere’s law (161) is written as

( ) ∑ ω2ps 2 -(h) nuq2uv2u ∑ vns -(h) 2 --(s) c2-− ∇ ⊥ δA∥ = 𝜀-T-c2 -v-δJ||s + ∇⊥ δA ∥ , s 0 u s u

(194)

where use has been made of μ₀ = 1∕(c²𝜀₀).

δf pullback in Eq. (164) is now written as (for the case of F₀ being Maxwellian):

(h) (h) q- (h) ( m-) δfnew = δfold + m⟨v∥δA∥old⟩α − T F0 = f(h)− q⟨v∥δA(h)⟩αF0 old T ∥old = f(h)− qv∥vn⟨δA-(h)⟩αF0 (195) old T vu ∥old

In terms of units used in TEK, Eq. (159) is written as

= −

b ⋅∇δΦ,

where t = t∕t_u, t_u = L_n∕v_u. Then the above equation is written as

∂ δA-(s) ---- ----∥- = − b ⋅∇δΦ, ∂t

(196)

i.e.,

--(s) -- ∂-δA-∥- = − (b ⋅∇z )∂δΦ-. ∂t ∂z

(197)

4.4 Summary of distribution function split

In simulations, the seemingly trivial thing on how to split the distribution function is often a big deal. Separating the perturbation from the total distribution function gives rise to the the famous “δf particle method”, in contrast to the conventional particle method which is now called full-f particle method.

In the above, the perturbed part of the distribution function, δF, is split at least three times in order to (1) simplify the gyrokinetic equation by splitting out the adiabatic response; (2) eliminate the time derivatives, ∂δϕ∕∂t and ∂δA∕∂t, on the right-hand. To avoid confusion, I summarize the split of the distribution function here. The total distribution function F is split as

F = F0 +δF,

(198)

where F₀ is the equilibrium distribution function and δF is the perturbed part of the total distribution function. δF is further split as

δF = δh + -q(δΦ− ⟨δΦ⟩)∂F0-+ q-⟨v⋅δA ⟩∂F0, m ∂𝜀 m ∂𝜀

(199)

where δh satisﬁes the gyrokinetic equation (151) or (156). In PIC simulations, δh is evolved by using markers and its moment in the phase-space is evaluated via Monte-Carlo integration. The blue and red terms explicitly depends on the unknown perturbed ﬁeld. After being integrated in the velocity space, these two terms give the polarization density and the skin current, respectively. The polarization density is discussed in Sec. 6. The skin current is discussed in Sec. (4.5) and the so-called “cancellation problem” is discussed in Sec. 5.

4.5 Skin current

Let us calculate the velocity space moment of q-
m ⟨v ⋅δA⟩ ∂F0
∂𝜀 . Consider the approximation δA ≈ δA_∥e_∥, then the blue term in Eq. (199) is written as

q ∂F0 m-⟨v∥δA∥⟩-∂𝜀 .

(200)

Notice that v_∥ can be taken out of the gyro-averaging. Then the above equation is written

-qv∥⟨δA∥⟩∂F0. m ∂𝜀

(201)

If we neglect the FLR eﬀect, then the above expression is written

q ∂F0 m-v∥δA∥-∂𝜀-.

(202)

The zeroth order moment (number density) is then written as

∫ q- ∂F0- δn = m δA∥ v∥∂ 𝜀 dv,

(203)

which is zero if F₀ is Maxwellian. Next, consider the parallel current carried by distribution (202), which is written

q2 ∫ 2∂F0 δj∥ = m-δA ∥ v∥-∂𝜀-dv.

(204)

If F₀ is a Maxwellian distribution given by

( m )3∕2 ( mv2) F0 = n0 2πT- exp − 2T-- .

(205)

then

∂F0- m- ∂𝜀 = −T F0.

(206)

Then expression (204) is written

q2 ( m )3∕2 ∫ ( mv2) δj∥ = − T-n0 2πT- δA ∥ v2∥exp − 2T-- dv.

(207)

Working in the spherical coordinates, then v_∥ = v cos𝜃 and dv = v² sin𝜃dvd𝜃dϕ. Then expression (207) is written

( ) ∫ ( ) δj = − q2n -m-- 3∕2δA v2cos2𝜃exp − mv2- v2sin𝜃dvd𝜃dϕ ∥ T 0 2πT ∥ 2T q2 ( m )3∕2 4π∫ 4 ( mv2 ) = − T-n0 2πT- δA ∥3-- v exp −-2T- dv 2 ( )3 ∕2 ∫ = − q-n0 1- 2TδA 4π- x4exp(− x2)dx T π m ∥3 q2 ( 1)3 ∕2 2T 4π3√ π- = − T-n0 π- m-δA ∥3---8- 2 = − q-n0δA∥. (208) m

Using c = 1∕ √----
μ0𝜀0

and ω_p² = n₀q²∕(m𝜀₀), the above expression can be written as

ω2p- μ0δj∥ = − c2 δA ∥,

(209)

where the c∕ω_ps is often called the skin depth of species “s”. This current δj_∥ is often called “skin current”, which can also be written as

β μ0δj∥ = −-2δA ∥, ρ

(210)

where ρ = ∘ -----
2T∕m ∕Ω is the thermal Larmor radius.

5 Parallel Ampere’s Law

2 (n+1) (n+1) (n+1) − ∇⊥ δA∥ = μ0(δJ ||i + δJ||e ),

(211)

where the parallel currents are given by

∫ δJ(n+1)= δJ ′(δϕ(n),δA(n)) + (v(n+1))2 q2i-⟨δA (n+1)⟩ ∂Fi0dv, ||i ||i ∥ ∥ mi ∥ α ∂𝜀

(212)

∫ q2 ∂F δJ||e = δJ′||e(δϕ(n),δA (n∥))+ (v(n∥+1))2-e⟨δA (n∥+1)⟩α --e0dv, me ∂𝜀

(213)

where δJ_∥i′ and δJ_∥e′ is the parallel current carried by the distribution function δh in Eq. (199), which are updated from the value at the nth time step using an explicit scheme and therefore does not depends on the ﬁeld at the (n + 1)th step. The blue terms in Eqs. (212) and (213) are called “skin current”, which depend on the unknown ﬁeld at the (n + 1)th step and thus need to be moved to the left-hand side of Ampere’s law (211) if we want to solve this equation by direct methods. In this case, equation (211) is written as

∫ − ∇2 δA(n+1)− μ (v(n+1))2 q2i⟨δA(n+1)⟩ ∂Fi0-dv ⊥ ∥ 0 ∥ mi ∥ α ∂ 𝜀 ∫ (n+1) 2 q2e (n+1) ∂Fe0 − μ0 (v∥ ) me⟨δA ∥ ⟩α ∂𝜀 dv. = μ (δJ ′+ δJ′ ) (214) 0 ∥i ||e

Then we need to put the blue terms into matrix form. If we put the bule terms into martrix form by using numerical spatial grid integration (as we do for the polarization density), then there arises the cancellation propblem (i.e., the two parts of the distribution are evaluated by diﬀerent methods,one is grid-based and the other is MC marker based, there is a risk that the sum of the two terms will be inaccurate when the two terms are of opposite signs and large amplitudes, and the ﬁnal result amplitude is expected to be much smaller than the amplituded of the two terms). If we get the matrix form by evaluating it numerically using MC markers (which can avoid the cancellation problem), the corresponding matrix will depends on markers and thus needs to be re-constructed and inverted each time-step, which is computationally expensive.

Therefore we go back to Eq. (211) and try to solve it using iterative methods. However, it is found numerically that directly using Eq. (211) as an iterative scheme is usually divergent. To obtain a convergent iterative scheme, we need to have an approximate form for the blue terms, which is independent of markers and so that it is easy to construct its matrix, and then subtract this approximate form from both sides. After doing this, the iterative scheme has better chance to be convergent (partially due to that the right-hand side becomes smaller). An approximate form is that derived by neglecting the FLR eﬀect given in Sec. 4.5. Using this, the iterative scheme for solving Eq. (211) is written as

( ) 2 (n+1) ω2pi (n+1) ω2pe (n+1) − ∇ ⊥δA∥ − − c2 δA∥ − c2 δA∥ ′ ′ = μ0(∫δJ∥i + δJ||e) (n+1) 2 q2i (n+1) ∂Fi0 + μ0 (v∥ ) mi⟨δA∥ ⟩α ∂𝜀 dv ∫ (n+1) q2 (n+1) ∂Fe0 + μ0 (v∥ )2me⟨δA∥ ⟩α -∂𝜀-dv ( 2 e 2 ) − − ωpiδA(n+1)− ωpeδA(n+1) . (215) c2 ∥ c2 ∥

In the drift-kinetic limit (i.e., neglecting the FLR eﬀect), the blue and red terms on the right-hand side of the above equation cancel each other exactly. Even in this case, it is found numerically that these terms need to be retained and the blue terms are evaluated using markers. Otherwise, numerical inaccuracy can give numerical instabilities, which is the so-called cancellation problem. The explanation for this is as follows. The blue terms are part of the current. The remained part of the current carried by δh is computed by using Monte-Carlo integration over markers. If the blue terms are evaluated analytically, rather than using Monte-Carlo integration over markers, then the cancellation between this analytical part and Monte-Carlo part can have large error (assume that there are two large contribution that have opposite signs in the two parts) because the two parts are evaluated using diﬀerent methods and thus have diﬀerent accuracy, which makes the cancellation less accurate.

Because the ion skin current is less than its electron counterpart by a factor of m_e∕m_i, its accuracy is not important. The cancellation error is not a problem and hence can be neglected. In this case, equation (215) is simpliﬁed as

( 2 2 ) − ∇2⊥δA(n+1)− − ωpiδA(n+1)− ωpeδA(n+1) ∥ c2 ∥ c2 ∥ = μ0(δJ′ + δJ′ ) ∫ ∥i ||e2 + μ0 (v(n+1))2 qe-⟨δA (n+1)⟩α∂Fe0dv ( ∥ me) ∥ ∂𝜀 ω2pe (n+1) − − -c2 δA ∥ . (216)

Note that the blue term will be evaluated using Monte-Carlo markers.

5.1 Discretizing Laplacian operator

In TEK, the Laplacian operator is approximated as

2 2 2 ∇2⊥ δA∥ ≈ ∂-δ2A∥|∇x|2 + ∂-δA2-∥|∇y |2 + ∂-δA∥2∇x ⋅∇y. ∂ x ∂ y ∂x∂y

(217)

To approximate δA_∥ of zero boundary condition A_∥(x = 0) = A_∥(x = L_x) = 0, the sine expansion can be adopted. In the y direction, full Fourier expansion is needed. Then, at each value of z, A_∥(x,y,z) is approximated by the following two-dimensional expansion:

N ∕2 ( ) ( ) ∑y 2π- Nx∑−1 m-π δA ∥(x,y,z) ≈ exp inLyy Am,n(z)sin Lx (x− x0) , n=− Ny∕2 m=1

(218)

Using this expression, Eq. (217) is written as

2 N∑y∕2 ( 2π ) Nx∑−1 − ∇ ⊥δA∥ = exp in L-y δAm,n(z) n{=[−Ny∕2 y m=1 ] } (m π)2 ( 2π)2 ( mπ ) 2π ( mπ ) ( m π ) × L-- |∇x |2 + n L-- |∇y |2 sin L--(x− x0) − inL-- L-- 2∇x ⋅∇ycos L--(x− x0) x y x y x x

At x = x_j, with x_j − x₀ = jΔ and L_x∕Δ = N_x, the above expression is written as

N∑y∕2 ( ) Nx∑−1 − ∇2⊥δA∥ = exp in 2πy δAm,n(z) n=−Ny∕2 Ly m=1 {[ ( )2 ( )2 ] ( ) ( ) ( )} × m-π |∇x |2 + n 2π- |∇y |2 sin jmπ- − in2π m-π 2∇x ⋅∇y cos jm-π(219,) Lx Ly Nx Ly Lx Nx

where ∇x and ∇y are evaluated at x_j. Plugging the DST coeﬃcients

( ) -2-Nx∑− 1 j′m-π- δAm,n(z) = Nx ′ An(xj′,z)sin Nx , j=1

(220)

into expression (219) gives

Ny∕2 ( ) N −1 N −1 ( ) − ∇2 δA = ∑ exp in 2πy x∑ -2- ∑x A (x ′,z)sin j′m-π- ⊥ ∥ Ly m=1 Nx j′=1 n j Nx n{=[−N(y∕2) ( ) ] ( ) ( ) ( )} m-π 2 2 2π-2 2 jmπ- 2π- m-π jm-π × Lx |∇x | + n Ly |∇y | sin Nx − inLy Lx 2∇x ⋅∇y cos Nx

For a single toroidal harmonic, the above expression reduces to

Nx∑−1 N∑x−1 ( ′ ) -2- An (xj′,z)sin jm-π- m=1 Nx j′=1 Nx { [( )2 ( )2 ] ( ) ( ) ( ) } × mπ- |∇x|2 + n 2π- |∇y |2 sin jm-π − in 2π- m-π 2∇x ⋅∇y cos jm-π (221) Lx Ly Nx Ly Lx Nx

Deﬁne

N∑x−1 2 (j′m π) Mjj′,n ≡ N--sin -N--- m{=[1 x x ] } ( m π)2 2 ( 2π )2 2 (jm π) 2π ( mπ ) ( jmπ ) × Lx- |∇x | + nLy- |∇y| sin -Nx- − inLy- Lx- 2∇x ⋅∇y cos Nx--

then expression (221) is written as

Nx∑ −1 Mjj′,nAn (xj′,z). j′=1

(222)

6 Poisson’s equation and polarization density

Poisson’s equation is written as

− 𝜀0∇2δΦ = qiδni + qeδne,

(223)

where −𝜀₀∇²δΦ is called the space-charge term. Since we consider modes with k_∥≪ k_⊥, the space-charge term is approximated as ∇²δΦ ≡∇_⊥²δΦ + ∇_∥²δΦ ≈∇_⊥²δΦ. Then Eq. (223) is written as

− 𝜀0∇2⊥ δΦ = qiδni + qeδne.

(224)

This approximation eliminates the parallel plasma oscillation from the system. The perpendicular plasma oscillations seem to be only partially eliminated in the system consisting of gyrokinetic ions and drift-kinetic electrons. There are the so-called Ω_H modes (also called electrostatic shear Alfven wave) that appear in the gyrokinetic system which have some similarity with the plasma oscillations but with a much smaller frequency, Ω_H ∼ (k_∥∕k_⊥)(λ_D∕ρ_s)ω_pe.

Using expression (199), the perturbed number density δn is written as

∫ δn = δF dv [ ] [ ] ∫ ∫ -q ∂F0- ∫ q- ∂F0- = δhdv + m (δΦ − ⟨δΦ ⟩α) ∂𝜀 dv + m ⟨v⋅δA ⟩α∂ 𝜀 dv, (225)

where the blue term is approximately zero for isotropic F₀ and this term is usually dropped in simulations that assume isotropic F₀ and approximate δA as δA_∥e_∥. The red term in expression (225) is the so-called the polarization density n_p, i.e.,

∫ q ∂F δnp(x) = --(δΦ − ⟨δΦ⟩α)---0dv, m ∂𝜀

(226)

which has an explicit dependence on δΦ and is usually moved to the left hand of Poisson’s equation when constructing the numerical solver of the Poisson equation, i.e., equation (224) is written as

∫ q ∂F − 𝜀0∇2⊥δΦ − qi -i-(δΦ − ⟨δΦ⟩α)--i0dv = qiδn′i + qeδne, mi ∂𝜀

(227)

where δn_i′ = δn_i −δn_pi = ∫ δh_idv, which is evaluated by using Monte-Carlo markers. Since some parts depending on δΦ are moved from the right-hand side to the left-hand side of the ﬁeld equation, numerical solvers (for δΦ) based on the left-hand side of Eq. (227) probably behaves better than the one that is based on the left-hand side of Eq. (224), i.e., −𝜀₀∇_⊥²δΦ.

6.1 Discussion on cancellation scheme

The polarization density is part of the perturbed density that is extracted from the source term and moved to the left-hand side of the Poisson equation. The polarization density will be evaluated without using Monte-Carlo markers, whereas the remained density on the right-hand side will be evaluated using Mote-Carlo markers. The two diﬀerent methods of evaluating two parts of the total perturbed density can possibly introduce signiﬁcant errors if the two terms are expected to cancel each other and give a small quantity that is much smaller than either of the two terms. This is one pitfall for PIC simulations that extract some parts from the source term and move them to the left-hand side. To remedy this, rather than directly moving a part of the distribution function to the left-hand side, we subtract an (approximate) analytic expression from both sides of Eq. (224). The analytical expressions on both sides are evaluated based on grid values of perturbed electromagnetic ﬁelds and are independent of makers. All the original parts of the distribution functions are kept on the right-hand side and are still evaluated by using markers, which hopefully avoids the possible cancellation problem. This strategy is often called a cancellation scheme. Since unknown perturbed electromagnetic ﬁelds appear on the right-hand side, iteration is needed to solve the ﬁeld equation.

Note that two things appear here: What motivates us to move parts of the distribution function to the left? It is the goal of hopefully making the left-hand side matrix more well-behaved (such as good condition number, etc.) Why do we need the cancellation scheme? Because we want to avoid the numerical inaccuracy that appears when large terms cancels each other. Note that iteration is needed when the cancellation scheme is used because the right-hand side explicitly contains unknown electromagnetic ﬁelds.

It turns out that the cancellation scheme is not necessary for Eq. (227), but for the ﬁeld solver for Ampere’s equation (discussed later), this cancellation scheme is necessary in order to obtain stable results.

7 Polarization density expressed as local Fourier expansion

Since δΦ is independent of the velocity in the particle coaordinates, the ﬁrst term (adiabatic term) in expression (226) is trivial and the velocity integration can be readily performed (assume F₀ is Maxwellian), giving

∫ -q ∂F0- δnad = m (δΦ) ∂𝜀 dv. q ∫ ( m ) = m-(δΦ) − T-fM dv. qδΦ = − ---n0, (228) T

which is called adiabatic response. Next, let us perform the gyro-averaging and the velocity integration of the second term in expression (226), i.e.,

∫ q ∂F − --⟨δΦ⟩α--0-dv. m ∂𝜀

(229)

7.1 Gyro-averaging of δΦ in guiding-center coordinates

In order to perform the gyro-averaging of δΦ, we Fourier expand δΦ in space as

∫ dk δΦ(x) = δΦkexp(ik ⋅x)----3, (2π)

(230)

and then express x in terms of the guiding center variables (X,v) since the gyro-averaging is taken by holding X rather than x constant. The guiding-center transformation gives

x = X + ρ(x,v ) ≈ X − v× e∥(X-). Ω(X )

(231)

Using expressions (230) and (231), the gyro-average of δΦ is written as

⟨∫ ⟩ ⟨δΦ ⟩α = δΦkexp(ik ⋅x)-dk-3 ⟨∫ ( ((2π) α ) ) ⟩ e∥(X)- -dk-- = δΦkexp ik ⋅ X − v × Ω (X ) (2π)3 α ∫ ⟨ ( e (X ))⟩ dk = δΦk exp (ik ⋅X ) exp − ik ⋅v × -∥---- ----3. (232) Ω(X ) α(2π)

When doing the gyro-averaging, X is hold constant and thus e_∥(X) is also constant. Then it is straightforward to deﬁne the gyro-angle α. Let k_⊥ deﬁne one of the perpendicular direction

₁, i.e., k_⊥ = k_⊥

₁. Then another perpendicular basis vector is deﬁned by

₂ = e_∥×

₁. Then v_⊥ is written as v_⊥ = v_⊥(

₁ cosα +

₂ sinα), which deﬁnes the gyro-angle α. Then the blue expression in Eq. (232) is written as

e∥(X-)- e∥(X-)- − ik ⋅v× Ω(X ) = − ik ⋅v⊥(ˆe1cosα + ˆe2sinα )× Ω (X ) v⊥ = − ik ⋅Ω-(X)(− ˆe2 cosα + ˆe1sin α) k v = − i-⊥-⊥-sinα. (233) Ω

Then the gyro-averaging in expression (232) is written as

⟨ ( e∥(X ))⟩ ⟨ ( k⊥v⊥ )⟩ exp − ik ⋅v× Ω(X-)- = exp − i-Ω---sinα α ∫ 2π ( α ) = 1-- exp − ik⊥v⊥-sinα dα 2π 0 Ω (k⊥v⊥-) = J0 Ω . (234)

where use has been made of the deﬁnition of the zeroth Bessel function of the ﬁrst kind. Then ⟨δΦ⟩_α in expression (232) is written as

∫ ( ) k⊥v⊥- --dk- ⟨δΦ ⟩α = δΦk exp (ik ⋅X )J0 Ω (2π)3.

(235)

7.2 Gryo-angle integration in particle coordinates

Next, we need to perform the integration in velocity space, which is done by holding x (rather than X) constant. Therefore, it is convenient to transform back to particle coordinates. Using X = x + v × e∥(x)
Ω (x) , expression (235) is written as

∫ ( ) ( ) ⟨δΦ⟩α = δΦkexp(ik⋅x)J0 k⊥v⊥- exp ik ⋅v × e∥ -dk-. Ω Ω (2π)3

(236)

Then the velocity integration is written as

∫ ∂F0 ⟨δΦ⟩α-∂𝜀-dv ∫ [∫ ( k v ) ( e ) ∂F ] dk = δΦk exp(ik ⋅x) J0 -⊥-⊥- exp ik⋅v × -∥ --0dv ---3. (237) Ω Ω ∂𝜀 (2π )

Similar to Eq. (233), except for now at x rather than X, ik ⋅ v × e∥
Ω

is written as

e∥ k⊥v⊥ ik ⋅v× Ω-= i--Ω--sin α.

(238)

Since this is at x rather than X, k_⊥, v_⊥, and Ω are diﬀerent from those appearing in expression (233). However, since this diﬀerence is due to the variation of the equilibrium quantity e_∥∕Ω in a Larmor radius, and thus is small and is ignored in the following.

Plugging expression (238) into expression (237) and using dv = v_⊥dv_⊥dv_∥dα, we get

∫ ⟨δΦ⟩α∂F0-dv ∫ ∂𝜀 [∫ ( ) ( ) ] = δΦ exp(ik⋅x) J k⊥v-⊥ exp ik⊥v-⊥sinα ∂F0-v dv dv dα -dk-. (239) k 0 Ω Ω ∂𝜀 ⊥ ⊥ ∥ (2π)3

Note that ∂F₀∕∂𝜀 is independent of the gyro-angle α in terms of guiding-center variables. When transformed back to particle coordinates, X contained in ∂F₀∕∂𝜀 will introduce α dependence via X = x + v × e∥Ω-

. This dependence on α is weak since the equilibrium quantities can be considered constant over a Larmor radius distance evaluated at the thermal velocity. Therefore this dependence can be ignored when performing the integration over α, i.e., in terms of particle coordinates, ∂F₀∕∂𝜀 is approximately independent of the gyro-angle α. Then the integration over α in Eq. (239) can be performed, yielding

∫ ⟨δΦ⟩α∂F0-dv ∂𝜀 ∫ ∫ ∫ (k⊥v-⊥) [∫ 2π ( k⊥v-⊥ ) ] ∂F0- -dk-- = δΦkexp(ik ⋅x) J0 Ω 0 exp i Ω sinα dα ∂ 𝜀 v⊥dv⊥dv ∥(2π )3 ∫ [∫ ∫ (k v ) ( k v )∂F ] dk = δΦkexp(ik ⋅x) J0 -⊥--⊥ 2πJ0 -⊥-⊥- ---0v⊥dv⊥dv∥ ----3, (240) Ω Ω ∂𝜀 (2π)

where again use has been made of the deﬁnition of the Bessel function.

The remaining velocity integration can be performed analytically if F₀ is Maxwellian

In order to perform the remaining velocity integration in expression (240), we assume that F₀ is a Maxwellian distribution given by

( ) ----n0(X-)----- −-mv2- F0 = fM = (2πT (X)∕m )3∕2 exp 2T (X ) (241) n0 ( − v2) = ----3∕23 exp 2v2- , (242) (2π) vt t

where v_t = ∘ ----
T∕m

, then

∂F m ---0= − --fM . ∂𝜀 T

(243)

Again we will ignore the weak dependence of n₀(X) and T(X) on v introduced by X = x + v × e_∥∕Ω when transformed back to particle coordinates. (For suﬃciently large velocity, the corresponding Larmor radius will be large enough to make the equilibrium undergo substantial variation. Since the velocity integration limit is to inﬁnite, this will deﬁnitely occur. However, F₀ is exponentially decreasing with velocity, making those particles with velocity much larger than the thermal velocity negligibly few and thus can be neglected.)

Parallel integration Using Eq. (243), the expression in the square brackets of Eq. (240) is written as

∫ ∫ ( k v ) ∂F 2π J20 -⊥-⊥- ---0v⊥dv⊥dv∥ Ω∫ ∫ ∂𝜀( ) ( ) m---n0-- 2 k⊥v⊥- -1 v2∥ +-v2⊥ = − T(2π)1∕2 J 0 Ω v3t exp − 2v2t v⊥dv ⊥dv∥ (244) ∫ ∫ ( ) ( -2 -2) = − m--n0-- J2 k⊥v⊥- exp −v∥-+-v⊥ v dv dv , (245) T(2π)1∕2 0 Ω 2 ⊥ ⊥ ∥

where v_∥ = v_∥∕v_t, v_⊥ = v_⊥∕v_t. Using

∫ ∞ ( x2) √ --- exp − 2- dx = 2π, −∞

(246)

the integration over v_∥ in expression (245) can be performed, yielding

∫ ∞ ( ) ( -2 ) − m-n0 J20 k⊥v⊥- exp − v⊥- v⊥dv ⊥ T 0 Ω 2

(247)

Perpendicular integration Using (I veriﬁed this by using Sympy)

∫ ∞ ( 2) J20(ax)exp − x xdx = exp(− a2)I0(a2), 0 2

(248)

where I₀(a) is the zeroth modiﬁed Bessel function of the ﬁrst kind, expression (247) is written

m − --n0exp(− b)I0(b) T

(249)

where b = k_⊥²v_t²∕Ω² = k_⊥²ρ_t². Then the corresponding density (229) is written as

∫ ∫ − q- ⟨δΦ ⟩ ∂F0dv = qn0 δΦ exp(ik ⋅x)exp(− b)I (b)-dk-. m α ∂𝜀 T k 0 (2π)3

(250)

Final form of polarization density

In Fourier space, the adiabatic term in expression (228) is written as

∫ ∫ q(δΦ)∂F0-dv = − qn0δΦ = − qn0 δΦk exp(ik ⋅x)-dk3-. m ∂𝜀 T T (2π)

(251)

Plugging expression (250) and (251) into expression (226), the polarization density n_p is written as

∫ np = − qn0 δΦk exp(ik ⋅x)[1− exp(− b)I0(b)]-dk-. (252) T (2π )3

Deﬁne

Γ = exp(− b)I (b), 0 0

(253)

then Eq. (252) is written as

qn0∫ dk np = −-T- δΦkexp(ik ⋅x)[1 − Γ 0](2π)3,

(254)

Expression (254) agrees with the result given in Yang Chen’s notes. Note that the dependence on species mass enters the formula through the Larmor radius ρ_t in Γ₀.

7.3 Pade approximation

Γ₀ deﬁned in Eq. (253) can be approximated by the Pade approximation as

1 Γ 0 ≈----. 1+ b

(255)

The comparison between the exact value of Γ₀ and the above Pade approximation is shown in Fig. 1.

pict

Fig. 1: Comparison between the exact value of Γ₀ = exp(−(k_⊥ρ)²)I₀((k_⊥ρ)²) and the corresponding Pade approximation 1∕(1 + (k_⊥ρ)²).

Using the Pade approximation (255), the polarization density n_p in expression (254) can be written as

∫ 2 2 np ≈ − qn0 δΦkexp(ik⋅x)--k⊥ρ2---dk--. (256) T 1 + k⊥ρ2(2π)3

(Pade approximate is the “best” approximation of a function by a rational function of given order – under this technique, the approximant’s power series agrees with the power series of the function it is approximating.)

Long wavelength approximation of the polarization density

In the long wavelength limit, k_⊥ρ ≪ 1, expression (256) can be further approximated as

∫ n ≈ − qn0 δΦ exp(ik ⋅x)k2ρ2-dk--, p T k ⊥ (2π)3 = qn0 ρ2∇2 δΦ. (257) T ⊥

Then the corresponding term in the Poisson equation is written as

q-n = q2n0-ρ2∇2δΦ 𝜀0 p 𝜀0T ⊥ ρ2- 2 = λ2∇ ⊥δΦ, (258) D

where λ_D is the Debye length deﬁned by λ_D² = T𝜀₀∕(n₀q²). For typical tokamak plasmas, the thermal ion gyroradius ρ_i is much larger than λ_D. Therefore the term in expression (258) for ions is much larger than the space charge term ∇²δΦ ≡∇_⊥²δΦ + ∇_∥²δΦ ≈∇_⊥²δΦ in the Poisson equation. Therefore the space charge term can be neglected in the long wavelength limit.

Equation (258) also shows that electron polarization density is smaller than the ion polarization density by a factor of ρ_e∕ρ_i ≈ 1∕60. Note that this conclusion is drawn in the long wavelength limit. For short wavelength, the electron polarization and ion polarization density can be of similar magnitude (to be discussed later).

Polarization density expressed in terms of Laplacian operator

The polarization density expression (257) is for the long wavelength limit, which partially neglects FLR eﬀect. Let us go back to the more general expression (256). The Poisson equation is written

− 𝜀0∇2⊥ δΦ = qiδni + qeδne.

(259)

Write δn_i = n_pi + δn_i′, where δn_pi is the ion polarization density, then the above expression is written

− 𝜀0∇2⊥δΦ − qinpi = qiδn′i + qeδne.

(260)

Fourier transforming in space, the above equation is written

− 𝜀0k2⊥δˆΦ − qiˆnpi = qiδˆn′i + qeδˆne,

(261)

where _pi is the Fourier transformation (in space) of the polarization density n_pi and similar meanings for δ , δ_i′, and δ_e. Expression (256) implies that _pi is given by

qini0- --k2⊥ρ2i-- ˆnpi = − Ti δˆΦ1 + k2ρ2. ⊥ i

(262)

Using this, equation (261) is written

( ) 2 ˆ qini0--k2⊥-ρ2i-- ˆ ′ − 𝜀0k⊥ δΦ− qi − Ti 1+ k2⊥ρ2i δΦ = qiδˆni + qeδnˆe,

(263)

Multiplying both sides by (1 + k_⊥²ρ_i²)∕𝜀₀, the above equation is written

2 2 2 qi( qini0 2 2 ) 1 2 2 ′ − (1 + k⊥ρi)k⊥δˆΦ− 𝜀0- − -Ti-(k⊥ρi)δˆΦ = 𝜀0(1+ k⊥ρi)(qiδˆni + qeδˆne).

(264)

Next, transforming the above equation back to the real space, we obtain

qi( qini0 ) 1 − (1− ρ2i∇2⊥ )∇2⊥δΦ − 𝜀- -T--ρ2i∇2⊥δΦ = 𝜀-(1− ρ2i∇2⊥)(qiδn′i + qeδne). 0 i 0

(265)

Neglecting the Debye shielding term, the above equation is written

( ) ρ2i-- 2 -1 2 2 ′ − λ2Di∇ ⊥δΦ = 𝜀0(1− ρi∇ ⊥)(qiδni + qeδne),

(266)

which is the equation actually solved in many gyrokinetic codes, where λ_Di² = 𝜀₀T_i∕(q_i²n_i0).

8 Polarization density matrix obtained by numerically integrating in phase space using grid

In Sec. 7, to evaluate the polarization density, the potential δΦ is Fourier expanded in space using local Cartesian coordinates, and then the double gyro-angle integration of each harmonic is expressed as the Bessel function. (It seems that the original motivation of using the Fourier expansion here is to facilitate analytical treatment and is not designed for numerical use. GEM code does make use of the local Fourier expansion in its numerical implementation, where the local perpendicular wave number needs to be estimated numerically, which seems awkward.)

In this section, we avoid using the local Fourier expansion, and directly express the double gyro-angle integral as linear combination of values of δΦ at spatial grid-points. The polarization density is given by Eq. (226), i.e.,

q-∫ ( ∂F0-) np(x) = m dv (δΦ − ⟨δΦ⟩α) ∂𝜀 .

(267)

Our goal is to write n_p in the following real-space discrete from:

∑ np(xk ) = cijδΦi,j. i,j

(268)

8.1 Direct evaluation of the double gyrophase integration

Deﬁne

q∫ ( ∂F0) A(x) = −m- dv ⟨δΦ ⟩α ∂𝜀-- .

(269)

Using dv = v_⊥dv_⊥dv_∥dα, the above integration is written as

q ∫ ∞ ∫ ∞ ∂F ∫ 2π A (x) = −-- dv∥ v⊥dv⊥ --0- dα ⟨δΦ ⟩α, m −∞ 0 ∂𝜀 0

(270)

where use has been made of the assumption that ∂F₀∕∂𝜀 is uniform in α in (x,v_⊥,v_∥,α) coordinates, and thus is moved outside of the α integration. Using the deﬁnition of gyro-averaging (2π)⁻¹ ∫ ₀^2π(…)dα′, the above integration is written as

q ∫ ∞ ∫ ∞ ∂Fi0 ∫ 2π ( 1 ∫ 2π ) A (x) = −-- dv∥ dv⊥----v⊥ dα --- δΦdα ′ . m −∞ 0 ∂𝜀 0 2π 0

(271)

Note that the gyro-averaging is performed in the guiding-center space, i.e., performed by varying the gyroangle α′ while keeping guiding-center position X, v_⊥, and v_∥ constant. Using the deﬁnition of δΦ_g (i.e., its relation with δΦ):

δΦg(X,α′,v⊥) = δΦ(x),

(272)

where the particle location x is computed from the guiding-center location by

e∥(X ) x = X + ρ(x,v ) ≈ X − v× -----, Ω(X )

(273)

then A(x) is written as

A(x) q ∫ ∞ ∫ ∞ ∂Fi0 ∫ 2π [ 1 ∫ 2π ( ′ e∥(X)) ′] = −m- dv∥ dv⊥-∂𝜀-v⊥ dα 2π- δΦ X − v⊥(v⊥,α )× Ω-(X-)- dα ∫−∞ ∫0 ∫0 ⌊ 0 ( )⌋ -q ∞ ∞ ∂Fi0 2π ⌈-1-∑N2 ′ e∥(X-)-⌉ ≈ −m −∞ dv∥ 0 dv⊥ ∂𝜀 v⊥ 0 dα N2 δΦ X − v⊥ (v⊥,α j) × Ω (X ) j=1

where v_⊥ = v(v_⊥,α_j′) denotes a perpendicular velocity corresponding to a discrete gyro-angle α_j′.

Next, in order to perform the remaining velocity space integration, transform back to the particle coordinates (because the velocity integration is performed in the particle coordinates, i.e., it is performed by keeping the particle coordinate x constant):

∫ ∫ q- ∞ ∞ ∂Fi0 A(x) = − m − ∞dv∥ 0 dv⊥ ∂𝜀 v⊥ ∫ ⌊ N2 ( )⌋ × 2πdα ⌈-1-∑ δΦ x + v (v ,α)× e∥(X-)− v (v ,α ′)× e∥(X-)-⌉ 0 N2 j=1 ⊥ ⊥ Ω(X ) ⊥ ⊥ j Ω(X ) ∫ ∞ ∫ ∞ ≈ − q- dv∥ dv⊥∂Fi0v⊥ m −∞⌊ 0 ∂𝜀 ⌋ ∫ 2π ∑N2 ( ) × dα ⌈-1- δΦ x + v⊥(v⊥,α)× e∥(x) − v⊥(v⊥,α′j)× e∥(x) ⌉ 0 N2 j=1 Ω(x) Ω (x) q ∫ ∞ ∫ ∞ ∂Fi0 ≈ − -- dv∥ dv⊥----v⊥ m −∞ 0 ( ∂𝜀 ) -2π∑N1 1--N∑2 e-∥(x) ′ e∥(x) ×N1 N2 δΦ x+ v⊥ (v⊥,αi)× Ω(x) − v⊥(v⊥,αj)× Ω (x) . (274) i=1 j=1

For notation ease, deﬁne

e∥(x)- ′ e∥(x) Δρij = v⊥ (v⊥,αi)× Ω(x) − v⊥(v⊥,αj)× Ω (x),

(275)

where Δρ_ij is a function of (x,v_⊥,α_i,α_j′). Then Eq. (274) is written as

q∫ ∞ ∫ ∞ ∂Fi0 2π∑N1 1 N∑2 A(x) = −-- dv∥ dv⊥----v⊥--- --- δΦ(x+ Δ ρij). (276) m −∞ 0 ∂𝜀 N1 i=1 N2 j=1

The guiding-center transform and its inverse involved in the above are illustrated in Fig. 2. Then the double gyro-angle integral appearing in the polarization density is approximated as

∫ 2π ( ∫ 2π ′) (2π)2 N∑1 N∑2 dα δΦ(x)dα ≈ N1N2- δΦ(xij), 0 0 i=1 j=1

(277)

where N₁ = 4,N₂ = 4 for the case shown in Fig. 2. This gives how to approximate the double gyro-angle integration using the discrete values of δΦ_ij. The spatial points x_ij appearing in Eq. (277) are not necessarily grid points. Linear interpolations are used to express δΦ(x_ij) as linear combination of values of δΦ at grid-points.

Fig. 2: Given a v_⊥, then the double gyro-angle integration in Eq. (271) at a particle location x is evaluated by the following steps: Use the guiding center transform to get guiding-center locations (four locations are shown in this case, namely X_i with i = 1,2,3,4, corresponding gyro-angle α_i with i = 1,2,3,4; 2.) For each guiding-center location, use the inverse guiding-center transform to calculate points on the corresponding gyro-ring (four points are shown for each guiding-center X_i in this case, namely x_ij with j = 1,2,3,4, corresponding gyro angle α_j′ with j = 1,2,3,4.)

8.2 Performing the parallel velocity integration

Assume that F₀ is Maxwellian, then

∂F m m n ( − mv2∥) ( − mv2 ) --0-= − -fM = −--------0-3∕2 exp ------ exp ----⊥- . ∂ 𝜀 T T (2πT ∕m) 2T 2T

(278)

Note that Δρ_ij is independent of v_∥. Then the integration over v_∥ in Eq. (276) can be analytically performed, yielding

( - ) ( ) q-∫ ∞ - −-v2∥ ∫ ∞ - - --n0-- − v2⊥ A (x) = T −∞ dv∥exp 2 0 dv⊥ v⊥(2π)3∕2 exp 2 2π-N∑1 -1-N∑2 × N1 N2 δΦ(x + Δρij) ∫i=1 j=1 ( ) ∞ - - − v2⊥ -1-∑N1 1--N∑2 q- = n0 0 dv⊥v⊥ exp 2 N1 N2 T δΦ (x + Δ ρij). (279) i=1 j=1

where v_∥ = v_∥∕v_t, v_⊥ = v_⊥∕v_t, v_t = ∘ ----
T∕m

, and use has been made of

∫ ∞ ( 2) √ --- exp − x- dx = 2π. −∞ 2

(280)

8.3 Toroidal DFT of polarization density in ﬁeld-aligned coordinates

In ﬁeld-aligned coordinates (x,y,z), Fourier expansion of δΦ along y is written

Ny∕2 ( ) δΦ (x) ≈ ∑ exp ιn 2πy δΦn(x,z), n=−Ny∕2 Ly

(281)

where ι = √ ---
− 1 . Use this in Eq. (279), yielding

∫ ( - ) A(x) = n ∞ dv v exp −-v2⊥ 0 0 ⊥ ⊥ 2 N∑1 N∑2 ⌊ N∑y∕2 ( ) ⌋ × 1-- -1- ⌈-q exp ιn2π-(y + Δρijy) δΦn(x+ Δ ρijx,z + Δ ρijz)⌉ N1 i=1 N2 j=1 T n= −Ny∕2 Ly Nt∕2 ( ) ∫ ( - ) ∑ 2π- ∞ - - −-v2⊥ = exp ιnLy y n0 0 dv⊥ v⊥exp 2 n= −Ny∕2 1--N∑1 -1-N∑2 [ ( 2π- ) q- ] × N1 N2 exp ιn LyΔ ρijy TδΦn (x + Δρijx,z + Δρijz) . (282) i=1 j=1

From Eq. (282), the Fourier expansion coeﬃcient of A(x) can be identiﬁed:

∫ ( ) ∞ - - − v2⊥ An(x,z) = n0 0 dv⊥ v⊥ exp 2 ∑N1 ∑N2 [ ( ) ] × -1- 1-- exp ιn2π-Δρijy -qδΦn(x+ Δ ρijx,z + Δ ρijz) . (283) N1 i=1 N2 j=1 Ly T

Let us make the following approximation

δΦn(x+ Δ ρijx,z + Δ ρijz) ≈ δΦn(x + Δρijx,z),

(284)

which should be a good approximation since the variation of δΦ along a ﬁeld line over a distance of a Larmor radius is small. Then expression (283) is written as

∫ ∞ - - ( − v2 ) An(x,z) ≈ n0 dv⊥v⊥ exp --⊥- 0 [ ( 2) ] -1-∑N1 1--N∑2 2π- -q ×N1 N2 exp ιnLy Δρijy T δΦn(x +Δ ρijx,z) . (285) i=1 j=1

The approximation (284) makes the operator A_n on δΦ_n become local in z direction, i.e., it involves δΦ_n at locations with a ﬁxed value of z. This makes A_n(x,z) reduce to a 1D operator on δΦ_n in the x direction.

8.4 Using MC integration, to be continued, not necessary

The integration we try to express is given by

∫ Aijk ≈ -1-- dxnp(x) Vijk Vijk 1 ∫ ∫ ( q ∂F0) = V--- dx dv m-(δΦ− ⟨δΦ⟩α)∂-𝜀- , (286) ijk Vijk

where A_ijk is the value of n_p(x) at a grid-point, which is approximated by the spatial averaging of n_p(x) over the cell whose center is the grid-point, V _ijk is the volume of the cell. We will use Monte-Carlo guiding-center markers to compute the above cell average.

( ) A (1)= − q--1--∑ ∑ ⟨δΦ⟩ ∂F01 ijk m Vijk p α α ∂𝜀 g ∑ ∑ ( ) = − q--1-- ⟨δΦ ⟩α ∂F01 m Vijk p α ∂𝜀 g

Nearest neighbour interpolation

∑ ∑ ( ) A (2) = −-q--1- δΦ∂F0-1 ijk m Vijk p α ∂𝜀 g qδΦ ∑ ∑ ( ∂F 1) = −-----ijk --0-- m Vijk p α ∂𝜀 g

Later, I found that evaluating the polarization density using grid works well. Therefore there is no need to evaluate it using MC markers, which is more complicated.

A Adiabatic electron response

Assume that the total electron density satisﬁes the Boltzmann distribution on each magnetic surface, i.e.,

( qδΦ ) ( q δΦ) ne = Ne(ψ)exp − -e-- ≈ Ne 1− -e-- , Te Te

(287)

where N_e is a radial function. Note that this does not imply that the equilibrium density is N_e (it just implies that the total density is N_e at the location where δΦ = 0, which can still be diﬀerent from the equilibrium density).

Further assume that the magnetic surface average of δn_e = n_e − n_e0 is zero, i.e.,

⟨ne − ne0⟩f = 0,

(288)

where n_e0 is the equilibrium electron density, ⟨…⟩_f is the magntic surface averaging operator. Using Eq. (287) in the above condition, we get

( ) Ne = ne0----1-----≈ ne0 1 + qe⟨δΦ-⟩f- . 1 − qe⟨δTeΦ⟩f- Te

(289)

Then expression (287) is written as

( ) ( ) ( ) qe⟨δΦ⟩f- qeδΦ- qe⟨δΦ⟩f- qeδΦ- ne = ne0 1+ Te 1 − Te ≈ ne0 1 + Te − Te .

(290)

Then the perturbation δn_e = n_e − n_e0 is written as

qe(δΦ-−-⟨δΦ⟩f) δne = − n0e Te .

(291)

This model of electron response is often called adiabatic electron.

A.1 Poisson’s equation with adiabatic electron response

Pluging expression (291) into the Poisson equation (227), we get

∫ q ∂F q2(δΦ − ⟨δΦ ⟩) − 𝜀0∇2⊥δΦ− qi -i(δΦ − ⟨δΦ⟩α)-i0dv + n0ee---------f--= qiδn′i. mi ∂𝜀 Te

(292)

When solving the Poisson equation, the equation is Fourier expanded in toroidal harmonics and each harmonic is independent of each other, so that they can be solved independently. For n≠0 harmonics, the ⟨δΦ⟩_f terms is zero and thus it is trivial to treat the electron term. For the n = 0 harmonic, the ⟨δΦ⟩_f term is nonzero and needs special treatment. I use the following method to obtain ⟨δΦ⟩_f. First slove the n = 0 harmonic of the following equation:

2 ′ ∫ qi ′ ′ ∂Fi0 ′ − 𝜀0∇ ⊥δΦ − qi mi-(δΦ − ⟨δΦ ⟩α)-∂𝜀-dv = qiδni,

(293)

(i.e., Eq. (292) with electron term dropped), and then take the magnetic surface average of the solution δΦ′ to get ⟨δΦ′⟩_f. It can be proved that ⟨δΦ′⟩_f is equal to ⟨δΦ⟩_f. Then solving Eq. (292) becomes easier since ⟨δΦ⟩_f term is known and can be moved to the right-hand side as a source term.

B Characteristic curves of Frieman-Chen nonlinear gyrokinetic equation

Examining the left-hand side of Eq. (), it is ready to ﬁnd that the characteristic curves of this equation are given by the following equations:

dX -dt = v∥e∥ + VD + δVD,

(294)

d𝜀 --= 0, dt

(295)

dμ --= 0. dt

(296)

(It is instructive to notice that the kinetic energy 𝜀 is conserved along the characteristic curves while the real kinetic energy of a particle is usually not conserved in a perturbed electromagnetic ﬁeld. This may be an indication that Frieman-Chen equation neglects the velocity space nonlinearity.) For notation ease, we denote the total guiding-center velocity by V_G, i.e.,

dX dX dX VG = --- = ---0+ ---1 = v∥e∥ + VD + δVD, dt dt dt

(297)

where

dX0-= v e + V . dt ∥ ∥ D

(298)

In the above, v_∥ can be obtained by using v_∥ = σ ∘ ---------
2(𝜀− Bμ ) . However σ is not easy to determine since it changes with time for trapped particles. In practice, we get v_∥ from an evolution equation. This evolution equation for v_∥ can be obtained by combining the equations for 𝜀 and μ. Next, we derive this equation.

B.1 Time evolution equation for v_∥

Using the deﬁnition μ = v_⊥²∕(2B₀), equation (296), i.e., dμ∕dt = 0, is written as

( 2 ) -d -v⊥- = 0, dt 2B0

(299)

which is written as

-1-d(v2)+ v2 d-( 1-) = 0, B dt ⊥ ⊥ dt B0

(300)

which can be further written as

-d 2 d- dt(v⊥) = 2μ dt(B0).

(301)

Using the deﬁnition of the characteristics, the right-hand side of the above equation can be expanded, giving

d 2 (∂B0 dX d𝜀 ∂B0 dμ ∂B0) dt(v⊥) = 2μ -∂t-+ -dt ⋅∇XB0 + -dt-∂𝜀-+ dt-∂μ-- ,

(302)

where dX∕dt, d𝜀∕dt, and dμ∕dt are given by Eq. (294), (295), and (296), respectively. Using Eqs. (294)-(296) and ∂B₀∕∂t = 0, equation (302) is reduced to

( ) d- 2 dX- dt(v⊥ ) = 2μ dt ⋅∇XB0 .

(303)

On the other hand, equation (295), i.e., d𝜀∕dt = 0, is written as

d- 2 dt(v ) = 0,

(304)

which can be further written as

d 2 d 2 dt(v∥) = −dt(v⊥).

(305)

Using Eq. (303), the above equation is written as

( ) d- μ- dX- dt(v∥) = − v∥ dt ⋅∇XB0 ,

(306)

which is the equation for the time evolution of v_∥. This equation involves dX∕dt, i,e., the guiding-center drift, which is given by Eq. (294). Equation (306) for v_∥ can be simpliﬁed by noting that the Frieman-Chen equation is correct only to the second order, O(λ²), and thus the characteristics need to be correct only to the ﬁrst order O(λ) and higher order terms can be dropped. Note that, in the guiding-center drift dX∕dt given by Eq. (294), only the v_∥e_∥ term in is of order O(λ⁰), all the other terms are of O(λ¹). Using this, accurate to order O(λ¹), equation (306) is written as

d- dt(v∥) = − μe∥ ⋅∇XB0,

(307)

which is the time evolution equation ready to be used for numerically advancing v_∥. Note that only the mirror force −μe_∥⋅∇B appears in Eq. (307) and there is no parallel acceleration term qv_∥δE_∥∕m in Eq. (307). This is because δE_∥ = −b ⋅∇δΦ − ∂δA_∥∕∂t is of order O(λ²) and (**check** the terms involving E_∥ are of O(λ³) or higher and thus have been dropped in deriving Frieman-Chen equation.)

C From (δΦ,δA) to (δE,δB)

C.1 Expression of δB_⊥ in terms of δA

Note that

δB ⊥ = ∇ × δA − (e∥ ⋅∇ × δA )e∥ = ∇ × (δA ⊥ + δA∥e∥)− [e∥ ⋅∇ × (δA ⊥ +δA ∥e∥)]e∥ (308)

Correct to order O(λ), δB_⊥ in the above equation is written as (e_∥ vector can be considered as constant because its spatial gradient combined with δA will give terms of O(λ²), which are neglected)

δB⊥ ≈ ∇ × δA ⊥ + ∇ δA∥ × e∥ − [e∥ ⋅∇ × δA ⊥ + e∥ ⋅(∇ δA∥ ×e∥)]e∥ (309) = ∇ × δA ⊥ + ∇ δA∥ × e∥ − (e∥ ⋅∇ × δA ⊥)e∥ (310)

Using local cylindrical coordinates (r,ϕ,z) with z being along the local direction of B₀, and two components of A_⊥ being A_r and A_ϕ, then ∇× A_⊥ is written as

( ∂δA ) (∂δA ) 1 [ ∂ ∂δA ] ∇ × δA ⊥ = − ---ϕ- er + ----r eϕ +- --(rδAϕ)− ----r e∥. ∂z ∂z r ∂r ∂ ϕ

(311)

Note that the parallel gradient operator ∇_∥≡ e_∥⋅∇ = ∂∕∂z acting on the the perturbed quantities will result in quantities of order O(λ²). Retaining terms of order up to O(λ), equation (311) is written as

1 [ ∂ ∂ δA ] ∇ ×δA ⊥ ≈ - --(rδAϕ)− ----r e∥, r ∂r ∂ ϕ

(312)

i.e., only the parallel component survive, which exactly cancels the last term in Eq. (310), i.e., equation (310) is reduced to

δB⊥ = ∇ δA ∥ × e∥.

(313)

Using basis vectors in ﬁeld-aligned coordinates

In terms of δB_xL and δB_yL, δB_⊥ is written as

δB ⊥ = δBxL∇x + δByL ∇y = ∇δA ∥ × e∥

(314)

Dotting the above equation by ∇x and ∇y, respectively, we obtain

δBxL|∇x|2 + δByL∇x ⋅∇y = ∇x ⋅(∇ δA∥ × e∥),

(315)

δBxL∇x ⋅∇y + δByL|∇y |2 = ∇y ⋅(∇ δA∥ × e∥).

(316)

Equations (315) and (316) can be further written as

( ∂δA ∂δA ) δBxL|∇x|2 + δByL∇x ⋅∇y = − ∇x ⋅ ---∥∇y × e∥ + ---∥∇z × e∥ , ∂y ∂z

(317)

and

( ) 2 ∂δA∥- ∂δA∥- δBxL∇x ⋅∇y + δByL |∇y | = − ∇y ⋅ ∂x ∇x × e∥ + ∂z ∇z × e∥ ,

(318)

The solution of this 2 × 2 system is expressed by Cramer’s rule in the code.

Use B₀ = Ψ′∇x ×∇y

b = Ψ′∇x ×∇y∕B₀

C.2 Expression of δB_∥ in terms of δA

δB∥ = e∥ ⋅∇ × δA = e∥ ⋅∇ × (δA ⊥ + δA∥e∥) (319)

Accurate to O(λ¹), δB_∥ in the above equation is written as (e_∥ vector can be considered as constant because its spatial gradient combined with δA will give O(λ²) terms, which are neglected)

δB ∥ ≈ e∥ ⋅∇ × δA ⊥ + e∥ ⋅(∇ δA∥ ×e∥) = e∥ ⋅∇ × δA ⊥ (320)

[Using local cylindrical coordinates (r,ϕ,z) with z being along the local direction of B₀, and two components of δA_⊥ being δA_r and δA_ϕ, then ∇× δA_⊥ is written as

( ∂δA ϕ) ( ∂δAr) 1 [ ∂ ∂δAr] ∇ × δA⊥ = −--∂z- er + -∂z-- er + r ∂r(rδAϕ)− -∂ϕ--e∥

(321)

1 [ ∂ ∂ δA ] ∇ ×δA ⊥ ≈ - --(rδAϕ)− ----r e∥, r ∂r ∂ ϕ

(322)

Using this, equation (320) is written as

[ ] 1 ∂-- ∂δAr- δB∥ = r ∂r(rδA ϕ)− ∂ϕ .

(323)

However, this expression is not useful for GEM because GEM does not use the local coordinates (r,ϕ,z).]

C.3 Expressing the perturbed drift in terms of δE and δB

The perturbed drift δV_D is given by Eq. (139), i.e.,

δV = −-q∇ ⟨δL⟩ × e∥. D m X α Ω

(324)

Using δL = δΦ − v ⋅ δA, the above expression can be further written as

δV = −-q∇ ⟨δΦ − v⋅δA ⟩ × e∥ D m X α Ω = q-e∥ ×∇X ⟨δΦ⟩α − q-e∥ ×∇X ⟨v∥δA∥⟩α m Ω m Ω −-q e∥× ∇X ⟨v⊥ ⋅δA ⊥⟩α. (325) m Ω

Accurate to order O(λ), the term involving δΦ is

q e∥ e∥ m- Ω-× ∇X ⟨δΦ⟩α = B--×⟨∇X δΦ⟩α e0∥ ≈ B--×⟨∇x δΦ⟩α 0 ⟨ ⟩ ≈ e∥-× − δE − ∂δA B0 ∂t α ≈ e∥-×⟨− δE⟩α B0 ≡ δVE, (326)

which is the δE×B₀ drift. Accurate to O(λ), the ⟨v_∥δA_∥⟩_α term on the right-hand side of Eq. (325) is written

-q e∥ q-v∥ −m Ω × ∇X ⟨v∥δA ∥⟩α ≈ − m Ω ⟨e∥ × ∇X (δA∥)⟩α ≈ − q-v∥⟨e × ∇ (δA )⟩ (327) m Ω ∥ x ∥ α = v ⟨δB⊥-⟩α-, (328) ∥ B0

which is called “magnetic ﬂuttering” (this is actually not a real drift). In obtaining the last equality, use has been made of Eq. (313), i.e., δB_⊥ = ∇_xδA_∥× e_∥.

Accurate to O(λ), the last term on the right-hand side of expression (325) is written

− q-e∥× ∇X ⟨v⊥ ⋅δA⊥ ⟩α ≈ − 1-⟨e∥ × ∇X (v⊥ ⋅δA⊥ )⟩α m Ω B0 ≈ − 1-⟨e × ∇ (v ⋅δA )⟩ B0 ∥ x ⊥ ⊥ α 1-- = − B0⟨e∥ × (v⊥ × ∇x ×δA ⊥ + v⊥ ⋅∇xδA ⊥)⟩α 1-- = − B0⟨(e∥ ⋅∇x × δA⊥ )v ⊥ + e∥ × v ⊥ ⋅∇x δA⊥ ⟩α

Using equation (320), i.e., δB_∥ = e_∥⋅∇× δA_⊥, the above expression is written as

−-q e∥× ∇ ⟨v ⋅δA ⟩ = − 1-⟨δB v + e × v ⋅∇ δA ⟩ m Ω X ⊥ ⊥ α B0 ∥ ⊥ ∥ ⊥ x ⊥ α 1-- ≈ − B0⟨δB∥v⊥ + e∥ × v⊥ ⋅∇X δA⊥ ⟩α 1 1 ≈ − B0⟨δB∥v⊥ ⟩α − B0e∥ × ⟨v ⊥ ⋅∇X δA ⊥⟩α. 1 ≈ − B-⟨δB∥v⊥ ⟩α. (329) 0

where use has been made of ⟨v_⊥⋅∇_XδA_⊥⟩_α ≈ 0 (**seems wrong**), where the error is of O(λ)δA_⊥. The term ⟨δB_∥v_⊥⟩_α∕B₀ is of O(λ²) and thus can be neglected (I need to verify this).

Using Eqs. (326), (328), and (329), expression (325) is ﬁnally written as

q e ⟨δE⟩ × e ⟨δB ⟩ δVD ≡ − --∇X ⟨δL ⟩α × -∥ = ----α---∥-+ v∥---⊥-α. m Ω B0 B0

(330)

Using this, the ﬁrst equation of the characteristics, equation (294), is written as

dX- dt = v∥e∥ + VD + δVD (331) ⟨δE⟩α × e∥ ⟨δB ⊥⟩α = v∥e∥ + VD + ----B-----+ v∥--B---- 0 0 ≡ VG (332)

C.4 Expressing the coeﬃcient before ∂F₀∕∂𝜀 in terms of δE and δB

[Note that

∂δA⊥ -∂t--= − (δE ⊥ + ∇⊥δΦ ),

(333)

where ∂δA_⊥∕∂t is of O(λ²). This means that δE_⊥ + ∇_⊥δϕ is of O(λ²) although both δE_⊥ and δϕ are of O(λ).]

Note that

∂⟨v-⋅δA⟩α = v ∂⟨δA∥⟩α+ v ⊥ ⋅ ∂-⟨δA-⟩α ∂t ∥ ∂t ∂t ∂⟨δA∥⟩α = v∥ ∂t + ⟨v⊥ ⋅(− δE − ∇δΦ )⟩α ∂⟨δA∥⟩α ≈ v∥ ∂t − ⟨v⊥ ⋅δE⟩α (334)

where use has been made of ⟨v_⊥⋅∇δϕ⟩≈ 0, This indicates that ⟨v_⊥⋅ δE⟩_α is of O(λ¹)δE. Using Eq. (334), the coeﬃcient before ∂F₀∕∂𝜀 in Eq. (144) can be further written as

[ ( e ) ] − q- − ∂-⟨v-⋅δA⟩α − v∥e∥ + VD − q--∥× ∇X ⟨v ⋅δA ⟩α ⋅∇X ⟨δΦ⟩α m [ ∂t m Ω ⟨ ⟩ ] = − q- − v ∂⟨δA-∥⟩α + ⟨v ⋅δE ⟩ − (v e + V − q-e∥× ∇ ⟨v ⋅δA ⟩ )⋅ − δE − ∂δA m ∥ ∂t ⊥ α ∥ ∥ D m Ω X α ∂t α q [ ∂⟨δA ∥⟩α ( q e∥ ) ⟨ ∂A∥ ⟩ ] ≈ − m- − v∥--∂t-- + ⟨v⊥ ⋅δE ⟩α − v∥e∥ + VD − m-Ω-× ∇X ⟨v ⋅δA ⟩α ⋅⟨− δE⟩α + v∥ -∂t- q [ ( q e ) ] α = − -- ⟨v ⊥ ⋅δE ⟩α + v∥e∥ + VD −-- -∥× ∇X ⟨v⋅δA ⟩α ⋅⟨δE⟩α m [ ( m Ω ) ] ≈ − q- ⟨v⊥ ⋅δE ⟩α + v∥e∥ + VD + v∥⟨δB-⊥⟩- ⋅⟨δE⟩α . (335) m B0

Using Eq. (335) and (), gyrokinetic equation (144) is ﬁnally written as

[ ( ) ] ∂- + v e + V + ⟨δE-⟩α-×-e∥+ v ⟨δB⊥-⟩α- ⋅∇ δf ∂t ∥ ∥ D B0 ∥ B0 X (⟨δE⟩α × e∥ ⟨δB⊥ ⟩α ) = − ----B0----+ v∥--B0--- ⋅∇XF0 [ ( ) ] − q- ⟨v⊥ ⋅δE ⟩α + v∥e∥ + VD + v∥⟨δB-⊥⟩α- ⋅⟨δE⟩α ∂F0-. (336) m B0 ∂𝜀

D Split-weight scheme for electrons in GEM code

The perturbed distribution function is decomposed as given by Eq. (199), i.e.,

q ∂F0 q ∂F0 δF = δh + m-(δΦ− ⟨δΦ⟩α)-∂𝜀-+ m-⟨v⋅δA ⟩α ∂𝜀-,

(337)

where the term in blue is the so-called adiabatic response, which depends on the gyro-angle in guiding-center coordinates. Recall that the red term ⟨δΦ⟩_α, which is independent of the gyro-angle, is introduced in order to eliminate the time derivative ∂⟨δΦ⟩_α∕∂t term on the right-hand side of the original Frieman-Chen gyrokinetic equation.

The so-called generalized split-weight scheme corresponds to going back to the original Frieman-Chen gyrokinetic equation by introducing another ⟨δΦ⟩_α term with a free small parameter 𝜖_g. Speciﬁcally, δh in the above is split as

δh = δh + 𝜖 q-⟨δΦ⟩ ∂F0. s gm α∂ 𝜀

(338)

(If 𝜀_g = 1, then the two ⟨δΦ⟩_α terms in Eq. (337) and (338) cancel each other.) Substituting this expression into Eq. (), we obtain the following equation for δh_s:

[-∂ ] ∂t + (v∥e∥ + VD + δVD )⋅∇X δhs q ∂F [ ∂ ] + 𝜖g----0- --+ (v∥e∥ + VD + δVD )⋅∇X ⟨δΦ ⟩α m ∂𝜀 ∂[t ] + 𝜖g q-⟨δΦ⟩α ∂-+ (v e + VD + δVD )⋅∇X ∂F0- m ∂t ∥ ∥ ∂𝜀 = − δVD ⋅∇XF0 q- ∂F0- − m [(v∥e∥ + VD + δVD ) ⋅∇X (⟨v ⋅δA − δΦ ⟩α )] ∂𝜀 . (339)

Noting that ∂F₀∕∂t = 0, e_∥⋅∇F₀ = 0, ∇F₀ ∼ O(λ¹)F₀, we ﬁnd that the third line of the above equation is of order O(λ³) and thus can be dropped. Moving the second line to the right-hand side, the above equation is written as

[ ] ∂- ∂t + (v∥e∥ + VD + δVD )⋅∇X δhs = − δVD ⋅∇XF0 q{ [∂ ⟨δΦ⟩ ]} ∂F − -- (v∥e∥ + VD + δVD )⋅∇X [⟨v ⋅δA⟩α − ⟨δΦ ⟩α]+ 𝜖g -----α + VG ⋅∇X ⟨δΦ⟩α -(03.40) m ∂t ∂𝜀

special case of 𝜖_g = 1

For the special case of 𝜖_g = 1 (the default and most used case in GEM code, Yang Chen said 𝜖_g < 1 cases are sometimes not accurate, so he gave up using it since 2009), equation (340) can be simpliﬁed as:

[ ∂ ] -- + (v∥e∥ + VD + δVD )⋅∇X δhs ∂t = − δV[D ⋅∇XF0 ] − q- VG ⋅∇X ⟨v ⋅δA ⟩α + ∂-⟨δΦ-⟩α ∂F0-, (341) m ∂t ∂𝜀

where two V_G ⋅⟨δΦ⟩_α terms cancel each other. Because the v_∥E_∥ term is one of the factors that make kinetic electron simulations diﬃcult, eliminating V_G ⋅⟨δΦ⟩_α term may be beneﬁcial for obtaining stable algorithms.

For 𝜖_g = 1, δF is written as

δF = δhs + 𝜀g q-⟨δΦ ⟩α ∂Fg0+-q(δΦ − ⟨δΦ ⟩α)∂F0-+-q⟨v ⋅δA ⟩α∂F0- m ∂𝜀 m ∂𝜀 m ∂𝜀 = δhs + q-δΦ∂Fg0 + q-⟨v⋅δA ⟩α∂F0, (342) m ∂𝜀 m ∂𝜀

where the adiabatic term will be moved to the left-hand side of the Poisson’s equation. The descretization of this term is much easier than the polarization density. This term is already in GEM.

Equation (341) actually goes back to the original Frieman-Chen equation. The only diﬀerence is that q-
m ⟨v ⋅δA⟩_α ∂F0
∂𝜀 is further split from the perturbed distribution function. Considering this, equation (341) can also be obtained from the original Frieman-Chen equation (136) by writing δG₀ as

-q ∂F0- δG0 = δhs + m ⟨v ⋅δA ⟩α ∂𝜀 ,

(343)

In this case, δF is written as

q ∂Fg0 q ∂F0 δF = δhs + m-δΦ-∂𝜀-+ m-⟨v ⋅δA⟩α-∂𝜀-,

(344)

Substituting expression (343) into equation (136), we obtain the following equation for δh_s:

[ ] ∂-+ (v e + V + δV )⋅∇ δh ∂t ∥ ∥ D D X s -q∂F0-[-∂ ] + m ∂𝜀 ∂t + (v∥e∥ + VD + δVD )⋅∇X ⟨v ⋅δA ⟩α q [ ∂ ] ∂F + --⟨v⋅δA ⟩α -- + (v∥e∥ +VD + δVD )⋅∇X ---0 m ∂t ∂𝜀 = − δVD ⋅∇XF0 − q-∂⟨δΦ-−-v⋅δA-⟩α-∂F0, (345) m ∂t ∂𝜀

[ ∂ ] -- + (v∥e∥ + VD + δVD )⋅∇X δhs ∂t = − δV[D ⋅∇XF0 ] − q- ∂⟨δΦ⟩α + (v e + VD + δVD )⋅∇X ⟨v⋅δA ⟩α ∂F0, (346) m ∂t ∥ ∥ ∂𝜀

which agrees with Eq. (341).

In GEM, the split weight method is used only for electrons, and ∂⟨δΦ⟩_α∕∂t is approximated by ∂δΦ∕∂t, which is obtained from the vorticity equation (rather than from time-diﬀerence scheme).

When using the split weight scheme, a ∂δϕ∕∂t terms appear in the right-hand-side of the weight evolution equation. GEM makes use of the vorticity equation (time derivative of the Poissson equation) to evaluate ∂δϕ∕∂t.

E Coordinate system and grid in TEK code

In (ψ,𝜃,ϕ) coordinates:

ψ is the normalized poloidal magnetic ﬂux, ψ = (Ψ − Ψ_axis)∕(Ψ_lcfs − Ψ_axis).
𝜃 ∈ [−π,π) is increasing along the anti-clockwise direction when viewed along ∇ϕ direction with 𝜃 = −π at the high-ﬁeld-side midplane.
ϕ is the toroidal angle of the right-handed cylindrical coordinates (R,ϕ,Z).

In this convention, the Jacobian of the (ψ,𝜃,ϕ) coordinate system, 𝒥 = (∇ψ ⋅∇𝜃 ×∇ϕ) is negative, i.e., (ψ,𝜃,ϕ) is a left-handed system. The ﬁeld-line-following coordinate system (ψ,𝜃,α) is also a left-handed system. The coordinate system (x = ψ,y = α,z = 𝜃) is a right-handed system.

E.1 Poloidal grid

Fig. 3: Poloidal grid for equilibrium quantities used in TEK and GEM code. The array starts at 𝜃 = −π and ends at 𝜃 = +π (𝜃 = ±π is chosen to be at the high-ﬁeld side midplane in both the codes). TEK array index starts from 1 whereas GEM array index starts from 0. Hence mpol=nth+1. nth is denoted by ntheta in GEM. In TEK, mpol must be an odd number, so that the array has a midpoint, whose index is given by (mpol+1)/2, which corresponds to 𝜃 = 0.

I do not need to make connection with GEM’s equilibrium poloidal array because there is no coupling of equilibrium quantities between the code written by me and the original code in GEM. The coupling happens for the perturbed quanties, whose poloidal grids need to be consistent.

The gridpoint number along 𝜃 for perturbations is denoted by mpol2. Its value is not chosen by users. Instead, it is equal to numproc/ntube, i.e., mpol2 is the total number of cells along 𝜃 for MPI parallelation. (numproc and ntube must be chosen in a way that makes numproc/ntube an integer.)
Poloidal grid-points for perturbations are indexed as 0:mpol2-1, with 0 corresponding to 𝜃 = −π and mpol2 corresponding to 𝜃 = π. Field equations are solved at 0 : mpol2 − 1. (The ﬁeld at mpol2, i.e., 𝜃 = +π, is not identical to that at 𝜃 = −π due to toroidal shift of a magnetic ﬁeld line when it ﬁnish one poloidal loop, and is obtained by toroidally inerpolating the ﬁelds at 𝜃 = −π.) No array for this is actually used in the code. The indexes just correspond to gclr of MPI processors.
The equilibrium poloidal gridpoint number, mpol, must be chosen in a way that makes (mpol-1)/mpol2 an integer, which can be larger than 1, i.e., one simulation cell can span several equilibrium cells.

E.2 Toroidal grid

pict

Fig. 4: Toroidal grid used in TEK and GEM. In TEK, the toroidal array is tor_1d_array(1:mtor+1). Source terms, δn and δj_∥, and EM ﬁelds are deﬁned at 1:mtor. GEM array index starts from 0 and ends at jm. It follows that mtor=jm.

E.3 Radial grid

pict

Fig. 5: Radial grid used in TEK. Two radial arrays are used in the code, radcor_1d_array(1:nt) and radcor_1d_array2(1:n), equilibrium is deﬁned on the former grid, and perturbations are deﬁned on the latter grid. Here nt = n + 2m, m is the number of grid points in one of the two buﬀer regions (regions in blue color). In the code n is denoted by nflux2 and nt is denoted by nflux, m is denoted by points_in_buffer. In GEM the radial array does no include the buﬀer regions and the index starts at 0 and ends at imx. Hence n=imx+1.

GEM array index system is better than TEK’s because my system is not consistent: sometimes I use 0-based index and sometimes I use 1-based index, sometimes the index ends at n and sometimes ends at n+1. It is important to know accurately the transformation between the two systems.

F Implementation of gyrokinetics in particle-in-cell (PIC) codes

F.1 Monte-Carlo evaluation of distribution function moment at grid-points

Suppose that the 6D guiding-center phase-space (X,v) is described by (ψ,𝜃,ϕ,v_∥,α,v_⊥) coordinates, where α is the gyro-angle. Denote the Jacobian of the coordinate system by 𝒥_rv.

Suppose that we sample the 6D phase-space by using a probability function P(ψ,𝜃,ϕ,v_⊥,α,v_∥). (Then the eﬀecitive probability function used in rejection method is P𝒥_r𝒥_v.) Note that we will sample a distribution function δf_g(X,v_⊥,v_∥,α) that happens to be independent of the gyro-angle α. Since δf_g is uniform in α, it is natural to smaple α using a uniform probability function, i.e., P(X,v_⊥,α,v_∥) can be chosen to be independent of α. Then the weight is also independent of α. In terms of sampling δf_g, there is no need to specify α because both f_g and marker distribution g = N_pP are indepedent of α. However, sampling α is still necessary in simulations. From the perspective of programming, it is ready to understand why α needs to be speciﬁed: when computing density at grid-points, we need to compute particle locations from the guiding-center locations which requires us to specify the gyro-angle α for each marker. Speciﬁcally, markers in guiding-center space (X,v_⊥,α,v_∥) with ﬁxed (X,v_⊥,v_∥) but diﬀerent values of α correspond to markers in particle space with diﬀerent locations (and diﬀerent α′). In the PIC method of computing δn or δj_∥ at ﬁxed x, the contribution of each particle marker to the density is aﬀected by the distance of the particle marker to the ﬁxed x. Therefore diﬀerent values of α contributes diﬀerently to δn or δj_∥, and thus the resolution of α matters.

For a marker with coordinators (X,v_⊥,v_∥,α), the corresponding particle position can be calculated by using the inverse guiding-center transformation (29). Then we can deposit the marker weight to grid-points in the same way that we do in conventional PIC simulations. Looping over all markers, we build the particle density at grids.

Compared with conventional PIC methods, where particle positions are directly sampled, what is the beneﬁt of sampling guiding-center positions and then transform them back to the particle positions? More computations are involved since we need to numerically perform the inverse guiding-center transformation. The answer lies in the important fact that the distribution function f_g(X,v_⊥,v_∥,α) that needs to be numerically evaluated in gyrokinetic simulation is actually independent of the gyro-angle α. Furthermore, we use a probability density function P that is independent of α to sample the 6D phase space (X,v_⊥,v_∥,α). Then the marker weight w ≡ f_g∕(N_pP) is independent of α, where N_p is number of markers loaded. This fact enable resolution in particle coordinates can be increased in a trivial way, as described below.

Suppose we have a concrete sampling of f_g in the 6D phase-space (X,v_⊥,v_∥,α), i.e., (X_j,v_⊥j,v_∥j,α_j,w_j) with j = 1,…,N_p, then we can do the inverse guiding-center transform and then deposit particles to the grids to obtain a moment (e.g. density).

Since P𝒥_rv is independent of α, the sampling α_j with j = 0,1,…,N_p are uniform distributed random numbers. Therefore we can generate another sampling of α (denoted by α_j′ with j = 1,…,N_p) using random number generators and combine α_j′ with the old sampling (X_j,v_⊥j,v_∥j) to obtain a new sampling (X_j,v_⊥j,v_∥j,α_j′,w_j′). The values of w at the new sampling points are equal to the original values, i.e., w_j′ = w_j since the particle weight w = f_g∕(N_pP) is independent of α. Using the new sampling and following the same procedures given above, we can estimate values of the moment at grid-points again, which will diﬀer from the estimation obtained using the old sampling because the resulting sampling of x and α′ for f_p is diﬀerent. Taking the average of the two estimations will give a more accurate estimation because the resolution in x and α′ for f_p is increased.

[Note that even if P𝒥_rv is dependent on α due to the possible dependence of 𝒥_rv on α, we still can easily generate a new set of sampling which diﬀers from the old sampling only in α. Speciﬁcally, in the rejection method, we use the old sampling (X_j,v_⊥j,v_∥j) for each reject step and only adjust α to satisfy the acceptance criteria.]

We can also construct a new sampling by replacing α_j with α_j + Δ, where Δ is a constant. Then the new set of sampling (X_j,v_⊥j,v_∥j,α_j + Δ,w_j) with j = 1,…,N_p is still a consistent set of sampling.

In doing the deposition, each marker has a single gyro-angle. Due to the independence of the weight (w = f_g∕(N_pP)) of the gyro-angle, the particle phase space resolution can be increased in a way that there can be several gyro-angles for a single (X_j,v_⊥j,v_∥j). This gives the wrong impression that the gyro-angle of a guiding-center marker is arbitrary. The correct understanding is that given above, i.e., we do several (say 4) separate sampling of the phase space and then average the results.

In the code, the gyro-angle is deﬁned relative to the direction ∇ψ∕|∇ψ| at the guiding-center position. Speciﬁcally, the gyro-angle is deﬁned as the included angle between ∇ψ∕|∇ψ| and −v × e_∥(x)∕Ω(x), where e_∥(x)∕Ω(x) is approximated by the value at the guiding-center location.

Then (X_j,v_⊥j,v_∥j) can be evolved by using the guiding-center motion equation. (It is obvious how to evalute the gyro-averaging of the electromagnetic ﬁelds needed in pusing markers.)

F.2 Monte-Carlo sampling of 6D guiding-center phase-space

Suppose that the 6D guiding-center phase-space (X,v) is described by (ψ,𝜃,ϕ,v_∥,v_⊥,α) coordinates. The Jacobian of the coordinate system is given by 𝒥 = 𝒥_rv_⊥, where 𝒥_r = 𝒥 (ψ,𝜃) is the Jacobian of the coordinates (ψ,𝜃,ϕ). Suppose that we sample the 6D phase-space by using the following probability density function:

[ ] 1 ( m )3∕2 m(v2∥ + v2⊥ ) P(ψ,𝜃,ϕ,v∥,v⊥, α) = Vr 2πT- exp − ---2T----- ,

(347)

where V _r is the volume of the spatial simulation box, T is a constant temperature. P given above is independent of ψ,𝜃,ϕ and α. It is ready to verify that the above P satisﬁes the following normalization condition:

∫ ∫ ∫ ∫ +∞ ∫ ∞∫ 2π PdvdX = Pv⊥dαdv⊥dv ∥𝒥rd ψd𝜃dϕ = 1. Vr Vr −∞ 0 0

(348)

I use the rejection method to numerically generate N_p markers that satisfy the above probability density function. [The eﬀective probability density function actually used in the rejection method is P′, which is related to P by

P′ = |𝒥r𝒥v|P = |𝒥r (ψ,𝜃)|v⊥P

(349)

Note that P′ does not depend on the gyro-angle α.]

Then the weight of a marker sampling δf_g is written as

w = δfg(ψ,𝜃,ϕ,v∥,v⊥). NpP

(350)

Since both f_g and P are independent of the gyro-angle α, w is also independent of α.

The numerical representation of δf_g is written

˜ --1--N∑p δfg = 𝒥rv⊥ wjδ(ψ− ψj)δ(𝜃− 𝜃j)δ(ϕ − ϕj)δ(v∥ − v∥j)δ(v⊥ − v⊥j)δ(α − αj). j

(351)

Although the distribution function δf_g to be sampled is independent of the gyro-angle α, we still need to specify the gyro-angle because we need to use the inverse guiding-center transformation, which needs the gyro-angle. Each marker needs to have a speciﬁc gyro-angle value α_j so that we know how to transform its X_j to x_j and then do the charge deposition in x space.

To increase the resolution over the gyro-angle (the quantity of intest to us, e.g., density, is the integration of the particle distribution function at ﬁxed spatial location, and the distribution at the ﬁxed location is not uniform in the gyro-angle. So the resolution over α matters), we need to load more markers. However, thanks to the fact that both sampling probability density function P and δf_g are independent of α, the resolution over the gyro-angle can be increased in a simple way.

n1(x-)+-n2(x-)+-n3(x-)+-n4(x-) n(x) = 4 .

(352)

This corresponds to sampling the 6D phase-space 4 separate times (each time with identical sampling points in (X,v_∥,v_⊥) but diﬀerent sampling points in α) and then using the averaging of the 4 Monte-Carlo integrals to estimate the exact value. This estimation can also be (roughly) considered as a Monte-Carlo estimation using 4 times larger number of markers as that is originally used (the Monte-Carlo estimation using truly 4 time larger number of markers is more accurate than the result we obtained above because the former also increase the resolution of (X,v_∥,v_⊥) while the latter keeps the resolution of (X,v_∥,v_⊥) unchanged.)

In numerical implementation, we choose N sampling points that are evenly distributed on the gyro-ring (N is usually 4 as a compromise between eﬃciency and accuracy). And the weight is evenly split by the N sub-markers on the gyro-ring. Therefore each sub-marker have a Monte-Carlo weight w_j∕N, where w_j is the weight of the j th marker. Then calculating the integration (358) at a grid corresponds to depositing all the N sub-markers associated with each guiding-center marker to the grid, as is illustrated in Fig. 6. However, interpreting in this way is confusing to me because: why can you split a single sampling into N diﬀerent samplings? I prefer the above interpretation that the 6D phase space is sampled 4 separate times and thus we get 4 estimations and ﬁnally we take the averaging of these 4 estimations. It took a long time for me to ﬁnally ﬁnd this way of understanding.

pict

Fig. 6: The spatial grid in the plane perpendicular to the equilibrium magnetic ﬁeld, a guiding-center marker C, and its gyro-ring with 4 sampling points (sub-markers) on it. The 4 sub-markers are calculated by using the transformation (29) (inverse guiding-center transform). Assume that Monte-Carlo weight of the guiding-center marker C is w_j. Then The Monte-Carlo weight of each sub-marker is w_j∕4. The value of integration (358) at a grid point is approximated by I∕ΔV , where I is the Monte-Carlo integration of all sub-markers (associated with all guiding-center markers) in the cell, ΔV is the cell volume. The cell associated with a grid-point (e.g., A) is indicated by the dashed rectangle (this is for the 2D projection, the cell is 3D and it is a cube). If the Dirac delta function is used as the shape function of the sub-markers, then calculating the contribution of a sub-marker to a grid corresponds to the nearest-point interpolation (for example, the 4 sub-markers will contribute nothing to grid point B since no-sub marker is located within the cell). In practice, the ﬂat-top shape function with its support equal to the cell size is often used, then the depositing corresponds to linearly interpolating the weight of each sub-marker to the nearby grids.

In summary, the phase-space to be sampled in gyrokinetic simulations are still 6D rather than 5D. In this sense, the statement that gyrokinetic simulation works in a 5D phase space is misleading. We are still working in the 6D phase-space. The only subtle thing is that the sixth dimension, i.e., gyro-angle, can be sampled in an easy way that is independent of the other 5 variables.

In numerical implementation, the gyro-angle may not be explicitly used. We just try to ﬁnd 4 arbitrary points on the gyro-ring that are easy to calculate. Some codes (e.g. ORB5) introduces a random variable to rotate these 4 points for diﬀerent markers so that the gyro-angle can be sampled less biased.

From the view of particle simulations, the gyrokinetic model can be considered as a noise reduction method, where the averaging over the gyro-angle is equivalent to a time averaging over a gyro-period, which reduces the ﬂuctuation level (in both time and space) associated with evaluating the Monte-Carlo phase integration. Here the averaging in gyrokinetic particle simulation refers to taking several points on a gyro-ring when depositing markers to spatial grids to obtain the density and current on the grids. (Another gyro-averaging appears in evaluating the guiding-center drift.) In gyrokinetic particle simulation, even a step size smaller than a gyro-period is taken, the quantities used in the model is still the ones averaged over one gyro-period. In this sense, a gyrokinetic simulation is only meaningful when the time step size is larger than one gyro-period. [**Some authors may disagree with that the gyro-averaging is a time-averaging. They may consider the gyro-averaging as the phase-space integration over the gyro-angle coordinate. This view seems to be right in Euler simulations but seems to be wrong in particle simulations. The reason is as follows. For each marker, choose a random gyro-phase and then do the inverse transformation to obtain particle position, and sum over all markers (this corresponds to phase-space Monte-Carlo integration, which include the gyro-angle integration, so no further gyro-angle integration is needed); choose another random gyro-phase and repeat the above procedure (this can be interpreted as do the phase-space Monte-Carlo integration at another time), choose further random gyro-phase for each marker and repeat. Finally averaging all the above values to obtain the ﬁnal estimation of the phase-space integration. This amounts to a time-averaging over a gyro-motion. In summary, sampling several times with diﬀerent gyro-phases for each marker and taking the average amounts to the time averaging over gyro-motion**]

When doing the time-average over the ion cyclotron motion, the time variation of the low-frequency mode is negligible and only the spatial variation of the modes is important. For the gyro-motion, only the gyro-angle is changing and all the other variables, (X,v_⊥,v_∥), are approximately constant. As a result, this time averaging ﬁnally reduces to a gyro-averaging.

I prefer to reason in terms of particle position and velocity, and consider the guiding-center location as an image of the particle position. When working in the guiding-center coordinates, I prefer to reason by transforming back to the particle position. This reasoning is clear and help me avoid some confusions I used to have.

In coding, the initial sampling of the phase space is performed for particles ((v,x) of particles) and then is transformed to guiding-center coordinates.

F.3 Distribution function transform**check

In the above, we assume that X and x are related to each other by the guiding-center transformation (27) or (29) , i.e., x and X are not independent. For some cases, it may be convenient to treat x and X as independent variables and express the guiding-center transformation via an integral of the Dirac delta function. For example,

∫ fp(x,v) = fg(X,v)δ3(X − x+ ρ)dX,

(353)

where x and X are considered as independent variables, ρ is the gyroradius evaluated at x, δ³(x − X − ρ) is the three-dimensional Dirac delta function. [In terms of general coordinates (x₁,x₂,x₃), the three-dimensional Dirac delta function is deﬁned via the 1D Dirac delta function as follows:

3 1 δ (x) = |𝒥-|δ(x1)δ(x2)δ(x3),

(354)

where 𝒥 is the the Jacobian of the general coordinate system. The Jacobian is included in order to make δ³(x) satisfy the normalization condition ∫ δ³(x)dx = ∫ δ³(x)|𝒥|dx₁dx₂dx₃ = 1.]

Expression (353) can be considered as a transformation that transforms an arbitrary function from the guiding-center coordinates to the particle coordinates. Similarly,

∫ fg(X,v) = fp(x,v)δ3(x − X − ρ )dx,

(355)

is a transformation that transforms an arbitrary function from the the particle coordinates to the guiding-center coordinates.

F.4 Moments of distribution function expressed as integration over guiding-center variables

In terms of particle variables (x,v), it is straightforward to calculate the moment of the distribution function. For example, the number density n(x) is given by

∫ n(x) = fp(x,v)dv.

(356)

However, it is a little diﬃcult to calculate n(x) at real space location x by using the guiding-center variables (X,v). This is because holding x constant and changing v, which is required by the integration in Eq. (356), means the guiding-center variable X is changing according to Eq. (27). Using Eq. (34), expression (356) is written as

∫ n(x) = f (X (x,v),v )dv, g

(357)

As is mentioned above, the dv integration in Eq. (357) should be performed by holding x constant and changing v, which means the guiding-center variable X = X(x,v) is changing. This means that, in (X,v) space, the above integration is a (generalized) curve integral along the the curve X(v) = x −ρ(x,v) with x being constant. Treating X and x as independent variables and using the Dirac delta function δ, this curve integral can be written as the following double integration over the independent variables X and v:

∫ ∫ 3 n(x) = fg(X, v)δ (X − x + ρ)dvdX.

(358)

Another perspective of interpreting Eq. (358) is that we are ﬁrst using the transformation (353) to transform f_g to f_p and then integrating f_p in the velocity space.

G Diamagnetic ﬂow check

The perturbed distribution function δF given in Eq. () contains two terms. The ﬁrst term is gyro-phase dependent while the second term is gyro-phase independent. The perpendicular velocity moment of the second term will give rise to the so-called diamagnetic ﬂow. For this case, it is crucial to distinguish between the distribution function in terms of the guiding-center variables, f_g(X,v), and that in terms of the particle variables, f_p(x,v). In terms of these denotations, equation () is written as

δFg = -q(δΦ− ⟨δΦ⟩α)∂F0g + δfg. m ∂𝜀

(359)

Next, consider the perpendicular ﬂow U_⊥ carried by δf_g. This ﬂow is deﬁned by the corresponding distribution function in terms of the particle variables, δf_p, via,

∫ nU ⊥ = v⊥δfp(x,v)dv,

(360)

where n is the number density deﬁned by n = ∫ δf_pdv. Using the relation between the particle distribution function and guiding-center distribution function, equation (360) is written as

∫ nU ⊥ = v⊥δfg(x − ρ,v )dv.

(361)

Using the Taylor expansion near x, δf_g(x −ρ,v) can be approximated as

δfg(x − ρ,v) ≈ δfg(x,v)− ρ ⋅∇δfg(x,v).

(362)

Plugging this expression into Eq. (361), we obtain

∫ ∫ nU ⊥ ≈ v⊥δfg(x,v)dv− v⊥ ρ⋅∇ δfg(x,v)dv

(363)

As mentioned above, δf_g(x,v) is independent of the gyro-angle α. It is obvious that the ﬁrst integration is zero and thus Eq. (363) is reduced to

∫ nU ⊥ = − v⊥ρ ⋅∇δfg(x,v)dv

(364)

Using the deﬁnition ρ = −v × e_∥∕Ω, the above equation is written

∫ v × e∥ nU⊥ = v ⊥--Ω---⋅∇ δfg(x,v)dv ∫ ( e ) = v ⊥ -∥× ∇ δfg(x,v) ⋅v⊥dv. ∫ Ω = v ⊥H ⋅v⊥dv, (365)

where H =

×∇δf_g(x,v), which is independent of the gyro-angle α because both e_∥(x)∕Ω(x) and δf_g(x,v) are independent of α. Next, we try to perform the integration over α in Eq. (365). In terms of velocity space cylindrical coordinates (v_∥,v_⊥,α), v_⊥ is written as

v ⊥ = v⊥ (ˆx cosα+ ˆy sinα ),

(366)

where and are two arbitrary unit vectors perpendicular each other and both perpendicular to B₀(x). H can be written as

H = Hxˆx + Hyˆy,

(367)

where H_x and H_y are independent of α. Using these in Eq. (365), we obtain

∫ nU ⊥ = v⊥(ˆx cosα + ˆysinα)v⊥(Hx cosα+ Hy sin α)dv ∫ = v2⊥[ˆx(Hx cos2α + Hy sinα cosα)+ ˆy(Hx cosαsinα +Hy sin2α)]dv. (368)

Using dv = v_⊥dv_∥dv_⊥dα, the above equation is written as

∫ ∞ ∫ ∞ ∫ 2π nU ⊥ = dv∥ v⊥dv ⊥ v2⊥ [ˆx(Hx cos2 α+ Hy sin αcosα)+ ˆy(Hx cosαsinα + Hysin2α)]dα ∫− ∞ ∫0 ∫0 ∞ ∞ 2π 2 2 2 = − ∞ dv∥ 0 v⊥dv ⊥ 0 v⊥ (ˆxHx cos α + ˆyHy sin α)dα ∫ ∞ ∫ ∞ = dv∥ v⊥dv ⊥[v2⊥ (xˆHx π + ˆyHyπ)] ∫−∞∞ ∫0∞ = dv∥ v⊥dv ⊥[v2⊥H π ] ∫− ∞ ∫0 ∞ ∞ 2 e∥ = − ∞ dv∥ 0 v⊥dv ⊥[v⊥ Ω × ∇ δfg(x,v )π ] e∥ ∫ ∞ ∫ ∞ v2 = --× ∇ dv∥ v⊥dv⊥ δfg(x,v)-⊥-2π Ωe −∞ 0 2 = -∥--×∇ δp⊥, (369) mΩ

where

∫ ∞ ∫ ∞ δp ≡ dv v dv δf (x,v)mv2⊥ 2π ⊥ − ∞ ∥ 0 ⊥ ⊥ g 2 ∫ mv2⊥ = δfg(x,v)--2-dv, (370)

is the perpendicular pressure carried by δf_g(x,v). The ﬂow given by Eq. (369) is called the diamagnetic ﬂow.

H Transform gyrokinetic equation from (X,μ,𝜀,α) to (X,μ,v_∥,α) coordinates

The gyrokinetic equation given above is written in terms of variables (X,μ,𝜀,α). Next, we transform it into coordinates (X′,μ′,v_∥,α′) which are deﬁned by

( ′ ||{ X′(X,μ,𝜀,α) = X μ′(X, μ,𝜀,α) = μ ||( α (X, μ,𝜀,α) = α ∘------------- v∥(X, μ,𝜀,α) = ± 2(𝜀 − μB0 (X))

(371)

Use this deﬁnition and the chain rule, we obtain

| -∂-|| = ∂X-′⋅-∂--+ ∂μ-′∂--+ ∂v∥ ∂--+ ∂α-′∂-- ∂X |μ,𝜀,α ∂X ∂X ′ ∂X ∂μ′ ∂X ∂v∥ ∂X ∂α′ ∂ μ ∂B0 ∂ = ∂X-′ − v-∂X-∂v-, (372) ∥ ∥

and

∂ || ∂X′ ∂ ∂μ′ ∂ ∂v∥ ∂ ∂ α′ ∂ ∂𝜀|| = ∂𝜀-∂X-′ + ∂𝜀-∂μ′ +-∂𝜀 ∂v-+ -∂𝜀 ∂α′ X,μ,α ∥ = 1--∂-. (373) v∥∂v∥

Then, in terms of (X′,μ′,v_∥,α′), equation (136) is written

[ ] ∂-+ (ve + VD + δVD )⋅-∂-- δG0 − (ve + VD + δVD )⋅-μ∂B0-∂δG0- ∂t ∥∥ ∂X ′ ∥ ∥ v∥ ∂X ∂v∥ ( ∂F0 μ ∂B0 ∂F0) q ∂⟨δL ⟩α ∂F0 1 = − δVD ⋅ ∂X-′ − v-∂X-∂v-- − m- --∂t--∂v--v-, (374) ∥ ∥ ∥ ∥

Dropping terms of order higher than O(λ²), equation (374) is written as

[ ] ∂- + (v e + V + δV )⋅-∂-- δG − e ⋅μ∂B0-∂δG0- ∂t ∥ ∥ D D ∂X ′ 0 ∥ ∂X ∂v∥ (∂F0 ) ( ∂B0 q ∂⟨δL⟩α) ∂F0 1 = − δVD ⋅ ∂X-′ + δVD ⋅μ ∂X--− m---∂t-- ∂v--v-, (375) ∥ ∥

The above equation drops all terms higher than O(λ²) and as a result the coeﬃcient before ∂δf∕∂v_∥ contains only the mirror force, i.e., −e_∥⋅ μ∇B₀, which is independent of any perturbations.

Note that the gyro-averaging operator in (X′,μ′,v_∥,α′) coordinates is identical to that in the old coordinates since the perpendicular velocity variable μ is identical between the two coordinate systems. Also note that the perturbed guiding-center velocity δV_D is given by

δV = e∥-×∇X-⟨δϕ⟩α + v ⟨δB-⊥⟩α, D B0 ∥ B0

(376)

where ∂∕∂X (rather than ∂∕∂X′) is used. Since δϕ(x) = δϕ_g(X,μ′,α′), which is independent of v_∥, then Eq. (372) indicates that ∂δϕ∕∂X = ∂δϕ∕∂X′.

Following the same procedures, equation (144) in terms of (X′,μ′,v_∥,α′) is written as

[ ] ∂-+ (ve + V + δV )⋅-∂-- δf − e ⋅μ∇B ∂δf- ∂t ∥∥ D D ∂X ′ ∥ 0∂v∥ ( ∂F0) ∂F0 1 = − δVD ⋅ ∂X-′ + δVD ⋅μ ∇B0 ∂v-v- [ ( ∥ ∥ ) ] − q- − ∂⟨v⋅δA-⟩α− v∥e∥ + VD + v∥⟨δB⊥-⟩α ⋅∇X ⟨δϕ⟩α ∂F0-1. (377) m ∂t B0 ∂v∥v∥

next, try to recover the equation in Mishchenk’s paper:

[ ∂ ∂ ] ∂δf ∂t + (v∥e∥ + VD + δVD )⋅∂X-′ δf − e∥ ⋅μ∇B0 ∂v∥ ( ) = − δVD ⋅ -∂F0 + δVD ⋅μ∇B0 ∂F0-1- [ ∂X ′ ( ∂v ∥v∥) ] q- ∂⟨δA-∥⟩α- VD-- ⟨δB-⊥⟩α- ∂F0- − m − ∂t − e∥ + v∥ + B0 ⋅∇X ⟨δϕ⟩α ∂v ∥.

δL = δΦ − v ⋅ δA,

δV q e∥ ---D = −----∇X ⟨δΦ− v∥A ∥⟩α × --. v∥ mv ∥ Ω

(378)

−

[ ∂⟨δA ∥⟩α ( VD q e∥) ]
− --∂t---− e∥ +-v--+ m-∇X ⟨A ∥⟩α × Ω- ⋅∇X ⟨δϕ⟩α
∥

−

[ ∂⟨δA (h)⟩ ( V q e ) ]
− ----∥---α− --D-+ --∇X ⟨A ∥⟩α × -∥ ⋅∇X ⟨δϕ⟩α
∂t v∥ m Ω

H.1 Recover equation in W. Deng’s 2011 NF paper

The guiding-center velocity in the equilibrium ﬁeld is given by

ve + V = -B⋆0-v + --μ--B × ∇B + --1---E × B ∥ ∥ D B ⋆∥0 ∥ ΩB ⋆∥0 0 0 B0B ⋆∥0 0 0

(379)

where

v∥ B⋆0 = B0 + B0--∇ × b, Ω

(380)

( ) B ⋆≡ b ⋅B⋆ = B 1+ v∥b ⋅∇ × b , ∥ Ω

(381)

Using B_∥0^⋆ ≈ B₀, then expression (379) is written as

v2∥ μ 1 v∥e∥ + VD = v∥b + Ω-∇ × b + ΩB0-B0 × ∇B0 + B2-E0 × B0, cu◟rva◝t◜ured◞rift ◟-----◝◜----◞ ◟-0-◝◜---◞ ∇B drift E×B drift

(382)

where the curvature drift, ∇B drift, and E₀ × B₀ drift can be identiﬁed. Note that the perturbed guiding-center velocity δV_D is given by (refer to Sec. C.3)

δV = e∥-×∇X-⟨δϕ⟩α + v ⟨δB-⊥⟩α. D B0 ∥ B0

(383)

Using the above results, equation (377) is written as

[ ] ∂- -∂-- ∂δf- ∂t +(v∥e∥ + VD + δVD )⋅∂X ′ δf − e∥ ⋅μ ∇B0 ∂v∥ ( ) (e ×∇ ⟨δϕ⟩ ) ( ) = − δVD ⋅ ∂F0′ + -∥----X----α + v∥⟨δB-⊥⟩α- ⋅ μ-∇B0 ∂F0- [ ∂X ( B0 B0 v∥ ∂v∥ ) ] q ∂⟨v ⋅δA ⟩α v2∥ μ 1 ⟨δB ⊥⟩α ∂F0 1 − m- − ---∂t----− v∥e∥ + Ω-∇ × b+ ΩB0- B0 × ∇B0 + B2E0 × B0 + v∥--B0--- ⋅∇X ⟨δϕ ⟩α ∂v∥(3v8∥4), 0

Collecting coeﬃcients before ∂F₀∕∂v_∥, we ﬁnd that the two terms involving ∇B₀ (terms in blue and red) cancel each other, yielding

[ ] ∂- +(v∥e∥ + VD + δVD )⋅-∂-- δf − e∥ ⋅μ ∇B0 ∂δf ∂t ( ) ∂X ′ ∂v∥ ∂F0- = − δVD ⋅ ∂X [ ( v2 ) ] + q- m-v∥⟨δB⊥-⟩α-⋅(μ∇B0 )+ ∂⟨v-⋅δA-⟩α + v∥b+ ∥-∇ ×b + -12E0 × B0 + v∥⟨δB-⊥⟩α ⋅∇X ⟨δϕ ⟩α ∂F0(3185,) m q B0 ∂t Ω B 0 B0 ∂v∥ v∥

This equation agrees with Eq. (8) in I. Holod’s 2009 pop paper (gyro-averaging is wrongly omitted in that paper) and W. Deng’s 2011 NF paper.

I Drift-kinetic limit

In the drift-kinetic limit, ⟨v_⊥⋅δE⟩_α = 0, ⟨δB_∥v_⊥⟩_α = 0, and ⟨δh⟩_α = δh, where δh is an arbitrary ﬁeld quantity. Using these, gyrokinetic equation (336) is written as

[ ( ) ] ∂- δB-⊥- ∂t + v∥e∥ + vD + vE +v∥ B0 ⋅∇X δf ( δB⊥ ) q [( δB⊥ ) ] ∂F0 = − vE + v∥-B-- ⋅∇XF0 − m- v∥e∥ + vD + v∥-B- ⋅δE -∂𝜀-. (386) 0 0

I.1 Linear case

Neglecting the nonlinear terms, drift-kinetic equation (386) is written

[ ] ∂- ∂t + (v∥e∥ + vD) ⋅∇X δf ( δB ⊥) q ∂F0 = − vE +v∥-B-- ⋅∇XF0 − m-[(v∥e∥ + vD )⋅δE]∂-𝜀 . (387) 0

Next let us derive the parallel momentum equation from the linear drift kinetic equation (this is needed in my simulation). Multiplying the linear drift kinetic equation (387) by qv_∥ and then integrating over velocity space, we obtain

∂δj∥ ∫ ∂t = − q dvv∥(v∥e∥ + vD )⋅∇X δf ∫ ( δB ⊥ ) q ∫ ∂F0 − q dvv∥ vE + v∥ B--- ⋅∇XF0 − m-q dvv∥[(v∥e∥ + vD)⋅δE ]∂𝜀-. (388) 0

Equation (388) involve ∇_Xδf and this should be avoided in particle methods whose goal is to avoid directly evaluating the derivatives of δf over phase-space coordinates. On the other hand, the partial derivatives of velocity moment of δf are allowed. Therefore, we would like to make the velocity integration of δf appear. Note that ∇_Xδf here is taken by holding (𝜀,μ) constant and thus v_∥ is not a constant and thus can not be moved inside ∇_X. Next, to facilitate performing the integration over v_∥, we transform the linear drift kinetic equation (387) into variable (X,μ,v_∥).

I.2 Transform from (X,μ,𝜀) to (X,μ,v_∥) coordinates

The kinetic equation given above is written in terms of variable (X,μ,𝜀). Next, we transform it into coordinates (X′,μ′,v_∥) which is deﬁned by

X′(X,μ,𝜀) = X,

(389)

μ′(X,μ,𝜀) = μ,

(390)

and

∘ ------------- v∥(X, μ,𝜀) = 2(𝜀 − μB0(X )).

(391)

Use this, we have

∂δG0- ∂X′∂-δG0 ∂μ′∂δG0- ∂v∥∂-δG0- ∂X |μ,𝜀 = ∂X ∂X ′ + ∂X ∂μ′ + ∂X ∂v∥ ∂δG ∂ δG μ ∂B ∂δG = ---0′ |μ,v∥ + 0-′0− ----0----0, (392) ∂X ∂μ v∥dX ∂v∥

and

∂F ∂F ∂ μ′ ∂F ∂v --0-= --0′--- + --0---∥ ∂𝜀 ∂μ ∂𝜀 ∂v∥ ∂𝜀 = 0∂F0-+ ∂F0∂v∥ ∂μ′ ∂v∥ ∂𝜀 ∂F0 1 = ∂v∥v∥ (393)

Then, in terms of variable (X′,μ,v_∥), equation (387) is written

∂δf+ (v∥e∥ + vD )⋅∇δf − e∥ ⋅μ∇B ∂-δf ∂t ∂v∥ ( δB ⊥) ( vE δB⊥ ) ∂F0 q[( vD ) ] ∂F0 = − vE + v∥-B0 ⋅∇F0 + v∥-+ -B0- ⋅μ∇B ∂v∥-− m- e∥ +-v∥ ⋅δE ∂v∥,(394)

where ∇≡ ∂∕∂X′|_{μ,v_∥}.

I.3 Parallel momentum equation

Multiplying the linear drift kinetic equation (394) by qv_∥ and then integrating over velocity space, we obtain

∫ ∫ ∂δj∥+ q dvv∥(v∥e∥ + vD )⋅∇X δf − q dvv∥e∥ ⋅μ∇B ∂δf- ∂t ∫ ( ) ∫ ( ∂v∥ ) δB-⊥ vE- δB⊥- ∂F0- = − q dvv∥ vE + v∥ B0 ⋅∇XF0 + q dvv∥ v∥ + B0 ⋅μ∇B ∂v∥ q ∫ [( v ) ] ∂F −--q dvv∥ e∥ + -D- ⋅δE ---0. (395) m v∥ ∂v ∥

Consider the simple case that F₀ does not carry current, i.e., F₀(X,μ,v_∥) is an even function about v_∥. Then it is obvious that the integration of the terms in red in Eq. (395) are all zero. Among the rest terms, only the following term

q ∫ ∂F 1 − --q dvv∥[(v∥e∥)⋅δE ]--0-- m ∂v∥v∥

(396)

explicitly depends on δE. Using dv = 2πBdv_∥dμ, the integration in the above expression can be analytically performed, giving

-q ∫ ∂F0-1 −m q dvv∥[(v∥e∥)⋅δE]∂v∥v∥ 2 ∫ = − q 2πBdv∥dμv∥δE ∥∂F0- m ∫ ∫ ∂v∥ q2 ∂F0- = −m 2πBd μδE∥ v∥ ∂v∥dv∥ q2 ∫ ( ∫ ) = −-- 2πBd μδE∥ 0− F0dv∥ m = q2δE n . (397) m ∥ 0

Using these results, the parallel momentum equation (395) is written

∫ ∫ ∂δj∥ = q2δE n − q dvv (ve + v )⋅∇ δf + q dvv e ⋅μ ∇B ∂δf- ∂t m ∥ 0 ∥ ∥∥ D X ∥ ∥ ∂v∥ ∫ ( δB ⊥) ∫ (δB ⊥) ∂F0 − q dvv∥ v∥-B-- ⋅∇F0 + q dvv∥ -B-- ⋅μ∇B ∂v-, (398) 0 0 ∥

where the explicit dependence on δE is via the ﬁrst term q²n₀δE_∥∕m, with all the the other terms being explicitly independent of δE (δf and δB implicitly depend on δE).

Equation (398) involve derivatives of δf with respect to space and v_∥ and these should be avoided in the particle method whose goal is to avoid directly evaluating these derivatives. Using integration by parts, the terms involving ∂∕∂v_∥ can be simpliﬁed, yielding

∂δj∥ q2 ∫ ∫ ----= --δE ∥n0 − q dvv ∥(v∥e∥ + vD )⋅∇X δf − q(e∥ ⋅∇B0 ) μδf dv ∂t m ∫ ( ) ( ) ∫ − q dvv∥ v∥δB⊥- ⋅∇F0 − q δB-⊥- ⋅(∇B0 ) μF0dv, (399) B0 B0

Deﬁne p_⊥0 = ∫ mv_⊥²F₀∕2dv and δp_⊥ = ∫ mv_⊥²δf∕2dv, then the above equation is written

∂ δj∥ q2 ∫ δp⊥ -∂t- = m-δE ∥n0 − q dvv∥(v∥e∥ + vD) ⋅∇X δf − q(e∥ ⋅∇B0 )mB ∫ ( ) ( ) 0 − q dvv∥ v∥δB⊥- ⋅∇F0 − q δB-⊥ ⋅(∇B0 ) p⊥0-, (400) B0 B0 mB0

Next, we try to eliminate the spatial gradient of δf by changing the order of integration. The second term on the right-hand side of Eq. (400) is written

∫ − q dvv ∥(v∥e∥)⋅∇X δf, ∫ = − q 2πB dv dμv2e ⋅∇ δf 0 ∥ ∥ ∥ X ∫ 2 = − q2πB0e∥ ⋅∇X v∥δfdv∥dμ ( 1 ∫ ) = − qB0e∥ ⋅∇X ---- mv2∥δfdv ( mB0 ) = − qB0e∥ ⋅∇X δp∥- , (401) mB0

where δp_∥ = ∫ mv_∥²δfdv. Similarly, the term −q ∫ dvv_∥ ( )
v∥δBB⊥0-

⋅∇_XF₀ is written as

∫ ( ) − q dvv v δB-⊥- ⋅∇ F ∥ ∥ B0 X 0 ∫ ( 2δB⊥-) = − q 2πB0dv∥dμ v∥ B0 ⋅∇XF0 (δB ) ∫ = − q ---⊥ ⋅B0∇X (v2∥F02πdv∥dμ) ( B0 ) [ ∫ ] = − q δB-⊥ ⋅B ∇ 1- (mv2F dv) B0 0 X B ∥ 0 (-p∥0 ) = − qδB ⊥ ⋅∇X mB0 (402)

where p_∥0 = ∫ mv_∥²F₀dv. Similarly, the term −q ∫ dvv_∥v_D ⋅∇_Xδf can be written as the gradient of moments of δf. Let us work on this. The drift v_D is given by

B0vΩ∥∇-×-b- -μ-- vD = B ⋆ v∥ + ΩB ⋆B0 × ∇B0. ∥ ∥

(403)

where B_∥^⋆ = B₀ ( v∥- )
1 + Ωb ⋅∇ × b (refer to my another notes). Using b ⋅∇× b ≈ 0, we obtain B_∥^⋆ ≈ B. Then v_D is written

v_D =

∇× b +

b ×∇B₀.

Using this and dv = 2πB₀dv_∥dμ, the term −q ∫ dvv_∥v_D ⋅∇_Xδf is written as

( ) ∫ ∫ v2∥ μ − q dvv∥vD ⋅∇X δf = − q 2πB0dv ∥dμv∥ -Ω∇ × b + Ω-b ×∇B0 ⋅∇X δf ∫ ∫ = − q2πB -1(∇ × b)⋅∇ v3δf dvdμ − q2πB -1(b ×∇B )⋅∇ v μδfdv dμ 0Ω X ∥ ∥ 0Ω 0 X ∥ ∥ 1- ( -1-∫ 3 ) -1 (-1-∫ ) = − qB0Ω (∇ × b)⋅∇X B0 v∥δfdv − qB0Ω (b× ∇B0 )⋅∇X B0 v∥μδfdv , ( 1 ∫ ) ( 1 ∫ ) = − m(∇ × b)⋅∇X --- v3∥δfdv − m (b× ∇B0 )⋅∇X --- v∥μδfdv , (404) B0 B0

which are the third order moments of δf and may be neglect-able (a guess, not veriﬁed). Using the above results, the linear parallel momentum equation is ﬁnally written

( ) ∂δj∥-= e2ne0δE +eB b⋅∇ -δp∥ + e(b⋅∇B )-δp⊥ ∂t m ∥ 0 X mB0 0 mB0 ( p∥0 ) (δB ⊥) p⊥0 +eδB ⊥ ⋅∇X mB0- +e -B0- ⋅(∇B0 )mB0-. ( ∫ ) ( ∫ ) − m (∇ × b)⋅∇X -1- v3∥δfdv − m (b × ∇B0 )⋅∇X -1- v∥μδfdv (405) B0 B0

Deﬁne

( p ) ∇B p D0 = ∇ --∥0 + ---0 -⊥0-, mB0 B0 mB0

(406)

which, for the isotropic case (p_∥0 = p_⊥0 = p₀), is simpliﬁed to

∇p D0 = ---0. mB0

(407)

then Eq. (405) is written as

∂δj e2n ---∥= ---0δE∥ + eδB ⊥ ⋅D0 ∂t m ( ) + eB0b ⋅∇X -δp∥ + e(b⋅∇B0 )-δp⊥ mB0( ∫ ) mB0 ( ∫ ) -1- 3 -1- − m (∇ × b)⋅∇X B0 v∥δfdv − m (b × ∇B0 )⋅∇X B0 v∥μδfdv . (408)

I.4 Special case in uniform magnetic ﬁeld

In the case of uniform magnetic ﬁeld, the parallel momentum equation (405) is written as

∂δj∥ = q-qE n − qe ⋅∇ (δp )− qδB⊥-⋅∇ p . ∂t m ∥ e0 ∥ X ∥ B0 X ∥0

(409)

I.5 Electron perpendicular ﬂow

Using the gyrokinetic theory and taking the drift-kinetic limit, the perturbed perpendicular electron ﬂow, δV_e⊥, is written (see Sec. G or Appendix in Yang Chen’s paper[2])

ne0δVe ⊥ = ne0δE ×b − -1-b × ∇ δp⊥e B◟0-◝◜---◞ e◟B0--◝◜-----◞ E×B ﬂow diamagneticﬂow

(410)

where n_e0 is the equilibrium electron number density, δp_e⊥ is the perturbed perpendicular pressure of electrons.

Drift kinetic equation

Drift kinetic equation is written

( ) ( ) ∂f-+ vb˜+ vD + δE-×-b ⋅∇f + − e-δE − μb˜⋅∇B ∂f- = 0, ∂t ∥ B0 m ∥ ∂v∥

(411)

where f = f(x,μ,v_∥,t), μ = mv_⊥²∕B₀ is the magnetic moment, = b + δB_⊥∕B₀, b = B₀∕B₀ is the unit vector along the equilibrium magnetic ﬁeld, v_D = v_D(x,μ,v_∥) is the guiding-center drift in the equilibrium magnetic ﬁeld. δE and δB are the perturbed electric ﬁeld and magnetic ﬁeld, respectively.

Parallel momentum equation

Multiplying the drift kinetic equation () by v_∥ and then integrating over velocity space, we obtain

∫ ∂f v ∫ ( ) ∫ ( ) --e-∥dv + v∥ v∥˜b + vD + δE-×-b ⋅∇fedv + v∥ − e-δE∥ − μ ˜b⋅∇B ∂fedv = 0, ∂t B0 m ∂v∥

(412)

which can be written as

∫ ( ) ∫ ∂J∥e- ˜ δE-×-e∥ ( e- ˜ ) ∂fe ∂t + v∥ v∥b + vD + B0 ⋅∇fedv + v∥ − m δE ∥ − μb ⋅∇B ∂v∥dv = 0,

(413)

Using dv = B⁻¹2πmdv_∥dμ, the last term on the RHS of the above equation is written

∫ ∫ ( ) v∥ − e-δE ∥ − μ˜b ⋅∇B ∂fedv m ∂v∥ ∫ ∫ ( e- ˜ ) ∂fe B- = v∥ − m δE ∥ − μb ⋅∇B ∂v∥2πm dv∥dμ e B ∫ ∫ ∂f B ∫ ∫ ∂f = − --δE∥2π-- v∥--edv∥dμ − (˜b ⋅∇B )2π- μ v∥--edv∥dμ m m ∫ ( ∂v∥ ∫ ) m ∂v∫∥ ( ∫ ) = − e-δE 2πB- v f |+∞ − fdv dμ− (˜b ⋅∇B )2πB- μ v f |+∞ − fdv dμ m ∥ m ∥ e−∞ e ∥ m ∥ e− ∞ e ∥ -e B-∫ ∫ ˜ B-∫ ∫ = m δE ∥2πm fedv∥dμ + (b ⋅∇B )2πm μ fedv∥dμ e ∫ ∫ = m-δE ∥ne + μ(˜b ⋅∇B )fedv (414) e ≈ m-δE ∥ne

∫ ∫ v∥(v∥˜b)⋅∇fedv ∫ ∫ = ˜b ⋅∇(v2f )dv ∥ e ∫ ∫ ˜ 2 B- = b ⋅∇(v∥fe)2π m dv∥dμ ( ∫ ∫ 1 ) = ˜b⋅∇ v2∥fe2π--dv∥dμ B0 ( ) m = ˜b⋅∇ p∥- B0 ( B0 ) ˜ p∥0 +-δp∥ = b⋅∇ B0 B0 ( p∥0) (δp∥) = ˜b⋅∇ --- B0 + ˜b ⋅∇ --- B0 ( B0 ) B0( ) ( ) ≈ b⋅∇ p∥0 B0 + δB-⊥ ⋅∇ p∥0 B0 + b ⋅∇ δp∥- B0 B0 B0 B0 B0 δB⊥- ≈ b⋅∇ (p∥0)+ B0 ⋅∇ (p∥0)+ b ⋅∇(δp∥) (415) δB⊥- = B0 ⋅∇ (p∥0)+ b ⋅∇ (δp∥)

where use has been made of b ⋅∇p_∥0 = 0.

∂-δJe∥ = − e-δE n − δB-⊥ ⋅∇(p )− b⋅∇ (δp ) ∂t m ∥ e B0 ∥0 ∥

(416)

Using Eq. () in Eq. (), we obtain

[ ] e- δB-⊥ μ0em δE∥ne + b ⋅∇ × ∇ × δE = − μ0e B0 ⋅∇(pe∥0) +b ⋅∇ (δpe∥)

(417)

————–

( ) δB⊥- -q − b⋅∇ × ∇ × δE = − μ0e − qB0 ⋅∇Xp ∥0 + m qE∥ne0 − qe∥ ⋅∇X (δp∥) .

(418)

ddddd

∫ ∫ ( ) v v ˜b+ v + δE-×-b ⋅∇f dv ∥ ∥ D B0 e ∫ ∫ ( ˜ δE-×-b) = v∥b+ vD + B0 ⋅∇ (v∥fe)dv ∫ ∫ ∫ ∫ ∫ ∫ δE × b = v∥˜b ⋅∇(v∥fe)dv + vD ⋅∇ (v∥fe)dv+ ------ ⋅∇ (v∥fe)dv ∫ ∫ ∫ ∫ ∫ ∫ B0 = ˜b ⋅∇(v2∥fe)dv+ vD ⋅∇ (v∥fe)dv + δE-×-b ⋅∇(v∥fe)dv ∫ ∫ ∫ ∫ B0 ∫ ∫ = ˜b ⋅∇(v2fe)2πB-dv dμ+ vD ⋅∇ (v fe)2πB-dv dμ + δE-×-b ⋅∇(v fe)2π B-dvdμ ∫ ( ∥ m ∥) ∫ ∫ ∥ m ∥ ∫ ∫ B0 ∥ m ∥ ˜ 2 1- B- δE-×-b B- = b ⋅∇ v∥fe2πm dv∥dμ B + vD ⋅∇(v∥fe)2π m dv∥dμ + B0 ⋅∇ (v∥fe)2πm dv∥dμ ( p∥) ∫ ∫ B ∫ ∫ δE × b B = ˜b⋅∇ B- B + vD ⋅∇ (v∥fe)2πm-dv∥dμ + --B--- ⋅∇(v∥fe)2π m-dv∥dμ ( p ) ∫ ∫ 0 ∫ ∫ = ˜b⋅∇ -∥ B + -1-b × (μ∇B )⋅∇ (v∥fe)2π B-dv∥dμ+ -1-b × (mv2∥κ)⋅∇ (v∥fe)2πB-dv∥dμ ∫ ∫ B mΩ m mΩ m + δE-×-b ⋅∇(v∥fe)2π B-dv∥dμ B0 m

J Derivation of Eq. (123), not ﬁnished

From the deﬁnition of μ, we obtain

∂μ v2 ∂B 1 ∂v2 μ ∂B ∂v v ---= − -⊥2---0-+ ------⊥-= −------0+ ---⊥ ⋅-⊥- ∂x 2B0 ∂x 2B0 ∂x B0 ∂x ∂x B0

(419)

Using

∂v ∂[v − ve ] ∂e ∂v --⊥-= ------∥∥--= − v∥-∥ − --∥e∥, ∂x ∂x ∂x ∂x

(420)

expression (419) is written as

∂μ- μ-∂B0- ∂e-∥ v⊥- ∂x = − B0 ∂x − v∥ ∂x ⋅B0 ,

(421)

which agrees with Eq. (10) in Frieman-Chen’s paper[3].

∫ -1- 2π dαv⋅ ∂μ- 2π 0 ∂x 1 ∫ 2π ( μ ∂B0 ∂e∥ v⊥ ) = 2π- dαv⋅ − B--∂x-− v∥-∂x ⋅B-- 0 0 0 =?0

∫ [ ] 1-- 2π -∂-(e∥) ∂δG0- 2π 0 dαv ⋅ v × ∂x Ω ⋅ ∂X =

Using the fact that δG₀ is independent of α, the left-hand side of Eq. (123) is written as

⟨v⋅[∫λB2π1 + λB2{ ]δ[G0⟩α ( )] } = -1- dαv ⋅ v × ∂-- e∥ ⋅ ∂δG0-+ ∂μ-∂δG0-+ ∂α-∂δG0 2π 0 ∂x Ω ∂X ∂x ∂μ ∂x ∂α 1 ∫ 2π { [ ∂ ( e∥)] ∂δG0 ∂μ ∂δG0} = 2π- dαv ⋅ v × ∂x- Ω- ⋅-∂X--+ ∂x--∂μ-- 0

K Modern view of gyrokinetic equation

Frimain-Chen equation was obtained by ﬁrst transforming to the guiding-center coordinates and then using the classical perturbation expansion for the distribution function (to separate gyro-phase independent part form the gyro-phase dependent one). Note that the guiding-center transform does not involve the perturbed ﬁeld.

The modern form of the nonlinear gyrokinetic equation is in the total-f form. This way of deriving the gyrokinetic equation is to use coordinate transforms to eliminate gyro-phase dependence of the total distribution function (rather than splitting the distribution function itself) and thus obtain an equation for the resulting gyro-phase independent distribution function (called gyro-center distribution function). The coordinate transform involves the perturbed ﬁelds besides the equilibrium ﬁeld.

The resulting equation for the gyro-center distribution function is given by (see Baojian’s paper)

( ∂ ∂ ) ∂t +X˙ ⋅∇ + v˙∥∂v- f(X, v∥,μ,t) = 0, ∥

(422)

where

e ⟨δB ⟩ ˙R = VD + --∥× ⟨∇ δΦ⟩α + v∥--⊥-α- B0 B0

(423)

1-B-⋆ q-∂⟨δA∥⟩α ˙v∥ = − m B∥⋆⋅(q∇⟨δΦ⟩+ μ∇B0 )− m ∂t .

(424)

Here the independent variables are gyro-center position X, magnetic moment μ and parallel velocity v_∥.

The gyro-phase dependence of the particle distribution can be recovered by the inverse transformation of the transformation used before. The pull-back transformation (inverse gyro-center transform) gives rise to the polarization density term. (phase-space-Lagrangian Lie perturbation method (Littlejohn, 1982a, 1983), I need to read these two papers.).

The learning curve of modern Lie tansform gyrokinetic is steep: the name Lie transform itself is already very scaring.

To derive the Frieman-Chen equation, you just need some patience.

L About this document

These notes were initially written when I visited University of Colorado at Boulder (Sept.-Nov. 2016), where I worked with Dr. Yang Chen, who said that most gyrokinetic simulations essentially employ Frieman-Chen’s nonlinear gyrokinetic equation. Therefore a careful re-derivation of the equation to get familiar to the gyrokinetic orderings and physics included in the model is highly desirable, which motivates me to write this note.

This document was written by using TeXmacs[5], a structured wysiwyw (what you see is what you want) editor for scientists. The HTML version of this document is generated by ﬁrst converting the TeXmacs ﬁle to TeX format and then using htlatex to convert the TeX ﬁle to HTML format.

References

[1] P J Catto. Linearized gyro-kinetics. Plasma Physics, 20(7):719, 1978.

[2] Yang Chen and Scott E. Parker. Particle-in-cell simulation with vlasov ions and drift kinetic electrons. Phys. Plasmas, 16:52305, 2009.

[3] E. A. Frieman and Liu Chen. Nonlinear gyrokinetic equations for low-frequency electromagnetic waves in general plasma equilibria. Physics of Fluids, 25(3):502–508, 1982.

[4] A. Mishchenko, A. Bottino, A. Biancalani, R. Hatzky, T. Hayward-Schneider, N. Ohana, E. Lanti, S. Brunner, L. Villard, M. Borchardt, R. Kleiber, and A. Könies. Pullback scheme implementation in orb5. Computer Physics Communications, 238:194–202, 2019.

[5] Joris van der Hoeven. Gnu texmacs, a structured wysiwyw (what you see is what you want) editor for scientists. http://www.texmacs.org/, 2007. [Online].