Nakayama’s lemma is a powerful and useful tool in algebraic geometry. It sort of plays the role of the inverse function theorem in differential geometry.

In this post, we will consider various versions of Nakayama’s lemma and some applications.

## Nakayama’s Lemma

We start with a standard technique due to Atiyah and Macdonald (Atiyah and Macdonald 1969).

**Lemma 1 (Generalized Cayley–Hamilton Theorem)**Let \(M\) be a finitely generated \(A\)-module, \(\mf{a}\) an ideal of \(A\) and \(\phi\) an element in \(\mathrm{End}_A(M)\) such that \(\phi(M)\subseteq\mf{a}M\). Then there exist elements \(r_k\in \mf{a}^k\) such that \[\phi^n+r_1\phi^{n-1}+\cdots+r_{n-1}\phi+r_n=0,\]

*Proof. * Let \(x_1\), \(x_2\),\(\dots\), \(x_n\) be a basis of \(M\). Then a endomorphism \(\phi\) is represented by matrix \(\left(\alpha_{ij}\right)\), i.e.
\[
\phi(x_i)=\sum_{j=1}^na_{ij}x_j.
\]
Since \(\phi(M)\subseteq\mf{a}M\), we may assume that \(\alpha_{ij}\in \mf{a}\).
Denote by \(\delta_{ij}\) the Kronecker delta. Then the above equality is equivalent to
\[\sum_{j=1}^n(\delta_{ij}\phi-\alpha_{ij})\cdot x_j=0.\]

By multiplying both sides the adjugate matrix, we find that \[\det(\delta_{ij}\phi-\alpha_{ij})=0.\]

The desired equality follows by applying the Laplace expansion to the determinant.**Lemma 2 (Nakayama’s Lemma)**Let \(\mf{a}\) be an ideal of a commutative ring \(A\) and \(M\) a finitely generated \(A\)-module. If \(M=\mf{a}M\), then there exists \(a\in \mf{a}\) such that \(m=am\) for all \(m\in M\).

**Corollary 1**Let \(A\) be a commutative local ring, \(\mf{m}\) the maximal ideal and \(M\) a finitely generated \(A\)-module. If \(\mf{m}M=M\), then \(M=0\).

*Proof.*By Lemma 2, there is an element \(r\in \mf{m}\) such that \(m+rm=0\) for all \(m\in M\). Since \(1+r\) is an unit in \(A\), multiplying the inverse \((1+r)^{-1}\), we find that that \(m=0\).

## A Geometric Version of Nakayama’s Lemma

A good resource of various versions and application of Nakayama’s lemma is Mumford’s red book (Mumford 1999).

**Corollary 2**Let \(A\) be a commutative local ring, \(\mf{m}\) the maximal ideal and \(M\) a finitely generated \(A\)-module. Let \(f_1\), \(\dots\), \(f_n\) be elements in \(M\) such that \(\bar{f}_1\), \(\dots\), \(\bar{f}_n\) generate \(M/\mf{m}M\). Then \(f_1\), \(\dots\), \(f_n\) generate \(M\). In particular, generators of \(\mf{m}/\mf{m}^2\) also generates \(\mf{m}\).

*Proof. * Let \(N=(f_1,\dots, f_n)\) be the submodule in \(M\). Then we have
\[(N+\mf{m}M)/\mf{m}M=M/\mf{m}M.\]
It then follows that \(N+\mf{m}M=M\) (by the third isomorphism theorem or the snake lemma). Therefore,
\[M/N=(N+mM)/N=m(M/N).\]

Let \(X\) be scheme and \(\mc{F}\) be a \(\O_X\)-module. Then the map \[ \Hom_{\O_X}(\O_X,\mc{F})\to \Gamma(X,\mc{F}), w \mapsto w_X(1) \] is an isomorphism. Indeed, if \(s\in \Gamma(X,\mc{F})\), then there is a unique homomorphism \(w: \O_X \to \mc{F}\) such that \(w_U(1) = s\vert_U\). This defines an inverse map.

Let \((s_i)\), \(i\in I\) be a family of sections \(s_i\in \Gamma(X,\mc{F})\). We say that \(\mc{F}\) is generated by the family, if the corresponding homomorphism \(\O_X^{(I)}\to \mc{F}\) is surjective.

Equivalently, \(\mc{F}\) is generated by the family \((s_i)\), \(i\in I\) if the family generates \(\Gamma(X,\mc{F})\) and for any closed point \(y\in X\), \((s_i)\) generate the stalk \(\mc{F}_y\) as an \(\O_{X,y}\)-module.

The above corollary implies the following geometric version of Nakayama’s lemma.

**Lemma 3 (Geometric Nakayama’s Lemma) **Let \(X\) be a Noetherian scheme, \(x\) a closed point in \(X\), \(U\) an open neighborhood of \(x\), and \(\mc{F}\) a coherent sheaf on \(X\). Let \(a_1\), \(\dots\), \(a_r\) be sections in \(\mc{F}(U)\) such that the germs \(\bar{a}_1\), \(\dots\), \(\bar{a}_r\) in \(\mc{F}_x\) generate the sheaf \(\mc{F}\vert_x=\mc{F}\otimes k(x)\). Then there exist an open neighborhood \(\spec(A)\ni p\) such that \(a_1\vert_{\spec(A)}\), \(\dots\), \(a_r\vert_{\spec(A)}\) generate \(\mc{F}\vert_{\spec(A)}\).

*Proof. * Let \(V\) be an affine open neighborhood of \(x\) in \(U\). Then \(\mc{F}\vert_V=\tilde{M}\), \(\mc{F}_x=M\otimes \O_{X,x}\), and \(\mc{F}\vert_x=M\otimes k(x)\), where \(M=\mc{F}(V)\) is a finitely generated \(\O(V)\)-module. Since \(\bar{a}_1\), \(\dots\), \(\bar{a}_n\) generate \(\mc{F}\vert_x\), by Corollary 2, we know that \(a_1\), \(\dots\), \(a_n\) generate \(M=\mc{F}(V)\). Moreover, \(a_1\), \(\dots\), \(a_n\) generates \(\mc{F}_x\).

Those sections define a morphism \(\O\vert_V^n\to \mc{F}\vert_V\) such that the morphism at the stalk level \(\O_x^n\to \mc{F}_x\) is surjective.

By shrinking \(V\), we may assume that \(\O\vert_V^n\to \mc{F}\vert_V\) is surjective (see Lemma 17.9.4 in the Stack Project).

Therefore, \(a_1\), \(\dots\), \(a_n\) generate \(\mc{F}\vert_V\).In the lemma, the condition can be relax to any scheme \(X\) and \(\mc{F}\) is quasicoherent sheaf of finite type (see for example Serre 1955 Section II Proposition 1).

This geometric version has some useful corollaries.

**Corollary 3**Let \(X\) be a Noetherian scheme and \(\mc{F}\) a coherent sheaf on \(X\). Then \(\dim_{k(x)}\mc{F}\vert_{x}\) is an upper semi-continuous function, i.e. the set \(\{x\vert \dim_{k(x)}\mc{F}\vert_{x}\leq r\}\subset X\) is open for any \(r\).

*Proof.*It follows directly from the Lemma 3

**Corollary 4**Let \(X\) be a reduced Noetherian scheme and \(\mc{F}\) a coherent sheaf on \(X\). Then \(\mc{F}\) is a free \(\O_X\)-module in some neighborhood of \(x\) if and only if \(e(x)=\dim_{k(x)}\mc{F}\vert_{x}\) is a constant near \(x\).

*Proof. * If \(\mc{F}\) is free in a neighborhood of \(x\), then \(e(x)\) is constant over that neighborhood.

Conversely, let \(U\) be a neighborhood of \(x\) such that \(e(y)\) is a constant for any \(y\in U\). Then by shrinking \(U\), we may assume that \[\O\vert_U^e\to \mc{F}\vert_U\] is surjective. Let \(\mc{K}\) be the kernel of this morphism. If \(\mc{K}\neq 0\), then there is a nonzero element \(s\) in \(\mc{K}\). Since \(\mc{K}\subset \O\vert_U^e\) and \(\O\vert_U^e\) is globally generated, we may view \(s\) as a global section of \(\O\vert_U^e\). Since \(X\) is reduced, since X is reduced, the section \(s\) will be non-zero at some generic point \(y\in U\). Localized at \(y\), we get a short exact sequence \[ 0\to \mc{K}_y\to \O_y^e\to \mc{F}_y\to 0, \] where \(\O_y\) is a field because \(y\) is the generic point.

Because \(0\neq s_y\in \mc{K}_y\). There is a contraction by comparing the ranks of the \(\O_y\)-modules in the above short exact sequence.

Therefore, \(\O\vert_U^e\to \mc{F}\vert_U\) is an isomorphism.**Corollary 5**Let \(X\) be a non-empty quasi-compact scheme and \(\mc{F}\) is a coherent sheaf on \(X\). If \(\mc{F}\) is globally generated at all closed points, then \(\mc{F}\) is globally generated at all points.

*Proof. * By Lemma 3, it suffices to show that there is a closed point \(y\) in the closure of every point \(x\) of \(X\).
Indeed, let \(U\) be an open neighborhood of \(y\) such that \(\mc{F}\vert_U\) is generated by it globally sections. Since \(y\in U\), we see \(x\in U\). Otherwise, if \(x\in U^c\), then \(y\in\overline{\{x\}}\subset U^c\) which implies a contradiction.

The fact that every point in \(X\) has a closed point in its closure follows from the quasi-compactness of \(X\).

Since \(X\) is non-empty quasi-compact, it admits an irredundant finite open cover \(X=U_1\cup \cdots \cup U_r\). Restrict to \(\overline{\{x\}}\subset X\), we get an irredundant finite open cover \(\overline{\{x\}}=V_1\cup \cdots \cup V_s\). Because a prime ideal is always contained in a maximal ideal. In each affine open set, there is a closed point. Let \(x_1\) be a close point in \(V_1\). If it is a closed point of \(\overline{\{x\}}\), then we are done. Otherwise, let \(x_2\in \overline{\{x_1\}}\) such that \(x_2\neq x_1\) and \(x_2\not\in V_1\), where the closure is taken in the closed subset \(\overline{\{x\}}\) but not in \(V_1\). With loss of generality, we may assume that \(x_2\in V_2\). Repeating this procedures, as the open cover is finite, we know that there must be a closed point \(y\) in \(\overline{\{x\}}\) which is also closed in \(X\).Note that without quasi-compactness, a closed subset of a scheme may not have a closed point (see Vakil 2017 Exercise 5.1.E).

## The Terminology of Invertible Sheaf

We know that an invertible sheaf is a locally free rank 1 sheaf. The term invertible is from the following fact.

**Proposition 1**Let \(X\) be a reduced Noetherian scheme. A coherent sheaf \(\mc{F}\) is locally free of rank 1 if and only if there is another coherent sheaf \(\mc{G}\) such that \(\mc{F}\otimes_{\O_X}\mc{G}=\O_X\).

*Proof. * If \(\mc{F}\) is locally free of rank 1, then \(\mc{G}=\sHom_{\O_X}(\mc{F},\O_X)\) satisfies that
\[
\mc{F}\otimes \sHom_{\O_X}(\mc{F},\O_X)=\sHom_{\O_X}(\mc{F},\mc{F})\O_X
\]
because \(\mc{F}\) is locally free.

A categorical definition of invertible sheaf can be found in 17.23 Invertible modules in the Stack Project.

## The Theorem of the Square

Let \(f: X\to Y\) be a morphism between schemes and \(\mc{F}\) be a coherent sheaf on \(X\). We denote \(\mc{F}_y=\mc{F}\vert_{f^{-1}(y)}\).

**Theorem 1 (Theorem of the Square)**Let \(X\) be a complete varieties and \(Y\) a reduced Noetherian Scheme. Let \(L\) and \(M\) be two line bundles on \(X\times Y\). If for all closed points \(y\in Y\), we have \(L_y \cong M_y\) there exists a line bundle \(N\) on \(Y\) such that \(L\cong M\otimes p^*N\), where \(p: X \times Y \to Y\) is the projection onto \(Y\).

*Proof. * From the statement, we are expect that \(N=p_*(L\otimes M^{-1})\). Indeed, this is true. First, because \(X\) is complete and \(L_y\otimes M_y^{-1}\) is trivial, then \(p_*(L\otimes M^{-1})_y=k(y)\) for any \(y\in Y\). By Grauert’s Theorem, we know that \(p_*(L\otimes M^{-1})\) is locally free and moreover of rank 1.

We now show that the natural morphism \[ \alpha: p^*p_*(L\otimes M^{-1})\to L\otimes M^{-1} \] is an isomorphism. Let \(q\) be the restriction of \(p\) on the fiber \(X_y\). Then by diagram chasing and the assumption, we have an isomorphism along the fiber \(X_y\). \[ (p^*p_*(L\otimes M^{-1}))\vert_y=q^*q_*((L\otimes M^{-1})\vert_y)=\O_{X_y} \to (L\otimes M^{-1})\vert_y=\O_{X_y}. \] Consequently, for each closed point \((x, y)\in X\times Y\), the morphism \[ p^*p_*(L\otimes M^{-1})\vert_{(x, y)}\to \O_{X\times Y}\vert_{(x,y)} \] is an isomorphism.

Therefore, by Nakayama lemma, \(p^*p_*(L\otimes M^{-1})\to O_{X\times Y}\) is surjective. Because \(p^*p_*(L\otimes M^{-1})\) is of the rank 1, then the surjective morphism \(p^*p_*(L\otimes M^{-1})\to O_{X\times Y}\) must be an isomorphism, i.e. \(p^*p_*(L\otimes M^{-1})\cong O_{X\times Y}\).**Corollary 6 (See-Saw Principle)**Suppose that, in addition to the hypotheses of Theorem of the Square, \(L_x\cong M_x\) for some \(x\in X\). Then \(L\cong M\).

*Proof.*Let \(N\) be the line bundle on \(Y\) such that \(L=M\otimes p^*N\). Over \(\{x\} \times Y\), we have \(L_x\cong M_x \otimes (p^∗N)_x\). Therefore, \((p^∗N)_x\cong N\) is trivial, which implies that \(p^*N\) is trivial and \(L\cong M\).

The see-saw principle is very useful to prove the Theorem of cube which says that an invertible sheaf on the product \(X_1\times X_2\times X_3\) of three complete varieties is trivial if it is trivial along fibers to the three projections \(p_i: X_1\times X_2\times X_3\to X_i\). Interested reader may find a proof, for example, in (Milne 2008).

*Introduction to Commutative Algebra*. Reading, Mass.: Addison-Wesley.

*The Red Book of Varieties and Schemes: Includes the Michigan Lectures (1974) on Curves and Their Jacobinians*. 2nd expanded ed. Lecture Notes in Mathematics 1358. Berlin ; New York: Springer.

*Annals of Mathematics*61 (2): 197–278. https://doi.org/bsnrv2.