Nakayama’s lemma is a powerful and useful tool in algebraic geometry. It sort of plays the role of the inverse function theorem in differential geometry.
In this post, we will consider various versions of Nakayama’s lemma and some applications.
Nakayama’s Lemma
We start with a standard technique due to Atiyah and Macdonald (Atiyah and Macdonald 1969).
Proof. Let \(x_1\), \(x_2\),\(\dots\), \(x_n\) be a basis of \(M\). Then a endomorphism \(\phi\) is represented by matrix \(\left(\alpha_{ij}\right)\), i.e. \[ \phi(x_i)=\sum_{j=1}^na_{ij}x_j. \] Since \(\phi(M)\subseteq\mf{a}M\), we may assume that \(\alpha_{ij}\in \mf{a}\). Denote by \(\delta_{ij}\) the Kronecker delta. Then the above equality is equivalent to \[\sum_{j=1}^n(\delta_{ij}\phi-\alpha_{ij})\cdot x_j=0.\]
By multiplying both sides the adjugate matrix, we find that \[\det(\delta_{ij}\phi-\alpha_{ij})=0.\]
The desired equality follows by applying the Laplace expansion to the determinant.A Geometric Version of Nakayama’s Lemma
A good resource of various versions and application of Nakayama’s lemma is Mumford’s red book (Mumford 1999).
Proof. Let \(N=(f_1,\dots, f_n)\) be the submodule in \(M\). Then we have \[(N+\mf{m}M)/\mf{m}M=M/\mf{m}M.\] It then follows that \(N+\mf{m}M=M\) (by the third isomorphism theorem or the snake lemma). Therefore, \[M/N=(N+mM)/N=m(M/N).\]
Apply Nakayama’s Lemma to \(M/N\), we get \(M/N=0\) and hence \(M=N\).Let \(X\) be scheme and \(\mc{F}\) be a \(\O_X\)-module. Then the map \[ \Hom_{\O_X}(\O_X,\mc{F})\to \Gamma(X,\mc{F}), w \mapsto w_X(1) \] is an isomorphism. Indeed, if \(s\in \Gamma(X,\mc{F})\), then there is a unique homomorphism \(w: \O_X \to \mc{F}\) such that \(w_U(1) = s\vert_U\). This defines an inverse map.
Let \((s_i)\), \(i\in I\) be a family of sections \(s_i\in \Gamma(X,\mc{F})\). We say that \(\mc{F}\) is generated by the family, if the corresponding homomorphism \(\O_X^{(I)}\to \mc{F}\) is surjective.
Equivalently, \(\mc{F}\) is generated by the family \((s_i)\), \(i\in I\) if the family generates \(\Gamma(X,\mc{F})\) and for any closed point \(y\in X\), \((s_i)\) generate the stalk \(\mc{F}_y\) as an \(\O_{X,y}\)-module.
The above corollary implies the following geometric version of Nakayama’s lemma.
Lemma 3 (Geometric Nakayama’s Lemma) Let \(X\) be a Noetherian scheme, \(x\) a closed point in \(X\), \(U\) an open neighborhood of \(x\), and \(\mc{F}\) a coherent sheaf on \(X\). Let \(a_1\), \(\dots\), \(a_r\) be sections in \(\mc{F}(U)\) such that the germs \(\bar{a}_1\), \(\dots\), \(\bar{a}_r\) in \(\mc{F}_x\) generate the sheaf \(\mc{F}\vert_x=\mc{F}\otimes k(x)\). Then there exist an open neighborhood \(\spec(A)\ni p\) such that \(a_1\vert_{\spec(A)}\), \(\dots\), \(a_r\vert_{\spec(A)}\) generate \(\mc{F}\vert_{\spec(A)}\).
In particular, if \(\mc{F}\vert_x=0\), then there exists an open subset \(V\) in \(X\) such that \(\mc{F}\vert_V\)=0.Proof. Let \(V\) be an affine open neighborhood of \(x\) in \(U\). Then \(\mc{F}\vert_V=\tilde{M}\), \(\mc{F}_x=M\otimes \O_{X,x}\), and \(\mc{F}\vert_x=M\otimes k(x)\), where \(M=\mc{F}(V)\) is a finitely generated \(\O(V)\)-module. Since \(\bar{a}_1\), \(\dots\), \(\bar{a}_n\) generate \(\mc{F}\vert_x\), by Corollary 2, we know that \(a_1\), \(\dots\), \(a_n\) generate \(M=\mc{F}(V)\). Moreover, \(a_1\), \(\dots\), \(a_n\) generates \(\mc{F}_x\).
Those sections define a morphism \(\O\vert_V^n\to \mc{F}\vert_V\) such that the morphism at the stalk level \(\O_x^n\to \mc{F}_x\) is surjective.
By shrinking \(V\), we may assume that \(\O\vert_V^n\to \mc{F}\vert_V\) is surjective (see Lemma 17.9.4 in the Stack Project).
Therefore, \(a_1\), \(\dots\), \(a_n\) generate \(\mc{F}\vert_V\).In the lemma, the condition can be relax to any scheme \(X\) and \(\mc{F}\) is quasicoherent sheaf of finite type (see for example Serre 1955 Section II Proposition 1).
This geometric version has some useful corollaries.
Proof. If \(\mc{F}\) is free in a neighborhood of \(x\), then \(e(x)\) is constant over that neighborhood.
Conversely, let \(U\) be a neighborhood of \(x\) such that \(e(y)\) is a constant for any \(y\in U\). Then by shrinking \(U\), we may assume that \[\O\vert_U^e\to \mc{F}\vert_U\] is surjective. Let \(\mc{K}\) be the kernel of this morphism. If \(\mc{K}\neq 0\), then there is a nonzero element \(s\) in \(\mc{K}\). Since \(\mc{K}\subset \O\vert_U^e\) and \(\O\vert_U^e\) is globally generated, we may view \(s\) as a global section of \(\O\vert_U^e\). Since \(X\) is reduced, since X is reduced, the section \(s\) will be non-zero at some generic point \(y\in U\). Localized at \(y\), we get a short exact sequence \[ 0\to \mc{K}_y\to \O_y^e\to \mc{F}_y\to 0, \] where \(\O_y\) is a field because \(y\) is the generic point.
Because \(0\neq s_y\in \mc{K}_y\). There is a contraction by comparing the ranks of the \(\O_y\)-modules in the above short exact sequence.
Therefore, \(\O\vert_U^e\to \mc{F}\vert_U\) is an isomorphism.Proof. By Lemma 3, it suffices to show that there is a closed point \(y\) in the closure of every point \(x\) of \(X\). Indeed, let \(U\) be an open neighborhood of \(y\) such that \(\mc{F}\vert_U\) is generated by it globally sections. Since \(y\in U\), we see \(x\in U\). Otherwise, if \(x\in U^c\), then \(y\in\overline{\{x\}}\subset U^c\) which implies a contradiction.
The fact that every point in \(X\) has a closed point in its closure follows from the quasi-compactness of \(X\).
Since \(X\) is non-empty quasi-compact, it admits an irredundant finite open cover \(X=U_1\cup \cdots \cup U_r\). Restrict to \(\overline{\{x\}}\subset X\), we get an irredundant finite open cover \(\overline{\{x\}}=V_1\cup \cdots \cup V_s\). Because a prime ideal is always contained in a maximal ideal. In each affine open set, there is a closed point. Let \(x_1\) be a close point in \(V_1\). If it is a closed point of \(\overline{\{x\}}\), then we are done. Otherwise, let \(x_2\in \overline{\{x_1\}}\) such that \(x_2\neq x_1\) and \(x_2\not\in V_1\), where the closure is taken in the closed subset \(\overline{\{x\}}\) but not in \(V_1\). With loss of generality, we may assume that \(x_2\in V_2\). Repeating this procedures, as the open cover is finite, we know that there must be a closed point \(y\) in \(\overline{\{x\}}\) which is also closed in \(X\).Note that without quasi-compactness, a closed subset of a scheme may not have a closed point (see Vakil 2017 Exercise 5.1.E).
The Terminology of Invertible Sheaf
We know that an invertible sheaf is a locally free rank 1 sheaf. The term invertible is from the following fact.
Proof. If \(\mc{F}\) is locally free of rank 1, then \(\mc{G}=\sHom_{\O_X}(\mc{F},\O_X)\) satisfies that \[ \mc{F}\otimes \sHom_{\O_X}(\mc{F},\O_X)=\sHom_{\O_X}(\mc{F},\mc{F})\O_X \] because \(\mc{F}\) is locally free.
Conversely, for each point \(x\in X\), we get \[ \mc{F}_x\otimes_{\O_x}\mc{G}_x=\O_x. \] Tensor with \(k(x)=\O_x/\mf{m}_x\), we get \[ \mc{F}\vert_x\otimes_{k(x)}\mc{G}\vert_x=\mc{F}_x\otimes_{\O_x}k(x)\otimes_{k(x)}k(x)\otimes_{\O_x}\mc{G}_x=k(x). \] Note that \(\mc{F}\vert_x=\mc{F}_x/\mf{m}_x\mc{F}_x\) is a vector field over the residue field \(k(x)\). Therefore, \(\mc{F}\vert_x\) has rank 1 at every point \(x\in X\). By Corollary 4, \(\mc{F}\) is locally free of rank 1.A categorical definition of invertible sheaf can be found in 17.23 Invertible modules in the Stack Project.
The Theorem of the Square
Let \(f: X\to Y\) be a morphism between schemes and \(\mc{F}\) be a coherent sheaf on \(X\). We denote \(\mc{F}_y=\mc{F}\vert_{f^{-1}(y)}\).
Proof. From the statement, we are expect that \(N=p_*(L\otimes M^{-1})\). Indeed, this is true. First, because \(X\) is complete and \(L_y\otimes M_y^{-1}\) is trivial, then \(p_*(L\otimes M^{-1})_y=k(y)\) for any \(y\in Y\). By Grauert’s Theorem, we know that \(p_*(L\otimes M^{-1})\) is locally free and moreover of rank 1.
We now show that the natural morphism \[ \alpha: p^*p_*(L\otimes M^{-1})\to L\otimes M^{-1} \] is an isomorphism. Let \(q\) be the restriction of \(p\) on the fiber \(X_y\). Then by diagram chasing and the assumption, we have an isomorphism along the fiber \(X_y\). \[ (p^*p_*(L\otimes M^{-1}))\vert_y=q^*q_*((L\otimes M^{-1})\vert_y)=\O_{X_y} \to (L\otimes M^{-1})\vert_y=\O_{X_y}. \] Consequently, for each closed point \((x, y)\in X\times Y\), the morphism \[ p^*p_*(L\otimes M^{-1})\vert_{(x, y)}\to \O_{X\times Y}\vert_{(x,y)} \] is an isomorphism.
Therefore, by Nakayama lemma, \(p^*p_*(L\otimes M^{-1})\to O_{X\times Y}\) is surjective. Because \(p^*p_*(L\otimes M^{-1})\) is of the rank 1, then the surjective morphism \(p^*p_*(L\otimes M^{-1})\to O_{X\times Y}\) must be an isomorphism, i.e. \(p^*p_*(L\otimes M^{-1})\cong O_{X\times Y}\).The see-saw principle is very useful to prove the Theorem of cube which says that an invertible sheaf on the product \(X_1\times X_2\times X_3\) of three complete varieties is trivial if it is trivial along fibers to the three projections \(p_i: X_1\times X_2\times X_3\to X_i\). Interested reader may find a proof, for example, in (Milne 2008).