# Automatic Reusable Design for Analog Micropower Integrated Circuits Por ### Pablo Aguirre Tesis Presentada Ante el Instituto de Ingeniería Eléctrica Para Cumplir con Parte de los Requisitos del Grado de MAGISTER EN INGENIERÍA ELÉCTRICA En el Área de MICROELECTRÓNICA Tutor: Prof. Fernando Silveira #### Tribunal: Prof. José Silva-Martinez, Texas A&M, USA. Prof. Carlos Galup-Montoro, UFSC, Brasil. Prof. Gregory Randall, UDELAR, Uruguay. Instituto de Ingeniería Eléctrica Facultad de Ingeniería Universidad de la República Montevideo, Uruguay Abril 2004 ISSN: 1510-7264 Type setted in LATEX $2_{\mathcal{E}}$ ## Contents | Lis | st of I | Figures | | vi | |-----|---------|----------|-------------------------------------------------------|------| | Lis | st of | Tables | | ix | | Lis | st of . | Algoritl | nms | Х | | Αs | grade | cimient | OS | xi | | | | | | | | K€ | esume | en | | .111 | | Al | ostrac | et | | ΧV | | 1. | Intro | oductio | n | 1 | | 2. | Desi | gn Met | hodologies | 5 | | | 2.1 | Introd | uction | 5 | | | 2.2 | A Cur | rrent-Based MOSFET Model for IC Design | 5 | | | | 2.2.1 | Current - Voltage Relationships | 6 | | | | 2.2.2 | The $(g_m/I_D)$ Ratio | 7 | | | | 2.2.3 | Intrinsic Capacitances | 7 | | | | 2.2.4 | Noise Model | 8 | | | | 2.2.5 | Output Conductance | 10 | | | | 2.2.6 | Non-quasi-static Model and Second Order Effects | 10 | | | | 2.2.7 | Why ACM? | 10 | | | 2.3 | The (g | $g_m/I_D$ ) Based Methodology for Analog Design | 10 | | | 2.4 | Auton | natic Synthesis for Miller Amplifiers | 12 | | | | 2.4.1 | The Miller Amplifier | 12 | | | | 2.4.2 | Gain-Bandwidth Driven Synthesis Algorithm | 17 | | | | 2.4.3 | Design Optimization Through Design Space Exploration | 17 | | | | 2.4.4 | Synthesis Example: Micropower 100kHz Miller Amplifier | 19 | | | | 2.4.5 | Synthesis Example: 50MHz Miller Amplifier | 25 | | | 2.5 | Concl | usions | 29 | | 3. | Low | -Power | OpAmp Cells: Reuse, Architecture and Synthesis | 31 | | | 3.1 | Introd | luction | 31 | | | 3.2 | Analo | g Design Reuse | 31 | | | | 3.2.1 | Circuit Performance Tuning Through Bias Current | 32 | |----|-------|----------|------------------------------------------------------------------------------------------------------------|----| | | | 3.2.2 | Reusable Circuit Architectures | 36 | | | | 3.2.3 | Technology Migration | 39 | | | 3.3 | Opam | p Architecture | 40 | | | | 3.3.1 | Constant gm Rail-to-Rail Input Stages | 40 | | | | 3.3.2 | Low-Power Class AB Output Stage | 45 | | | | 3.3.3 | Opamp Complete Architecture | 48 | | | 3.4 | Advan | aced Design Methodologies | 48 | | | | 3.4.1 | Power Optimization for a Given Total Settling Time | 49 | | | | 3.4.2 | Settling Behavior Model | 50 | | | | 3.4.3 | Power Optimization of a Miller OTA | 51 | | | 3.5 | Conclu | usions | 53 | | 4. | Hiera | archical | Automated Synthesis | 55 | | | 4.1 | Introd | uction | 55 | | | 4.2 | Miller | ${\bf Compensation} \ {\bf Capacitance} \ {\bf for} \ {\bf Minimum} \ {\bf Power} \ {\bf Consumption} \ .$ | 55 | | | 4.3 | Synthe | esis Algorithm | 56 | | | | 4.3.1 | High Level Synthesis | 58 | | | | 4.3.2 | Input Stage Synthesis | 61 | | | | 4.3.3 | Output Stage Synthesis | 65 | | | 4.4 | Synthe | esis Results | 69 | | | | 4.4.1 | $1\mu s$ Settling Time Design | 70 | | | | 4.4.2 | Opamp Performance Tuning | 75 | | | | 4.4.3 | Synthesized vs. Tuned | 78 | | | | 4.4.4 | Performance Evaluation Against a Simpler Architecture | 78 | | | 4.5 | Analys | sis of the Constant-gm Circuit | 79 | | | | 4.5.1 | Open Loop Transfer | 80 | | | | 4.5.2 | Bias Current Monitor | 82 | | | | 4.5.3 | Redesign of the Constant-gm Circuit | 83 | | | 4.6 | Conclu | usions | 84 | | 5. | Expe | eriment | al Results | 85 | | | 5.1 | Rail-to | o-rail Operational Amplifier in $0.8 \mu m$ CMOS Technology | 85 | | | 5.2 | Compa | arison with other published results | 90 | | | 5.3 | Concli | usions | 92 | | 6. | Conclusions | 93 | |-----|---------------------------------------------------|-----| | A. | Low-Voltage Cascode Bias Transistor Design | 97 | | В. | Size of Transistors in the Experimental Prototype | 99 | | Bil | bliography | .01 | # List of Figures | 2.1 | Normalized $V_{DSsat}$ for several values of $\xi$ and the SI approximation | 8 | |------|-----------------------------------------------------------------------------------------------------|----| | | | | | 2.2 | $(g_m/I_D)$ ratio as a function of the inversion factor $i_f$ | 11 | | 2.3 | Miller Amplifier, including parasitics capacitances | 13 | | 2.4 | Offset voltage as a function of $(g_m/I_D)$ | 16 | | 2.5 | Design space exploration: Total consumption (in $\mu A$ ) of the 100kHz Miller Amplifier | 20 | | 2.6 | Design space exploration: Die area estimation (in $\mu m^2$ ) of the 100kHz Miller Amplifier | 21 | | 2.7 | Design space exploration: DC Gain (in $dB$ ) of the 100kHz Miller Amplifier | 21 | | 2.8 | Gain and doublet frequency dependence on the length of M3 | 23 | | 2.9 | Output swing and total area dependence on $(g_m/I_D)_4$ ratio | 23 | | 2.10 | Gain and total area dependence on the length of M4 | 24 | | 2.11 | Frequency response of the 100kHz Miller Amplifier | 26 | | 2.12 | Design space exploration: Total consumption (in $mA$ ) of the 50MHz Miller Amplifier | 27 | | 2.13 | Design space exploration: Die area estimation (in $\mu m^2$ ) of the 50MHz Miller Amplifier | 27 | | 2.14 | Design space exploration: Total gain (in $dB$ ) of the 50MHz Miller Amplifier | 28 | | 2.15 | Frequency response of the 50MHz Miller Amplifier | 29 | | 3.1 | Gain-Bandwidth product tuning of the Miller amplifier from section 2.4.4 | 33 | | 3.2 | General characteristic of the class AB stage | 38 | | 3.3 | Open loop frequency response comparison in technology migration | 40 | | 3.4 | Basic rail-to-rail differential pair architecture | 41 | | 3.5 | Transconductance as a function of the input common mode voltage, using architecture from Figure 3.4 | 42 | | 3.6 | Schematic view of the constant gm operation principle | 43 | |------|-------------------------------------------------------------------------------------------------------------------------|----| | 3.7 | Implementation of the constant gm technique | 44 | | 3.8 | Transconductance as a function of the input common mode voltage using constant gm technique | 45 | | 3.9 | Class AB output stage | 46 | | 3.10 | Amplifier circuit implementation, omitting constant-gm circuit | 48 | | 3.11 | Settling time model and step response plot | 50 | | 4.1 | High Level Schematic of the Amplifier | 57 | | 4.2 | Complete Amplifier Synthesis Algorithm Scheme. $t_{sett}$ is total settling time and $IDD$ is total current consumption | 58 | | 4.3 | Folded Cascode Circuit | 63 | | 4.4 | Class AB output stage | 65 | | 4.5 | Opamp cell, omitting constant-gm circuitry | 69 | | 4.6 | Total Consumption (in $\mu A$ ) for a $1\mu sec$ total settling time rail-to-rail OTA | 70 | | 4.7 | Transition frequency and Phase Margin along the input common mode range | 73 | | 4.8 | Total Settling Time for different input common mode range | 74 | | 4.9 | Transition frequency $(f_T)$ and Phase Margin tuning over more than 3 decades | 75 | | 4.10 | Transition frequency and phase margin tuning as a function of the input common mode | 76 | | 4.11 | Total Settling Time tuning for three different input common modes | 77 | | 4.12 | Total Settling Time tuning as a function of the input common mode range | 78 | | 4.13 | Total Settling Time comparison between the $10\mu s$ design and the tuned $1\mu s$ design | 79 | | 4.14 | Constant-gm circuit loop | 81 | | 4.15 | Settling time as a function of the input common mode with the redesign of the constant-gm circuit | 83 | | 5.1 | Opamp Cell Microphotograph | 85 | |-----|---------------------------------------------------------------------------------------------------------------------|----| | 5.2 | Settling time automatic measurement system | 86 | | 5.3 | Total settling time tuning as a function of the input common mode range | 87 | | 5.4 | Comparison between the simulated and experimental total settling time tuning for three different input common modes | 87 | | 5.5 | Settling time as a function of the total quiescent current consumption | 88 | | 5.6 | Offset voltage as a function of the input common mode | 90 | | A.1 | Cascode transistor bias | 97 | ## List of Tables | 2.1 | Specifications for a Micropower 100kHz Miller Amplifier | 20 | |-----|---------------------------------------------------------------------------------------------|----| | 2.2 | Final design for a 100kHz Miller amplifier | | | | | | | 2.3 | Specifications for a 50MHz Miller Amplifier | 25 | | 2.4 | Final design for a 50MHz Miller amplifier | 28 | | 3.1 | Tuning of the Miller amplifier introduced on section 2.4.4 | 36 | | 4.1 | Automatic Synthesis Result with Algorithm 4.1 | 71 | | 4.2 | Transistors Sizes Obtained Using Algorithm 4.1 | 72 | | 4.3 | Calculated and simulated characteristics of the OTA with $1\mu sec$ total settling time | 73 | | 4.4 | Comparison between the $10\mu s$ design and the tuned $1\mu s$ design | 80 | | 4.5 | Comparison between our amplifier and a simple Miller amplifier designed using Algorithm 3.1 | 80 | | 5.1 | Opamp Cell characteristics | 89 | | 5.2 | Comparison of the performance of the Opamp Cell | 91 | | B.1 | Transistors sizes in the experimental prototype | 99 | ## List of Algorithms | 2.1 | Gain-Bandwidth Driven Automatic Synthesis | 18 | |-----|----------------------------------------------------|----| | 2.2 | Power Optimization Automatic Synthesis | 19 | | 3.1 | Power Optimization for a Given Total Settling Time | 54 | | 4.1 | High Level Synthesis | 62 | | 4.2 | Input Stage Synthesis | 65 | | 4.3 | Output Stage Synthesis | 69 | ### Agradecimientos Si bien esta tesis fue realizada en su totalidad en Uruguay, donde el español es la única lengua oficial, decidimos escribirla en inglés para darle a la misma mayor posibilidad de difusión, ya que como dijo el Prof. Gabor Temes en una conferencia en la que tuve el gusto de escucharlo "The language of scientific research is accented english". Podrá sorprender, entonces, que estas líneas estén escritas en español, pero lo cierto es que todas las personas a las que quiero agradecer tienen la suerte de tener al idioma español por lengua madre, por lo que no veo la razón para agradecerles en otro idioma. Esta tesis es el resultado del apoyo y el esfuerzo de diversas personas a las que estoy profundamente agradecido. Mi tutor, Fernando Silveira, ha sido una sólida y fundamental guía en este trabajo. Fernando ha sido quien, a lo largo de estos años, me ha brindado desde su ya amplia experiencia en la investigación científica lo mejor de sí para formarme como investigador en el área de la microelectrónica. De hecho, también fue él quien me formó en los principios básicos de la electrónica en los cursos de grado hace ya 4 años y en los cuales ahora tengo el gusto de desempeñarme como uno de sus ayudantes. A lo largo de estos años y hasta el día de hoy, Fernando siempre esta dispuesto a discutir y solucionar mis mas variadas inquietudes, con su habitual optimismo y a pesar de sus innumerables tareas y obligaciones dentro y fuera de la facultad. Por todo esto y mucho más, que probablemente no queda reflejado en este corto párrafo, estoy sumamente agradecido. También estoy sumamente agradecido con Alfredo Arnaud, quien por un formalismo administrativo no puede figurar como el co-tutor que fue de esta tesis. Alfredo ha sido fundamental para este trabajo y en general en mi formación en la microelectrónica. Fue él quien me guió en mi primer trabajo en el área y desde entonces y a lo largo de esta maestría nunca ha dudado en apoyarme y asistirme mientras llevaba a cabo exitosamente su propia tesis de doctorado. Conrado Rossi, con quien compartimos la oficina, ha sido desde que entré en el IIE mi referencia en todo lo que es el funcionamiento del grupo y del propio instituto. Conrado, si bien no participó directamente de este trabajo, siempre estuvo dispuesto a interesarse en el tema y discutir mis dudas, y en los últimos meses me liberó de diversas responsabilidades en el proyecto que él dirige y en el que me desempeño como ayudante, para que pudiera terminar esta tesis en tiempo y forma. En ellos, junto al resto del grupo de Microelectrónica: Leonardo Barboni, Rafaella Fiorelli, Pablo Mazzara y Linder Reyes, encontré un excelente grupo de trabajo con el que me siento sumamente a gusto y con cuyos integrantes estoy muy agradecido. Aquí también quiero agradecer a Raúl Acosta, quien trabajó en el tema de migración de tecnología y obtuvo los resultados experimentales que se muestran en la sección 3.2.3. También estoy muy agradecido al Instituto de Ingeniería Eléctrica (IIE), a su director, Gregory Randall, y al jefe de mi departamento, Rafael Canetti. Debo también agradecer a la Comisión Académica de Posgrados (CAP) de la Facultad de Ingeniería por la beca de apoyo que me asignaron durante estos casi dos años de trabajo y que me permitió dedicarme exclusivamente a mi maestría y a mi trabajo docente en el IIE. Quiero agradecer también a los Profesores Carlos Galup-Montoro y José Silva Martinez que aceptaron participar de mi tribunal de tesis. Es un verdadero honor para mi contar con la evaluación de dos reconocidos profesores del más alto nivel internacional. También quiero agradecer a la Dra. Adoración Rueda y a la gente del CNM de Sevilla, España, por recibirme y facilitarme los recursos necesarios para realizar una estancia de investigación de tres meses durante el año 2003. Mis tareas en el CNM, si bien no tienen relación directa con el trabajo de esta tesis, fueron un aporte importantísimo a mi formación en esta maestría. Para terminar quiero agradecer a mi familia y amigos. A mis padres que desde el principio me formaron para dar lo mejor en lo que me propusiera hacer. A mi madre, que es la principal razón para que esta tesis se haya podido escribir en un nivel aceptable de inglés, y a mi padre, que desde que tengo memoria apoyó e incentivó mi fascinación por las ciencias. A mis hermanos Diego y Fernando, que junto a mis padres, han soportado mis mal humores y que desaparezca de mi casa durante largos períodos de tiempo, otra vez. A Patricia, Choché, Juan Pablo y Agustín quienes desde hace 5 años me hacen sentir parte de la familia. A mis amigos, Ale, Cris, Javo, Jorge, Juan, Leo, Martín, Nacho y Pucho, que siempre me apoyaron y estuvieron más que dispuestos a ir a comer unos lehmeyuns y tomar una(s) cerveza(s) después de un largo día de trabajo. Por último y muy especialmente, a Virigina, quien desde hace ya más de 5 años me apoya incondicionalmente y hace lo imposible por entender qué es exactamente a lo que se dedica su novio. #### Resumen Esta tesis trata sobre el estudio y desarrollo de un algoritmo automático de síntesis para amplificadores operacionales de microconsumo. Los objetivos principales de este trabajo son el estudio de las metodologías existentes de diseño analógico para consumo mínimo y su aplicación en el diseño automático de un amplificador operacional reutilizable de microconsumo con etapas de entrada y salida "rail-to-rail". Por lo tanto, se seguirán dos líneas de investigación en este trabajo. Primero, el desarrollo de un nuevo enfoque jerárquico en algoritmos de síntesis automática, que permite desacoplar la síntesis de cada etapa del amplificador del algoritmo de síntesis principal. Segundo, una revisión y la aplicación de técnicas de reutilización de circuitos analógicos, particularmente en arquitecturas de amplificadores, migración de tecnología y especialmente en la técnica de sintonización del compromiso entre velocidad y consumo utilizando la corriente de polarización. En esta tesis, utilizando la metodología $(gm/I_D)$ [1], nos enfocaremos exclusivamente en la obtención de diseños con óptimo consumo de corriente, siguiendo así con la línea de investigación del Grupo de Microelectrónica del IIE. El punto de partida para el repaso de las metodologías de diseño avanzadas, es un algoritmo simple de diseño automático para un amplificador Miller [2] basado en el producto ganancia por ancho de banda. Este repaso progresa hasta el algoritmo de síntesis automática desarrollado por Silveira [3], con el cual a partir de especificaciones de alto nivel (tiempo total de establecimiento) se puede sistemáticamente obtener las especificaciones del amplificador (producto ganancia por ancho de banda, slew rate) y el tamaño de los transistores. El diseño que se obtiene, cumple con las especificaciones con consumo mínimo. Desde este punto, desarrollamos un algoritmo jerárquico para arquitecturas más complejas que incluyen etapas de entrada "railto-rail" y una etapa de salida clase AB. Este enfoque jerárquico permite separar el algoritmo de síntesis de cada etapa del algoritmo de síntesis de alto nivel que está basado en el algoritmo presentado por Silveira [3]. La elección de las arquitecturas de cada etapa no es arbitraria y está sumergida en el contexto de la segunda línea de investigación de este trabajo: la reutilización de diseños analógicos. En esta linea se investigan dos enfoques. Primero, se estudian arquitecturas para etapas de entrada y salida que son factibles de ser utilizadas en diferentes condiciones de operación, lo que nos permite obtener una celda que puede ser utilizada en un amplio espectro de aplicaciones para baja tensión de alimentación y microconsumo. El segundo enfoque que se investiga, se centra en la posibilidad de sintonizar la performance del circuito mediante la corriente de polarización. La idea es sintonizar el compromiso entre velocidad y consumo del amplificador mientras se mantiene la performance en el resto de los aspectos. Esta no es una idea nueva y ya ha sido implementada con éxito en una aplicación comercial [4]. Sin embargo, hasta <sup>&</sup>lt;sup>1</sup>Que permiten señales en cualquier nivel de tensión entre las fuentes de alimentación. donde sabemos, esto solo ha sido realizado en tecnología bipolar, y por lo tanto nos proponemos realizar la primera experiencia exitosa de esta teoría utilizando tecnología CMOS estándar. En definitiva, el objetivo final de esta tesis fue diseñar, utilizando un algoritmo automático de síntesis, una celda de un amplificador operacional reutilizable, que cumpla con las especificaciones de alto nivel con mínimo consumo. Los resultados obtenidos, tanto en simulaciones como en las medidas experimentales del prototipo, muestran que el algoritmo de síntesis desarrollado obtiene un diseño que cumple exitosamente con las especificaciones para el tiempo de establecimiento. Para comparar la eficiencia del amplificador se utilizaron figuras de mérito usuales para medir la performance en términos del compromiso entre velocidad y consumo. Se comparó contra otros resultados publicados en la literatura [3,5–8] y se muestra que la performance del amplificador es superior a todos ellos, lo que permite afirmar que efectivamente se logró optimizar el consumo del amplificador. El consumo total para el diseño con $1\mu s$ de tiempo total de establecimiento es de $10.3\mu A$ con una tensión de alimentación de sólo 2V ( $20.6\mu W$ ). La sintonización del punto de operación también se comprobó exitosamente, pudiéndose sintonizar el mismo por más de 3 décadas de tiempo de establecimiento, con consumos que llegan a 160nA para el amplificador con $100\mu s$ de tiempo de establecimiento y que puede ser llevado a amplificadores más lentos pero con consumos aún menores. #### Abstract This thesis deals with the study and development of an automatic synthesis algorithm for micropower operational amplifiers. The main objectives of this work are the study of the existent power oriented methodologies for analog design and its application in the automatic design of a reusable rail-to-rail input/output micropower operational amplifier cell. Thus, two main lines of research will be attacked in this work. First, the development of a new hierarchical approach in automatic synthesis algorithms to decouple the synthesis algorithm of each stage of the amplifier from the main synthesis algorithm. Second, the review and application of analog reuse techniques, regarding opamp architectures, technology migration, but specially speed-power trade-off tuning through bias current. In this thesis, we will follow the line of research of our group and focus exclusively in the obtention of optimum power designs, using the $(g_m/I_D)$ methodology [1]. The introduction of a simple, gain-bandwidth driven, automatic design algorithm for a Miller amplifier [2] is used as a starting point for the review of more advanced design methodologies. This review leads to an automatic synthesis algorithm developed by Silveira [3] which systematically transits from high level specifications (total settling time) to the amplifier specifications (gain-bandwidth, slew rate) and then to transistor sizing. The design obtained complies with the high level specifications with minimum power consumption. We took on from this point into the development of a hierarchical algorithm for more complex architectures that include rail-to-rail input stages and a power efficient class AB output stage. The hierarchical approach allows to decouple the synthesis algorithm of each stage from the high level synthesis algorithm based on the algorithm presented by Silveira [3]. The selection of the architectures of each stage is not arbitrary, but is based on the second line of research of this work: analog design reuse. Two main lines of study are followed here. The study of architectures for input and output stages that are suitable to be used on different environmental conditions, allow us to obtain an opamp cell that can be used in an ample spectrum of low-voltage, micropower applications. The second line of study in analog design reuse focuses on the possibility of circuit performance tuning through the bias current, where preliminary results have already been obtained [9]. The idea in this technique is to tune the power-speed trade off of the opamp cell using the bias current while keeping the performance in all other aspects. This idea is not new, and has already been used in a industrial application [4], but to the best of our knowledge, it has only been done in bipolar technology. Therefore, we intend to make the first experimental test of this theory in standard CMOS technology. The final objective pursued in this thesis, then, is the successful design and implementation, using an automatic synthesis algorithm, of a reusable opamp cell that complies with the high level specifications with optimum power consumption. The results show, both in simulations and experimental measurements, that the synthesized design using the algorithm developed in this work, successfully complies with the settling time specifications. To compare the efficiency of the amplifier, we used the usual figures of merit to measure the trade-off between speed and power consumption. We achieved superior performance against several other published results [3, 5–8], which shows that the amplifier presents optimum consumption. Total current consumption on the $1\mu s$ total settling time design is $10.3\mu A$ with a supply voltage of only 2V ( $20.6\mu W$ ). Performance tuning was also successfully verified. The cell can be tuned over more than 3 decades of settling time, including consumptions that reach 160nA for a $100\mu s$ settling time, and beyond. # Chapter 1 Introduction This thesis deals with the study and development of an automatic synthesis algorithm for micropower operational amplifiers. The main objectives of this work are the study of the existent power oriented methodologies for analog design and its application in the automatic design of a reusable rail-to-rail input/output micropower operational amplifier cell. Thus, two main lines of research will be attacked in this work. First, the development of a new hierarchical approach in automatic synthesis algorithms to decouple the synthesis algorithm of each stage of the amplifier from the main synthesis algorithm. Second, the review and application of analog reuse techniques, regarding opamp architectures, technology migration, but specially speed-power trade-off tuning through bias current. The $(g_m/I_D)$ methodology [1], developed in the Université Catholique de Louvain (UCL), provides a powerful tool for automatic design methodologies, as it allows the designer to systematically explore the design space and obtain an optimum combination of the design variables in a given sense. In this thesis, we will follow the line of research of our group and focus exclusively in the obtention of optimum power designs. Nevertheless, the methods and techniques applied here are general and can be applied to optimize any other aspect of the design. We begin with the review of the MOSFET model used in this work. The ACM [10,11] model presents simple, single piece, continuous expressions and has many advantages regarding analog design. Specially, the fact that every equation is a function of the inversion level and a few physical-based parameters, makes this model ideal to be used in automatic synthesis algorithms. The introduction of a simple, gain-bandwidth driven, automatic design algorithm for a Miller amplifier [2] is used as a starting point for the review of more advanced design methodologies. This review leads to an automatic synthesis algorithm developed by Silveira [3] which systematically transits from high level specifications (total settling time) to the amplifier specifications (gain-bandwidth, slew rate) and then to transistor sizing. The design obtained complies with the high level specifications with minimum power consumption. We took on from this point into the development of a hierarchical algorithm for more complex architectures that include rail-to-rail input stages and a power efficient class AB output stage. The hierarchical approach allows to decouple the synthesis algorithm of each stage from the high level synthesis algorithm based on the algorithm presented by Silveira [3]. The selection of the architectures of each stage is not arbitrary, but is based on the second line of research of this work. Analog design reuse has become an essential tool to bridge the gap between circuits complexity and the ever shrinking time-to-market. The urge for implementing reuse capabilities is particularly intense in the analog field [12], since automatic synthesis of analog circuits is a much hard problem than for the digital counterparts. Not only there are more aspects of the problem to take into account besides consumption, speed and area, but also analog block design is very layout and process dependent and special skills are required to complete them. Hence, analog automatic synthesis is much less developed than digital synthesis, further increasing the demands for experienced designer time in the analog field. Two main lines of study are followed in analog design reuse. The study of architectures for input and output stages that are suitable to be used on different environmental conditions, allow us to obtain an opamp cell that can be used in different applications. The most important characteristics of rail-to-rail input stages towards reusability are presented together with a new approach presented by Silveira [3] for a power efficient class AB output stage that takes advantage of a transconductance multiplication effect. The complete amplifier architecture obtained, conforms an opamp cell suitable to be used in an ample spectra of low-voltage, micropower applications. The second line of study in analog design reuse focuses on the possibility of circuit performance tuning through the bias current, where preliminary results have already been obtained [9]. The idea in this technique is to tune the power-speed trade off of the opamp cell using the bias current while keeping the performance in all other aspects. This idea is not new, and has already been used in an industrial application [4], but to the best of our knowledge, it has only been done in bipolar technology. Therefore, we intend to make the first experimental test of this theory in standard CMOS technology. The final objective pursued in this thesis, then, is the successful design and implementation, using an automatic synthesis algorithm, of a reusable opamp cell that complies with the high level specifications with optimum power consumption. Next we will outline the contents of each chapter, - Chapter 1: Introduction This chapter, where an introduction with the backgrounds, motivations and objectives of this thesis are presented. - Chapter 2: Design Methodologies The second chapter introduces the reader with the basic automatic design methodologies and synthesis algorithms. It begins with a review of the MOSFET model used in this work which presents major advantages for analog design. Then the $(g_m/I_D)$ methodology, which is a keystone in all the algorithms presented and developed in this work, is introduced and explained. On the third part of the chapter, the design of a simple Miller compensated amplifier is presented. First, the characteristic equations for frequency response, offset, dynamic range and parasitic capacitances are presented. Then the basic gain-bandwidth driven algorithm and the design space exploration algorithm for power optimization are presented and explained in two design examples for $f_T = 100kHz$ and $f_T = 50MHz$ . - Chapter 3: Low-Power OpAmp Cells: Reuse, Architecture and Synthesis The third chapter of this thesis presents the theory and actual state of know-ledge in analog reuse and advanced automatic synthesis algorithms for reu- 1. Introduction 3 sable low-power operational amplifier cells. The chapter is divided in three sections. First, we present the theory and some examples of analog design reuse, including performance tuning through bias current, architectures suited for different environmental conditions and technology migration. Second, the selected opamp architecture for the opamp cell is presented. And third, the power optimization algorithm for a given total settling time developed by Silveira [3] is presented as an example of state of the art automatic synthesis algorithm. Chapter 4: Hierarchical Automated Synthesis The fourth chapter presents the development of the hierarchical automated synthesis algorithm and its application to the design of a $1\mu s$ total settling time amplifier using the architecture seen on the previous chapter. First, a new expression for directly estimate the Miller compensation capacitance for optimum consumption is presented. This expression is used in the following section in the development of the high level synthesis algorithm, and saves large amounts of processing time. Then, we present the hierarchical approach for the automatic synthesis algorithm, along with the synthesis algorithms for the input and output stages. Finally, we present the simulations of the synthesized cell, including the tuning of the cell over several decades of total settling time. Chapter 5: Experimental Results The last chapter presents the results obtained from the measurements of the prototype fabricated in a $0.8\mu m$ standard CMOS process. The performance of the opamp cell is characterized and the reusability of the cell over several decades of total settling time is successfully verified. The usual figures of merit used to measure the power-speed efficiency in amplifiers are used to compare the performance of our cell with several other amplifiers from the literature and excellent results are obtained, proving the true power optimization achieved by the algorithm. Chapter 6: Conclusions Conclusions and ideas for future research are presented. # Chapter 2 Design Methodologies #### 2.1 Introduction This chapter introduces the basic concepts and ideas that will be used to develop the automatic design algorithms presented on Chapter 3. The chapter begins by introducing the MOSFET model used in this work. By doing so, we introduce the reader with the notation and basic design equations that will be used through out this work. The development of an automatic synthesis algorithm for two-stage Miller amplifiers, allow us to explain in a simple architecture amplifier the design space exploration using the $(g_m/I_D)$ methodology [1], which is the main idea behind the synthesis automation for optimum design. The core of the Miller amplifier synthesis algorithm is a gain-bandwidth product driven algorithm presented by Jespers [2]. Section 2.2 briefly reviews the MOSFET model presented by Cunha, Galup-Montoro and Schneider [10, 11]. On section 2.3, the $(g_m/I_D)$ based methodology will be introduced before entering section 2.4 where the synthesis algorithms for Miller amplifiers is presented. In this section, the Miller amplifier is analyzed and the algorithm driven by the gain-bandwidth product and the algorithm for design space exploration are presented. Also, two design examples are introduced to show the performance of the algorithms. Finally conclusions are presented on section 2.5. ### 2.2 A Current-Based MOSFET Model for IC Design The need for an accurate MOSFET model that provides simple expressions is critical in the development of analog design methodologies. In this work we will use the model presented by Galup-Montoro *et al.* [10,11]. This model meets several desirable requirements from the designer point of view. Among them we would like to highlight that - The model is single piece, continuous and presents simple accurate expressions. Particularly it correctly represents all the regions of operation, from weak inversion to strong inversion, including moderate inversion. - The model conserves charge. - The model has a minimum set of parameters, all physically based. The main approximation of this model, referred as the ACM model from herein, is to consider the depletion and inversion charge densities, $Q_B'$ and $Q_I'$ , to be linear functions of the surface potential of the channel $\phi_S$ for a constant gate-to-bulk voltage. As a consequence, the MOSFET drain current and charges are expressed as very simple functions of two components of drain current, namely, the forward and reverse saturation currents. A very simple relation between these two components of the drain current and the applied voltages is obtained. One of the fundamental parameters in the MOSFET model is defined as the inverse of the slope of the curve $\phi_{Sa}$ versus $V_G$ , where $\phi_{Sa}$ is the surface potential for which $Q'_I = 0$ . This parameter is known as the *slope factor* and is written as $$n = 1 + \frac{\gamma}{2\sqrt{\phi_{Sa}}} \tag{2.1}$$ where $\gamma$ is the body effect factor. The *slope factor* is slightly dependent on the gate voltage, but it can be assumed constant for hand calculations and usually $n = 1, 2, \dots, 1.6$ for bulk technology. #### 2.2.1 Current - Voltage Relationships Let us now resume the main expressions of the ACM model, as they will be used throughout this work. The pinch-off voltage, defined as the channel voltage for which the inversion charge density equals $-\gamma C'_{ox}\phi_t$ being $C'_{ox}$ the oxide capacitance per unit area and $\phi_t$ the thermal voltage, can be approximated as $$V_P = \frac{V_G - V_{T0}}{n} (2.2)$$ where every voltage is referred to the bulk voltage, $V_G$ is gate voltage and $V_{T0}$ is the threshold voltage when source voltage, $V_S$ , is zero. The drain current is defined as $$I_D = I_S(i_f - i_r) \tag{2.3}$$ where $i_{f(r)}$ is the forward (reverse) normalized current and $$I_S = \frac{1}{2}\mu n C'_{ox} \phi_t^2 \frac{W}{L} \tag{2.4}$$ is the normalization current, which is four time smaller than the same factor as presented in the EKV model [13]. Here $\mu$ is the carriers mobility in the channel, and W and L are the channel width and length respectively. In forward saturation $i_f \gg i_r$ , so the drain current can be approximated by $$I_D = I_S i_f = \frac{1}{2} \mu n C'_{ox} \phi_t^2 \frac{W}{L} i_f$$ (2.5) In the EKV model [13] the forward normalized current $i_f$ is also referred as the inversion factor since it indicates the inversion level of the MOSFET. As a rule of thumb, values of $i_f$ greater than 100 characterize strong inversion and values below 1 characterize weak inversion<sup>2</sup>. Values between 1 and 100 indicate moderate inversion. The relationship between current and voltage in the MOSFET transistor is given by: $$V_P - V_{S(D)} = \phi_t \left[ \sqrt{1 + i_{f(r)}} - 2 + \ln \left( \sqrt{1 + i_{f(r)}} - 1 \right) \right]$$ (2.6) where $V_{S(D)}$ is the source (drain) voltage. Used with equation (2.2), we can estimate from this expression the gate voltage in a forward saturated transistor as a function of the inversion level and the source voltage. $$V_G = V_{T0} + nV_S + n\phi_t \left[ \sqrt{1 + i_f} - 2 + \ln\left(\sqrt{1 + i_f} - 1\right) \right]$$ (2.7) Another powerful design equation provided by the ACM model is derived from equation (2.6). The theoretical drain to source saturation voltage, $V_{DSsat}$ , is defined in equation (2.8) as the value of $V_{DS}$ for which the ratio $Q'_{ID}/Q'_{IS} = \xi$ , where $\xi$ is an arbitrary number much smaller than 1. In this definition, $1 - \xi$ represents the saturation level of the MOSFET. $$V_{DSsat} = \phi_t \left[ \ln \left( \frac{1}{\xi} \right) + \sqrt{1 + i_f} - 1 \right]$$ $$\simeq \phi_t \left[ \sqrt{1 + i_f} + 3 \right] for \xi = 1\%$$ (2.8) It can be noted that $V_{DSsat}$ is independent of the inversion level in weak inversion while in strong inversion it follows the usual square root approximation, as shown in Figure 2.1. #### 2.2.2 The $(g_m/I_D)$ Ratio The $(g_m/I_D)$ ratio will be a key parameter in the design methodologies presented in this work, as we will see in section 2.3 and through out this work. The ACM model provides a simple expression for the $(g_m/I_D)$ ratio in a forward saturated MOS transistor as a function of the inversion level. $$\frac{g_m}{I_D} = \frac{1}{n\phi_t} \frac{2}{\sqrt{1 + i_f} + 1} \tag{2.9}$$ #### 2.2.3 Intrinsic Capacitances Nine intrinsic capacitances characterize the MOS transistor [14]. Among this nine capacitances, $C_{GS}$ , $C_{GD}$ , $C_{GB}$ , $C_{BS}$ and $C_{BD}$ are widely used in AC modelling as they accurately describe charge storage up to moderate frequencies. It can be proved [10] that $C_{GB} = C_{BG}$ so only three more capacitances should be added to the model to complete the nine capacitances. In the case of ACM model $C_{SD}$ , $C_{DS}$ $<sup>^{2}</sup>$ In EKV model values of $i_f$ greater than 10 characterize strong inversion and values below 0.1 characterize weak inversion. Since the ratio between the normalization current in EKV and ACM is four, these boundaries would correspond to 0.4 and 40 when using ACM. Nevertheless, for simplicity sake, 1 and 100 are taken. Figure 2.1: Normalized $V_{DSsat}$ for several values of $\xi$ and the strong inversion approximation: $\sqrt{i_f}$ and $C_{DG}$ are chosen. The complete expressions for these eight capacitances can be found in reference [10]. Here we will only give a simplified expression for the gate capacitance in the case of a forward saturated transistor with $V_S = 0$ . $$C_{GS} = \frac{2}{3}C_{ox}\left(1 - \frac{1}{\sqrt{1+i_f}}\right)\left(1 - \frac{1}{(\sqrt{1+i_f}+1)^2}\right)$$ (2.10) $$C_{GB} = \frac{n-1}{n}(C_{ox} - C_{GS}) \tag{2.11}$$ $$C_G = C_{GS} + C_{GB}$$ $$= \frac{n-1}{n} C_{ox} \left( 1 - \frac{2}{3} \left( 1 - \frac{1}{\sqrt{1+i_f}} \right) \left( 1 - \frac{1}{(\sqrt{1+i_f}+1)^2} \right) \right)$$ (2.12) These expressions are valid for every operating region and become very useful design tools. #### 2.2.4 Noise Model Noise is considered an internally generated, random, small signal and can be modelled by the addition of noise sources to the noiseless small-signal transistor model [14]. MOSFET noise is usually modelled as a current source between source and drain and can be considered to be composed of thermal (white) noise and flicker noise. Both these noise sources are uncorrelated [14], so the power spectral density of the total noise will be given by $$S_i(f) = S_{iw}(f) + S_{if}(f)$$ (2.13) The classical model for the white noise power spectral density follows [14], $$S_{iw} = -\frac{4k_B T \mu Q_I}{L^2} \tag{2.14}$$ where $k_B$ is the Boltzmann constant, T the absolute temperature and $Q_I$ the total inversion charge. Using the expression for $Q_I$ in the ACM model, a general expression can be obtained [10,15] $$S_{iw} = \gamma n k_B T g_m \tag{2.15}$$ where $\gamma = 2$ in weak inversion operation and $\gamma = \frac{8}{3} \simeq 2$ in strong inversion. The other component of noise in equation (2.13) is flicker noise, which is also called "1/f" noise because its power spectral density is nearly proportional to the inverse of the frequency. It is quite well accepted that this behavior is due to the random fluctuation of the number of carriers in the channel caused by trapping and detrapping of carriers in energy states near the $Si - SiO_2$ interface [14,15]. Arnaud and Galup-Montoro [15] provide an expression for the flicker noise power spectral density in the ACM model $$S_{if}(f) = \frac{q^2 N_{ot} I_D \mu}{L^2 n C'_{ox}} \ln \left( \frac{n C'_{ox} \phi_t - Q'_{IS}}{n C'_{ox} \phi_t - Q'_{ID}} \right) \frac{1}{f}$$ (2.16) where $N_{ot}$ is a technology parameter to be adjusted representing the effective number of traps. This expression can be further simplified into expressions valid in weak inversion or strong inversion. However, in their work, Arnaud and Galup-Montoro provide a simple expression, valid for any inversion level, for the *corner frequency*. The corner frequency is defined as the frequency where both thermal and flicker noise have the same value. $$f_c \simeq \frac{1}{2} \frac{g_m}{WLC'_{ox}} \frac{N_{ot}}{N^*} \tag{2.17}$$ Equation (2.17), in which $N^* = \frac{q}{nC_{ox}\phi_t}$ , can be used to obtain a simple expression to easily estimate the total noise power spectral density for a single transistor [15] $$S_i = 2nk_B T g_m \left( 1 + \frac{f_c}{f} \right) \tag{2.18}$$ From the designer perspective this is a very powerful tool as it allows to identify the source of the most significant terms of noise in a circuit. #### 2.2.5 Output Conductance A complete model for the output conductance, including velocity saturation effects, channel length modulation and drain-induced barrier lowering is included in the ACM model. Nevertheless, we will use the usual and much simpler approximated model, valid for the forward saturated long-channel transistor, using $$g_0 = \frac{dI_D}{dV_D} = \frac{I_D}{V_A}$$ (2.19) where $V_A$ is referred as the Early voltage and supposed proportional to the transistor length. #### 2.2.6 Non-quasi-static Model and Second Order Effects So far, long and wide channel MOSFETs have been considered and the model presented is valid for low and medium frequency analysis. The ACM model includes a complete non-quasi static model and a set of equations to take into consideration second order effects, as mobility reduction, velocity saturation and channel length modulation. #### 2.2.7 Why ACM? The ACM model presented in this section shows major advantages on MOS transistor analog design. All of which might be summed up on the fact that all the ACM model expressions are functions of the forward normalized current (also known as inversion factor) and a very small set of parameters all physically based. The fact that we can sweep all the regions of operation with one variable and using simple single piece equations for each transistor characteristic is a mayor advantage in design automation algorithms. Models widely used as BSIM, use large quantities of parameters, most of which are empirical fitting parameters. These models are fine for computer based simulators but are hardly acceptable for hand made calculations and design algorithms. The EKV model on the other hand has many of the advantages of ACM model: Inversion factor based, simple expressions, few parameters, etc. However it uses nonphysical interpolating curves to bridge the gap between weak and strong inversion. EKV model, then, does not allow the calculation of the nonreciprocal capacitances and does not conserve charge [11]. Nevertheless most of the algorithms introduced on this work can be easily used with the EKV model. ## 2.3 The $(g_m/I_D)$ Based Methodology for Analog Design The $(g_m/I_D)$ based methodology allows an unified synthesis methodology in all regions of operation of the MOS transistor. It provides an alternative, taking full advantage of the moderate inversion region, to obtain reasonable power-speed compromise [1]. This methodology has been widely used since its publication proving its advantages in analog circuits design [10, 11, 16–35] Figure 2.2: $(g_m/I_D)$ ratio as a function of the inversion factor $i_f$ for typical bulk-technology parameters. The proposed methodology considers the relationship between the transconductance over drain current ratio $(g_m/I_D)$ and the normalized drain current $(\frac{I_D}{W/L})$ as a fundamental design tool. This choice of the $(g_m/I_D)$ ratio is based in the following reasons - 1. It gives an indication of the device operation region. - 2. It is strongly related to the performance of analog circuits. - 3. It provides a tool for calculating the transistor dimensions. The first reason can be explained using the ACM model. Equation (2.9) shows an univocal relationship between the inversion factor $i_f$ and the $(g_m/I_D)$ ratio. This relationship can be seen in Figure 2.2 where the three regions, strong, moderate and weak inversion, are shown. The relationship between $(g_m/I_D)$ ratio and the power efficiency can be seen in an "intrinsic gain-stage" example, where both gain and transition frequency are linear functions of the transconductance $$A_0 = -\frac{g_m}{I_D} V_A \tag{2.20}$$ $$A_{0} = -\frac{g_{m}}{I_{D}} V_{A}$$ $$f_{T} = \frac{1}{2\pi} \frac{g_{m}}{C_{L}}$$ (2.20) where $V_A$ is the Early voltage of the transistor and $C_L$ is the load capacitance of the stage. Equations (2.20) and (2.21) show that greater $(g_m/I_D)$ ratio reflects in greater gain and bandwidth for the same power consumption. Finally, the ability to precisely obtain transistors dimension with this methodology lays in the fact that the $(g_m/I_D)$ vs $I_D/(W/L)$ characteristic is independent of transistor size, and therefore is a unique characteristic for all transistors of the same type in a given batch [1]. This "universal" quality of the $(g_m/I_D)$ curve shows that once a pair of values among $(g_m/I_D)$ , $g_m$ and $I_D$ are chosen, (W/L) ratio is unambiguously determined [1]. #### 2.4 Automatic Synthesis for Miller Amplifiers In this section an automatic synthesis algorithm for Miller amplifiers is presented. This will illustrate the use of the $(g_m/I_D)$ methodology applied in automatic circuit synthesis. First the Miller Amplifier is analyzed and the equations that characterize its behavior are presented. Then the concept of design space exploration for optimum design is presented. The design space exploration in the case of the Miller amplifier is implemented with a gain-bandwidth product driven algorithm that is also explained in this section. Finally, two amplifiers will be synthesized, each for a different transition frequency. The first for $f_T = 100kHz$ and the second for $f_T = 50MHz$ . #### 2.4.1 The Miller Amplifier The Miller compensated amplifier is a well known opamp architecture that can achieve good power consumption performances in low frequency applications. Figure 2.3 shows the amplifier schematic, where $C_m$ is the Miller compensating capacitance, $C_1$ , $C_{out}$ and $C_3$ are parasitic capacitances and $C_L$ is the load capacitance. In the notation used, $C_2 = C_{out} + C_L$ is the total output capacitance of the amplifier. #### Gain-Bandwidth Product and Phase Margin The transfer function of this amplifier is given in equation (2.22), where $gm_1(gm_2)$ is the transconductance of, respectively, the differential pair M1a - M1b (output stage M2) and $g_1$ ( $g_2$ ) is the output conductance of the first stage (second stage). $$H(s) = -\frac{gm_1(C_m s - gm_2)\frac{1}{g_1}\frac{1}{g_2}}{1 + (\frac{C_1}{g_1} + \frac{C_2}{g_2} + C_m(\frac{gm_2}{g_1g_2} + \frac{1}{g_1} + \frac{1}{g_2}))s + (\frac{C_1C_2 + C_m(C_1 + C_2)}{g_1g_2})s^2}$$ (2.22) Figure 2.3: Miller Amplifier, including parasitics capacitances. DC gain and expressions of poles and zero frequencies can be easily derived from equation (2.22) $$G = \frac{gm_1gm_2}{q_1q_2} \tag{2.23}$$ $$G = \frac{gm_1gm_2}{g_1g_2}$$ $$\omega_{DP} \simeq \frac{1}{\frac{gm_2}{g_1g_2}C_m}$$ $$\omega_{NDP} \simeq \frac{gm_2C_m}{C_1C_2 + C_m(C_1 + C_2)}$$ (2.23) $$(2.24)$$ $$\omega_{NDP} \simeq \frac{gm_2C_m}{C_1C_2 + C_m(C_1 + C_2)} \tag{2.25}$$ $$\omega_z = \frac{gm_2}{C_m} \tag{2.26}$$ where G is the DC gain, $\omega_{DP}$ and $\omega_{NDP}$ are the amplifier's dominant and nondominant pole angular frequencies and $\omega_z$ is the amplifier right-half plane zero angular frequency $^3$ . Equations (2.23-2.26) can be used to obtain the following relationships $$\omega_T = G\omega_{DP} = \frac{gm_1}{C_m} \tag{2.27}$$ $$\omega_{T} = G\omega_{DP} = \frac{gm_{1}}{C_{m}}$$ $$NDP = \frac{\omega_{NDP}}{\omega_{T}} = \frac{gm_{2}}{gm_{1}} \frac{C_{m}^{2}}{C_{1}C_{2} + C_{m}(C_{1} + C_{2})}$$ $$Z = \frac{\omega_{z}}{\omega_{T}} = \frac{gm_{2}}{gm_{1}}$$ (2.27) (2.28) $$Z = \frac{\omega_z}{\omega_T} = \frac{gm_2}{gm_1} \tag{2.29}$$ where $\omega_T$ is the gain-bandwidth product of the first order system neglecting the effect <sup>&</sup>lt;sup>3</sup>In what follows, angular frequencies ( $\omega$ ) will be referred in the text, for compactness, as frequencies, while the actual frequencies will be noted as f of the non-dominant pole. NDP and Z are the non-dominant pole and right-half plane zero frequencies normalized to $\omega_T$ . These two latter relationships determine the phase margin (PM) of the amplifier. Assuming NDP, Z > 1 (that is $\omega_{NDP}, \omega_z > \omega_T$ ), PM can be approximated as $$PM = 90 - \arctan(\frac{1}{NDP}) - \arctan(\frac{1}{Z})$$ (2.30) The exact PM expression must take into account that the actual transistors frequency is different from the first order approximation. Finally, equations (2.28) and (2.29) can be combined to obtain an expression for the Miller compensating capacitance for a given NDP over Z ratio. $$C_m = \frac{1}{2} \frac{NDP}{Z} \left[ C_1 + C_2 + \sqrt{(C_1 + C_2)^2 + 4\frac{Z}{NDP} C_1 C_2} \right]$$ (2.31) Since NDP and Z ratios determine the phase margin of the amplifier, as we saw in equation (2.30), equations (2.27) and (2.31) become powerful design tools in a Miller amplifier synthesis. #### Offset Two effects will be considered in the input offset voltage of a Miller amplifier: systematic offset and random offset. The first one is due to the finite output impedance of the current mirror (M3a - b). The second one is due to the mismatch between the mirror transistors and the mismatch between the differential pair transistors. **Systematic Offset**, as we said, is due to the finite output impedance of the current mirror. When there is a difference between the drain-source voltage of each mirror transistor, a difference appears between the drain currents. The relative error in the copy can be estimated as $$\frac{\Delta I_D}{I_D} = \frac{1}{I_D} \frac{\Delta V}{r_o} = \frac{\Delta V}{V_A} \tag{2.32}$$ where $\Delta V = V_{DS3a} - V_{DS3b}$ , $r_o = V_A/I_D$ is the output resistance of the mirror transistors and $V_A$ is the Early voltage. The offset voltage due to this copy error can be calculated through the differential pair transconductance as $$V_{off} = \frac{\Delta I_D}{g_m} = \frac{\Delta I_D/I_D}{(g_m/I_D)} = \frac{\Delta V/V_A}{(g_m/I_D)}$$ (2.33) which is a useful expression as it estimates the systematic offset voltage as a function of the $(g_m/I_D)$ ratio of the differential pair. Random Offset is due to the mismatch between the transistors of the mirror and the mismatch between the transistors of the differential pair. To model these mismatches the following analysis is made. Current through a forward saturated transistor can be expressed as a function of the current factor ( $\beta = \mu C'_{ox}W/L$ ), the threshold voltage ( $V_{T0}$ ) and the gate voltage. As gate voltage is the same for both transistors, either in the mirror or in the differential pair, the current error can be written as $$I_D - \overline{I_D} = \Delta I_D = \frac{\partial I_D}{\partial V_{T0}} \Delta V_{T0} + \frac{\partial I_D}{\partial \beta} \Delta \beta$$ (2.34) where $\overline{I_D}$ is the mean current value and $I_D$ is the actual current value of each sample. The partial derivatives can be approximated as $$\frac{\partial I_D}{\partial V_{T0}} = \frac{\partial I_D}{\partial V_G} \frac{\partial V_G}{\partial V_{T0}} = g_m \cdot 1 = g_m \tag{2.35}$$ $$\frac{\partial I_D}{\partial \beta} = \frac{I_D}{\beta} \tag{2.36}$$ allowing us to rewrite equation (2.34) as $$\Delta I_D = g_m \Delta V_{T0} + \frac{I_D}{\beta} \Delta \beta \tag{2.37}$$ The standard deviation of the current will depend on the standard deviation of $V_{T0}$ and $\beta$ . Since this two random effects are considered statistically independent, the standard deviation of the current is $$\frac{\sigma_{I_D}}{I_D} = \sqrt{\left(\frac{g_m}{I_D}\right)^2 \sigma_{V_{T0}}^2 + \frac{\sigma_{\beta}^2}{\beta^2}}$$ (2.38) where the standard deviation of $V_{T0}$ and $\beta$ ( $\sigma_{V_{T0}}^2$ , $\sigma_{\beta}^2$ ) can be expressed using Pelgrom's model [36]: $$\sigma_{V_{T0}}^2 = \frac{A_{V_{T0}}^2}{W.L} + S_{V_{T0}}^2 D^2 \tag{2.39}$$ $$\frac{\sigma_{\beta}^2}{\beta^2} = \frac{A_{\beta}^2}{W.L} + S_{\beta}^2 D^2 \tag{2.40}$$ Here D represents the distance between transistors and depends strongly on transistors's layout and size. When considering transistors with an almost square structure, D can be approximated as $D = \sqrt{WL}$ [3]. Finally, $A_{\beta}$ , $A_{V_{T0}}$ , $S_{V_{T0}}$ and $S_{\beta}$ are the coefficients that characterize the matching properties in a particular process and can be obtained from the foundry itself or from published results on matching. Taking transistor's mismatch on the mirror and the differential pair as statistically independent, total current error can be expressed as $$\frac{\sigma_{I_D}}{I_D} = \sqrt{\left(\frac{\sigma_{I_D}}{I_D}\right)_{pair}^2 + \left(\frac{\sigma_{I_D}}{I_D}\right)_{mirr}^2} \tag{2.41}$$ Figure 2.4: Offset voltage in a differential pair with active load as a function of $(g_m/I_D)_{pair}$ . which gives a total mismatch offset $$V_{off} = \frac{\sigma_{I_D}/I_D}{(g_m/I_D)_{pair}} \tag{2.42}$$ Equations (2.38)-(2.42) conform a useful set to easily estimate the input offset voltage due to transistor mismatch in the amplifier. Figure 2.4 shows an example where the offset voltage is evaluated as a function of the $(g_m/I_D)$ ratio of the differential pair for given transistors sizes and mirror's $(g_m/I_D)$ ratio. Here we can see that a steep decrease in the offset voltage appears as we move from strong to moderate inversion. As we enter into deep weak inversion the offset voltage tends to a constant value. It can be seen also that generally, systematic offset is much smaller than mismatch offset. Similar and further analysis on the effect of the $(g_m/I_D)$ ratio on mirror precision and OTA's offset voltage can be found on [3]. #### Input Common Mode Range and Output Swing Input Common Mode Range (ICMR) and Output Swing (OS) can be easily estimated using the equations provided by the ACM model for saturation voltage $(V_{DSsat}$ , equation (2.8)) and gate voltages (equation (2.7)). **ICMR** is determined by the saturation voltage of the differential pair's current source (M5) and the gate voltage of the mirror (M3a - b), see Figure 2.3). $$V_{SS} + V_{GS3} + V_{DSsat1} - V_{GS1} < V_{iCM} < V_{DD} - V_{DSsat5} - V_{GS1}$$ (2.43) **Output Swing**, on the other hand is determined by the saturation voltages of the output stage transistors (M2 and M4, see Figure 2.3). $$V_{SS} + V_{DSsat2} < Vo < V_{DD} - V_{DSsat4} \tag{2.44}$$ #### 2.4.2 Gain-Bandwidth Driven Synthesis Algorithm The basic synthesis algorithm for the Miller amplifier that will be applied here, was presented by Jespers [2]. The main idea is to synthesize a Miller amplifier for a given gain-bandwidth product $\omega_T$ and a given phase margin (PM). The rest of the performance specifications (DC gain, SR, noise, etc.) are adjusted by the selection of the design variables $((g_m/I_D)$ ratios and lengths). The designer chooses the $(g_m/I_D)$ ratio of each stage (that is the $(g_m/I_D)$ ratio of the differential pair transistors and the $(g_m/I_D)$ ratio of transistor M2) taking into consideration, for example, the objective gain-bandwidth product and the current budget available. Also, the length of the transistors is selected according to noise, gain and matching considerations. NDP and Z ratios can be chosen for a given objective phase margin. For example, it is common to take NDP = 2.2 and Z = 10 to achieve $PM \simeq 60^{\circ}$ . Then using equations (2.27) and (2.31) and applying the $(g_m/I_D)$ methodology, transistors sizes and other parameters (DC gain, SR, noise, etc.) can be obtained. The basic algorithm, as presented by Jespers is shown in Algorithm 2.1. In this algorithm, there are some missing design criteria. Step 4 (Algorithm 2.1) establishes mirror design to minimize offset. This is one of the choices, but noise, frequency response or gain could be used jointly with or instead of this criterium. On step 5 (Algorithm 2.1), the design criterium for the $(g_m/I_D)$ ratio isn't even specified. In section 2.4.4 the criterium used in this work is explained. #### 2.4.3 Design Optimization Through Design Space Exploration In Algorithm 2.1, $(g_m/I_D)$ ratios and lengths must be selected a priori by the designer. However, this may not be an easy task for an unexperienced user and can lead to very non-optimum designs. To obtain an optimum design, we can define a design space by both stage's active transistors $(g_m/I_D)$ ratios and explore the characteristics of the amplifier on it. This may be achieved applying Algorithm 2.1 in a mesh of points for the defined design space. In this way constant-level curves for every aspect of the amplifier required by the designer can be plotted to graphically show the behavior of the amplifier in the design space. This idea is based on the $(g_m/I_D)$ based <sup>&</sup>lt;sup>4</sup>Actually, since NDP and Z are normalized to $\omega_T$ of the first order system approximation, we could expect that the real PM will be bigger. However, other effects present in the real amplifier, as the pole-zero doublet from the input stage mirror, will eventually have a negative impact on the PM, leading to actual PM of about $60^{\circ}$ . #### **Algorithm 2.1** Gain-Bandwidth Driven Automatic Synthesis. - 1. Miller compensating capacitance $C_m$ is grossly estimated in a first guess. - 2. Differential pair is synthesized using equations (2.27) and (2.5) $$gm_1 = \omega_T C_m \Rightarrow I_{D1} = \frac{gm_1}{(g_m/I_D)_1}$$ $(W/L)_1 = \frac{I_{D1}}{\frac{1}{2}\mu n C'_{ox}\phi_t^2 i_{f1}}$ 3. Output stage transistor M2 is synthesized using equations (2.29) and (2.5) $$gm_5 = Zgm_1 \Rightarrow I_{D2} = \frac{gm_2}{(g_m/I_D)_2}$$ $(W/L)_2 = \frac{I_{D2}}{\frac{1}{2}\mu nC'_{ox}\phi_t^2 i_{f2}}$ 4. Minimum systematic offset criterion can be used to design the current mirror M3a-M3b $$V_{G3a} = V_{D3b} = V_{G2} \Rightarrow (g_m/I_D)_3 = (g_m/I_D)_2$$ $I_{D3} = I_{D1} \Rightarrow (W/L)_3 = \frac{I_{D1}}{\frac{1}{2}\mu n C'_{ox}\phi_t^2 i_{f2}}$ 5. The design of the output stage current source transistor M4 can be based in several criteria. An example is shown in section 2.4.4. $$(W/L)_4 = \frac{I_{D2}}{\frac{1}{2}\mu n C'_{ox}\phi_t^2 i_{f4}}$$ 6. First stage current source transistor M5 synthesis follows $$V_{G5} = V_{G4} \Rightarrow (g_m/I_D)_5 = (g_m/I_D)_4$$ $I_{D5} = 2I_{D1} \Rightarrow (W/L)_5 = \frac{2I_{D1}}{\frac{1}{2}\mu n C'_{ox}\phi_t^2 i_{f4}}$ - 7. All the transistors sizes are obtained using the selected lengths. - 8. Parasitic capacitances $C_1$ and $C_2$ are calculated. Now we can use equation (2.31) to obtain a new Miller capacitance $C_m$ $$C_m = \frac{1}{2} \frac{NDP}{Z} \left[ C_1 + C_2 + \sqrt{(C_1 + C_2)^2 + 4\frac{Z}{NDP}C_1C_2} \right]$$ 9. With the newly calculated $C_m$ we iterate from step 2, until the value of $C_m$ converges. #### Algorithm 2.2 Power Optimization Automatic Synthesis - 1. The length of all transistors is set to minimum. - 2. $(g_m/I_D)_4$ ratio is fixed somewhere in moderate inversion. - 3. The design space is swept using Algorithm 2.1. We choose the optimum combination of input and output stage $(g_m/I_D)$ ratio. - 4. The length of M3 is swept. We choose $L_3$ to obtain good gain and frequency response. - 5. $(g_m/I_D)_4$ ratio is swept. We choose it to obtain good Output Swing and total area. - 6. The length of M4 is swept. We choose $L_4$ to obtain good gain and total area. We iterate with step 5 until we converge to a solution for both $(g_m/I_D)_4$ and $L_4$ . - 7. We run Algorithm 2.1 with the values obtained for $(g_m/I_D)$ ratios and L's. methodology [1], explained in section 2.3, and has been used in previous works ([2,3]). Doing so, not only optimum combinations of the $(g_m/I_D)$ ratios can be obtained, but also the evolution of the aspect under study can be evaluated. This means that we may consider several aspects of the amplifier and select an optimum combination of the $(g_m/I_D)$ ratios in a multi-aspect sense. Lengths and non-critical $(g_m/I_D)$ ratios (e.g. second stage bias transistor) can also be selected using similar methodology. For example, a sweep of the length of mirror transistors can be used to select an optimum trade-off between frequency response and gain. Applying these ideas, we developed an algorithm that explores the design space of the Miller amplifier to obtain optimum consumption with good gain and area. thus, the algorithm also analyzes the effects of lengths and passive transistor's $(g_m/I_D)$ ratios to consider the trade-offs in the performance of the amplifier. This algorithm is presented in Algorithm 2.2 and in sections 2.4.4 and 2.4.5 we present two design examples. The algorithm is explained thoroughly when the first example is introduced in section 2.4.4. #### 2.4.4 Synthesis Example: Micropower 100kHz Miller Amplifier Table (2.1) shows the specifications for the design of this first example. As it can be seen, this design is intended for low frequency, low supply voltage, micropower operation. The process parameters are taken from a $0.8\mu m$ technology. First, the initial conditions for the algorithm (steps 1 and 2) are set. The length of the transistors can be latter adjusted to improve gain and $(g_m/I_D)_4$ was set to 10. We then sweep the design space to obtain constant consumption, area | $f_T$ | 100kHz | |--------------|------------| | Consumption | $< 1\mu A$ | | Power Supply | 2V | | $C_L$ | 10pF | | PM | > 60 | | Tech. | $0.8\mu m$ | Table 2.1: Specifications for a Micropower 100kHz Miller Amplifier Figure 2.5: Design space exploration: Total consumption (in $\mu A$ ) of the 100kHz Miller Amplifier. and dc gain curves. Figures 2.5, 2.6 and 2.7 show the space exploration for consumption, area and dc gain respectively. They clearly show that optimum consumption with reasonable gain and die area can be obtained when both stage's active transistors are in weak inversion. Particularly we choose: $$(g_m/I_D)_1 = 24$$ $(g_m/I_D)_2 = 22$ The lengths of the differential pair transistors and output stage active transistor are chosen $3\mu m$ to avoid big sizes, but at the same time obtain a good gain. Mirror's transistors can be designed according to several criteria. As shown in Algorithm 2.1 (step 4) we choose to minimize the systematic offset of the amplifier. As seen in section 2.4.1, systematic offset is due to a difference in the drain-source Figure 2.6: Design space exploration: Die area estimation (in $\mu m^2$ ) of the 100kHz Miller Amplifier.. Figure 2.7: Design space exploration: DC Gain (in dB) of the 100kHz Miller Amplifier.. voltage of both transistors. Thus it can be minimized if both voltages are designed to be the same, which can be achieved using the same $(g_m/I_D)$ ratio than transistor M2. Now, we only need to choose the length of both transistors. To do so, we will consider two effects of the length in the amplifier's behavior: gain and frequency response (step 4). The effect on gain, lays on the fact that the output resistance of the transistors can be modelled to be proportional to the length through the Early voltage $(r_o =$ $V_A/I_D$ , $V_A=V_EL$ ). Regarding, the frequency response, a given $(g_m/I_D)$ ratio and drain current fix the W/L ratio. That means that larger length implies larger width and, thus, larger parasitic capacitance (which depends grossly on W and on the WL product). Since the parasitic capacitance $C_3$ adds a pole-zero doublet to the response of the Miller amplifier [37], the length of the mirror transistors also has an effect on the frequency response. This doublet has a small impact on the frequency response but a large one on the transient response and should be kept beyond the working frequencies. $$\omega_{pDOUB} = \frac{g_{m3}}{C_3} \tag{2.45}$$ $$\omega_{pDOUB} = \frac{g_{m3}}{C_3}$$ $$\omega_{zDOUB} = \frac{2g_{m3}}{C_3}$$ (2.45) Both effects can be seen on Figure 2.8 where the doublet frequency is calculated using equation (2.45). The improvement on the gain starts to diminish as a longer transistor is selected, because for $L_3 \gg L_1$ the gain is determined only by $L_1$ . Thus, in this design we choose: $$L_3 = 9\mu m \tag{2.47}$$ which gives a good gain and a doublet frequency almost a decade above the transition frequency. The design criteria for the output stage bias transistor (M4) wasn't specified on Algorithm 2.1. We choose to select its $(g_m/I_D)$ ratio and length with the following analysis (steps 5 and 6). The design of transistor M4 affects gain, total area (like any transistor) and output swing. Length will affect gain and area (eqs. 2.49 and (2.50) and $(g_m/I_D)$ ratio will affect area and output swing (eqs. 2.48 and 2.49). $$V_{DSsat} = f\left((g_m/I_D)\right) \Rightarrow OS = f\left((g_m/I_D)\right)$$ (2.48) $$\left(\frac{W}{L}\right) = f\left((g_m/I_D)\right) \Rightarrow Area = f\left((g_m/I_D)\right), L_4$$ (2.49) $$Gain = f(g_4 + g_2) \Rightarrow Gain = f(L_4)$$ (2.50) where $g_4, g_2$ are the output conductance of transistors M4, M2 respectively. Figure 2.9 shows the dependence of the output swing and total area with $(g_m/I_D)_4$ . In this figure we see that the output swing behaves, as expected, according to the relationship between $V_{DSsat}$ and $(g_m/I_D)$ ratio seen on equation (2.8). This behavior shows that there is no reason to go into deep weak inversion because Figure 2.8: Gain and doublet frequency dependence on the length of M3. Figure 2.9: Output swing and total area dependence on $(g_m/I_D)_4$ ratio. Figure 2.10: Gain and total area dependence on the length of M4. no further increase on the output swing is achieved. What is more, the total area starts to raise exponentially as we move beyond $(g_m/I_D)_4 > 10$ . Thus, we choose $$(g_m/I_D)_4 = 6$$ which gives an upper output swing limit of 0.7V (power supply: $\pm 1V$ ) and a total area of about $10^4 \mu m^2$ . Figure 2.10 shows the dependence of the gain and total area with $L_4$ . As with the case of transistor M3, increasing the length improves the gain, but only to some extent. We choose $$L_4 = 60 \mu m$$ which gives a good gain and a total area of about $10^4 \mu m^2$ . #### Final Design The final design obtained (step 7) can be seen on Table (2.2). Here we see that, as expected, the frequency response isn't fully achieved due to the presence of higher order effects like the mirror pole-zero doublet. Systematic offset is effectively eliminated, as simulation showed that drain-source voltage difference in mirror transistors is below 5mV. Total consumption is kept below $1\mu A$ . The ratio between both stage's bias current is due to factor Z (equation (2.29)). It can be easily seen that for the same $(g_m/I_D)$ ratios the currents ratio equals Z. A more efficient architecture is obtained when using a R-C compensating network that eliminates the right-half plane zero. This design was simulated on SPICE using the BSIM3v3 transistor model. | 100kHz Miller Amplifier | | | | | |-------------------------|-----------------------|--------------------|--|--| | Gain | Total | 113.04dB | | | | | 1st stage | 55.23dB | | | | | 2nd stage | 57.81dB | | | | Freq. Resp. | $f_T$ | 91.9kHz | | | | | PM | $57.9^{o}$ | | | | Swing Lim. | OSwing | Up.: 706.0mV | | | | | | Low: -890.0mV | | | | | ICMR | Up.: 216.0mV | | | | | | Low: -740.1mV | | | | Offset | Mismatch | $4.56 \mathrm{mV}$ | | | | | Systematic | $1.82\mu V$ | | | | Pwr. Cons. | $1.69 \mu W (845 nA)$ | | | | | | $I_{D1}$ | 65nA | | | | | $I_{D2}$ | 716nA | | | | Capacitances | | | | | |--------------|-----------------|------------|--|--| | Miller | $C_m$ | 2.51 pF | | | | Paras. | $C_1$ | 0.25 pF | | | | | $C_2 - C_{out}$ | 0.12 pF | | | | | $C_3$ | 0.36 pF | | | | Sizes | | | | | | | $W(\mu m)$ | $L(\mu m)$ | | | | M1a-b | 18.5 | 3 | | | | M3a-b | 18 | 9 | | | | M2 | 66.4 | 3 | | | | M4 | 36.8 | 60 | | | | M5 | 6.6 | 60 | | | | Tot. Area | $0.001mm^2$ | | | | Table 2.2: Final design obtained for a 100kHz Miller amplifier using Algorithm 2.1 | $f_T$ | 50MHz | |--------------|---------------| | Consumption | $1 \dots 3mA$ | | Power Supply | 2V | | $C_L$ | 5pF | | PM | > 60 | Table 2.3: Specifications for a 50MHz Miller Amplifier Figure 2.11 compares the result of the simulation and the expected result obtained from MATLAB. It can be seen that both are in very good agreement. There is a slight difference on the low frequency gain which can be explained because of the simplified output conductance model used in the synthesis. Other difference can be seen on the phase at high frequencies, which can be explained because the algorithm uses a simplified second order transfer function. #### 2.4.5 Synthesis Example: 50MHz Miller Amplifier Having seen the design of a micropower 100kHz Miller amplifier a question may arise: Does this algorithm also works when used at higher frequencies? To answer that question we propose the design of a 50MHz Miller amplifier using Algorithm 2.2. The specifications for this amplifier can be seen on Table (2.3). The exploration of the design space can be seen on Figures 2.12, 2.13 and 2.14. Figure 2.12 shows that the optimum consumption region has moved towards strong inversion. This result was expected as will be explained next. For a transition frequency several orders of magnitude higher than the previous case, the input transconductance has to be also several orders of magnitude higher. If we intend to Figure 2.11: Frequency response of the 100kHz Miller Amplifier. keep the same $(g_m/I_D)$ ratio, drain current must increase along with the transconductance. But, as we showed in section 2.3, for a given $(g_m/I_D)$ ratio, $I_D$ over W/L ratio is determined. Thus W/L ratio will also increase several orders of magnitude yielding enormous transistors sizes with parasitic capacitances that prevents us from reaching our objective transition frequency. The solution is to have a smaller $(g_m/I_D)$ ratio, stronger inversion, obtaining an $I_D$ over W/L ratio several orders of magnitude bigger and thus having W/L ratios of the same order than the ones had on the low frequency case, though with less power efficiency. The $(g_m/I_D)$ ratios chosen for this design were: $$(g_m/I_D)_1 = 5$$ $(q_m/I_D)_2 = 7$ #### Final Design Using Algorithm 2.2 the rest of the design was completed based on the same criteria used on section 2.4.4. The final result can be seen on Table (2.4). Here we see that sizes, and consequently parasitic capacitance, although bigger, are of the same order of the previous case. Also, the smaller $(g_m/I_D)$ ratios and the shorter transistor's lengths reflected on a smaller gain. Smaller swing ranges and larger offset voltage are also found because of working in strong inversion. Regarding offset, mismatch offset is only a bit higher. Systematic offset is several orders bigger than in the previous design because of the much smaller length of mirror's transistors Figure 2.12: Design space exploration: Total consumption (in mA) of the 50MHz Miller Amplifier. Figure 2.13: Design space exploration: Die area estimation (in $\mu m^2$ ) of the 50MHz Miller Amplifier. Figure 2.14: Design space exploration: Total gain (in dB) of the 50MHz Miller Amplifier. | 50MHz Miller Amplifier | | | | |------------------------|----------------|--------------------|--| | Gain | Total | 63.8dB | | | | 1st stage | 28.0dB | | | | 2nd stage | 35.8dB | | | Freq. Resp. | $f_T$ | 46.1MHz | | | | MF | $58.9^{o}$ | | | Swing Lim. | OSwing | Up.: 591 mV | | | | | Low: -764 mV | | | | ICMR | Up.: -297 mV | | | | | Low: -599 mV | | | Offset | Mismatch | $6.36~\mathrm{mV}$ | | | | Systematic | 0.10mV | | | Pwr. Cons. | 3.36mW(1.68mA) | | | | | $I_{D1}$ | $184\mu A$ | | | | $I_{D2}$ | 1.31mA | | | Capacitances | | | | |--------------|-----------------|------------|--| | Miller | $C_m$ | 2.93 pF | | | Paras. | $C_1$ | 1.53 pF | | | | $C_2 - C_{out}$ | 2.72 pF | | | | $C_3$ | 0.48 pF | | | Sizes | | | | | | $W(\mu m)$ | $L(\mu m)$ | | | M1a-b | 109 | 1 | | | M3a-b | 77.4 | 0.8 | | | M2 | 691 | 1 | | | M4 | 1431 | 3 | | | M5 | 401 | 3 | | | Tot. Area | $0.025mm^{2}$ | | | Table 2.4: Final design obtained for a $50\mathrm{MHz}$ Miller amplifier using Algorithm 2.1 Figure 2.15: Frequency response of the 50MHz Miller Amplifier. and the stronger inversion operation. The criterium used still stands as drain-source voltage difference stays in the vicinity of only some mV. Finally, GBW product and PM are, again, a bit lower than requirements due to higher order terms. Figure 2.15 shows the same comparison made in section 2.4.4 between the MATLAB estimation of the frequency response and the SPICE simulation. Here, again, we found that the main difference appears at low frequencies where we used a simple output conductance model to estimate frequency response in MATLAB. # 2.5 Conclusions This chapter presented the reader with the ACM transistor model and its equations that will be used through out this work. The advantages of this model for automatic synthesis were also reviewed. We then introduced the concept of design space exploration using the $(g_m/I_D)$ methodology [1]. This is a keystone in the further development of automatic synthesis algorithms. The design space exploration allows to have a graphical representation of the evolution of a desired characteristic (consumption, area, dc gain, noise, etc.) with the design variables defined, for example in the case of a Miller amplifier, by the $(g_m/I_D)$ ratios of the active transistors of each stage. To show these ideas we present two different designs of a two-stage Miller operational amplifier using a gain-bandwidth driven algorithm. With this algorithm we obtained design space exploration plots for consumption, area and dc gain in each design which allowed us to design the amplifiers, in this case, for optimum power consumption. Examples of how to apply the $(g_m/I_D)$ methodology and the idea of design space exploration were also used to optimize non-critical parameters as 30 2.5 Conclusions transistors lengths or current sources $(g_m/I_D)$ ratios. Excellent agreement between the calculated performance and the simulations made, validates the methodology and allow us to step further into more advance automatic synthesis methodology for more complex opamps architectures. # Chapter 3 Low-Power OpAmp Cells: Reuse, Architecture and Synthesis # 3.1 Introduction On Chapter 2 we reviewed the basic ideas in automatic synthesis algorithms. Design space exploration using the $(g_m/I_D)$ methodology [1] proved to be an effective tool to automatically obtain minimum power designs. On this Chapter, we will first explore the possibility of analog design reuse. Section 3.2 will review this idea from three points of view: Circuit performance tuning, Architectures and Technology migration. Then on section 3.3, we will review the architecture selected to be used on the amplifier to be designed in Chapter 4. This architecture complies with the reusability characteristics described in section 3.2. Then, on section 3.4 we will present an example of advanced design methodology. Particularly, we will present an algorithm that systematically transits from high level specifications of the amplifier (e.g. total settling time), towards a low level design that complies with the high level specifications with minimum power consumption [3]. This algorithm will be partially modified in Chapter 4 to be used in a hierarchical automatic synthesis algorithm used in the design of the much complex architecture presented in section 3.3. # 3.2 Analog Design Reuse The advent of deep submicron processes has enabled the system-on-chip (SOC) design and enlarged the existing gap between design complexity and designers productivity. Time-to-market pressure makes this gap more challenging, and consequently, the reuse of circuit designs has become an essential tool. The urge for implementing reuse capabilities is particularly intense in the analog field [12], since automatic synthesis of analog circuits is a much hard problem than for the digital counterparts. This problem is of particular interest in the Microelectronics Group of the Instituto de Ingeniería Eléctrica (IIE) from the Universidad de la República, Uruguay. Being a small research group which has to handle several projects, the possibility of reusing analog cells in the designs we develop is a main concern, since we lack the designer time to implement every cell in each new project. On this section we will study analog reuse applied to amplifiers from three points of view. First we will review the possibility of reusing the same design in different applications by tuning its performance through the reference bias current. Second we will see architectures for input and output stages that are suitable to be used on different environmental conditions. And last, we will address the issue of automatic redesign in technology migration. # 3.2.1 Circuit Performance Tuning Through Bias Current One possible reuse scheme is to apply the reference bias current as an adjustment parameter to tune an existing circuit performance to suit different applications and hence save design time. This approach is being address by our research group [9], but also has been applied in commercial applications [4]. However, as far as we know, all these previous applications of performance tuning in standard circuits, has been made on bipolar technology. Bias current can be used to tune the performance of an already fabricated circuit, with almost no loss of performance, based essentially on the exponential dependence of current on voltage in weak and moderate inversion regions. This exponential relationship has two main consequences, the gate voltage has a very low dependence on current (equation (2.7)) and the $(g_m/I_D)$ versus $I_D$ curve is almost flat (Figure 2.2). Using equations from section 2.4.1 we can see that a Miller amplifier operating between weak inversion and moderate inversion will have a speed versus consumption trade-off tuned by the amplifier's bias current while preserving acceptable operation in all other aspects (voltage swing ranges, offset, phase margin, ...). #### Gain-Bandwidth Product Recalling equation (2.27) we see that we can tune the amplifier's gain-bandwidth product adjusting the differential pair's transconductance. In weak and near-weak inversion regions, since the $(g_m/I_D)$ ratio is almost constant, the dependence of the transconductance with bias current is linear. Hence we have a linear dependence between gain-bandwidth product and bias current as shown in equation (3.1). $$\omega_T = \frac{(g_m/I_D)_1}{C_m} I_{D1} \xrightarrow{(g_m/I_D) \simeq \text{cst.}} \omega_T \propto I_{D1}$$ (3.1) It is worth noticing, though, that equation (2.27) is a first order approximation. Therefore it is valid as long as second order effects, related to the NDP and Z ratios, remain far enough. We will study this below, in relation to the stability margins of the amplifier Figure 3.1 shows an example of tuning on the Miller amplifier designed on section 2.4.4. Here we see how we can tune the gain-bandwidth product for several decades with a linear relationship with bias current and hence with total consumption. We have found the way, then, to tune the performance (speed vs. consumption trade-off) of an existing design of our amplifier to suit different applications. We must now ensure that the amplifier preserves acceptable operation in other aspects. Figure 3.1: Gain-Bandwidth product tuning of the Miller amplifier from section 2.4.4 # Phase Margin One main concern when we state that we can tune the gain-bandwidth product of our amplifier for several decades is that the amplifier not only remains stable, but keeps the stability margins from the original design. Recalling equations (2.28)-(2.30) we see that the phase margin depends on the relationship between both stages' transconductances and the amplifier's parasitic capacitances. $$NDP = \frac{gm_2}{gm_1} \frac{C_m^2}{C_1C_2 + C_m(C_1 + C_2)}$$ (2.28) $$Z = \frac{gm_2}{qm_1} \tag{2.29}$$ $$PM = 90 - \arctan(\frac{1}{NDP}) - \arctan(\frac{1}{Z})$$ (2.30) It can be easily seen that the ratio between $gm_1$ and $gm_2$ depends on the ratio between both stages' $(g_m/I_D)$ ratios, both considered to be almost constant, and the ratio between both stages' bias current which is geometry determined and, hence, constant. Then the ratio between $gm_1$ and $gm_2$ is also constant. Regarding parasitic capacitances, usually they can be neglected when compared with Miller and load capacitances. In any case, in weak inversion parasitic capacitances are usually determined by the extrinsic, geometry dependent, capacitances which doesn't depend on bias current, since the intrinsic capacitances in weak inversion, excepting $C_G$ , tend to zero. Regarding the gate capacitance we will see below that it can be considered constant when working on weak and near-weak inversion. In conclusion both NDP and Z ratios, and thus the phase margin, remain almost constant when tuning the gain-bandwidth product of the amplifier. On the Miller amplifier designed on section 2.4.4, the phase margin varies less than 1.8% as the GBW is tuned over 4 decades. #### Voltage Swing Ranges Both ICMR (Input Common Mode Range) and Output Swing are determined by gate-source voltages and drain-source saturation voltages, as shown in section 2.4.1. It has already been shown that both these voltages have low dependence on bias current when operating in weak and moderate inversion. For instance, on equation (2.8) and Figure 2.1 we see how the drain-source saturation voltage tends to a constant minimum value when entering weak inversion operation. #### Offset In section 2.4.1 the Miller amplifier's input offset voltage was studied. Recalling equation (2.41) and the preceding equations, we found that both systematic and random offset can be written as functions of the $(g_m/I_D)$ ratios of the input differential pair and mirror. Since the input stage mirror was designed to have the same normalized current ratio $(\frac{I_D}{W/L})$ as the output stage active transistor, it will also have it's same $(g_m/I_D)$ ratio over all the tuning range. Particularly we can rewrite the total input offset voltage as $$V_{off} = \sqrt{A + \frac{B}{(g_m/I_D)_p^2} + C\frac{(g_m/I_D)_m^2}{(g_m/I_D)_p^2}}$$ (3.2) where A, B and C are all current independent parameters and subscript m (p) refers to mirror (differential pair). It can be easily seen, then, that the offset voltage tends to a constant value as we enter weak inversion, where both $(g_m/I_D)$ ratios are constant. #### **Intrinsic Capacitances** It can be seen from equation (2.12) that the total gate capacitance in a forward saturated transistor tends to $C_G = \frac{n-1}{n}C_{ox}$ when $i_f \ll 1$ , i.e. operating in weak inversion. Reference [10] shows that all other eight intrinsic capacitances tend to zero when operating in weak inversion region, as we mentioned above. Thus, since the slope factor n is constant and $C_{ox}$ is geometry dependent, the intrinsic capacitances are either negligible or constant in weak inversion. #### Noise Recalling equations (2.17) and (2.18) $$f_c \simeq \frac{1}{2} \frac{g_m}{WLC'_{or}} \frac{N_{ot}}{N^*} \tag{2.17}$$ $$S_i = 2nk_B T g_m \left(1 + \frac{f_c}{f}\right) \tag{2.18}$$ we see that the total noise current through a MOS transistor depends on its transconductance and the gate area. It can be easily seen that the total voltage noise at the gate of the transistor for a defined bandwidth between two frequencies $f_1 < f < f_2$ has the following expression [15]. $$\overline{v}_n^2 = \frac{2nk_BT(f_2 - f_1)}{gm} + \frac{\frac{nk_BTN_{ot}}{C'_{ox}N^*}\ln\left(\frac{f_2}{f_1}\right)}{WL}$$ (3.3) where the first term is related to white noise and the second one to flicker noise. In the case of a Miller amplifier, the equivalent input noise will depend mostly on the input stage transistors. The expression will have the following form. $$\overline{v}_n^2 = \frac{A}{gm_p} + B\frac{gm_m}{gm_p^2} + \frac{C}{(WL)_p} + \frac{D}{(WL)_m}\frac{gm_m^2}{gm_p^2}$$ (3.4) where, as in equation (3.2), A, B, C and D are current independent parameters and subscript m (p) refers to mirror (differential pair). It can be seen from equation (3.4), that the total equivalent input noise will increase as we decrease the reference current. Nevertheless, it can be seen from the noise estimation in Table (3.1) that if we consider a constant signal level (since the ICMR is constant) the signal-to-noise ratio will decrease only 20dB over 4 full decades of tuning. #### Slew Rate As we will see in section 3.4.1, the slew rate originates from the charging of a capacitive node with a limited, constant current. This node can be either an internal node (the first stage output in a Miller amplifier) or the output node, as in the case of a class A output stage. It can be easily shown, then, that the slew rate of the amplifier will be tuned with the reference current, since it determines the maximum charging current of both stages output nodes in a Miller amplifier, while both capacitance remain constant. #### **Tuning Example** Table (3.1) shows the tuning of the Miller amplifier from section 2.4.4. It is verified that the design preserves acceptable operation while being tuned over more | $I_{REF}(nA)$ | 0.65 | 6.5 | 65 | 650 | |--------------------------------|-------|-------|-------|-------| | $I_{DD}(nA)^1$ | 8.50 | 84.6 | 845 | 8270 | | $f_T(kHz)^1$ | 1.22 | 11.2 | 95.7 | 615 | | $PM(^{o})^{1}$ | 58.2 | 58.4 | 58.4 | 58.8 | | Max. ICM $(V)^1$ | 0.36 | 0.24 | 0.02 | -0.55 | | Min. ICM $(V)^1$ | -0.94 | -0.93 | -0.92 | -0.89 | | Max. OSW $(V)^1$ | 0.72 | 0.71 | 0.62 | 0.29 | | Min. OSW $(V)^1$ | -0.86 | -0.86 | -0.86 | -0.84 | | Offset $(mV)^2$ | 5.17 | 5.17 | 5.17 | 5.18 | | Eq. Input Noise | | | | | | $V_n@100Hz(\mu V/\sqrt{Hz})^1$ | 1.81 | 0.68 | 0.30 | 0.18 | <sup>&</sup>lt;sup>1</sup>: Spice Simulation. Table 3.1: Tuning of the Miller amplifier introduced on section 2.4.4 than 3 decades. Noise, as we showed above, is the only parameter that might become a concern if we go into deep weak inversion operation. Another example of tuning an amplifier, can be found on reference [9], where a micropower Miller OTA, from an industrial application, was successfully tuned over 3 decades of gain-bandwidth product. From what we have seen on this section we can conclude that if we develop our design to operate in weak inversion, then we will have the possibility of reusing that design over several decades of gain-bandwidth product and consumption while keeping the design performance in mostly any other aspect. Therefore we will achieve a key result in this work, that is, have a tunable operational amplifier cell. # 3.2.2 Reusable Circuit Architectures Addressing the issue of reusability, the design of an opamp cell capable of operating in different environmental conditions is a key objective. This cell must accept a wide range of input signals and be able to drive different impedance loads, in order to meet the demands of each application. Current low power and low supply voltage requirements on CMOS analog circuits has forced designers to reconsider input and output stages in order to maintain dynamic ranges and load driving capabilities. Rail-to-rail input stages and new class AB output stages are the most common proposed solutions to overcome these problems. # Rail-to-Rail Input Stages Rail-to-rail input stages have been addressed heavily in the literature [5,38–46] in order to cope with the loss of input dynamic range with the ongoing reduction of supply voltage. Three desirable characteristics can be defined in rail-to-rail input stages: <sup>&</sup>lt;sup>2</sup>: Matlab Calculation. Constant-gm Most of the rail-to-rail architectures are based on the first and simplest rail-to-rail input stage, proposed by Huijsing et al. [38], which consisted of standard n-channel and p-channel differential pairs driven in parallel. The main problem in these architectures is that close to the rails, only one input pair is active and so the effective transconductance is halved if no other measures are taken. This behavior has several drawbacks on transient response and non-optimal frequency compensation, as will be shown in section 3.3.1. Then, it is desirable to achieve constant-gm over the entire input common mode range. Universal At first the proposed architectures considered only the "square law" characteristic of the MOS transistor (strong inversion) [39,40,42]. More recently, some of them even consider only weak inversion operation [5,44]. Although good solutions were achieved in some cases, they all have a strong dependence on the operating point and the design space will be greatly reduced if we were to limit ourselves to only strong or weak inversion operation, missing completely the whole idea of using a complete, continuous model in the design space exploration algorithm. Thus, the "universal" characteristic in constant-gm rail-to-rail input architectures is defined as the ability to provide constant gm in all regions of operation. Several of these "universal" architectures have been proposed recently [41, 43, 45, 46]. Robust Another common drawback on rail-to-rail input stages is that most of the architectures' accuracy rely on some condition for matching n- and p-channel input transistors, i.e. on scaling the geometries to compensate for the different mobility between electrons and holes [41–45]. Robust architectures do not rely on this matching, therefore, they allow the circuit to be independent of process variations, which are hard to anticipate for the designers, and allows also to be easily migrated to another technology. We will use, then, a rail-to-rail input stage which complies with these three characteristics. In this way we will be able to achieve optimum frequency compensation in applications independently of their input common mode requirements, considering the whole design space and without depending upon process characteristics variations. Other characteristics could and should be considered when designing a rail-to-rail input stage architectures, since it is also important to achieve, for example, constant large signal behavior (constant SR vs ICM), constant and good CMRR along the ICMR, etc. Another issue that should be taken into consideration is that the constant-gm scheme should work in all the frequencies of operation of our circuit. This, as we will see in section 4.4, became a problem in the architecture selected for the opamp designed in this work. Figure 3.2: General characteristic of the class AB stage. #### **Output Stages** Output stages must also allow the amplifier to change between several applications, without loss of performance. Our output stage is intended to drive different load capacitances in low power, low supply voltage environments. Output stages are based on two blocks: one that sources current from the positive supply to the load and another that sinks current from the load to ground or the negative supply (Figure 3.2(a)). Output stages of opamps are usually classified in 3 classes: A, B and AB. The principle of operation in each of them can be found on any basic electronics reference book and will not be addressed here. However we will analyze some of the advantages of class AB stages regarding power optimization. A graphical way to describe the current characteristics of the class AB stages is illustrated by the plot on Figure 3.2(b) [3,47,48]. Here we see that for zero output current $(I_{out})$ we have a non-zero current $(I_Q)$ flowing through the output devices. As the magnitude of $I_{out}$ increases, one of the output devices delivers the output current while the other tends to cut-off. However, an additional improvement is often added to class AB stages. If one of the output devices is completely turned off when the other is supplying the output current, when that device that is off has to conduct again, there is a delay. This delay is associated with the action of charging capacitances at the output device or its driver and leads to increased distortion and ringing in the transient response. Therefore it is desirable to always assure a minimum current through the output devices as shown in Figure 3.2(b). Class AB stages contribute to minimize power in several ways: on one hand by decoupling the large signal (i.e. slew rate and current through the load resistor), and small signal (i.e. stability) requirements on the output stage; on the other hand by reducing the quiescent current consumption [3]. The quiescent current must assure stability for a given load capacitance. The maximum current that the stage can deliver must be enough for supplying the current through the load at maximum amplitude and assuring the desired slew rate. Traditional class AB stages used to apply the common drain configuration (or common collector in bipolar technology) for the output devices. This is not acceptable in low supply voltage environments since the output swing is reduced by one gate-source (base-emitter) voltage at each supply rail. Thus, low voltage stages apply common source output devices. This has the additional benefit of being able to provide voltage gain, which is a desirable characteristic since we will apply the class AB stage to replace a class A output stage. In conclusion rail-to-rail input stages and class AB output stages provide power efficient architectures to face a wide range of applications and environment conditions. On sections 3.3.1 and 3.3.2 we will look at the selected architectures for each stage and how they comply with the characteristics explained in this section. # 3.2.3 Technology Migration One final aspect on circuit design reuse is that of technology migration as one system block changes the fabrication process, for example, to exploit the benefits of scaled technologies for digital circuits. This problem has started to be addressed in the academic world, where the issue of developing resizing rules has been faced [9, 49, 50]. Reference [49] introduce a set of redesign rules considering the case of a MOS-FET in common source configuration. These rules are built to preserve the stage gain-bandwidth product and the signal-to-noise (SNR) ratio. Two scaling strategies are analyzed in [49], channel length scaling and constant inversion level scaling. The most convenient proves to be the first one, particularly from the point of view of taking advantage of technology scaling to reduce power consumption. Rules for scaling each parameter in the design are presented in [49], as a function of the scaling factors defined in equations (3.5)-(3.9) $$L_{min2} = \frac{L_{min1}}{K_L} \tag{3.5}$$ $$V_{DD2} = \frac{V_{DD1}}{K_V} \tag{3.6}$$ $$C_{ox2} = C_{ox1}K_{cox} (3.7)$$ $$\mu_2 = \frac{\mu_1}{K_\mu} \tag{3.8}$$ $$V_{E2} = V_{E1} K_E (3.9)$$ where subscripts 1 and 2 refer to initial and target technology, $V_{DD}$ is the supply voltage, $L_{min}$ is the minimum transistor length of the technology and $V_E$ is the Early voltage that defines the small signal drain conductance. Reference [9] analyzes the impact of the redesign method on two additional performance aspects (slew rate and current mirror frequency response). The results show that the proposed method decreases the slew rate and hence this is an aspect to look after in the resulting design. Regarding the current mirror frequency response, Figure 3.3: Comparison between the open loop frequency response of original and scaled design of the Miller micropower OTA from [9] the analysis showed that if the current mirror transistors were originally designed to work in stronger inversion than the differential pair transistors, the current mirror pole frequency increases, preserving a good overall frequency response. On reference [9] the redesign rules are applied in the scaling of a micropower Miller OTA from an industrial application. Experimental results support the validity of this redesign technique. Very good performance of the scaling method can be appreciated in Figure 3.3. # 3.3 Opamp Architecture In this section we will present the architecture of the operational amplifier to be designed on Chapter 4. The scheme used is basically a two stage, Miller compensated, amplifier. The input stage is implemented with a constant-gm rail-to-rail architecture proposed by Duque-Carrillo *et al.* [46]. The output stage is implemented with a micropower class AB scheme that exploits a transconductance multiplication effect to provide a very power efficient stage [3,49]. #### 3.3.1 Constant gm Rail-to-Rail Input Stages Rail-to-rail input stages were briefly introduced in section 3.2.2. There we defined three important characteristics towards reusability and power optimization. As we said in that section the first, simplest, architectures consisted in two n-channel and p-channel differential pairs driven in parallel as shown in Figure 3.4. There are basically three main regions of operation in this scheme. When the input Figure 3.4: Basic rail-to-rail differential pair architecture. common mode voltage $(V_{CM})$ is close to the negative rail $(V_{SS})$ only the p-channel differential pair is on, since the current sink in the n-channel differential pair $(I_n)$ is out of the constant current saturation region. As we approach mid rail, the n-channel differential pair current sink turns into the saturation region and we have both differential pairs working. Finally, when $V_{CM}$ approaches the positive rail $(V_{DD})$ , p-channel current source $(I_p)$ comes out of the saturation region and only the n-channel differential pairs remains on. This behavior reflects on non-uniform small signal parameters. Particularly the total transconductance $(gm_T)$ might vary more than 100% along the ICMR as can be seen on Figure 3.5. #### Non-optimum Frequency Compensation and Power Consumption We will see how this prevents us from achieving an optimum frequency compensation, and therefore, from having an optimum power consumption. Let's take the Miller amplifier from section 2.4.4 for example. The input stage transconductance of this amplifier is $gm_1 = 1.56\mu V/A$ . If we suppose that this amplifier was implemented with the input stage from Figure 3.4 and that $gm_n = gm_p$ , two possibilities arise: either we implement $gm_1$ with the total transconductance $(gm_1 = gm_T)$ or with the transconductance of each pair $(gm_1 = gm_n = gm_p)$ . In either case we end up with non-optimum solutions. If we design our input stage to have $gm_1 = gm_n = gm_p$ , when $V_{CM}$ is in mid rail the frequency of the gain-bandwidth product $(\omega_T)$ doubles (equation (2.27)) and, since the non-dominant pole is independent from $gm_1$ (equation (2.25)), the phase margin is greatly reduced to only $36.4^o$ (both NDP and Z ratios are halved)<sup>5</sup>. Since this is unacceptable, we should design our input stage to have $gm_T = gm_1$ . Then, when $V_{CM}$ approaches either rail the gain-bandwidth product is halved. <sup>&</sup>lt;sup>5</sup>This is a first order approximation. The actual $\omega_T$ will be less than double and thus, the phase margin might be a bit higher. Figure 3.5: Transconductance as a function of the input common mode voltage, using architecture from Figure 3.4. The phase margin, in this case, remains more than adequate. The problem is that we are still powering the second stage to keep the non-dominant pole frequency $(\omega_{NDP})$ far from the mid-rail $\omega_T$ . Therefore when operating close to the rails, the amplifier is consuming up to two times as much as necessary. This is so, because as we saw in section 2.4.4, consumption in a Miller amplifier is dominated by the second stage bias current. Then it is not possible to have optimum frequency compensation and power consumption if we do not achieve constant transconductance over the entire input common mode range. It is interesting to estimate, then, how much current we can invest in having a constant transconductance. Basically, according to what we have just seen, we could halve second stage bias current, when the gain-bandwidth product is halved. But it is more easy to see the problem from the other side. That is, that we could invest as much as an additional second stage bias current, needed to preserve frequency compensation in mid-rail, in keeping the input stage transconductance constant. Since, the second stage bias current can be expressed in terms of the first stage bias current as $$I_{D2} = \frac{gm_2}{gm_1} \frac{(g_m/I_D)_1}{(g_m/I_D)_2} I_{D1}$$ (3.10) in the case of the Miller amplifier designed in section 2.4.4, we could spend up to 10.1 times the first stage bias current. Next, when we present the rail-to-rail input stage used in this work, we will see how many bias current copies does the constant transconductance auxiliary circuit need, in order to evaluate on the total design if Figure 3.6: Schematic view of the constant gm operation principle. it was worth it. It is also important to notice, that the time domain response is also closely related to the frequency response, and thus, its performance will also suffer greatly from the non uniform characteristic of the transconductance. It might even make sense, then, to spend even a little more than the ratio given in equation (3.10), to avoid all the negative effects from having a non-constant input stage transconductance. # Universal and Robust Rail-to-Rail Architecture with Constant-gm From all the reasons exposed, we selected a rail-to-rail architecture that complies with the three characteristics listed in section 3.2.2. Duque-Carrillo *et al.* [46] proposed a solution that meets the three of them in a very efficient way. The key to obtain a constant gm input stage lays in obtaining the correct tail current for each differential pair. Duque-Carrillo *et al.* technique is based on using a negative feedback loop to impose that: $$gm_{REF} = gm_P + gm_N (3.11)$$ where $gm_{REF}$ is independent of the input common mode and $gm_P$ and $gm_N$ are the transconductances of the input differential pairs. The negative feedback loop principle is illustrated in Figure 3.6. Here the differential pairs $T_{P,ref}$ and $T_P$ are identical to the p-channel input stage differential pair, while $T_N$ is identical to the n-channel one. All of them are unbalanced by a DC voltage V, small enough to ensure operation in the linear region. The $T_P$ differential pair is biased by a replica of the p-channel input stage differential pair tail current $(I_{BP})$ so it depends on the input common mode, but $T_{P,ref}$ is biased by a replica of the nominal p-channel tail current $(I_B)$ and therefore it's gm is independent of the input common mode level. With the polarities shown in Figure Figure 3.7: Implementation of the constant gm technique proposed in [46]. 3.6 the differential output currents are added and the voltage of the summing node controls the $T_N$ tail current $(I_{BN})$ as indicated by the dashed line. Doing so, the sum $i_{REF} + i_P + i_N$ always equals zero and using the small signal model for the differential pairs $(i = gm \times v_i)$ we can easily obtain equation (3.11). Using a replica of $I_{BN}$ to bias the actual n-channel differential pair of the input stage we will always have a common mode independent net transconductance in the input stage equal to $gm_{REF}$ . Since the $T_{P,ref}$ transconductance is independent of any matching conditions between n- and p-channel input transistors and their operating regions, the proposed architecture is both robust and universal. Figure 3.7 shows the circuit implementation of the constant gm technique. The three differential pairs $T_{P,ref}$ , $T_P$ and $T_N$ are shown. A folded cascode implements the summing circuit. This avoids deviations caused by the mismatch of the current mirrors that would be otherwise required to steer the currents to a summing node. Also, on the left of the circuit, there is a monitor circuit formed by a fourth differential pair whose transistors are identical to the p-channel input ones and with its drain and sources short-circuited. The gates are connected to the amplifier input signals, thus, sensing the input common mode and therefore always supplying $T_P$ with the same common mode dependent tail current that biases the p-channel input differential pair. Figure 3.8 shows the total simulated transconductance of this circuit. In the figure we can appreciate that differential pair $T_N$ doesn't activate until it is needed and it's transconductance accurately compensates for the loss of transconductance on pair $T_P$ . On the figure, transconductance from pair $T_{P,ref}$ ( $gm_{REF}$ ) is also shown. We can notice a small difference between $gm_T$ and $gm_{REF}$ , but it is always below 1.6%. Finally, the variations on $gm_T$ are less than 0.9% in the whole input common mode range. It can be seen then, that since only two of the three differential pairs are always fully active at any moment, this constant gm circuit draws $10I_{D1}$ from the supply. This is about the limit we saw for the case of the Miller amplifier from section 2.4.4. But, we will see later, when we study the limit in the case of the complete amplifier, that in that case this is well below the limit for the consumption of the constant gm Figure 3.8: Transconductance as a function of the input common mode voltage using constant gm technique. circuit. # 3.3.2 Low-Power Class AB Output Stage Class AB output stages were introduced in section 3.2.2. Here we will show the architecture applied in this work. Figure 3.9 shows the structure of this architecture as proposed by Silveira *et al.* [3,24]. This architecture has been used earlier [51,52], nevertheless none of these previous works exploit the principle of operation proposed in [3,24]. This approach provides a significant reduction in power consumption with respect to traditional class AB and class A structures by three means. First, the output stage transconductance is boosted through the current mirror gains resulting in an important improvement of its transconductance to current ratio. Second, it can be shown that this increase in the output stage transconductance, results in a reduction of the value of the Miller compensating capacitor that gives minimum consumption for a complete amplifier. This reduction in the compensation capacitor, besides saving area, makes it possible to reduce the first stage consumption and to operate the first stage transistor closer to weak inversion. This provides additional benefits in terms of increased input common mode range and reduced offset voltage. Third the architecture provides a low impedance path from the input of the driver stage to the output transistors, avoiding compensating capacitors internal to the output stage [3]. The disadvantages with respect to more complex class AB architectures are two. First, no mechanism is provided to assure that both output branches will Figure 3.9: Class AB output stage proposed by [3]. always remain in conduction. This, as mentioned in section 3.2.2, leads to increased ringing in the time response for high amplitude steps and increased distortion, due to the increased delay associated to the branch that cuts off. Second, the ratio of the maximum output current to the quiescent current is fixed by the current mirror gains. As we will see below, the maximum allowable value of these gains is limited due to its effect on stability. Thus, the ratio of maximum output current to quiescent current is also limited. In spite of this, a significant reduction in consumption with respect to a class A case is achieved [3]. Distortion is an additional possible concern. The transconductance multiplication effect is based on an asymmetric behavior of the p and n sections of the output stage, hence leading to a non linear behavior of the output stage. However, as shown in [3] the high gain achieved by this stage allows to reach very reasonable distortion figures. We will now sum up the main characteristics of this architecture as presented in [3]. The quiescent current is fixed with respect to the $I_{AB}$ current source from Figure 3.9. In quiescent conditions the output current is zero and the output branch quiescent current $I_Q$ must be such that the sum of the scaled versions of $I_Q$ at Me and Mf is equal to $I_{AB}$ . This conditions yields $$I_Q = \frac{km}{1 + \frac{km}{h}} I_{AB} \tag{3.12}$$ where h, k and m are the gain factors of the current mirrors shown in Figure 3.9. The total quiescent transconductance of this stage under class AB operation, defined as the ratio between the total signal output current $i_o$ and the input signal voltage $v_i$ is given by: $$gm_{AB} = gm_a \left(1 + \frac{km}{h}\right) D(s) \tag{3.13}$$ where $gm_a$ is the transconductance of the output transistor Ma and D(s) represents the contribution to the frequency response of the current mirrors. The factor $1 + \frac{km}{h}$ that multiplies the transconductance is noted $gm_{mult}$ , and although D(s) might introduce high frequency doublets, the circuit can be properly stabilized even for factors as high as 25. It is also interesting to consider the effect on the transconductance to consumed ratio $(g_m/I_D)$ of the stage. $$(g_m/I_D) = (g_m/I_D)_a \frac{1 + \frac{km}{m}}{1 + \frac{1}{k} + \frac{1}{km} + \frac{1}{h}} D(s)$$ (3.14) In this case, multiplication factors as high as 12 can be achieved. This is like having an equivalent transistor 12 times more efficient than the original one. The factor D(s) is given by equation (3.15) [3], where $\omega_e$ ( $\omega_c$ ) is the angular frequency of the pole of the current mirror Mb - Mc (Md - Me). $$D(s) = \frac{1 + \left(\frac{1}{\omega_e} + \frac{1}{\omega_c}\right) \frac{s}{gm_{mult}} + \frac{1}{\omega_e \omega_c} \frac{s^2}{gm_{mult}}}{\left(1 + \frac{s}{\omega_e}\right) \left(1 + \frac{s}{\omega_c}\right)}$$ (3.15) Both angular frequencies are given by equations (3.16) and (3.17). $$\omega_e = \frac{gm_e}{C_c} \tag{3.16}$$ $$\omega_e = \frac{gm_e}{C_e}$$ $$\omega_c = \frac{gm_c}{C_c}$$ (3.16) where $gm_e$ $(gm_c)$ is the transconductance of the Me (Mc) transistor and $C_e$ $(C_c)$ is the total capacitance at the Me (Mc) gate node. The doublet introduces an important phase shift near the $\omega_e$ and $\omega_c$ frequencies. This effect determines the maximum acceptable values for the $gm_{mult}$ factor. Lets summarize the factors that determine the achievable total consumption reduction. Consider we apply this output stage as the second stage of a Miller amplifier, in which the Miller capacitance is connected between the input and output nodes of the circuit in Figure 3.9. Since the non-dominant pole of this amplifier is proportional to the second stage transconductance (equation (2.25)), the second stage current will decrease according to the increase in the $(g_m/I_D)$ ratio, when compared to the class A output stage used in the amplifier from section 2.4.1. The improvement in the $(g_m/I_D)$ ratio is not completely translated into a reduction of the current. The reason is that the non-dominant pole must be increased, with respect to the class A case, to have the same phase margin while allowing the phase shift introduced by the doublets. Taking these factors into account, reductions of quiescent current, with respect to the class A case, by a factor of 3 to 4 are reported in [3] to be achievable. In this work, we obtained a factor of 4, in spite that our nondominant pole has to be increased even more due to the presence of non dominant Figure 3.10: Amplifier circuit implementation, omitting constant-gm circuit. poles in the input stage folded cascode (Figure 3.10). The maximum output source current is given by $kmI_{AB}$ , while the maximum output current the stage is capable to sink is limited by the size of Ma and the maximum voltage at the input of the stage. This output stage has already been successfully used in a very low power consumption (100nA@2.0V) pacemaker sensing channel application [23]. #### 3.3.3 Opamp Complete Architecture These two stages are used in a Miller amplifier architecture. The rail-to-rail input stage is used instead of the single differential pair input stage used in the Miller amplifier from section 2.4.1. In order to provide a single high impedance node where the output stage and the compensating network is connected, we sum the output currents of the complementary input differential pairs using the same folded cascode summing circuit used in the constant-gm technique (Figure 3.7). Besides, this stage will provide additional gain to the amplifier. The R-C network eliminates the right half plane zero of the Miller amplifier, making it possible to reduce the overall consumption by further decreasing the requirements on the second stage transconductance. The RC compensation introduces an additional non-dominant pole, but it can be shown to lie at much higher frequencies than the first non-dominant pole, associated to the load capacitance. The complete amplifier schematic, omitting the constant-gm circuit shown in Figure 3.7, is shown in Figure 3.10. # 3.4 Advanced Design Methodologies When developing design methodologies towards a given objective several factors and aspects of the amplifier must be taken into account. All this data can be hierarchically represented in three levels (high, medium and low), where data in each lower level combine to determine the next level performance. In the case of power consumption, "high level" data can be represented by the static and dynamic precision and the total noise of the amplifier. These characteristics are determined by the "medium level" data that is represented, in turn, by transition frequency, phase margin, slew rate, thermal and flicker noise, DC gain and so on. All of which are determined by the "low level" data: transistor's sizes, currents and capacitances [3]. In the conventional design practice, the step that goes from the high level performance data to the medium level performance data that guides amplifier design, has been rather fuzzy. Silveira [3,22] presents a new approach to transit systematically from the high level total settling time specification to a low level design that complies with these specifications with optimum power consumption. This approach is based on the $(g_m/I_D)$ methodology [1] that allows a systematic exploration of the design space to implement the step that goes from the medium level op amp specifications to the low level design data, as in the case of Algorithm 2.2 that was presented in Chapter 2. This new approach, was implemented in a power optimization algorithm for a given total settling time. Though it can be easily applied in other amplifiers architectures, it was developed to be applied in a Miller RC compensated amplifier. This algorithm will be presented in section 3.4.1 as was presented in [3,22]. Then, in Chapter 4, we will further develop this algorithm in a more general, hierarchical design methodology that will allows us to automatically synthesize the amplifier presented in section 3.3 with optimum power consumption for a given total settling time. #### 3.4.1 Power Optimization for a Given Total Settling Time The total settling time is defined as the time the response to an input step will take to settle to a given relative error (e.g. 1%) of its final value. The settling behavior is an essential specification in most op amp applications, as it is a direct measure of the ability of the amplifier to respond to large input signals. Two distinct periods determine the settling time: the slewing period and the linear settling period. During the first period, the variation rate of the output is limited to a maximum value (slew rate). This originates from the charging of a capacitive node with a limited, constant current. This node can be either an internal node or the output node. We will refer to the first case as internal slew rate and to the second one as external slew rate. In the linear settling period, the amplifier behaves according to its small signal frequency response, and thus is related to the amplifier transition frequency and phase margin. A given total settling time can be achieved with different distributions between the linear settling and the slewing part. Which is the best alternative to this distribution is still a subject of study and Silveira [3] shows that the selected partition strongly influences power consumption To find the optimum distribution, Silveira [3,22] developed a settling behavior model to be applied in a power optimization algorithm for a given total settling - (a) Model for total settling time analysis. - (b) Output voltage vs. time plot. Figure 3.11: Settling time model and step response plot. time, also developed in his work. In this section, we will briefly introduce the model's main expressions and then we will review the design of a simple RC compensated Miller amplifier using this algorithm as presented in [3]. #### 3.4.2 Settling Behavior Model The objective here is to obtain an expression of the total settling time suitable to be applied in an analytical synthesis procedure and in qualitative hand analysis. A first order model of the amplifier was applied in order to determine the basic expression of total settling time. The second order frequency response will alter both the linear settling and the slewing periods. This effect, which in the case of linear settling are minor as long as the phase margin is above $60^{\circ}$ [37], will be addressed later. The model obtained considers the effect of both the internal and external slew rates and has a reasonable accuracy, which allows us to take design decisions, while it is independent from the amplifier architecture. The model, shown in Figure 3.11(a), considers the amplifier in a closed loop with a real feedback factor $\beta$ . The amplifier is considered to have a first order transfer function with open loop DC gain $A_0$ and transition frequency $\omega_T$ . When the amplifier operates linearly, supposing the open loop gain is much bigger than $1/\beta$ , the time constant $\tau$ is given by: $$\tau = \frac{1}{\beta \omega_T} \tag{3.18}$$ The plot on Figure 3.11(b) shows the step response of an amplifier. The total settling time $(t_s)$ defined, as we said, as the time the response to an input step will take to settle to a given relative error $\varepsilon$ of its final value is shown. The total time has two parts, a slewing period $(t_{slew})$ and a linear settling period $(t_{ls})$ The output voltage where the transition from the slew rate limited operation to the linear operation occurs is denoted $V_{trans}$ . The expression of the total settling time, taking into account the slewing and the linear part, is given by [3, 22]: $$t_s = t_{slew} + t_{ls} = \tau \left[ \ln \left( \frac{1}{\varepsilon} \right) - 1 + \ln \left( x \right) + \frac{1}{x} \right]$$ (3.19) where $$x = \frac{\tau SR}{V_{step}} = \frac{V_{step} - V_{trans}}{V_{step}} \tag{3.20}$$ Therefore, x is a dimensionless magnitude which has values between 0 and 1. Its physical meaning is that it corresponds to the fraction of the total step where we have linear settling. The fact that the amplifier has actually a second order response, can be taken into account by introducing two changes to equation (3.19). First, calculating $\tau$ with the actual transition frequency and not the first order one, which for a given phase margin is achieved including in the expression of $\tau$ in equation (3.18) a correction coefficient $k_{corr\omega_T}$ that multiplies $\omega_T$ . Second, multiplying the $\ln(1/\varepsilon)$ term by a correction coefficient $(k_{corrsetl})$ that takes into account the different evolution of the second order time response, and hence change the number of time constants required to settle for a given phase margin. This model, including the above mentioned correction factors, was applied in [3, 22] to evaluate the total settling time at 5\% of an experimental class AB 9MHz OTA. #### 3.4.3Power Optimization of a Miller OTA The power optimization methodology was developed for a RC compensated Miller OTA. As we said in section 3.3.3, the RC compensation network makes it possible to reduce the power consumption of a Miller amplifier. Phase margin, in the RC compensated Miller, is a function of the first order transition frequency (2.27) and the non-dominant pole frequency (2.25). $$PM = f\left(\omega_T, \omega_{NDP}\right) \tag{3.21}$$ Both expressions for these frequencies, seen on section 2.4.1, still stand when using the RC compensation. $$\omega_T = \frac{gm_1}{C_m} \tag{2.27}$$ $$\omega_T = \frac{gm_1}{C_m}$$ $$\omega_{NDP} = \frac{gm_2C_m}{C_1C_2 + C_m(C_1 + C_2)}$$ (2.27) while a third pole appears $$\omega_{RC} = \frac{1}{R_m C_m}$$ where $R_m$ is the resistance in the RC compensating network. It can be proved [37] that this last pole lies at much higher frequencies, and thus, can be neglected in equation (3.21). The slew rate is defined as the minimum between the internal slew rate and the external slew rate. Internal slew rate will be noted $SR_1$ and in the case of the Miller amplifier originates from the charging of the compensating capacitance $C_m$ . External slew rate will be noted $SR_2$ and in the case of the class A output stage, originates from the charging of the load capacitance. $$SR = \min\{SR_1, SR_2\}$$ (3.22) with $$SR_1 = \frac{2I_{D1}}{C_m}$$ (3.23) $SR_2 = \frac{I_{D2}}{C_2}$ $$SR_2 = \frac{I_{D2}}{C_2}$$ (3.24) where $I_{D1(2)}$ , $C_m$ and $C_2$ were defined in section 2.4.1. An additional equation will be needed for the determination of the compensation capacitance. Silveira [3,22] presents two approaches. One, is from the fact that the compensation capacitance basically determines the thermal noise characteristic of the amplifier [53]. Therefore, we could determine $C_m$ from the noise specification. A second way to determine $C_m$ is from the effect it has on power consumption. If we consider a given gain-bandwidth product, equation (2.27), shows that an increase in $C_m$ requires an increase in the first stage transconductance and thus an increase in the first stage current. On the other hand, for a given phase margin, and thus for a given non-dominant pole frequency, equation (2.25), it can be seen that an increase in $C_m$ results in a decrease in the second stage transconductance and hence in its current. Therefore, it exists a $C_m$ value that results in a minimum total current, for a given gain-bandwidth product and phase margin. So far, in the previous equations, we have five unknowns: the (W/L) ratio and current of the input and output transistors and the compensating capacitor. It can be seen that this is equivalent to have the $(g_m/I_D)$ ratio of the input and output transistors, the current of the output transistor $(I_{D2})$ , the gain bandwidth product $(\omega_T)$ and the compensating capacitance $(C_m)$ . The equations are three, equation (3.21) to have a given phase margin, equation (3.19) to have a given total settling time and either condition that determines the compensation capacitance. Hence we have two degrees of freedom that we will assign to the $(g_m/I_D)$ ratios of the input differential pair $((g_m/I_D)_1)$ and the output stage active transistor $((g_m/I_D)_2)$ . We will be able then to perform a design space exploration, as in Algorithm 2.2, to determine the optimum combination of $(g_m/I_D)_1$ and $(g_m/I_D)_2$ that minimizes power consumption for a given settling time and phase margin. In Algorithm 3.1 the power optimization algorithm for a given total settling time developed in [3] is presented. This algorithm only calculates the dimensions of transistors M1...M3, the dimensions of the current sources transistors (M4 and M5) can be later sized using similar criteria to the criteria used on Algorithm 2.2. This algorithm has the interesting feature that it can be very easily applied to other amplifier architectures. This is based, first, on the fact that the total settling time model shown above and presented in [3,22], is fairly independent of the particular amplifier architecture. On top of that, the most general concept of exploring the design space through the $(g_m/I_D)$ method to search the minimum consumption for a given total settling time, can be applied to any amplifier. What we need is to adapt the design procedure described for the Miller RC amplifier in steps 3(a) to 3(e). This procedure is based on the two expressions that relate the first order transition frequency with the input stage $(g_m/I_D)$ ratio and the non-dominant pole with the output stage $(g_m/I_D)$ ratio (equations (2.27) and (2.25)). These same relations are present in other amplifiers and, thus, can be used to develop the particular procedure for that particular amplifier. # 3.5 Conclusions The main ideas in analog design reuse and advanced design methodologies have been presented. We have shown that circuit performance tuning is possible and we have reviewed the desirable characteristics of a reusable opamp architectures. Then we showed a particular implementation of these characteristics in a two stage RC Miller compensated amplifier with rail-to-rail input stage and class AB output stage. Also, technology migration was briefly explained as a valid alternative in analog design reuse, including experimental results. Then, a power optimization algorithm for a given total settling time, presented by Silveira [3,22], was reviewed as an example of advanced design methodology. This algorithm effectively transits systematically from high level total settling time specification to a low level design that complies with these specifications with optimum power consumption. This algorithm, can be very easily applied to other amplifier architectures. In the case of a Miller amplifier with more complex input and output stages' architectures, we only need to rewrite some of the equations used here. This is what we intend to do in Chapter 4, when we will develop a hierarchical algorithm to synthesize the Miller amplifier presented in section 3.3. 54 3.5 Conclusions #### **Algorithm 3.1** Power Optimization for a Given Total Settling Time - 1. The lengths of the active transistors are taken of minimum value. This value can be later increased if the resulting DC gain is not enough or the 1/f noise is too big. - 2. Mirror transistors are designed to minimize systematic offset as in Algorithm 2.1. - 3. The design space is swept. For each point in the design space we swept $C_m$ . For each value the amplifier is designed to comply with the specified total settling time and phase margin. This is done with the following iterative procedure: - (a) Initial values for $\omega_T$ and $I_{D2}$ are determined using equations (2.25), (3.19), (3.20) and (3.22) in the simplified case where $C_1 \ll C_m \ll C_2$ and $SR_1 < SR_2$ . - (b) $I_{D1}$ is determined as: $$I_{D1} = \frac{\omega_T C_m}{(g_m/I_D)_1}$$ - (c) We can now determine the size of transistors M1...M3 since the $(g_m/I_D)$ ratio and current is known for all of them. - (d) From the calculated transistor sizes, we calculate the parasitic capacitance $C_1$ . Then, we can calculate x, $\omega_T$ and $I_{D2}$ using the following expressions derived from equations (2.25), (3.19), (3.20) and (3.22). $$x = \frac{\min\left\{\frac{2}{(g_m/I_D)_1}, \frac{NDP(C_1C_2 + C_m(C_1 + C_2))}{(g_m/I_D)_2 C_m C_2}\right\}}{\beta V_{step}}$$ (3.25) $$\omega_T = \frac{\left(\ln\left(\frac{1}{\varepsilon}\right) - 1 + \ln(x) + \frac{1}{x}\right)}{\beta t_s}$$ $$I_{D2} = \omega_T \frac{NDP\left(C_1C_2 + C_m(C_1 + C_2)\right)}{(g_m/I_D)_2 C_m}$$ (3.26) $$I_{D2} = \omega_T \frac{NDP \left( C_1 C_2 + C_m (C_1 + C_2) \right)}{(q_m / I_D)_2 C_m}$$ (3.27) (e) If the relative differences with the initial values of $\omega_T$ and $I_{D2}$ is less than a given error the procedure is finished, else we iterate at step 3(b) with the newly calculated values. # Chapter 4 Hierarchical Automated Synthesis # 4.1 Introduction This Chapter intends to apply to ideas seen on the last chapter to automatically synthesize a reusable operational amplifier cell. Since we are using a much more complex architecture seen on Chapter 3, we need to adapt the algorithms seen so far to take into account the added complexity. First, section 4.2 presents an expression developed in this work to directly obtain the value of the Miller compensating capacitance that minimize power consumption. This expression saves large amounts of processing time, since it is no longer necessary to sweep the value of $C_m$ and synthesize the whole amplifier for each value to find the optimum design. However, this expression must be used with care as will be seen below. Section 4.3 presents the hierarchical automated synthesis algorithm that was developed in this work. The algorithm independently synthesize each stage and then combines its results in a high level algorithm based on the algorithm seen on section 3.4. The results of this algorithm are presented in section 4.4, including simulation results, and examples of the performance tuning capabilities of the opamp cell. Conclusions are presented in section 4.6 and experimental results are presented in Chapter 5. # 4.2 Miller Compensation Capacitance for Minimum Power Consumption Algorithm 3.1 uses a very robust but very time consuming method to obtain the compensation capacitance $(C_m)$ that provides minimum power consumption in a given design space location. We will try here to obtain an expression for this capacitance, in order to use it in the proposed algorithm. The total quiescent current consumption in the amplifier can be expressed as $$I_{DD} = \alpha_1 I_{D1} + \alpha_2 I_{D2} \tag{4.1}$$ where $\alpha_1$ and $\alpha_2$ is the number of times each of the stage's bias currents is drawn from the source in quiescent conditions. In the case of the simple Miller OTA presented in section 2.4.1, $\alpha_1 = 2$ and $\alpha_2 = 1$ . In the case of the amplifier presented in section 3.3, $\alpha_1 = 16$ and $\alpha_2 = 1 + \frac{1}{h} + \frac{1}{k} + \frac{1}{km}$ . Bias currents can be written as functions of the compensation capacitance using equations (2.27) and (3.27) $$I_{D1} = \frac{gm_1}{(g_m/I_D)_1} = \frac{\omega_T C_m}{(g_m/I_D)_1}$$ (4.2) $$I_{D2} = \omega_T \frac{NDP \left( C_1 C_2 + C_m (C_1 + C_2) \right)}{(g_m/I_D)_2 C_m}$$ (4.3) In order to obtain the compensation capacitance $(C_m)$ that minimizes the total consumption $(I_{DD})$ , we must null the derivative of equation (4.1). Several factors in the above expression are also functions of $C_m$ ( $\omega_T$ , $C_1$ , $C_2$ , etc.) and should be also taken into account when calculating the derivative. However, these dependencies are very hard to obtain and very dependent on the stage architecture. Since we will be using this expression in an iterative algorithm that will asymptotically tend to the solution, we will consider all these factors as constant in a small interval around the actual value of $C_m$ . $$\frac{dI_{DD}}{dC_m} = \alpha_1 \frac{\omega_T}{(g_m/I_D)_1} - \alpha_2 \frac{\omega_T NDP}{(g_m/I_D)_2} \frac{C_1 C_2}{C_m^2} = 0$$ (4.4) $$\frac{d^2 I_{DD}}{dC_m^2} = 2\alpha_2 \frac{\omega_T NDP}{(g_m/I_D)_2} \frac{C_1 C_2}{C_m^3} > 0 \ (C_m > 0)$$ (4.5) Equation (4.5) proves that the zero in equation (4.4) corresponds to a minimum and confirms that there is a value for $C_m$ that minimizes consumption, as stated by Silveira in [3] and section 3.4.1. Then using equation (4.4), we obtain the expression for minimum consumption $C_m$ : $$C_m = \sqrt{\frac{\alpha_2 N D P(g_m/I_D)_1 \ C_1 C_2}{\alpha_1 (g_m/I_D)_2}}$$ (4.6) This expression becomes a very useful tool towards saving important processing time in the synthesis of Miller amplifiers using Algorithm 3.1. The validity of this expression can be confirmed referring to the design obtained by Silveira [3] using Algorithm 3.1. He obtained optimum consumption with $C_m = 2pF$ , while equation (4.6) yields $C_m = 1.87pF$ . The difference not only is less than the step used by Silveira to sweep $C_m$ , but also is deep within the process variations. However, equation (4.6) must be used with care, since it makes an important approximation when considers $C_1$ and $C_2$ as constants and, thus, it should be used only in iterative processes as Algorithm 3.1. # 4.3 Synthesis Algorithm The amplifier presented in section 3.3 has a much more complex architecture than the simple Miller amplifier used in the synthesis algorithms presented in sections 2.4 and 3.4.1. Then, to develop a synthesis algorithm for our amplifier we will use a hierarchial approach. That is, we will consider our amplifier as a two-stage, Figure 4.1: High Level Schematic of the Amplifier RC Miller compensated amplifier, and thus, use the same algorithm presented in section 3.4.1, but making modifications to take into account that each stage needs to be synthesized to obtain its high level characteristics. We will show that this hierarchical approach is feasible in spite of the fact that variables in analog design are strongly coupled. This can be seen in Figure 4.1, where each stage is characterized by 4 parameters: transconductance, gm, output conductance, go, input capacitance, $C_i$ and output capacitance, $C_o$ . Then, a high level synthesis can be used independently of the implementation of each stage, provided that the internal poles and zeros of each stage are taken into account. The two stage amplifier model is completed with $$C_1 = C_{o1} + C_{i2} (4.7)$$ $$C_2 = C_{o2} + C_L (4.8)$$ $$R_m = \frac{1}{qm_2} \tag{4.9}$$ Thus, we have 5 unknowns at this point: $$\begin{cases} gm_1 \\ gm_2 \\ I_{D1} \\ I_{D2} \\ C_m \end{cases} \Rightarrow \begin{cases} (g_m/I_D)_1 \\ (g_m/I_D)_2 \\ I_{D1} \\ I_{D2} \\ C_m \end{cases} (4.10)$$ which are more conveniently written in the form of the second brace. Now we can sweep the design space, defined as always by the $(g_m/I_D)$ ratio of each stage, to obtain in each point, minimum consumption designs that comply with the total settling time and phase margin specifications. Figure 4.2 shows the synthesis scheme for the amplifier. For each point of the Figure 4.2: Complete Amplifier Synthesis Algorithm Scheme. $t_{sett}$ is total settling time and IDD is total current consumption. design space, we perform a high level synthesis of the amplifier to obtain minimum consumption for the specifications given. In each step of the synthesis, input and output stage are re-synthesized and we update the high level characteristics of them to be used in the next step of the high level synthesis. When it converges to a final value, we move on to the next point in the design space, until we have explored the consumption space of the amplifier, as in sections 2.4.4 and 2.4.5 (see Figure 2.5 and Figure 2.12). There, we can choose the point of minimum consumption from which we can obtain our final design. The final step is a performance evaluation of this design, to assure that we have all other aspects of the amplifier at acceptable values. In case we find something out of specifications, we should change some preselected parameter (e.g. transistor's lengths) and rerun the synthesis. The high level synthesis will be a modified version of the Algorithm 3.1 including the direct estimation of Cm by equation (4.6). Next we will see how we will modify the algorithm and how we will implement the synthesis algorithms for each stage. #### 4.3.1 High Level Synthesis When considering the $(g_m/I_D)$ ratio of each stage, we have two choices. Either consider the ratio between the effective transconductance of the stage over the total current consumption of the stage; or consider the $(g_m/I_D)$ ratio of the active transistors of each stage. The first choice, is clearly more in the spirit of a hierarchical synthesis algo- rithm and would allow us to use the same equations used in Algorithm 3.1. However, these $(g_m/I_D)$ ratios would depend on the stage synthesis, and thus, could not possibly be used to define the design space as we have used the $(g_m/I_D)$ ratios so far. Which ultimately means that we could not use Algorithm 3.1 at all. On the other hand, the second choice allows us to define the design space independently of the outcome of each step of the algorithm, since the $(g_m/I_D)$ ratio characteristic of a transistor is defined by its technology. Thus, although having to rewrite some of the equations, this option allow us to use Algorithm 3.1 as the high level synthesis algorithm of our amplifier. In conclusion, for the input stage we will use the $(g_m/I_D)$ ratio of the $T_{P,ref}$ differential pair transistors, from now on noted $(g_m/I_D)_1$ . And for the output stage we will use the $(g_m/I_D)$ ratio of transistor Ma, from now on noted $(g_m/I_D)_2$ (see Figures 3.7 and 3.9). Regarding $(g_m/I_D)_2$ , note that $(g_m/I_D)_2 = gm_a/I_{Da}$ . Then, to have an uniform criterium, we will refer to $gm_a$ and $I_{Da}$ as $gm_2$ and $I_{D2}$ respectively. Let's see now, how does the equations from Algorithm 3.1 ought to be modified. The gain-bandwidth product remains unchanged, but the expression for the non-dominant pole frequency $$\omega_{NDP} = \frac{gm_2C_m}{C_1C_2 + C_m(C_1 + C_2)} \tag{2.25}$$ has to take into account that, according to equation (3.13), the effective transconductance from the output stage is augmented by a $gm_{mult}$ factor, and thus, the new expression for the non-dominant pole frequency is $$\omega_{NDP0} = \frac{gm_2gm_{mult} \ C_m}{C_1C_2 + C_m(C_1 + C_2)} \tag{4.11}$$ where the subscript 0 is because we have considered that the effect of the pole-zero doublets of the output stage (equation (3.15)) is negligible (e.g. D(s) = 1). Of course, this has to be taken into account when synthesizing the output stage. Then, the NDP factor $(NDP = \frac{\omega_{NDP0}}{\omega_T})$ will not be given a priori, but will be obtained from the output stage synthesis, where we can evaluate where should the non-dominant pole lie to have a given phase margin in spite of the pole-zero doublets of the output stage. It is worth stopping a moment here to take notice on the effect that the $gm_{mult}$ factor has on the size of $C_m$ and ultimately on the current consumption<sup>6</sup> and the available current budget for the constant gm circuit. It can be easily seen that equation (4.11) yields smaller $C_m$ values for the same $\omega_{NDP}$ when $gm_{mult} > 1$ . This translates in smaller currents in the input stage transistors to obtain the same gain-bandwidth product. Thus, we end up with less total consumption, although, <sup>&</sup>lt;sup>6</sup>This fact was first mentioned when introducing the output stage in section 3.3.2 and here we can show the cause. since the latter is dominated by the second stage current consumption, this effect doesn't lead to big savings. However, what is interesting here, is that we have a number of benefits in this negligible current consumption save. As we mentioned in section 3.3.2, we can work our input stage in weaker inversion with all the benefits that yields in terms of increased input common mode range, increased gain and reduced input offset voltage. What we did not mention earlier is that the ratio between output and input stage bias currents has become almost $gm_{mult}$ bigger. Thus, recalling what we saw in section 3.3.1, now we have enlarged the current budget available for the constant gm circuit by the same $gm_{mult}$ factor. Therefore it makes even more sense now, to invest the 10 copies of the input stage bias current needed to keep the input stage transconductance constant, since its effect on the total current consumption of the amplifier has become truly negligible<sup>7</sup>. Resuming with the rewriting of the expressions from Algorithm 3.1, we will analyze how internal and external slew rate expressions are modified. The internal slew rate $(SR_1)$ will be given by the maximum output current of the first stage, that is the maximum output current of the folded cascode summing circuit (see Figure 3.10). Since we are biasing them with $2I_{D1}$ , it is well known that the maximum output current, and thus the slew rate, is the same that the one for a single differential pair. Therefore the expression for the internal slew rate (given in equation (3.23)) remains unchanged. The external slew rate $(SR_2)$ will be given by the maximum output current of the second stage. As we saw in section 3.3.2 the output current is non-symmetrical. The maximum current the stage can source is $$I_{oMAX}(+) = gm_{mult} I_{D2}$$ (4.12) while the maximum current the stage can sink is given by the size of Ma and the maximum voltage swing at the input of the stage. Since this last value can't be known before the transistor has been synthesized, we will use equation (4.12) and then assure that the maximum current the stage can sink is above that value. Finally, then, we can rewrite equation (3.20) as $$x = \frac{\min\left\{\frac{2}{(g_m/I_D)_1}, \frac{gm_{mult}\ I_{D2}}{\omega_T C_2}\right\}}{\beta V_{step}}$$ (4.13) where the expression for $I_{D2}$ has to be rewritten from equation (3.27) as $$I_{D2} = \omega_T \frac{NDP \left( C_1 C_2 + C_m (C_1 + C_2) \right)}{(g_m/I_D)_2 \ gm_{mult} \ C_m}$$ (4.14) taking into consideration the augmented effective transconductance of the output <sup>&</sup>lt;sup>7</sup>Usually $gm_{mult}$ factor can reach values between 15 and 25. In our case, $gm_{mult}$ is equal to 22 stage. In Algorithm 3.1 the value for $C_m$ is swept. In this new algorithm we will use the expression derived in section 4.2. However, during this work we saw that if we use the value obtained from equation (4.14) using equation (4.6) to calculate $C_m$ , the algorithm had severe convergence problems. This can be explained reviewing the assumptions made to obtain equation (4.6). That expression is valid only in a small interval around the previous step value for $C_m$ , since it considered constant several parameters that in fact, are not constant in the whole range of valid $C_m$ values. Thus, we choose to smooth the change to the next value of $I_{D2}$ to comply with the assumptions made to obtain equation (4.6). To do this, we calculate the difference between the actual value and the new estimation, $$\Delta I_{D2} = \omega_T \frac{NDP \left( C_1 C_2 + C_m (C_1 + C_2) \right)}{(g_m / I_D)_2 \ g m_{mult} \ C_m} - I_{D2}$$ (4.15) and then calculate the new value as $$I_{D2new} = I_{D2} + \alpha \Delta I_{D2} \tag{4.16}$$ where we defined $\alpha$ as the "iteration step", which can be either constant or adaptive with $\Delta I_{D2}^{8}$ . It is worth making one final remark on the value of $C_m$ . As we have mentioned, the effect of the $gm_{mult}$ factors might yield very low values for $C_m$ . We even encounter values as low as some tens of fF. This of course is unacceptable, since it would yield high noise figures and would be hardly implementable due to the uncertainty in the actual value. Then, we chose to establish a bottom limit for $C_m$ and check each time that the limit is respected. Therefore, instead of equation (4.6), we will use $$C_{m} = \min \left\{ C_{mMIN}, \sqrt{\frac{\alpha_{2}NDP(g_{m}/I_{D})_{1} C_{1}C_{2}}{\alpha_{1}gm_{mult} (g_{m}/I_{D})_{2}}} \right\}$$ (4.17) where we chose $C_{mMIN} = 0.25pF$ . In Algorithm 4.1 the high level synthesis algorithm is presented. Here we see how we did implement the main ideas presented in this section. Steps 4(a) and 4(c) are calls to Algorithm 4.3 and Algorithm 4.2, which implement the output and input stage synthesis respectively. These algorithms are presented next. #### 4.3.2 Input Stage Synthesis The input stage synthesis, although might look complicated because of all the constant transconductance circuitry, is quite simple. This is so, because once we <sup>&</sup>lt;sup>8</sup>In the last version of the algorithm, we use the hyperbolic tangent function (tanh) to implement an adaptive step that goes from 0.1 to 1. #### Algorithm 4.1 High Level Synthesis - 1. $(g_m/I_D)_1$ and $(g_m/I_D)_2$ ratios are given by the design space sweep. - 2. Initial values for x, $\omega_T$ and $I_{D2}$ are determined using equations (4.13), (3.26) and (4.14) in the simplified case where $C_1 \ll C_m \ll C_2$ and NDP = 2.2. - 3. We estimate values for $k, h, m, C_1$ and $C_2 \simeq C_L$ . Then we can estimate an initial value for $C_m$ using equation (4.17). - 4. We perform the following iterative process: - (a) Using $I_{D2}$ we synthesize the output stage. We obtain $C_{i2}$ , $C_{o2}$ , NDP and k, h, m. - (b) From k, h, m we calculate $gm_{mult}$ and the factor $\alpha_2$ for equation (4.17). - (c) Using $C_m$ we synthesize the input stage. We obtain $C_{o1}$ and $I_{D1}$ . - (d) From the calculated parasitic input and output capacitances, we calculate the $C_1$ and $C_2$ . - (e) Using equations (4.15) and (4.16) we calculate $I_{D2}$ . $$\Delta I_{D2} = \omega_T \frac{NDP (C_1 C_2 + C_m (C_1 + C_2))}{(g_m / I_D)_2 \ gm_{mult} \ C_m} - I_{D2}$$ $$I_{D2new} = I_{D2} + \alpha \Delta I_{D2}$$ (f) Using equations (4.13), (3.26) and (4.17) we calculate x, $\omega_T$ and $C_m$ respectively. $$x = \frac{\min\left\{\frac{2}{(g_m/I_D)_1}, \frac{gm_{mult}\ I_{D2new}}{\omega_T C_2}\right\}}{\beta V_{step}}$$ $$\omega_T = \frac{\left(\ln\left(\frac{1}{\varepsilon}\right) + \ln(x) + \frac{1}{x}\right)}{\beta t_s}$$ $$C_m = \min\left\{C_{mMIN}, \sqrt{\frac{c_2 NDP(g_m/I_D)_1\ C_1 C_2}{c_1 gm_{mult}\ (g_m/I_D)_2}}\right\}$$ (g) If the relative difference between $I_{D2new}$ and $I_{D2}$ is less than a given error the procedure is finished, else we iterate at step 4(a) with the newly calculated values. Figure 4.3: Folded Cascode Circuit have synthesized one differential pair and one folded cascode circuit, we have all the building blocks for the stage. The differential pair synthesis is quite simple. Given $C_m$ and the gain-bandwidth product $(\omega_T)$ , we get the differential pair bias current as $$I_{D1} = \omega_T \frac{C_m}{(g_m/I_D)_1} \tag{4.18}$$ from which we can easily obtain transistor sizes using the $(g_m/I_D)$ methodology [1]. Since we use a robust constant-gm circuit, we don't have to worry about scaling the sizes between p-type and n-type differential pair transistors and, thus, we use the same size for both. However, this is not the best solution and should add some scaling that will improve the performance of the constant-gm circuit. The most complex part of the synthesis of this stage is the folded cascode synthesis. To illustrate it, Figure 4.3 shows the folded cascode circuit. This circuit adds two non-dominant poles, one because of the parasitic capacitance $C_{FCm}$ of the current mirror formed by transistors M7a, M7b, and the other because of the parasitic capacitances $C_{p1}$ and $C_{p2}$ . This last pole has one value when the p-type input differential pair is active (related to $C_{p1}$ ) and another when the n-type input differential pair is active (related to $C_{p2}$ ). The expression for the non-dominant pole due to the cascode transistors, in the case the p-type differential pair is acting, has the following expression $$\omega_{FC} = \frac{gms_{FC3}}{C_{p1}} \tag{4.19}$$ where $gms_{FC3}$ is the source transconductance of transistors M3a, M3b and $C_{p1}$ is given by $$C_{p1} = C_{oDP} + C_{jFC3} + C_{jFC1} + C_{GSFC3} (4.20)$$ where $C_{oDP}$ is the output capacitance of the differential pair, $C_{jFC1(3)}$ is the junction drain-bulk capacitance of transistors M1b (M3b) and $C_{GSFC3}$ is the gate source capacitance of transistors M3b. This means that the parasitic capacitance is partially determined by the differential pair synthesis and transistor M1b, which is a current source designed a priori. However it also depends on the $(g_m/I_D)$ ratio of transistor M3b, and recalling that gms = ngm, we can sweep the $(g_m/I_D)_{FC3}$ ratio of the cascode transistors to obtain the corresponding frequencies of this non-dominant pole. Since the current through the cascode transistor is $2I_{D1}$ , we can express the non-dominant pole frequency as $$\omega_{FC} = \frac{2I_{D1}n(g_m/I_D)_{FC3}}{C_{n1}} \tag{4.21}$$ It can be seen that there is a $(g_m/I_D)_{FC3}$ ratio in which the $NDP_{FC} = \frac{\omega_{FC}}{\omega_T}$ ratio is maximum. Thus, we design our transistors M3a, M3b to obtain maximum $NDP_{FC}$ . To size transistors M5a, M5b we use an analogue procedure. The existence of this maximum is yet another example of the trade off between the increase in the transconductance and the increase in the parasitic capacitances when operating towards weak inversion. The expression for the current mirror pole is $$\omega_{FCm} = \frac{gm_{FC7}}{C_{FCm}} \tag{4.22}$$ where $gm_{FC7}$ is the transconductance of transistors M7a, M7b. As in the case of $\omega_{FC}$ , looking at the expression for $C_{FCm}$ $$C_{FCm} = 2C_{GFC7} + C_{iFC3} + C_{iFC5} (4.23)$$ we see that it is partially determined by the cascode transistors synthesis, but it also depends on $(g_m/I_D)_{FC7}$ through $C_{GFC7}^{9}$ . Then, there is also a value for $(g_m/I_D)_{FC7}$ where the $NDP_{FCm}$ ratio $(NDP_{FCm} = \frac{\omega_{FCm}}{\omega_T})$ is maximum. Therefore we will also sweep the $(g_m/I_D)_{FC7}$ ratio to design transistors M7a, M7b where we have maximum $NDP_{FCm}$ ratio. These ratios $(NDP_{FC}, NDP_{FCm})$ are usually large enough to neglect the effect of these non-dominant poles on a first approximation. This fact is checked after the synthesis is finished. We will see next, on the output stage synthesis, that we use a higher value for the phase margin than the one specified. This is so, to take into account that the final value will be lower because of these and other effects which are considered negligible in the synthesis. Algorithm 4.2 shows the algorithm used to synthesize the input stage. As $<sup>{}^{9}</sup>$ Gate capacitance of transistors M7a, M7b #### Algorithm 4.2 Input Stage Synthesis - 1. Differential pair transistors are sized using equation (4.18) - 2. Transistors M1a, M1b are sized like the other current sources of the opamp. - 3. We sweep $(g_m/I_D)_{FC3}$ , and for each value we estimate $C_{p1}$ and $NDP_{FC}$ (equation (4.21)). - 4. We choose $(g_m/I_D)_{FC3}$ to obtain maximum $NDP_{FC}$ - 5. We design transistors M5a, M5b repeating steps 2 and 3 in their case. - 6. We sweep $(g_m/I_D)_{FC7}$ , and for each value we estimate $C_{FCm}$ and $NDP_{FCm}$ (equation (4.22)). - 7. We design transistors M7a, M7b to obtain maximum $NPD_{FCm}$ . - 8. We estimate $C_{o1}$ Figure 4.4: Class AB output stage. we said, this algorithm only obtains the sizes for the transistors of the two basic blocks of the whole input stage. With them, we can implement both complementary differential pairs and the constant-gm auxiliary circuit shown in Figure 3.7. Finally, the cascode bias circuit shown in Figure 4.3 and its design is explained in Appendix A. This design is done after the opamp synthesis since it has no effect on it. #### 4.3.3 Output Stage Synthesis The design methodology for the output stage is based in the methodology proposed by Silveira *et al.* [3,24]. In Figure 4.4, we replicate Figure 3.9 but noting the parasitic capacitances $C_c$ and $C_e$ that fix the current mirrors frequency response. Also, the implementation of the $I_{AB}$ current source is shown, as it will have an effect on the total $C_e$ capacitance. As in the previous examples seen in this work, the phase margin of the amplifier will be determined by the position of the non-dominant poles with respect to the transition frequency. However, when using this output stage the response of the current mirror of the output stage will affect the phase margin as explained in section 3.3.2. If we recall equation (3.15), the phase shift introduced by the current mirrors, depends only in the frequency of the poles and the gain of the current mirrors. Then, we can express the phase margin as $$PM = f(NDP, \omega_c, \omega_e, k, h, m) \tag{4.24}$$ where $NDP = \omega_{NDP0}/\omega_T$ is given in equation (4.11) and $\omega_c$ ( $\omega_e$ ) is the angular frequency of the Mc, Mb (Me, Md) current mirror pole shown in equation (3.17) (equation (3.16)). Is quite clear, then, that the current mirror gains (k, h, m) affect the amplifier phase margin as well as the total quiescent current of the second stage. Then, there is a trade-off between the amplifier stability and power consumption that will be used to design the stage [3]. The synthesis algorithm will find a set of current mirror gains that provides minimum consumption while preserving stability. To measure the reduction of consumption, we will use a figure of merit for the output stage $(FOM_{OS})$ that must be minimized. This figure of merit will be $$FOM_{OS} = \frac{I_{TqAB}}{I_{TqA}} \tag{4.25}$$ where $I_{TqAB}$ is the total quiescent current consumption of the stage, and $I_{TqA}$ is the total quiescent current consumption of a simple class A stage with the same phase margin. The stability will be measured through the phase margin. Then, we must develop an expression for the relation presented in equation (4.24). This expression, neglects the effect of the input stage frequency response and the effect of the Miller resistor (Rm). On this latter effect, this is usually the case. With the input stage, although it is designed to have a negligible effect, experience showed that it has a minor effect in the total phase margin. To keep with the main idea of hierarchical design and be able to decouple the synthesis of both stages, we will design our output stage to have a phase margin a bit higher than the specifications, and thus, take into account the loss suffered due to the input stage frequency response. Silveira [3], presents an expression for the phase margin, neglecting the effect of the input stage frequency response. $$PM = phase\left(\frac{1}{1 + \frac{j\omega}{\omega_{NDP0}D(j\omega)}}\right)$$ (4.26) Since the complex quantity $D(j\omega)$ influences the phase to be determined, this ex- pression can only be solved numerically. Silveira [3], then, provides an approximate, pessimistic, estimation of the phase margin $$PM = phase\left(\frac{1}{1 + \frac{\jmath\omega_T}{\omega_{NDP0}}}\right) + phaseD(\jmath\omega_T)$$ $$= -\arctan\left(\frac{\omega_T}{\omega_{NDP0}}\right) + phaseD(\jmath\omega_T)$$ (4.27) where the second term, recalling equation (3.15), can be expressed as $$phase(D(j\omega_T)) = \arctan\left(\frac{\frac{\omega_c + \omega_e}{\omega_c \omega_e} \omega_T}{gm_{mult} - \frac{\omega_T^2}{\omega_c \omega_e}}\right) - \arctan\left(\frac{\omega_T}{\omega_c}\right) - \arctan\left(\frac{\omega_T}{\omega_c}\right)$$ (4.28) Then, to synthesize the output stage, we will use an optimization program that minimizes $FOM_{OS}$ while keeping the phase margin over a given value (e.g. $60^{\circ}$ ). To do so, the routine will have to obtain an optimum combination of the three mirror gains (k, h, m) and $NDP_0$ that complies with the constrains given by the phase margin. This will be done with the optimization routine fmincon available in the MATLAB Optimization Toolbox [54]. To calculate the phase margin, we need to calculate the current mirror poles. Silveira [3], derived simplified approximate expression for the current mirror poles. These expressions, were obtained using several assumptions. The main assumption made, was that the parasitic capacitances that define the poles of the current mirrors were dominated by the gate capacitance. This is a reasonable assumption to make in Silicon-On-Insulator (SOI) technology [3] and it might also be reasonable to make it in Bulk technology, especially when the current mirror gain is larger than unity. However, when using those expressions, quite large errors in the final phase margin value were obtained. Besides, we proved that if we neglect the drain-substrate capacitance of transistor Mf in the expression of $C_e$ , factor h always tend to its minimum allowable value, and thus, non-optimum solutions might arise. This last statement can be easily seen and we will give a brief proof below. Therefore, we will use complete expressions for both parasitic capacitances when calculating the current mirror poles. The optimization routine, will also have to calculate the transconductance of transistors Me and Mc. Then, at each step of the routine we need to obtain the $(g_m/I_D)$ ratios for both transistors. Since we are considering $L_e$ and $L_c$ equal to $L_a^{10}$ , the criterium used was to consider $W_c = W_e = W_{min}$ since this maximizes the current mirrors pole frequency. It is worth noticing that we obtain the size of transistors Mb, Mc, Md, Me at each step of the optimization. However, we don't design transistors $Mos_1$ , $Mos_2$ in the optimization routine. Instead, when calculating $C_e$ we assign a certain amount of the total value to take into consideration the current source. Then, we design the current <sup>&</sup>lt;sup>10</sup>It might be of interest here, to note that $L_a$ and the maximum allowable value for h are chosen such, that we will never obtain $W_f < W_{min}$ source to comply with that amount of drain-substrate capacitance. Naturally, we must check at the end of the synthesis that the dynamic ranges are satisfied. Let's briefly explain the statement made above, about the need to include the drain-substrate capacitance of transistor Mf. If we recall equations (2.28) and (4.11), we can write the bias current for the active transistor of our class AB and a simple class A output stage respectively as $$I_{D2}|_{AB} = \omega_T N D P_{AB} \frac{(C_1 C_2 + C_m (C_1 + C_2))}{(g_m / I_D)_2 \ g m_{mult} \ C_m}$$ (4.29) $$I_{D2}|_{A} = \omega_{T} N D P_{A} \frac{(C_{1}C_{2} + C_{m}(C_{1} + C_{2}))}{(g_{m}/I_{D})_{2} C_{m}}$$ (4.30) Then, we see that since both stages are designed with the same $(g_m/I_D)_2$ ratio, the same capacitances $(C_1, C_m, C_2)$ and the same $\omega_T$ the ratio of both bias current is $$\frac{I_{D2}|_{AB}}{I_{D2}|_{A}} = \frac{NDP_{AB}}{gm_{mult}} \frac{1}{NDP_{A}}$$ (4.31) where $NDP_{AB}$ and $NDP_A$ are the non-dominant pole ratios to the transition frequency needed to achieve the same phase margin with our class AB output stage and a simple class A output stage respectively. Then, the figure of merit to be minimized presented in equation (4.25), can be rewritten as $$FOM_{OS} = \frac{I_{TqAB}}{I_{TqA}} = \frac{(I_{Tq}/I_Q)_{AB}}{gm_{mult}} \frac{NDP_{AB}}{NDP_A}$$ $$(4.32)$$ where $(I_{Tq}/I_Q)_{AB}$ is the ratio between the total quiescent current consumption and the quiescent current through transistor Ma as defined in equation (3.12). Using this same equation and the definition of $gm_{mult}$ , we can write the first term in equation (4.32) as $$\frac{(I_{Tq}/I_Q)_{AB}}{gm_{mult}} = \frac{1 + \frac{1}{h} + \frac{1}{k} + \frac{1}{km}}{1 + \frac{km}{h}} = \frac{h\left(\frac{m(k+1)+1}{km}\right) + 1}{h + km}$$ $$\frac{(I_{Tq}/I_Q)_{AB}}{gm_{mult}} = \frac{h/a + 1}{h + b} = \frac{1}{a}\left(\frac{h+a}{h+b}\right) \text{ where } b > a$$ (4.33) It can be easily seen that this term is maximum when h is minimum. Then, if we don't consider the drain-substrate capacitance of transistor Mf in the expression of $C_e$ , the second term, $NDP_{AB}/NDP_A$ only depends on k and m. Therefore, the routine that minimize $FOM_{OS}$ will always come up with the solution $h = h_{min}^{-11}$ . Algorithm 4.3 completes the algorithms used in the hierarchical automated synthesis algorithm presented in this chapter. It might be noted that some details, as the design of some current sources, are not explained. We choose not to go into $<sup>^{11}</sup>$ In our case the minimum allowable value for h is 1 #### Algorithm 4.3 Output Stage Synthesis - 1. We size transistor Ma using $I_{D2}$ and $(g_m/I_D)_2$ ratio. - 2. We estimate the upper limit for mirror gain h ( $h_{max}$ ) to assure that transistor Mf will always have $W_f > W_{min}$ . - 3. We run the optimization routine to obtain optimum combination of mirror gains (k, h, m) and NDP factor that minimizes the relation $\frac{I_{TqAB}}{I_{TaA}}$ . - 4. The optimization routine also sizes transistors Me, Md and Mc, Mb. - 5. We estimate the input and output capacitances: $C_{i2}$ and $C_{o2}$ . Figure 4.5: Opamp cell, omitting constant-gm circuitry. $V_{IBN}$ controls the current through the n-type differential pair. that level of detail, since the whole methodology might become hard to follow and those are non-critical points regarding the power optimization. Next, we will see the results of applying this algorithm in a particular design. ### 4.4 Synthesis Results We will present here the automated design of an opamp cell synthesized to have $1\mu s$ total settling time. The results will be compared with simulations made using the transistor model BSIM3v3 in SPICE. Performance tuning of the amplifier will be also explored using simulations. Experimental results are presented in the next chapter. The performance of the amplifier will be compared with an amplifier with simpler architecture designed using Algorithm 3.1. Then we will compare the performance of the designed cell tuned to operate at a different settling time with another cell designed with our algorithm specifically to operate with those specifications. Before starting with the section, Figure 4.5 shows the complete circuit of the opamp cell (omitting the constant-gm circuitry) in order to refresh the reader with the notation of each transistor that will be referred below. Figure 4.6: Total Consumption (in $\mu A$ ) for a $1\mu sec$ total settling time rail-to-rail OTA #### 4.4.1 $1\mu s$ Settling Time Design The algorithm described in section 4.3 was applied to the design of a rail-to-rail Miller OTA in $0.8\mu m$ CMOS technology. The opamp is synthesized to have $1\mu s$ total settling time for a 0.3V step input. The load capacitance is 50pF and we want a phase margin above $60^{\circ}$ . The algorithm explores the design space defined by $(g_m/I_D)_1$ and $(g_m/I_D)_2$ . The resulting level curves of constant total consumption are shown in Figure 4.6, where the design point for optimum power consumption is $$(g_m/I_D)_1 = 12$$ $(g_m/I_D)_2 = 15$ However, inspecting Figure 4.6 it is clear that $(g_m/I_D)_1$ is not a critical parameter in respect to power optimization. This was expected, since the bias current in the input stage is much smaller than the bias current in the output stage. Then, to select the $(g_m/I_D)_1$ ratio, we will take into consideration that if we select $(g_m/I_D)_1$ =12, the constant-gm circuitry won't be able to effectively compensate the complementary input differential pairs, since we will be dangerously close to have a zone in the input common mode range where both pairs sources will work | $(g_m/I_D)_1 (V^{-1})$ | 18 | $I_{TqAB}/I_{TqA}$ | 0.25 | |------------------------|-------|---------------------|-------| | $(g_m/I_D)_2 (V^{-1})$ | 15 | $SR_{int}/SR_{ext}$ | 0.29 | | $I_{TOT}(\mu A)$ | 10.33 | GainDC(dB) | 167 | | $I_{D1}(nA)$ | 82 | $f_T(MHz)$ | 0.85 | | $I_{Da}(\mu A)$ | 4.5 | $NDP_0$ | 5.1 | | $I_{oMAX}(\mu A)^1$ | 103 | $SR(V/\mu s)$ | 0.59 | | k | 8.5 | Cm(pF) | 0.277 | | m | 3 | $Rm(k\Omega)$ | 0.65 | | h | 1.2 | $C_1(fF)$ | 52 | | $gm_{mult}$ | 23 | $C_2 - C_L(fF)$ | 53 | <sup>1</sup>: Maximum source current. Table 4.1: Automatic Synthesis Result with Algorithm 4.1 in the linear region. Then, we will design our amplifier in the suboptimum point $$(g_m/I_D)_1 = 18$$ $(g_m/I_D)_2 = 15$ where the consumption is less than 10% above the minimum and we assure a good safety margin against having a zone where the constant-gm circuit can't compensate the loss of transconductance. Applying Algorithm 4.1, we obtain the design presented in Table (4.1), with transistors sizes shown in Table (4.2). The first thing we see, is that the input stage bias current is 50 times smaller than the second stage bias current. Then, the penalty for the current "invested" in the constant-gm circuitry is quite negligible, as expected. Recalling equation (4.1), the total input stage consumption is only $1.3\mu A$ , less than 15% of the total opamp consumption. The values obtained for k, h, m, yield a quite large $gm_{mult}$ . This has several effects, as briefly explained in sections 3.3.2 and 4.3.1 and more thoroughly by Silveira [3]. One of them is that we have a very small compensating capacitance $C_m$ , which is just above the minimum value established in the algorithm. This allow us to achieve the desired $\omega_T$ with such a small $I_{D1}$ . Another effect is about the reduction in the output stage consumption when compared with a simple Class A output stage. We can see this in the figure of merit $I_{TqAB}/I_{TqA}$ , defined in equation (4.25), where we achieved a current consumption reduction ratio of 4. This reduction can't reach much higher values because of the big phase shift introduced by the output stage current mirrors. The phase shift results in a high $NDP_0$ value ( $NDP_0 = 5.1$ ), which in turn keeps the output stage from higher reduction in consumption. The ratio between internal and external slew rate $(SR_{int}/SR_{ext})$ shows that the total slew rate is determined by the input stage. Silveira [3] analyzed this ratio | Differential Pair | | | | Output Stage | | | | | | |-------------------|-----------------|---------|---|--------------|---------|-------|-------|------|-------------| | | $W^3$ | $L^{3}$ | M | $(g_m/I_D)$ | | $W^3$ | $L^3$ | M | $(g_m/I_D)$ | | C.M. <sup>1</sup> | 7.9 | 20 | | 15 | Ma | 13.2 | 1 | 1 | 15 | | D.P. <sup>2</sup> | 9.2 | 10 | 1 | 18 | Mf | 11 | 1 | 1 | 15 | | Folded | Folded Cascodes | | | Mb | 2 | 1 | 8.5 | 12.4 | | | | $W^3$ | $L^3$ | M | $(g_m/I_D)$ | Mc | 2 | 1 | 1 | 12.4 | | $M_{FC1}$ | 7.9 | 20 | 4 | 15 | Md | 2 | 1 | 3 | 21.7 | | $M_{FC3}$ | 2.6 | 2 | 1 | 22.8 | Me | 2 | 1 | 1 | 21.7 | | $M_{FC5}$ | 4.6 | 2 | 1 | 21.9 | $Mos_1$ | 3.7 | 10 | 1 | 3.4 | | $M_{FC7}$ | 2 | 3.1 | 1 | 16.1 | $Mos_2$ | 3.7 | 10 | 2 | 3.4 | <sup>&</sup>lt;sup>1</sup>: Current Mirrors. Table 4.2: Transistors Sizes Obtained Using Algorithm 4.1. M is the number of parallel transistors. and concluded that the optimum consumption is given when the ratio equals 1. This is quite reasonable since if we increase one of the slew rates, the total slew rate is still ruled by the minimum of them, and the current spent in increasing the latter will be wasted. Then, how do we have a ratio of 0.29? This can be explained by looking at our architecture. Silveira [3] used a simple RC compensated Miller OTA with a class A output stage, when he made the latter analysis. In our amplifier, we have two main differences that explain our smaller ratio. First, the ratio between output and input stage bias currents is much larger and second, the ratio between the maximum output current and the total output stage quiescent current is augmented by approximately $gm_{mult}$ . Then, if we want to increase $SR_{ext}$ , the relative increase in the total quiescent current is much smaller than if we were to do that in a simple class A output stage. Thus, a small penalty arise from this course of action. On the other hand, if we increase $I_{D1}$ to increase $SR_{int}$ , the increase in the second stage consumption, in order to keep the same stability margin, is much larger and thus the penalty in this case is much bigger. Then, the optimizing algorithm will consider acceptable to have a small increase in the output stage quiescent current, to achieve the stability margins, for example, in spite of the fact that the small increase in current will yield a much larger external slew rate. Table (4.3) compares the calculated performance with the performance obtained by simulating the OTA in SPICE using model BSIM3v3. The simulations where performed for three input common mode levels: $V_{CM}: -0.7V, 0.0V, +0.7V$ ( $V_{DD}=\pm 1$ ) and the results are in good agreement with the expected performance. The difference between the rise and fall settling times is a known effect of the output stage [3] and is due to the fact that in a falling edge, the current mirror Me, Md turns off and the delay to turn it on reflects on increased ringing and a slower settling behavior. <sup>&</sup>lt;sup>2</sup>: Differential Pairs Transistors. <sup>&</sup>lt;sup>3</sup>: W and L in $\mu m$ . | | $I_{TOT}(\mu A)$ | $t_{set}(\mu s)$ | | $SR(\mu V/s)$ | | $f_T(MHz)$ | $PM(^{o})$ | |-------------------------|------------------|------------------|------|---------------|------|------------|------------| | | | rise | fall | rise | fall | | | | Calculated | 10.33 | 1 | 1 | 0.59 | 0.59 | 0.85 | 63.2 | | SPICE: $V_{CM} = -0.7V$ | 9.68 | 0.91 | 1.90 | 0.42 | 0.32 | 0.85 | 61.9 | | SPICE: $V_{CM} = 0.0V$ | 10.34 | 1.02 | 2.19 | 0.45 | 0.39 | 0.84 | 62.2 | | SPICE: $V_{CM} = +0.7V$ | 10.96 | 0.95 | 1.64 | 0.52 | 0.30 | 0.87 | 69.1 | Table 4.3: Calculated and simulated characteristics of the OTA with $1\mu sec$ total settling time Figure 4.7: Transition frequency and Phase Margin along the input common mode range. The differences in the slew rate along the input common mode range is due to the fact that our rail-to-rail input stage doesn't have constant large signal behavior. Since the slew rate is determined by the internal slew rate, different values are achieved when only the p-type differential pair or only the n-type differential pair or both differential pairs are acting. To analyze the frequency response of the amplifier we plot in Figure 4.7 the transition frequency and the phase margin of the amplifier. The transition frequency is, as predicted, 0.85MHz with less than 6% variation along the input common mode range, except for $V_{CM} = 0.9$ where the saturation voltage of the output stage transistor Mb limits the performance. This is because the criterium used was to optimize the frequency response of the output stage current mirrors and that yields low $(g_m/I_D)$ ratios and thus high saturation voltages. This criteria should be reviewed to establish a minimum allowable $(g_m/I_D)$ ratio if operation closer to the Figure 4.8: Total Settling Time for different input common mode range rails at the output is required. Another source of variation in the transition frequency is the fact that the phase margin increase almost $10^o$ in the upper half of the input common mode range. This increase in the phase margin is due to the fact that when the n-type differential pair is acting, the non-dominant pole of the folded cascode is given by the parasitic capacitance at the source of transistors $M_{FC5}$ , $M_{FC6}$ , which is much smaller than the one at the source of transistors $M_{FC3}$ , $M_{FC4}$ due to the difference between the sizes of the current sources $(M_{FC1}, M_{FC2})$ and the current mirrors $(M_{FC7}, M_{FC8})$ in the folded cascode. This difference is quite much larger than we expected, and we should improve that part of the synthesis algorithm by reducing the size of the current sources and trying to match the frequency of both non-dominant poles. Figure 4.7 seems to prove a good performance of the constant-gm circuitry. However Figure 4.8 allow us to have a better look at the evolution of the total settling time with the input common mode range, and thus of the constant-gm circuitry. Since the output stage distorts the falling settling behavior, we show only the case for a positive step. It is clear in Figure 4.8 that the constant-gm circuitry performs quite badly. The explanation lies in the frequency response of the constant-gm circuitry to common mode signals. When analyzing the frequency response of the amplifier, the common mode is constant and thus the constant-gm circuitry doesn't affect the performance. However, when analyzing the response to a step input in the follower configuration, the common mode at the input also presents a step and thus the speed of the constant-gm circuitry in responding to the input becomes critical. It can be proved, analyzing the frequency response to common mode signals, that the constant-gm circuitry used in this design is much slower than the opamp, and thus Figure 4.9: Transition frequency $(f_T)$ and Phase Margin tuning over more than 3 decades the settling time suffers a noticeable delay, as seen in Figure 4.8. On section 4.5 we will further analyze the constant-gm circuit and show how the speed performance can be improved. On the other hand, Figure 4.8 shows that when these undesirable effects don't hinder the performance, the opamp complies with the $1\mu s$ total settling time specifications precisely. This proves the achievement of a major goal of this work, since we were able to automatically design a complex opamp cell that achieves the desired characteristics. One final remark goes to the extremely high value achieved for the DC gain. This is explained by several factors: a) there are three gain stages (two in the input stage and one in the output stage), b) the output stage gain is enhanced by the transconductance multiplication effect, c) these values correspond to operation with a purely capacitive load, in which case the output stage gain is maximum, d) we are taking full advantage of the high gain achievable in the weak and moderate inversion regions. Nevertheless, it is expected to have smaller values in the experimental prototypes due to presence of parasitic and unmodelled effects in the transistor output conductance. We will now explore the reusability of this cell by exploring the performance tuning using the input bias current. #### 4.4.2 Opamp Performance Tuning Figure 4.9 shows the first analysis of the reusability capabilities of the opamp cell. The gain bandwidth product is tuned over more than 3 decades. From the design point $(I_{REF} = 41nA)$ we can go down working in deeper weak inversion. The figure shows the tuning down to $I_{REF} = 410pA$ , but there is no reason that prevent us from going even further until leakage currents limit us. Regarding higher Figure 4.10: Transition frequency and phase margin tuning as a function of the input common mode gain bandwidth product, we can't go much higher than a decade or less, since the cell will start to have problems with the low supply voltage as transistors start to enter strong inversion operation, and thus, all the advantages of moderate and weak inversion for performance tuning, seen in section 3.2.1, won't hold anymore. Figure 4.9 shows the tuning for three different common mode voltages. We see a good performance, as expected, as the three curves are superposed along the whole tuning range in the gain bandwidth product and the $10^{o}$ difference, seen on Figure 4.7, also holds along the tuning range. Except for the last value, where the higher saturation voltage of the differential pair sources forces the constant-gm circuitry to act for $V_{CM} = 0.0V$ and the effect seen for high input common mode appears in that curve. The good response of the gain-bandwidth product with the input common mode range in the whole tuning range, can be better appreciated in Figure 4.10. It is clear that a constant small signal behavior is achieved along the input common mode range in every tuning point. The transition from the p-type differential pair to the n-type differential pair, is also clearly visible in the phase margin, since the increase in it appears for lower input common mode range as the bias current augment. We see also, how the saturation voltage of the output stage transistor Mb affects the gain bandwidth product for the higher bias current because of the stronger inversion operation of that tuning point, as we commented above. This, can also be seen in the phase margin, which noticeably drops at higher common mode levels when operating in stronger inversion. This can be explained because the current Figure 4.11: Total Settling Time tuning for three different input common modes where the constant-gm circuitry doesn't affect the performance greatly. mirror Mc, Mb at the output stage no longer can be considered as a mirror with gain k, since transistor Mb is working in the linear region. Thus, $gm_{mult}$ has a much lower value yielding lower $NDP_0$ ratios, and an overall bad performance of the output stage. In Figure 4.11 we see how we can tune our settling behavior from about 300ns up to $100\mu s$ . The higher phase margin of the upper part of the input common mode range, reflects on faster settling behavior, except for $I_{REF}$ above 41nA where the lack of dynamic range due to the higher saturation voltages of transistors hinders the performance. Also, at $I_{REF} = 130nA$ the constant-gm circuitry activates below $V_{CM} = 0V$ , and thus, the settling behavior suffers the consequences of its slower response. In spite of the problems described above, the performance is in excellent agreement with the expected results. For example, we are able to tune the settling response of our amplifier to $100\mu s$ , using $I_{REF}=0.41nA$ , and thus, reducing the current consumption over two orders of magnitude. One last plot regarding the tuning of the opamp cell is presented in Figure 4.12. Here we see the effect of both problems described above. The slower response of the constant-gm circuitry and the lack of dynamic range at the output stage due to the stronger inversion operation when tuning for faster settling times. Nevertheless, this section has proved another major goal of this work. If we design our amplifier to work between weak and moderate inversion, then we can tune our amplifier over several decades, all the way down into deep weak inversion without loss of performance. Figure 4.12: Total Settling Time tuning as a function of the input common mode range #### 4.4.3 Synthesized vs. Tuned An interesting question to ask ourselves is: How optimum is a tuned design? If we tune our $1\mu s$ design to use it as a $10\mu s$ design, how far from the optimum power design for $10\mu s$ total settling time design are we? To answer this question we synthesized an amplifier using Algorithm 4.1 to comply with $10\mu s$ total settling time. The simulation shows that the design obtained has a total $8.5\mu s$ settling time in the lower part of the input common mode range. Therefore we tuned our $1\mu s$ design to have the same settling time. Figure 4.13 shows that we achieved the same settling performance in the desired input common mode range. The effect of the constant-gm circuitry appears for lower common mode levels in the $10\mu s$ design, because the input stage of the tuned $1\mu s$ design operates in weaker inversion. Table (4.4) shows that both designs have similar performance. However the optimized design for $10\mu s$ consumes almost half the total current consumption of the tuned $1\mu s$ design. This shows that, although tuned cells have good performance, it is worth to re-synthesize an amplifier when making a new design where power consumption is critical. Specially since with the aid of Algorithm 4.1, a new completely optimized design might be ready in very short times. #### 4.4.4 Performance Evaluation Against a Simpler Architecture Here we will compare our amplifier with the amplifier designed by Silveira [3] using Algorithm 3.1. This is a very rough comparison due to the noticeable differences between both designs. A more ample and thorough comparison will be made in Chapter 5 using the experimental results, nevertheless this comparison gives the reader a first idea of the performance of the new algorithm. The architecture used by Figure 4.13: Total Settling Time comparison between the $10\mu s$ design and the tuned $1\mu s$ design. Silveira is a simple Miller OTA as the one used in section 2.4.1, but using an RC compensating network. The main difference is that it was designed in the $3\mu m$ CMOS on Fully Depleted Silicon-on-Insulator technology (FD-SOI) of the UCL (Université Catholique de Louvain). However, the fact that our amplifier is designed with a shorter minimum length compensates the advantages of the FD-SOI technology in a first approximation [3]. The specifications used by Silveira [3], were $1\mu s$ total settling time for a input step amplitude $V_{step}$ equal to 0.25 and a 10pF load capacitor. Table (4.5) compares both synthesis main characteristics. Since we are using a much efficient output stage, is reasonable to compare one design with a capacitive load of 10pF and the other of 50pF. Both designs achieve very similar performance, with a total consumption 30% higher in our design. This difference can be accounted for the higher load capacitance, in spite of the more efficient output stage, or the more efficient technology of the design in [3], in spite of the longer minimum channel. Nevertheless it proves that our algorithm successfully complied with the same total settling time specifications with higher requirements (higher load capacitance and rail-to-rail input) using a more complex architecture keeping the power consumption performance. ## 4.5 Analysis of the Constant-gm Circuit All the simulations of this chapter and the experimental prototype, presented in Chapter 5, used a constant-gm circuit that hinder the settling behavior of our amplifier. With a more thorough analysis a good trade-off between improved settling | | $10\mu s$ Design | Tuned $1\mu s$ Design | |-------------------|------------------|-----------------------| | $I_{TOT}(\mu A)$ | 0.58 | 1.18 | | $f_T(kHz)$ | | | | $@V_{CM} = -0.5V$ | 88.3 | 120.3 | | $@V_{CM} = 0.0V$ | 87.1 | 118.4 | | $@V_{CM} = +0.5V$ | 88.2 | 121.7 | | $PM(^{o})$ @ | | | | $@V_{CM} = -0.5V$ | 68.5 | 63.9 | | $@V_{CM} = 0.0V$ | 68.6 | 64 | | $@V_{CM} = +0.5V$ | 71.8 | 72.5 | | $SR(mV/\mu s)$ | | | | $@V_{CM} = -0.7V$ | 51 | 51 | | $@V_{CM} = 0.0V$ | 51 | 52 | | $@V_{CM} = +0.7V$ | 59 | 52 | Table 4.4: Comparison between the $10\mu s$ design and the tuned $1\mu s$ design. | | [3] | This work | |----------------------------------|-----------------|-----------------| | Settling Specification $(\mu s)$ | 1 | 1 | | Technology | $3\mu m$ FD-SOI | $0.8\mu m$ Bulk | | Load Cap. (pF) | 10 | 50 | | $I_{TOT}(\mu A)$ | 7.05 | 10.33 | | GainDC(dB) | 82 | 167 | | $f_T(MHz)$ | 0.75 | 0.85 | | $PM(^{o})$ | 68 | 58 | | $SR(V/\mu s)$ | 0.59 | 0.59 | Table 4.5: Comparison between our amplifier and the one designed in [3] using Algorithm 3.1 with a simple Miller OTA. behavior and increased consumption is presented. Regrettably, the improvement in the settling behavior could not be achieved before the prototype was fabricated and thus, the ideas discussed in this section could not be verified experimentally. #### 4.5.1 Open Loop Transfer On Figure 4.14 we recall (see section 3.3.1) the constant-gm circuit loop that fixes the total transconductance equal to the transconductance of the differential pair $T_{P,ref}$ . If we want to analyze the dynamics of the loop we must open it to obtain an expression for the open loop transfer. We choose to open at the input of the current source of differential pair $T_N$ . Thus, input and output signals are defined $(V_{IN}, V_{OUT})$ for the open loop transfer. Figure 4.14: Constant-gm circuit loop. We will write this open loop transfer as: $$\frac{V_{OUT}}{V_{IN}} = \frac{V_{OUT}}{\Delta i} \frac{\Delta i}{V_{IN}} \tag{4.34}$$ where $\Delta i$ is the sum of all the differential currents $i_N, i_P$ and $i_{REF}$ . This sum is performed by the folded cascode circuit and thus it has a frequency response, analyzed in section 4.3.2, that we will note F(s). According to the input stage synthesis algorithm (see Algorithm 4.2) the poles introduced by F(s) lie above the transition frequency of the opamp, and thus, we will suppose that they don't affect the frequency response of the loop. $\Delta i$ can be written as, $$\Delta i = (i_N + i_P + i_{REF})F(s) \tag{4.35}$$ where the sum of the differential currents equals zero in regime. During the transient response to a signal in $V_{IN}$ the only differential current that changes is $i_N$ , thus, we can further expand equation (4.34) as $$\frac{V_{OUT}}{V_{IN}} = \frac{V_{OUT}}{\Delta i} \frac{i_n}{V_{IN}} F(s) \tag{4.36}$$ where $i_n$ is the signal current in $i_N$ due to $V_{IN}$ . The first term is determined by the output impedance of the folded cascode circuit and the gate capacitance of the current source transistor of the differential pair $T_N$ . Thus, the expression is $$\frac{V_{OUT}}{\Delta i} = -\frac{1/g_{oFC}}{1 + s/\omega_{po}} \text{ where } \omega_{po} = \frac{g_{oFC}}{C_{g3}}$$ $$(4.37)$$ $g_{oFC}$ is the output conductance of the folded cascode and $C_{g3}$ is the total output capacitance, mostly dominated by the gate capacitance of the current source. The second term is less trivial. We got an increase in the bias current of the differential pair $T_N$ given by $gm_3V_{IN}$ , where $gm_3$ is the transconductance of the current source. This increase of bias current will increase $gm_1$ , which in turn will increase $i_n$ . Continuing with a small signal model we will write this term as $$\frac{i_n}{V_{IN}} = k_V g m_3 \tag{4.38}$$ where $k_V$ reflects the increase of differential current due to V for a small increase in the bias current. It can be seen that this term can be written as $$k_V = \frac{1}{2} (g_m / I_D)_1 \ V \tag{4.39}$$ where $(g_m/I_D)_1$ is the $(g_m/I_D)$ ratio of the n-type differential pair. In our case, since V = 10mV, $k_V = 0.09$ . Finally, the complete open loop transfer function is $$\frac{V_{OUT}}{V_{IN}} = -\frac{k_V \frac{gm_3}{g_{oFC}}}{1 + \frac{s}{\omega_{po}}} F(s)$$ $$\tag{4.40}$$ which can be rewritten as $$\frac{V_{OUT}}{V_{IN}} = -\frac{A_0}{1 + s\frac{A_0}{\omega_T}}F(s) \tag{4.41}$$ where $$A_0 = k_V \frac{gm_3}{g_{oFC}}$$ $$\omega_T = k_V \frac{gm_3}{C_{q3}}$$ $$(4.42)$$ $$\omega_T = k_V \frac{gm_3}{C_{g3}} \tag{4.43}$$ which is a very reasonable result, since the gain-bandwidth product of our transfer is dominated by the total output capacitance of the folded cascode. Using this equations, the current source transistors of the n-types differential pairs can be sized. Here, we face a trade-off between speed and the length of the transistors, since if we size it to have minimum length, $C_{a3}$ will be minimum, but unacceptable common mode rejection ratios are obtained. Also we must take into account the position of the poles in F(s) since the loop could become unstable. #### 4.5.2**Bias Current Monitor** Another part of the constant-gm circuit that slows down the performance is the bias current monitor circuit. This circuit (see section 3.3.1 and Figure 3.7) monitors the bias current of the p-type differential pair in the input stage and generates a replica for the differential pair $T_P$ in the constant-gm circuit. In Figure 4.14 this Figure 4.15: Settling time as a function of the input common mode with the redesign of the constant-gm circuit circuit is represented by the voltage controlled current source $I_{BP}(V_{CM})$ . This circuit, formed by 3 current mirrors and a replica of the differential pair, has to be much faster than the rest of the constant-gm circuit. Then, the current mirrors will be sized to assure that the frequency doublets they introduce, lie at much higher frequencies. The problem with this circuit is that when $I_{BP}(V_{CM})$ tends to 0 all the current mirrors leave the saturation region and an unacceptable delay is introduced. Thus, a constant "bias" current has to be added in all these current mirrors in order to keep them always on the saturation region. Obviously this solution will increase the total consumption. Therefore, the trade off will be between the amount of current invested and the error in the copy of the current due to the maximum acceptable size for the mirrors transistors. #### 4.5.3 Redesign of the Constant-gm Circuit Taking into consideration the limitations considered in this section, we redesigned the current source transistor of the n-type differential pairs and the current mirrors of the bias current monitor. The size of the current source is W/L = 2/8 and the size of the current mirrors is W/L = 2/6. The added bias current is 100nA. A simulation using this redesign, which is far from being optimized, yields the results shown in Figure 4.15. It is clear that a very acceptable result was obtained, since the total variations due to the transition between differential pairs is less than 500ns compared to almost $5\mu s$ in the previous design. In this redesign, the total current consumption was increased only 300nA which is less than the 5% of the total previous current consumption. 4.6 Conclusions Further analysis and simulations are due here, and certainly is a topic of future research. #### 4.6 Conclusions A new hierarchical automated synthesis algorithm was presented. This algorithms successfully synthesizes a rail-to-rail opamp cell to comply with a given total settling time with minimum power consumption. The hierarchical approach allow us to independently synthesize each stage, decoupling the high level specifications from the actual implementation of each stage. This is a major advantage since we can use different architectures in the input stage, for example, changing only the input stage synthesis. The results obtained in section 4.4.1 proves that the developed algorithm successfully complies with the total settling time specification, and comparisons with other designs from the literature presented in Chapter 5, will show that this is truly achieved with optimum consumption. A first example of the good power consumption achieved was presented in section 4.4.4 with a design using Algorithm 3.1, in spite of the notorious differences in both designs. Another mayor achievement was presented in section 4.4.2, where we prove the feasibility of tuning the performance of the opamp cell through the bias current. The obtained results showed that we can tune the settling time characteristic of the cell for over more than 3 decades. Nevertheless, section 4.4.3, showed that, expectedly, this "tuned" designs don't have optimum power consumption. Thus, regarding new designs, a trade-off between reuse with sub-optimum power consumption and quick re-synthesis of new optimum designs is present. Finally, the two weakness seen on the performance of this cell have two different sources. One is a problem with the rail-to-rail input stage architecture, which we use in spite of its poor frequency response for common mode signals. The constant-gm circuit was analyzed and redesigned in section 4.5 with a significant improvement in the settling behavior. Nevertheless, recently, a new rail-to-rail input stage with all the desirable characteristics of the one used here (constant-gm, robust and universal) but also with constant slew rate (large signal behavior) and a feed-forward scheme was presented [6]. Then, taking advantage of the superior performance of this last architecture, we could redesign our amplifier with this new input stage by changing the input stage synthesis to suit the different architecture; and this is one fundamental achievement, since this algorithm has a great degree of independence from the architecture used to implement each stage. The other problem in the performance is that the current mirrors of the output stage were synthesized to operate at almost strong inversion, and thus we end up with an output stage with a loss of 150mV in the total output swing. This was done so, because the synthesis algorithm designed the current mirrors based only on their frequency response. As always, the algorithm can be improved, and the inclusion of a minimum allowable $(q_m/I_D)$ ratio for the current mirrors could be easily added. # Chapter 5 Experimental Results # 5.1 Rail-to-rail Operational Amplifier in $0.8\mu m$ CMOS Technology We successfully tested the results obtained in section 4.4, with an experimental prototype of the opamp cell designed in section 4.4.1<sup>12</sup>. Figure 5.1 shows a microphoto of the fabricated opamp prototype. To test the opamp settling behavior we implemented an automatic measurement system using a PC with a GPIB<sup>13</sup> card to communicate with the instruments. The system is presented in Figure 5.2. We use the amplifier in a unity gain configuration, but using a buffer in order to control the capacitive and resistive load our amplifier has to drive. The amplifier used for the buffer was a JFET input TLE2071, $<sup>^{13}</sup>$ IEEE 488 Standard Figure 5.1: Opamp Cell Microphotograph <sup>&</sup>lt;sup>12</sup>There are some minor differences between the size of the transistors of the experimental prototype and those of the cell characterized with simulations in section 4.4. These differences are due to the use of the final, slightly improved algorithm, for the latter. Refer to Appendix B for a table of the transistor sizes in the experimental prototype Figure 5.2: Settling time automatic measurement system which has capacitive load of approximately 15pF at the input. The signal generator used was the HP3245A, which we used to generate a square wave signal at different input common modes. Both input and output signals were registered with a 500MHz, 5GS/s Oscilloscope (Tektronix TDS3052). Finally, to precisely bias the opamp (D.U.T.: Device Under Test) over all the tuning range, we use the Semiconductor Analyzer HP4155A. Then using the GPIB bus we could measure the settling behavior of the opamp along the input common mode range for every point of the tuning range. Using this system, we measured the total settling time at 5% for a 0.3V step amplitude. We add a capacitive load $C_L = 33pF$ which, along with the buffer input capacitance, completes $C_{Ltot} \simeq 48pF$ . Figure 5.3 shows the results of the measurements. A good performance was obtained, and we can see that the expected behavior of the constant-gm circuit is clearly present. We can see also that the settling behavior for the last tuning point $(I_{REF} = 120nA)$ didn't achieve the expected performance. To have a better evaluation of the settling behavior of the opamp, Figure 5.4 compares the settling time over the whole tuning range between the simulation of the synthesis result and the measurements made with the experimental prototype. Here we see that the prototype is a bit slower compared to the simulations. Nevertheless a good agreement is achieved between both results, except for the last tuning point, as anticipated by Figure 5.3. As we mentioned in section 4.4.2, in this last tuning point, the amplifier is in the limits of its capabilities, as critical transistors start to enter strong inversion, and the reusability hypothesis don't hold any longer. The experimental prototype seems to have a degraded frequency response, since the step responses are more oscillatory than expected, particularly for this last tuning point. This might be caused by the parasitic capacitances introduced by the metal connections in sensitive nodes. According to the circuit extracted from the layout, this parasitic capacitances are a Figure 5.3: Total settling time tuning as a function of the input common mode range Figure 5.4: Comparison between the simulated and experimental total settling time tuning for three different input common modes Figure 5.5: Settling time as a function of the total quiescent current consumption. few tens of fF which are in the order of the drain junction parasitic capacitances calculated in the synthesis algorithm. Therefore, the position of the non-dominant poles could vary significantly. This is visible in the simulation of the extracted amplifier including the parasitic capacitances of the metal connections. The circuit presented a phase margin of only $55^{\circ}$ compared to the same extracted circuit without those metal connections parasitic capacitances, which has a phase margin above $60^{\circ}$ . To prove that this is the reason for the minor oscillations observed in the settling behavior, we measured the settling time with a load capacitance $C_L = 22pF$ , which completes a total load of $C_{Ltot} \simeq 37pF$ . The oscillations where almost completely gone, which means that, indeed, our amplifier has a degraded phase margin with respect to the simulations made. Figure 5.5 shows the amplifier's speed-power trade-off. We can appreciate that the amplifier is limited up to $1\mu s$ of total settling time, and also, that for slow settling times we obtain really low power consumptions. Two remarks are due here. One, as we showed in section 4.4.3, the power consumption in a particular tuning point could be greatly improved if we synthesize the amplifier specifically for that point. Two, in the slowest tuning point showed $(100\mu s)$ , the consumption is noticeable above the expected value of approximately 100nA. This is because the bias circuit that generates the two constant input voltages for the constant-gm circuit differential pairs (see section 3.3.1), consumes 60nA independently of the bias current. This consumption is negligible in the $1\mu s$ settling time design, but almost equals the rest of the current consumption in this last tuning point. Table (5.1) presents a comparison between the simulated and experimental results of the opamp cell in its design point ( $I_{REF} = 40nA$ ). We choose three | $V_{CM}(V)$ | -0.7 | | 0.0 | | +0.7 | | | |------------------------------|-------|-------|-------|-------|-------|-------|-------------------| | | Sim. | Meas. | Sim. | Meas. | Sim. | Meas. | Notes | | $I_{DD}(\mu A)$ | 9.68 | 9.83 | 10.34 | 10.17 | 10.96 | 10.43 | | | $A_0(dB)$ ) | 167 | > 120 | 167 | > 120 | 167 | > 120 | $1, R_L = \infty$ | | $t_{set}(\mu sec)5\%$ (rise) | 0.91 | 0.91 | 1.02 | 0.81 | 0.95 | 1.13 | $2, C_L = 48pF$ | | $t_{set}(\mu sec)5\%$ (fall) | 1.9 | 4.9 | 2.19 | 5.8 | 1.64 | 1.7 | $2, C_L = 48pF$ | | $SR(V/\mu s)$ (rise) | 0.42 | 0.48 | 0.45 | 0.53 | 0.52 | 0.38 | $C_L = 48pF$ | | $SR(V/\mu s)$ (fall) | 0.32 | 0.45 | 0.39 | 0.57 | 0.30 | 0.38 | $C_L = 48pF$ | | Offset (mV) | | 8.6 | | 8.1 | | 16.7 | 3 | | Area $(mm^2)$ | 0.083 | | | | | | | <sup>1:</sup> Measured in a previous version of the prototype. Table 5.1: Opamp Cell characteristics. different common mode levels where we can appreciate the behavior of the opamp where only the p-type differential pair is acting $(V_{CM} = -0.7V)$ and before and after the undesirable effect of the constant-gm circuit $(V_{CM} = 0.0V)$ and $V_{CM} = +0.7V$ . The consumption is in excellent agreement with the expected results. As we saw in Figure 5.4, the rise settling time also has an acceptable agreement. However, the settling response presents a more oscillatory response because of the degraded phase margin of the prototype, and although this effect is almost negligible in the rise response, it has a serious impact on the fall settling behavior. As we mentioned in section 4.4.1, the output stage has a known oscillatory effect in the fall settling due to the turn on delay of the current mirrors [3]. In the work by Silveira [3], this effect although undesired, didn't had a severe consequence in the total fall settling time, because the fall slew rate was almost three times higher than the rise slew rate and the phase margin was 68°. In our work, both slew rates are approximately equal and most important, the phase margin is severely degraded. In the lower half of the input common mode range, where we expected to have over $60^{\circ}$ we only have $55^{\circ}$ . Then, the oscillations caused by the output stage take a much longer time to extinguish yielding settling times between 5 and 6 times longer than the rise settling. This is not the case in the upper half of the input common mode range, where, although the phase margin is also degraded, we expected to have almost $70^{\circ}$ , due to the non-symmetrical behavior of the folded cascode non-dominant poles, and we end up (always according to simulations) with approximately $60^{\circ}$ . Regarding the settling behavior there is no big difference if the phase margin is either $60^{\circ}$ or $70^{\circ}$ , and thus for $V_{CM} = +0.7V$ the settling time agrees with the simulated results. Measured offset values are acceptable, and a noticeable increase is appreciated in the upper half of the input common mode range. This is not unexpected since different input stages are acting in each half of the input common mode range. Mo- <sup>2:</sup> Input step has 0.3V amplitude. <sup>3:</sup> Measured in follower configuration. Figure 5.6: Offset voltage as a function of the input common mode. reover, not only each stage has different type of transistors, but also each differential pair has a different load circuit. Figure 5.6 shows the offset voltage for different bias currents. Once again, we see that the transition between each input stage is clearly a function of the reference current, hence is a function of the saturation voltage of the p-type differential pair current source. ### 5.2 Comparison with other published results In this section we will compare the performance of our amplifier with other published results. Several figures of merit have been presented to evaluate the speed-power tradeoff achieved by operational amplifiers. In reference [55] the gain-bandwidth divided by the consumed power is used. However, it makes sense to include the load capacitance in the trade-off. Therefore, references [7,8] use the following figure of merit $$FOM_S\left(\frac{MHz.pF}{mW}\right) = \frac{GBW.C_L}{power} \tag{5.1}$$ However, this figure only compares the small signal behavior of the amplifier. Therefor we will also use another figure of merit [8], that takes into account the large signal behavior of the speed-power trade-off in amplifiers using the slew rate instead of the gain-bandwidth product, $$FOM_L\left(\frac{pF.V/\mu s}{mW}\right) = \frac{SR.C_L}{power}$$ (5.2) where the average SR of the amplifier is taken. | | Power | Load | GBW | PM | SR | $\mathrm{FOM_S}$ | $\mathrm{FOM_L}$ | |-----------|---------------|----------------|-------|------|-------------|----------------------------------|--------------------------------------| | | $(mW@V_{DD})$ | $(k\Omega/pF)$ | (MHz) | (°) | $(V/\mu s)$ | $\left(\frac{MHz.pF}{mW}\right)$ | $\left(\frac{pF.V/\mu s}{mW}\right)$ | | This work | 0.021@2 | $\infty/50$ | 0.85 | > 55 | 0.47 | 2024 | 1119 | | [3](1) | 0.014@2 | $\infty/10$ | 0.75 | 68 | 0.48 | 536 | 343 | | [3](2) | 0.048@2 | 10/22 | 1.4 | 50 | 1.15 | 642 | 527 | | [5] | 0.46@1.5 | $\infty/15$ | 1.3 | 64 | 1 | 42 | 33 | | [6] | 4.8@3 | 0.56/33 | 17.5 | 60 | 16.27 | 93 | 72 | | [7] | 6.9@3 | $\infty/40$ | 47 | 76 | 69 | 272 | 400 | | [8] | 2.45@3 | $\infty/300$ | 10.4 | 63.7 | 3.5 | 1273 | 429 | Table 5.2: Comparison of the performance of the Opamp Cell. Table (5.2) presents the comparison of the performance of the opamp cell against several other published examples. Reference [3] presents the design of several amplifiers. Both examples used there, are amplifiers designed in $2\mu m$ FD-SOI technology, which is considered [3] to be "comparable" with our $0.8\mu m$ bulk technology. The first example taken its a simple Miller OTA designed for optimum power consumption using Algorithm 3.1. This is the same amplifier used for the comparison made in section 4.4.4, and we can appreciate that the most efficient output stage allowed our design to out-perform the power efficiency of this design in spite of our rail-to-rail input stage. The second example from reference [3] is a Miller OTA with the same class AB output stage used in this work. This second amplifier was also designed for optimum power consumption using an algorithm similar to Algorithm 4.3 that is gain-bandwidth driven. An interesting result is that, since the gain-bandwidth is 1.4MHz, the optimum mirror gains found for this amplifier were approximately the same found in our design (h = 1, k = 8, m = 3). The difference is that the optimizing algorithm used didn't take jointly into account the internal and external slew rates. Thus the algorithm "over-dimensioned" the output stage current, while the slew rate was determined by the input stage. This explains why our design has a better figure of merit. Reference [5] is an amplifier with constant-gm rail-to-rail input stage designed for low-voltage operation, but with no special consideration for consumption. The design was made in $0.7\mu m$ standard CMOS technology and presents very similar gain-bandwidth and slew rate values. However this comparison shows how far from optimum consumption a design can be, if consumption is not taken into consideration during design. Reference [6] presents an amplifier with a constant-gm, constant slew rate input stage that complies with all the desirable characteristics seen in section 3.2.2. As we mentioned in section 4.6, this input stage don't have restrictions in its operating frequency range, since it uses a feed-forward scheme. What is more interesting about this input stage is that it works with approximately the same number of copies of the input stage bias current used in the input stage used in this work. Thus it is feasible of being used with our algorithm, since it wouldn't jeopardize our power efficiency. 92 5.3 Conclusions The opamp designed in reference [6] includes a class AB output stage that gives the amplifier high-drive capabilities and makes the design appropriate for video applications. Nevertheless, both figures of merit shows that the design is quite far from begin optimum in a power consumption sense, in spite of some minor considerations mentioned by the authors and its drive capabilities. Reference [7] presents a three-stage opamp implemented in $0.6\mu m$ n-well CMOS technology. The work is focused in the development of embedded frequency compensation networks. The amplifier achieves high-bandwidth and fast slewing figures for a capacitive load of 40pF. The authors accurately report a good power efficiency using the figure of merit presented in equation (5.1). Therefore, we see another example of the truly optimum power design obtained with the algorithm developed in this work. Finally, we took the amplifier designed in reference [8] to compare our work with an amplifier specially designed to drive heavy capacitive loads (300pF) and, therefore, prove that our higher figures of merit are not accounted only for the relatively high capacitive load driven by our output stage. We can see in Table (5.2) that our amplifier still has a better power efficiency performance. #### 5.3 Conclusions This chapter has shown the experimental results of the reusable opamp cell synthesized in Chapter 4. Excellent agreement between the settling time specifications and the measurements was obtained in spite of the expected effect due to the constant-gm circuit. The power-speed trade-off can be successfully tuned over more than 3 decades, beyond a $100\mu s$ settling time design that has total current consumption below 180nA. A degraded phase margin, partially due to small parasitic capacitance in sensitive nodes of the circuit, prevent us from achieving settling times below $1\mu s$ , nevertheless the cell proved to have excellent tuning capabilities. We took the usual figures of merit to measure and compare the power-speed trade-off against several examples from the literature. We could appreciate that our design achieves very high figures of efficiency, which proves the true optimization achieved by the synthesis algorithm. On the next chapter we will review the conclusions and goals of the whole work, along with some open lines of future research. # Chapter 6 Conclusions This thesis presented the development of an automatic synthesis algorithm intended for micropower operational amplifiers. This algorithm is based in previous work on this subject that originate with the $(g_m/I_D)$ methodology proposed and developed by Prof. Paul Jespers and the staff of the Microelectronics Laboratory, at the Université Catholique de Louvain (UCL). The basic automatic synthesis algorithm developed by the UCL, and presented in Chapter 2 (Algorithm 2.1), is gain-bandwidth driven and has been used to introduce the application of the $(g_m/I_D)$ methodology and the idea of design space exploration (Algorithm 2.2) which has been further improved in this work. This idea allows us to obtain optimum combinations of the $(g_m/I_D)$ ratios not only regarding speed-power trade-offs but in any sense needed by the designer. We have presented here how to apply this algorithms using a single piece, continuous MOSFET model that allow us to explore different trade-offs in the selection of several, previously unexplored, design variables. Nevertheless, these algorithms were based in amplifier specifications, which prevents us from achieving automatic synthesis starting from specifications at an application level. The work done by Prof. Silveira [3] provided a mean to transit from high level specifications (settling time) to the amplifier specifications and, then to transistor sizing using the $(g_m/I_D)$ methodology. The resulting power optimization algorithm (Algorithm 3.1) was applied to a simple RC compensated Miller amplifier, and several results were obtained, particularly, the proof of the existence of an optimum consumption design point. Also, Prof. Silveira [3] presented a new approach for the design of a class AB output stage that exploits a transconductance multiplication effect. A need, then, to extend the power optimization algorithm to include amplifier architectures as this class AB output stage and rail-to-rail input stages, was of particular interest in the field of analog design automation in low-voltage, power critical systems. In this work we could successfully fill this need by developing a hierarchical synthesis algorithm (Algorithm 4.1) based on the power optimization algorithm presented by Silveira [3], but decoupling in a great degree the synthesis of each stage with the high level synthesis of the amplifier. This allows us to use the same algorithm, with minor changes, for an amplifier with a different input or output stage architecture. To further advance in the field of analog design automation, we explore and review the options in analog design reuse, including a brief review of technology migration. We proved that an amplifier cell designed to operate in weak or near-weak inversion, can have its speed-power trade-off tuned over several decades without loss of performance in other aspects. This was verified in an experimental prototype designed using the algorithm developed in this work. The problems presented by the prototype were mostly due to problems with the selected architectures. Specially with the selected rail-to-rail input stage, which failed to perform as expected. After further analysis and redesign of this stage, an important improvement in the settling behavior was achieved. However, this improvement could not be included in the fabricated prototype. A second problem, appear due to the sensitivity of some nodes to parasitic capacitances of such small values as a few tens of fF. This can be improved with a more careful layout design. Nevertheless, the loss of performance caused by this problem, could have been negligible, except for the third and last problem which we encountered. This last issue, regards a known oscillatory effect of the class AB output stage. Silveira [3] suggested an auxiliary circuit to overcome this problem. However, it didn't work in our design and it was finally dropped from the prototype. An alternative was briefly and unsuccessfully searched, and finally the circuit was fabricated with the known deficiency, since its solution was not a critical objective of this work. In summary, the main results of this work are: - It thoroughly reviewed the simple automatic synthesis algorithms for amplifiers, applying a continuous MOSFET model that uses simple single-piece equations. - It presented and experimentally verified the possibility of performance tuning through the bias current in amplifiers. - It has verified not only the existence of an optimum compensation capacitance in Miller amplifiers, but also, to our best knowledge, a new expression to directly estimate its value has been developed. Previous algorithms swept predefined values of the capacitance and synthesized the whole amplifier in each of them to obtain the optimum, with the obvious penalty paid in the needed processing time. - It presented a new approach to develop a hierarchical automatic synthesis algorithm that allows the designer to easily decouple the high level synthesis from each stage synthesis. The approach was applied in the development of a hierarchical algorithm based on the settling time driven algorithm presented by Silveira [3], for a rail-to-rail amplifier with a power efficient class AB output stage. - Besides some problems encountered with the performance of the architecture, the synthesis algorithm was successfully verified with an experimental prototype that complies with the design specifications with truly optimal power consumption. - Additional results were obtained in the extension of the design space exploration algorithm (Chapter 2) and the design methodology for the cascode bias circuit presented in Appendix A. 6. Conclusions 95 #### **Future Work** In order to further optimize the resulting amplifier, the constant speed performance over the whole input common mode range could be improved extending the analysis of the constant-gm circuit. Alternatively, the use of recently proposed rail-to-rail input stages [6] that overcomes the deficiencies of the selected input stage could be considered. Of course, there is also still much to do in the direction of automatic analog design. The hierarchical approach presented here, should be extended to other amplifier architectures, to prove its major advantages in power consumption optimization. Also, it would be interesting to explore the use of these opamp design techniques and the reuse concept in a system-level example, to show the power optimization capabilities in a whole system. ### Appendix A Low-Voltage Cascode Bias Transistor Design The summing circuits present in the input stage of the opamp, are implemented with a folded cascode stage. In these circuits, two voltages must be generated in order to bias the cascode transistors. Figure A.1 shows the circuit used, where M2 is the cascode transistor, M1 is the transistor to be cascoded and M3 is a diodeconnected transistor used to bias M2. An equivalent pMOS circuit is used to bias the pMOS cascode transistors. This circuit was first proposed in [56] and a first study of its design, using EKV model, can be found on [57]. The basic idea proposed in [57], is to fix the drain-source voltage of M1 close to the drain-source saturation voltage by choosing an adequate operation point for M3 as a function of the operating point of M1 and M2. In [57], transistors M1 and M2 were supposed to be working in strong inversion (M1) and weak inversion (M2) and the EKV expressions on those limits were used to design M3. Here, as far as we know, we propose a new set of design equations based on a continuous model valid in all region of operation (ACM), which, hence, doesn't depend on the operating point of the transistors. If $I_b$ is the bias current through M2 and $I_b/k$ is the bias current through M3 we can define a first relation between the operating point of M2 and M3 using equation (2.3) $$\frac{I_{D2}}{I_{D3}} = k = \frac{i_{f2}(W/L)_2}{i_{f3}(W/L)_3} \tag{A.1}$$ Figure A.1: Cascode transistor bias. then $$i_{f3} = \frac{i_{f2}(W/L)_2}{k(W/L)_3}$$ (A.2) Recalling equation (2.6), we can write the pinch-off voltage for transistors M2 and M3 as: $$V_{P2} = V_{D1} + \phi_T f(i_{f2}) \tag{A.3}$$ $$V_{P3} = \phi_T f(i_{f3}) \tag{A.4}$$ where $$f(i_f) = \sqrt{1 + i_f} - 2 + \ln\left(\sqrt{1 + i_f} - 1\right)$$ (A.5) In equation (A.3) $V_{D1}$ is the drain voltage of M1 and, as we said, the criterium will be to fix it close to the saturation voltage. Using equation (2.8), in the case $\varepsilon = 1\%$ , we can define $V_{D1}$ as $$V_{D1} = V_{DSsat1} + \Delta V_{margin} = \phi_T \left( \sqrt{1 + i_{f1}} + 3 \right) + \Delta V_{margin}$$ (A.6) where $\Delta V_{margin}$ defines how close we want the drain voltage to the saturation voltage. Since $V_{G2} = V_{G3}$ , then $V_{P2} = V_{P3}$ and combining equations (A.3) to (A.6) we obtain the following expression that relates the inversion factor of the three transistors: $$\sqrt{1+i_{f3}} - \sqrt{1+i_{f2}} - \sqrt{1+i_{f1}} + \ln\left(\frac{\sqrt{1+i_{f3}}-1}{\sqrt{1+i_{f2}}-1}\right) = 3 + \frac{\Delta V_{margin}}{\phi_T}$$ (A.7) Then, using equations (A.2) and (A.7) we can develop the following method to design transistor M3. First, we define $\Delta V_{margin}$ and using equation (A.7) we obtain the inversion level for transistor M3. Then, we define factor k according to the current budget, and using equation (A.2), we obtain the $(W/L)_3$ ratio. It is worth noticing that both equations used, are completely independent of the technology used, and so, become powerful design tools for this circuit. The criterium used in this work was to consider $\Delta V_{margin} = 4\phi_T$ ## | Differential Pair | | | | | Output Stage | | | | | |-------------------|-------|-------|---|-------------|--------------|-------|-------|-----|-------------| | | $W^3$ | $L^3$ | M | $(g_m/I_D)$ | | $W^3$ | $L^3$ | M | $(g_m/I_D)$ | | C.M. <sup>1</sup> | 7.7 | 20 | | 15 | Ma | 13.6 | 1 | 1 | 15 | | D.P. <sup>2</sup> | 8.9 | 10 | 1 | 18 | Mf | 11.2 | 1 | 1 | 15 | | Folded Cascodes | | | | | Mb | 2 | 1 | 8.5 | 12.3 | | | $W^3$ | $L^3$ | M | $(g_m/I_D)$ | Mc | 2 | 1 | 1 | 12.3 | | $M_{FC1}$ | 7.7 | 20 | 4 | 15 | Md | 2 | 1 | 3 | 21.7 | | $M_{FC3}$ | 2.5 | 2 | 1 | 22.8 | Me | 2 | 1 | 1 | 21.7 | | $M_{FC5}$ | 4.5 | 2 | 1 | 21.9 | $Mos_1$ | 4.1 | 10 | 1 | 3.4 | | $M_{FC7}$ | 2 | 3.2 | 1 | 16.1 | $Mos_2$ | 4.1 | 10 | 2 | 3.4 | <sup>&</sup>lt;sup>1</sup>: Current Mirrors. Table B.1: Transistors sizes in the experimental prototype. M is the number of parallel transistors. <sup>&</sup>lt;sup>2</sup>: Differential Pairs Transistors. <sup>&</sup>lt;sup>3</sup>: W and L in $\mu m$ . ### **Bibliography** - [1] F. Silveira, D. Flandre, and P. G. Jespers, "A $(g_m/I_D)$ based methodology for the design of cmos analog circuits and its application to the synthesis of a silicon-on-insulator micropower OTA," *IEEE Journal of Solid-State Circuits*, vol. 31, no. 9, pp. 1314–1319, Sept. 1996. - [2] P. G. Jespers, "Interfacing microsystems," IberChip, Montevideo, Uruguay, Course Session 2.2, Mar. 2001. - [3] F. Silveira and D. Flandre, Low Power Analog CMOS for Cardiac Pacemakers Design and Optimization in Bulk and SOI Technologies, ser. The Kluwer International Series In Engineering And Computer Science. Boston: Kluwer Academic Publishers, Jan. 2004, vol. 758, ISBN 1-4020-7719-X. - [4] "LM4250 Programable operational amplifier," National Semiconductor Corp.," Data Sheet, Aug. 2000. - [5] G. Ferri and W. Sansen, "A rail-to-rail constant-gm low-voltage CMOS operational transconductance amplifier," *IEEE Journal of Solid-State Circuits*, vol. 32, no. 10, pp. 1563–1567, Oct. 1997. - [6] J. M. Carrillo, J. F. Duque-Carrillo, G. Torelli, and J. L. Ausín, "Constant-gm constant-slew-rate high-bandwith low-voltage rail-to-rail CMOS input stage for VLSI cell libraries," *IEEE Journal of Solid-State Circuits*, vol. 38, no. 8, pp. 1364–1372, Aug. 2003. - [7] H. Ng, R. Ziazadeh, and D. Allstot, "A multistage amplifier technique with embedded frequency compensation," *IEEE Journal of Solid-State Circuits*, vol. 34, no. 4, pp. 339–347, Mar. 1999. - [8] P. Chan and Y. Chen, "Gain-enhaced feedforward path compensation technique for pole-zero cancellation at heavy capacitive loads." *IEEE Transactions on Circuits and Systems—Part II: Analog and Digital Signal Processing*, vol. 50, no. 12, pp. 933–940, Dec. 2003. - [9] R. Acosta, F. Silveira, and P. Aguirre, "Experiences on analog circuit technology migration and reuse," in *Proc. XV Symposium on Integrated Circuits and Systems Desgin*, SBCCI2002. Porto Alegre, Brazil: IEEE Computer Press ISBN 0-7695-1807-9/02, Sept. 2002, pp. 169-174. - [10] C. Galup-Montoro, M. Schneider, and A. Cunha, "A current-based MOSFET model for integrated circuit design," in Low-Voltage / Low-Power Integrated Circuits and Systems: Low-Voltage Mixed-Signal Circuits, E. Sanchez-Sinencio and A. Andreou, Eds. IEEE Press, ISBN 0-7803-3446-9, 1999, ch. 2, pp. 7–55. [11] A. Cunha, M. Schneider, and C. Galup-Montoro, "An MOS transisotr model for analog circuit design," *IEEE Journal of Solid-State Circuits*, vol. 33, no. 10, pp. 1510–1519, Oct. 1998. - [12] P. L. Levin and R. Ludwig, "Crossroads for mixed-signal chips," *IEEE Spectrum*, pp. 38–43, Mar. 2002. - [13] C. Enz, F. Krumenacher, and E. Vittoz, "An analitical MOS transistor model valid in all regions of operations and dedicated to low-voltage low-current applications," *Analog Integrated Circuits Signal Processing*, vol. 8, pp. 83–114, 1995. - [14] Y. Tsividis, Operation and Modelling of the MOS Transistor. New York: McGraw-Hill, 1987. - [15] A. Arnaud and C. Galup-Montoro, "Simple noise formulas for MOS analog design," in *Proc. Int. Symposium on Circuits and Systems (ISCAS)*, vol. I, Phoenix, USA, May 2002, pp. 189–192. - [16] M. Hasan, H.-H. Shen, D. Allee, and M. Pennell, "A behavioral model of a 1.8-v flas A/D converter based on device parameters," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 19, no. 1, pp. 69–82, Jan. 2000. - [17] B. Ray, P. Chaudhuri, and P. Nandi, "Efficient synthesis of ota network for linear analog functions," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 21, no. 5, pp. 517–533, Mar. 2002. - [18] D. Stoppa, A. Simoni, L. Gonzo, M. Gottardi, and G.-F. D. Betta, "Novel CMOS image sensor with 132-dB dynamic range," *IEEE Journal of Solid-State Circuits*, vol. 37, no. 12, pp. 1846–1852, Dec. 2002. - [19] D. Binkley, C. Hopper, S. Tucker, B. Moss, J. Rochelle, and D. Foty, "A CAD methodology for optimizing transistor current and sizing in analog CMOS design," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 22, no. 2, pp. 225–237, Feb. 2003. - [20] P. Aguirre and F. Silveira, "Design of a reusable rail-to-rail operational amplfier," in Proc. XVI Symposium on Integrated Circuits and Systems Desgin, SBCCI2003. São Paulo, Brazil: IEEE Computer Press ISBN 0-7695-2009-X, Sept. 2003, pp. 20–25. - [21] F. Silveira, D. Flandre, and P. Aguirre, "Conception optimale et réutilisable d'otas pour dispositifs médicaux implantables," in *Actes du Colloque TAISA* 2003, Louvain-la-neuve, Belgique, Sept. 2003. [22] F. Silveira and D. Flandre, "Operational amplifier power optimization for a given total (slewing plus linear) settling time," in *Proc. XV Symposium on Integrated Circuits and Systems Desgin, SBCCI2002.* Porto Alegre, Brazil: IEEE Computer Press ISBN 0-7695-1807-9/02, Sept. 2002. - [23] —, "A 110nA pacemaker sensing channel in CMOS on Silicon-on-Insulator," in *Proc. Int. Symposium on Circuits and Systems (ISCAS)*, vol. V, Phoenix, USA, May 2002, pp. 181–184. - [24] —, "Analysis and design of a family of low-power class AB operational amplifiers," in *Proc. XIII Symposium on Integrated Circuits and Systems Desgin, SBCCI2000*. Manaos, Brazil: IEEE Computer Press ISBN 0-7695-0843-X/00, Sept. 2000. - [25] L. Reyes, D. Perciante, and F. Silveira, "Amplificador pasabajos diferencial a capacitores commutados para aplicaciones biomédicas implantables," in *Procee*dings VI Workshop de Iberchip, São Paulo, Brazil, Mar. 2000. - [26] A. Arnaud, M. Barú, G. Picún, and F. Silveira, "Design of a micropower signal conditioning circuit for a piezoresistive acceleration sensor," in *Proc. Int. Sym*posium on Circuits and Systems (ISCAS), vol. 1, Monterrey, USA, May 1998, pp. 269 – 272. - [27] A. Arnaud, O. de Oliveira, L. Reyes, D. Perciante, C. Rossi, and F. Silveira, "Diseño de circuitos a capacitores conmutados para adquisición de señales biomédicas," in *Proceedings V Workshop de Iberchip*, Lima, Peru, Mar. 1999. - [28] M. Barú, H. Valdenegro, C. Rossi, and F. Silveira, "An ask demodulator in CMOS technology," in *Proceedings IV Workshop de Iberchip*, Mar del Plata, Argentina, Mar. 1998. - [29] A. Arnaud, M. Barú, O. de Oliveira, , P. Mazzara, G. Picún, C. Rossi, F. Silveira, and H. Valdenegro, "Circuitos analógicos de microconsumo y baja tensión de alimentación," in *Proceedings IV Workshop de Iberchip*, Mar del Plata, Argentina, Mar. 1998. - [30] A. Arnaud and F. Silveira, "The design methodology of a sample and hold for a low-power sensor interface circuit," in *Proceedings X Brazilian Symposium on Integrated Circuit Design*, Gramado, Brazil, Aug. 1997. - [31] H. Valdenegro, "Diseño de un amplificador operacional cmos de alta ganancia y muy bajo consumo," in *Proceedings III Workshop de Iberchip*, Mexico, Feb. 1997. - [32] D. Flandre, F. Silveira, J. Eggermont, B. Gentinne, V. Dessard, A. Viviani, D. Baldwin, L. Demeus, and P. G. Jespers, "Design automation of cmos otas using symbolic analysis and gm/id methodology," in *Proceedings 4th International Workshop on Symbolic Methods and Applications to Circuit Design*, Belgium, Oct. 1996. - [33] A. Afzalian and D. Flandre, "Modeling of the bulk versus SOI CMOS performances for the optimal design of APS circuits in low-power low-voltage applications," *IEEE Transactions on Electron Devices*, vol. 50, no. 1, pp. 106–110, Jan. 2003. - [34] J. Colinge, "Fully-depleted SOI CMOS for analog applications," *IEEE Transactions on Electron Devices*, vol. Vol. 45, no. No. 5, pp. pages 1010–1016, May 1998. - [35] D. Flandre, A. Viviani, J.-P. Eggermont, B. Gentinne, and P. Jespers, "Improved synthesis of gain-boosted regulated-cascode CMOS stages using symbolic analysis and gm/id methodology," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 5, pp. 1010–1016, May 1998. - [36] M. Pelgrom, A. Duinmaijer, and A. Welbers, "Matching properties of MOS transistors," *IEEE Journal of Solid-State Circuits*, vol. 24, no. 5, pp. 1433–1440, Oct. 1989. - [37] K. R. Laker and W. M. Sansen, Design of Analog Integrated Circuits and Systems. New York: McGraw-Hill, 1994, ch. 6, pp. 622–627. - [38] J. H. Huijsin and D. Linebarger, "Low-voltage operational amplifier with rail-to-rail input and output ranges," *IEEE Journal of Solid-State Circuits*, vol. SC-20, pp. 1144–1150, Dec. 1985. - [39] R. Hogervorst, R. Wiegerink, P. de Jong, J. Fonderie, R. Wassenaar, and J. H. Huijsing, "CMOS low-voltage operational amplifiers with constant gm rail-to-rail input stage," in *Proc. Int. Symposium on Circuits and Systems (ISCAS)*, vol. 6, 1992, pp. 2876–2879. - [40] S. Sakurai and M. Ismail, *Low-Voltage CMOS Operational Amplfiers*. Kluwer Academic Publishers, 1995. - [41] R. Hogervorst, J. P. Tero, and J. H. Huijsing, "A programable 3V CMOS rail-to-rail opamp with gain boosting for driving heavy resistive loads," in *Proc. Int. Symposium on Circuits and Systems (ISCAS)*, 1995, pp. 1544–1547. - [42] ——, "Compact CMOS constant-gm rail-to-rail input stage with gm-control by an electronic zener diode," *IEEE Journal of Solid-State Circuits*, vol. 31, no. 7, pp. 1035–1040, July 1996. - [43] W. Redman-White, "A high bandwidth constant gm and slew-rate rail-to-rail CMOS input circuit and its application to analog cells for low voltage vlsi - systems," *IEEE Journal of Solid-State Circuits*, vol. 32, no. 5, pp. 701–702, May 1997. - [44] V. I. Prodanov and M. M. Green, "Bipolar/CMOS (weak inversion) rail-to-rail constant-gm input stage," *Electronics Letters*, vol. 33, no. 5, pp. 386–387, Feb. 1997. - [45] —, "New CMOS universal constant-gm input stage," in *Proc. of the IEEE Int. Conference on Electronics, Circuits and Systems*, vol. 2, 1998, pp. 359–362. - [46] J. F. Duque-Carrillo, J. M. Carrillo, J. L. Ausín, and E. Sánchez-Sinencio, "Robust and universal constant gm circuit technique," *Electronics Letters*, vol. 38, no. 9, pp. 396–397, Apr. 2002. - [47] K. de Langen and J. H. Huijsing, Compact Low-Voltage and High-Speed CMOS, BiCMOS and Bipolar Operational Amplifiers. Dordrecht: Kluwer Academic Publishers, 1999. - [48] H. Sjoland, *Highly Linear Integrated Wideband Amplifiers*. Dordrecht: Kluwer Academic Publishers, 1999. - [49] C. Galup-Montoro and M. C. Schneider, "Resizing rules for the reuse of MOS analog design," in Proc. XIII Symposium on Integrated Circuits and Systems Desgin, SBCCI2000. Manaos, Brazil: IEEE Computer Press ISBN 0-7695-0843-X/00, Sept. 2000, pp. 89–93. - [50] S. Funaba, A. Kitagawa, T. Tsukada, and G. Yokomizo, "A fast and accurate method of redesigning analog subcircuits for technology scaling," *Analog Integrated Circuits and Signal Processing*, vol. 25, pp. 299–307, 2000. - [51] M. Verbeck, C. Zimmermann, and H. Fiedler, "A MOS switched-capacitor ladder filter in SIMOX technology for high temperature applications up to 300C," *IEEE Journal of Solid-State Circuits*, vol. 31, no. 7, pp. 908–914, July 1996. - [52] R. Griffith, R. Vyne, R. Dotson, and T. Petty, "A 1-v BiCMOS rail-to-rail amplifier with n-channel depletion mode input stages," *IEEE Journal of Solid-State Circuits*, vol. 32, no. 12, pp. 2012–2023, Dec. 1997. - [53] K. R. Laker and W. M. Sansen, Design of Analog Integrated Circuits and Systems. New York: McGraw-Hill, 1994, ch. 6, pp. 523–535. - [54] "MATLAB," The Mathworks," web: http://www.mathworks.com. - [55] R. G. Eschauzier and J. H. Huijsing, Frequency compensation techniques for low-power operational amplifiers. Dordrecht: Kluwer Academic Press, 1995. - [56] T. Choi, R. Kaneshiro, R. Brodersen, P. Gray, W. Jett, and M. Wilcox, "High-frequency CMOS switched-capacitor filters for communications application," *IEEE Journal of Solid-State Circuits*, vol. 18, no. 6, pp. 652–664, Dec. 1983. [57] F. Silveira, "Analog design in SOI technology: Micropower and high temperature applications," MSc. Thesis, Université Catholique de Louvain, Louvain-laneuve, Belgique., Jan. 1995.