# Circuits and Techniques for Cell-based Analog Design Automation of Low Dropout Regulators

by

Yaswanth Kumar Cherivirala

A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical and Computer Engineering) in the University of Michigan 2023

**Doctoral Committee:** 

Professor David D. Wentzloff, Chair Associate Professor Ronald Dreslinski Professor Michael P. Flynn Professor Dennis Sylvester Yaswanth Kumar Cherivirala

yaswanth@umich.edu

ORCID iD: 0000-0001-9652-9520

© Yaswanth Kumar Cherivirala 2023

Dedication

To my beloved family and friends.

#### Acknowledgements

First and foremost, I want to thank my advisor Prof. David D. Wentzloff for the constant guidance and support he has given me over the course of this work. He has always been very patient with me and helped me tackle some difficult situations, this work would not have been fruitful without his support. I will also be forever grateful for the fact that he has given equal importance to my personal and professional growth in addition to my academic growth. I also want to thank my other committee members, Prof. Ronald Dreslinski, Prof. Michael Flynn and Prof. Dennis Sylvester, who have been a constant source of knowledge and guidance all through my PhD. In addition, I also thank them for taking the time to review my work and providing me valuable feedback that helped polish this work.

I cannot express how thankful I am for my family (Dad: Subash, Mom: Vijaya, Brother: Nagarjuna, Sister: Nagalakshmi and Brother-in-law: Vinay), who have been nothing but supportive in all my decisions so far. I wouldn't have been here if it is not for my family and my work is dedicated to them. Also, it is very difficult to tackle the pressure of graduate studies without the support of friends. So I want to convey heartfelt thanks to all my friends in US, and especially to Himanshu Aggarwal, Yash Mehta, Kyumin Kwon, Milad Moosavifar, Omar Abdelatty, Trevor Odelberg, Rohit Rothe, Sumanth Kamineni, Aditya Varma Muppala, Sanjana Eyunni, Abhijit Parolia and Rini Kaushal, who have been like a family away from home for me.

I also want to thank all the past and current lab mates at "Wireless Integrated Circuits and Systems" (WICS) group. The endless technical discussions we had helped me improve the breadth

and depth of my knowledge about circuit design immensely. In addition, I extend my thanks to all the student members of the Michigan Integrated Circuits Lab (MICL), with whom I have shared the workspace, attended classes and instructed in courses as a graduate student instructor for making my student experience complete. It has been a great experience interacting with people working on different types of projects, learning from them and getting help during the tapeouts and measurements. In particular, I want to thank the senior students Abdullah Alghaihab, Jaeho Im, Tutu Ajayi, Li Xu, Christine Weston, Seungjong Lee, Seungheun Song and the post-doctoral student Se-Un Shin, for teaching me various topics and enabling me to conduct research independently.

Lastly, I want to thank my collaborators, Prof. Dreslinski's lab/DD lab at University of Michigan, Prof. Calhoun's lab at University of Virginia and Matteo Coltella from ARM, Ireland in the Fully Autonomous SoC generator (FASoC) project, and the organizations DARPA and INTEL for funding this research.

# **Table of Contents**

| Dedication                                                   | ii   |
|--------------------------------------------------------------|------|
| Acknowledgements                                             | iii  |
| List of Tables                                               | viii |
| List of Figures                                              | ix   |
| List of Abbreviations                                        | xiii |
| Abstract                                                     | XV   |
| Chapter 1 Introduction                                       | 1    |
| 1.1 Circuit Design Automation                                | 1    |
| 1.2 Analog synthesis & Layout automation problem             | 2    |
| 1.2.1 Existing techniques for analog synthesis               | 2    |
| 1.2.2 Complexities of analog layout automation               | 3    |
| 1.3 LDO Background                                           | 4    |
| 1.4 Thesis Organization                                      | 6    |
| Chapter 2 Cell-Based Design Automation Framework using CADRE | 8    |
| 2.1 Baseline Design                                          | 8    |
| 2.1.1 Architecture                                           | 8    |
| 2.1.2 Auxiliary Cells                                        | 9    |
| 2.2 LDO w/ Stochastic ADC as error detector                  | 11   |
| 2.2.1 Architecture                                           | 11   |
| 2.3 Generator Framework                                      | 12   |
| 2.3.1 Generator Flow                                         | 13   |
| 2.3.2 LDO Modeling                                           | 14   |
| 2.3.3 Spec Agnostic Place & Route Constraints                | 16   |

| 2.4 Generated LDO Designs                              | 18 |
|--------------------------------------------------------|----|
| 2.4.1 Chip Block diagram                               | 18 |
| 2.4.2 Measurements                                     | 20 |
| 2.5 SoC Integration                                    | 22 |
| Chapter 3 Synthesizable PID Controller                 | 24 |
| 3.1 Motivation                                         | 24 |
| 3.2 PID Controller Architecture                        | 25 |
| 3.2.1 Multi-BAWP PI-Control                            | 26 |
| 3.2.2 Synthesizable Differential Control               | 29 |
| 3.3 Fabricated Test Chips & Measurements               | 32 |
| 3.3.1 TSMC 65nm Test Chip                              | 32 |
| 3.3.2 Global Foundries 12nm FinFET Test Chip           | 38 |
| 3.3.3 Transient Measurement with On-Chip Load          | 42 |
| 3.3.4 Automated DC Measurement Setup                   | 42 |
| Chapter 4 Transient Spec Synthesis & Layout Automation | 45 |
| 4.1 Additional Aux-Cells                               | 45 |
| 4.1.1 Characterization of Transient Detector Offsets   | 46 |
| 4.2 Design Synthesis Flow                              | 47 |
| 4.2.1 Supported User Input Specifications              | 49 |
| 4.3 Layout Construction                                | 50 |
| Chapter 5 Hybrid LDO Architecture                      | 52 |
| 5.1 Introduction                                       | 52 |
| 5.2 Proposed LDO Architecture                          | 53 |
| 5.3 TSMC 65nm Hybrid LDO Test Chip                     | 55 |
| 5.3.1 Die Micrograph & Layout                          | 55 |
| 5.3.2 Transient Measurements                           | 56 |
| 5.3.3 DC Measurements                                  | 57 |
| 5.3.4 Performance Comparison                           | 59 |
| Chapter 6 Conclusion and Future Work                   | 61 |
| 6.1 Summary                                            | 61 |
| 6.2 Future Work                                        | 62 |
| 6.2.1 Automating the Auxcell Generation                | 62 |

| 6.2.2 Automating the Hybrid Analog Synthesis & Implementation  | 63 |
|----------------------------------------------------------------|----|
| 6.2.3 Synthesis of Dynamic Range or Efficiency Specifications  | 63 |
| 6.2.4 Automated PGN Analysis and Synthesis of Distributed LDOs | 63 |
| Bibliography                                                   | 64 |

# List of Tables

| Table 2-1 Supported Baseline LDO Generator User Input Specifications 12                |
|----------------------------------------------------------------------------------------|
| Table 2-2 Generated 65nm LDOs – Input Specifications                                   |
| Table 2-3 Generated 65nm LDOs - Measurement Results @ 200MHz Fclk 21                   |
| Table 3-1 State of the art comparison of the synthesizable PID controller in 65nm test |
| chip                                                                                   |
| Table 4-1 Supported LDO Generator User Input Specifications Using the                  |
| Synthesizable PID Controller Template in 65nm Technology                               |
| Table 5-1 State of the art comparison of the synthesizable PID controller in 65nm test |
| chip                                                                                   |

# List of Figures

| Figure 1-1: Number of transistors in a processor vs. time [1] 1                                  |
|--------------------------------------------------------------------------------------------------|
| Figure 1-2: Layout Constraints trend with advancing technologies [25]                            |
| Figure 1-3: Voltage Regulation Mechanism                                                         |
| Figure 1-4: (a) Analog LDO [32] (b) Digital LDO [33]                                             |
| Figure 2-1: Baseline Digital LDO Architecture [33]9                                              |
| Figure 2-2: (a) PT_UNIT_CELL aux-cell structure (b) LDO_COMPARATOR [41]                          |
| structure                                                                                        |
| Figure 2-3: Digital LDO with stochastic flash ADC [41] based error detector 11                   |
| Figure 2-4: Fully Autonomous LDO Generator Flow Using Commercial [34] & Open-                    |
| Source [43] CAD Tools 13                                                                         |
| Figure 2-5: I <sub>load,max</sub> vs. Power transistor array Size in (a) 65nm (b) 130nm (c) 12nm |
| technologies 15                                                                                  |
| Figure 2-6: LDO Modeling Test Circuit 15                                                         |
| Figure 2-7: I <sub>load,max</sub> vs PT Array Size performance in 65nm (a) Pre-PEX (b) Post-PEX  |
| with no APR constraints (c) Post-PEX with spec agnostic APR constraints 16                       |
| Figure 2-8: Effective Post-PEX schematic of the LDO Modeling Test Circuit                        |
| Figure 2-9: (a) Complete layout of LDO design 1 (b) constrained PT_UNIT_CELLs                    |
| placement (c) V <sub>REG</sub> power stripes                                                     |
| Figure 2-10: Final Block Diagram of taped out LDO designs                                        |

| Figure 2-11: $V_{REF}$ match for $V_{IN} = 1.3V$ maximum $I_{load}$ condition         |
|---------------------------------------------------------------------------------------|
| Figure 2-12: Specs and layout of the SoC                                              |
| Figure 2-13: Test setup for DFS operation                                             |
| Figure 3-1: Conventional Single BAWP Architecture [47] vs. Proposed Multi-BAWP        |
| with Delay Line based Differential control Architecture [51]                          |
| Figure 3-2: Block Diagram & Steady state linearized model of a BAWP loop              |
| Figure 3-3: Self-Clocked TD Time Sharing State Machine                                |
| Figure 3-4: Block diagram of the transient detector & the TD offset tuning unit 29    |
| Figure 3-5: Block diagram of the delay line slope detector based differential control |
| loop & Timing diagram for undershoot and overshoot load events (with Kdn =            |
| Kdp = 15)                                                                             |
| Figure 3-6: Die Micrograph of the Synthesizable PID Controller test chip in TSMC      |
| 65nm                                                                                  |
| Figure 3-7: 65nm Test Chip – PID Controller Transient response –                      |
| Undershoot/Overshoot at 50mV dropout                                                  |
| Figure 3-8: Post-PEX transient simulations for a 28mA load step @ 0.1ns transient     |
| time (a) FF corner (b) SS corner (c) TT corner (d) TT corner including estimated      |
| off-chip loading parasitics                                                           |
| Figure 3-9: 65nm Test Chip – PID Controller DC Measurements – Load Regulation         |
| before and after de-embedding the parasitic IR drop @50mV dropout                     |
| Figure 3-10: 65nm Test Chip – PID Controller DC Measurements – Line Regulation        |
| Plots @ 10mA load                                                                     |

| Figure 3-11: 65nm Test Chip – PID Controller DC Measurements – Steady state ripple    |
|---------------------------------------------------------------------------------------|
| measured for the maximum load conditions at $50mV$ (best case) and $550mV$            |
| (worst case) dropout voltages                                                         |
| Figure 3-12: Die Micrograph of the Synthesizable PID Controller test chip in Global   |
| Foundries 12nm FinFET                                                                 |
| Figure 3-13: 12nm Test Chip – PID Controller Transient response –                     |
| Undershoot/Overshoot at 50mV dropout                                                  |
| Figure 3-14: 12nm Test Chip – PID Controller DC Measurements – Load Regulation        |
| Plot @50mV dropout                                                                    |
| Figure 3-15: 12nm Test Chip – PID Controller DC Measurements – Line Regulation        |
| Plot @ 10mA I <sub>LOAD</sub>                                                         |
| Figure 3-16: 65nm Re-Spin Chip with On-Chip load – PID Controller Transient           |
| response – Undershoot/Overshoot at $V_{IN} = 1.3V$ , $V_{REG} = 1.25V$ (50mV dropout) |
| & $T_{EDGE} = 4.5$ ns                                                                 |
| Figure 3-17: LDO QFN Board for (a) TSMC 65nm Test Chip (b) GF 12nm Test Chip          |
|                                                                                       |
| Figure 3-18: (a) DC measurement board connections including Programmable off-chip     |
| load, Opal Kelly XEM6001 FPGA and LDO QFN boards, (b) SPI & Power                     |
| connection interfaces of the LDO QFN board                                            |
| Figure 4-1: (a) "TRANS_DET" auxcell layout, (b) "TD_TUNE_UNIT" auxcell                |
| schemtic and (c) layout in TSMC 65nm 46                                               |
| Figure 4-2: Schematic of the "TD_TUNE_UNIT" aux-cell and the simulated TD offset      |
| w.r.t to the ctrl. word                                                               |

| Figure 4-3. Conversion of User Specifications to Synthesizable FID Design Parameters      |
|-------------------------------------------------------------------------------------------|
|                                                                                           |
| Figure 4-4: Generated top-level layout showing the sub-block placements                   |
| Figure 4-5: Post-PEX Simulation of the Generated Layout                                   |
| Figure 5-1: Power Supply Noise Rejection in Digital LDO                                   |
| Figure 5-2: Block diagram of the proposed fine loop for a hybrid LDO                      |
| Figure 5-3: Block Diagram of the proposed hybrid LDO architecture (synthesizable          |
| PID + hybrid fine loop)                                                                   |
| Figure 5-4: Differential to single ended amplifier with Common source buffer              |
| Figure 5-5: Die Micrograph of the Hybrid LDO test chip in TSMC 65nm along with            |
| on-chip load                                                                              |
| Figure 5-6: 65nm Hybrid LDO test chip with on-chip load – Transient response –            |
| Undershoot/Overshoot at $V_{IN} = 1.3V$ , $V_{REG} = 1.25V$ (50mV dropout) & $T_{EDGE} =$ |
| 2.5ns                                                                                     |
| Figure 5-7: 65nm Hybrid LDO test chip – DC Measurements – Load Regulation Plot            |
| @50mV dropout                                                                             |
| Figure 5-8: 65nm Hybrid LDO test chip – DC Measurements – Line Regulation Plot            |
| @20mA I <sub>LOAD</sub>                                                                   |
| Figure 5-9: 65nm Hybrid LDO test chip – DC Measurements – PSRR measurement. 59            |

# List of Abbreviations

| ADC  | Analog to Digital Converter              |
|------|------------------------------------------|
| ALDO | Analog Low Dropout Regulator             |
| AMS  | Analog Mixed Signal                      |
| APB  | Advanced Peripheral Bus                  |
| APR  | Auto Place and Route                     |
| BAWP | Bidirectional Asynchronous Wave Pipeline |
| BGA  | Ball Grid Array                          |
| BW   | Bandwidth                                |
| CAD  | Computer Aided Design                    |
| CMOS | Complementary Metal-Oxide Semiconductor  |
| CPU  | Central Processing Unit                  |
| DFF  | D-Flip Flop                              |
| DLDO | Digital Low Dropout Regulator            |
| DVFS | Dynamic Voltage & Frequency Scaling      |
| En   | Enable                                   |
| FET  | Field Effect Transistor                  |
| FF   | Fast-Fast                                |
| FOM  | Figure of Merit                          |
| GF   | Global Foundries                         |
| HPF  | High Pass Filter                         |
| IC   | Integrated circuit                       |
| I/O  | Input/output                             |
| ІоТ  | Internet-of-Things                       |
| JTAG | Joint Test Action Group                  |

| LDO  | Low Dropout Regulator                       |
|------|---------------------------------------------|
| LCO  | Limit Cycle Oscillation                     |
| NMOS | N-channel Metal Oxide Semiconductor         |
| РСВ  | Printed Circuit Board                       |
| PDK  | Process Design Kit                          |
| PEX  | Parasitic Extracted                         |
| PGN  | Power Grid Network                          |
| PI   | Proportional Integral                       |
| PID  | Proportional Integral Differential          |
| PLL  | Phase Locked Loop                           |
| PMOS | P-channel Metal Oxide Semiconductor         |
| PSRR | Power Supply Rejection Ratio                |
| QFN  | Quad-Flat-No lead                           |
| RAM  | Random Access Memory                        |
| RO   | Ring Oscillator                             |
| SoC  | System-on-Chip                              |
| SPI  | Serial Peripheral Interface                 |
| SR   | Shift Register                              |
| SRAM | Static Random Access Memory                 |
| SS   | Slow-Slow                                   |
| TD   | Transient Detector                          |
| TSMC | Taiwan Semiconductor Manufacturing Company  |
| TT   | Typical-Typical                             |
| UART | Universal Asynchronous Receiver-Transmitter |
| VCO  | Voltage Controlled Oscillator               |

#### Abstract

As semiconductor technology advances, circuit design becomes more difficult due to increased short channel effects and low breakdown voltages of FETs. In addition, as we move to FinFET technologies there are additional layout constraints, added due to complex lithography techniques like double/quad patterning and separate exposure masks for minimum features, making the manual layout process more time consuming. While digital CAD tools can automatically synthesize a digital design from a Verilog description and then automatically generate the layout (gds file), analog circuit design and layout generation remains a significant bottleneck for automating the design of complete system on chips (SoCs). An automated design and implementation flow for analog circuits, similar to that of standard digital flow (digital synthesis and auto place/route) would greatly improve the SoC design efficiency and reduce implementation cycle times from months to only a few hours.

This work focuses on implementing an LDO Generator tool which can automatically output a low dropout regulator (LDO) design based on user specifications and automatically place/route the design to output the LDO layout (gds file) for a given process design kit (PDK). The key idea in implementing this tool is to identify the analog functionalities required to achieve voltage regulation and push these analog functionalities to smallest circuit structures (called auxiliary cells). These auxiliary cells can be used as building blocks of the LDO design. Once the auxiliary cells are defined for the LDO, we use the standard digital flow to implement different LDO designs and characterize their post-pex performance (model generation). Based on these models, we then automate the layout generation of a LDO that meet the user input specifications.

To implement a synthesizable LDO, we adopted the digital LDO architecture as the baseline design, which uses an array of small power transistors that operate as current switches. The use of power transistors as current switches facilitates low VDD power management and process scalability which makes digital LDOs a potential candidate for power management especially in lower technology nodes. In addition to the power transistor array, we have used a clocked comparator that could be implemented using only standard digital cells. With the "Current Switch" and the "Comparator" as the auxiliary cells, and a bidirectional shift register as the LDO controller, an automatic LDO Generator tool was developed. However only the DC specifications ( $V_{IN}$ ,  $V_{REF}$ , maximum load current & dropout) can be synthesized using the baseline design.

In addition, a synthesizable PID controller has been proposed and demonstrated in this work to improve the transient performance of a digital LDO. This synthesizable PID controller architecture is used to automate an LDO design from transient specifications (maximum undershoot/overshoot, minimum transient time & output capacitance). Furthermore, a hybrid LDO architecture with integrated analog control loop is proposed to enhance the PSRR performance of the LDO and realize a universal LDO architecture.

#### **Chapter 1 Introduction**

# **1.1 Circuit Design Automation**

With the advent of Integrated Circuit (IC) in 1956, the number of transistors and the amount of functionality integrated into a single chip has exponentially increased. A well-known example illustrating this idea is the "Moore's law", first postulated in 1965, which predicted that the number of transistors on a single chip doubles almost every two years. One of the key factors that contributed to the exponential increase of digital circuit integration, aside from new CMOS technology, is the automation of digital circuit design using computer-aided design (CAD) tools.



Figure 1-1: Number of transistors in a processor vs. time [1]

From Figure 1-1, we can see that a modern day processor can have 10s of billions of transistors on a single IC. This wouldn't have been possible if each of these transistors have to be designed manually. Instead modern day industries and academia incorporate a standardized design flow to automate the design and layout generation of digital circuits with the use of CAD tools. CAD tools can automatically synthesize a high level digital logic description into a hardware description and automatically generate the layout file required for fabrication of the hardware. While the number of analog transistors integrated in a modern day IC is multiple orders lower than the number of digital transistors, the design time of analog circuits can often be the major bottleneck in reducing the overall IC design time. One of the factors contributing to this is the lack of standard analog circuit automation tools.

## 1.2 Analog synthesis & Layout automation problem

#### **1.2.1** Existing techniques for analog synthesis

Analog design automation has been an important focus of research from the late 1980s, however due to various roadblocks like usability, computational power requirements, complexity of the problems, flexibility etc., it is still not part of the standard design flow in most industries. Various approaches have been used for automation of analog design from expert based or knowledge based approaches [2], [3], model based approaches [4], [5], simulation based approaches [6], [7] and hybrids [8], [9] have been used. A more comprehensive collection of the works with equation based approaches, commercial tools etc., can be found in [10]. Each of these approaches have advantages which are hard to replicate in the other. Expert based approaches are the most trustworthy of approaches by design and are relatively fast, however these approaches focus mainly on linear design and on a specific set of circuits that are generally used in the industry. These approaches require constant updates to keep up with technology [10]. Model based

approaches require a large database of the circuit to generate the models before providing relatively accurate data, making them less robust to use and limiting their usage. These approaches however cannot compete with simulation based approaches in terms of accuracy. While simulation based approaches are accurate, they are computationally intensive and require higher processing power and time. Hybrid approaches, that optimizes the benefits of above approaches, have been identified to be the best suited approach for analog synthesis [9].

One way to achieve analog synthesis is to optimize transistor level netlists, and with efficient learning algorithms we can meet the required performance including the post-layout effects. Most of these optimizations [11]–[17] are based on machine learning and regression modeling making them very robust and technology independent. However, the computational overhead of these techniques increases exponentially with the number of transistors to be optimized.

#### **1.2.2** Complexities of analog layout automation

Analog IC layout automation has been an active research area for decades [18]–[21]. However, unlike the digital layout automation, these techniques have limited adoption by the current industrial flow and have not been standardized. The performance of an AMS design is sensitive to parasitics, process variations, and layout-dependent effects. Therefore, in manual designs, layouts are highly customized and sometimes analogized to "art" [22]. Unlike their digital counterparts, analog layout designs have a higher degree of freedom and often need to be customized for performance consideration [2]. As a result, current analog layout synthesis flows differ regarding the trade-off between procedure and optimization. Similarly, state of the art auto layout generation tools [23], [24] are based on topology extraction and device level placement optimizations. They also have large computational overhead as the netlist size increases resulting in solving the AMS IC layout automation problem still as an open question.

As semiconductor technology advances, circuit design becomes more difficult due to increased short channel effects and low breakdown voltages of FETs. In addition, as we move to FinFET technologies there are additional layout constraints (Figure 1-2), added due to complex lithography techniques like double/quad patterning and separate exposure masks for minimum features, making the manual layout process more time consuming. While digital cad tools can automatically synthesize a Verilog description of a digital design and automatically generate the layout (gds file), analog circuit design and layout generation remains as the bottleneck fully automated circuit design.



Number of design rules per technology node

Figure 1-2: Layout Constraints trend with advancing technologies [25]

# **1.3 LDO Background**

System-on-Chips (SoCs), as the name suggests, are full systems that may contain analog/mixed-signal and digital blocks, usually a processing unit and memory, implemented as a single integrated circuit on a single chip [26]–[30]. SoC automation can significantly improve the

design efficiency, robustness and reduce implementation cycles of electronic products. Low dropout (LDO) regulators are essential for power management of sensitive analog and digital circuits in SoC designs. With the high level of integration achieved in modern day SoCs, as many as 50 to 100 LDOs can be present in a single SoC [31]. As shown in Figure 1-3, the main function of an LDO is voltage regulation over different load impedances ( $R_{LOAD}$ ). In other words an LDO cleans up a varying/noisy input voltage ( $V_{IN}$ ) and generate a stable output voltage ( $V_{REG}$ ) irrespective of the load resistance ( $R_{LOAD}$ ). Hence and LDO can be represented as a variable resistance ( $R_{VAR}$ ) that adjusts itself to maintain a constant  $V_{REG}$  (Figure 1-3).



Figure 1-3: Voltage Regulation Mechanism

Traditional LDOs are completely analog in nature, implemented using an error amplifier driving a large power FET operating in the saturation region. These LDOs have fast response time and are only limited by the bandwidth of the error amplifier. However with technology scaling, the need for low-voltage operation of LDOs restrict the usage of analog LDOs. Hence Digital LDO (DLDO) architectures are being adopted in advanced process nodes to enable high levels of integration and low input voltage operation of SoCs. As shown in Figure 1-4, while the DLDOs can be used for low-voltage operation and are easily scalable across technologies, they have limited BW and DC gain resulting in large output ripples.



Figure 1-4: (a) Analog LDO [32] (b) Digital LDO [33]

# 1.4 Thesis Organization

This thesis presents a detailed analysis and implementation details for improving digital LDO performance and automating the LDO using a fully-autonomous technology-agnostic generator. Chapter 2 presents the cell-based design python framework that can automate the design of baseline digital LDO architecture. This chapter is further organized into detailed analysis of digital LDO architecture, introduces the auxiliary cells and an ADC-based controller to improve settling time.

Chapter 3 introduces a synthesizable PID controller to improve the transient response of a digital LDO. In this chapter, we outline the motivation for the requirement of a synthesizable differential controller and detailed circuit implementation of the PID controller. Chapter 4 presents the methodology for automating the design and layout of the synthesizable PID controller proposed in chapter 3. In Chapter 5, we propose a hybrid LDO architecture to improve the PSRR of the

synthesizable PID controller by integrating it with an analog fine loop. Finally, Chapter 6 summarizes this thesis contributions and lists the future directions that can be pursued.

#### Chapter 2 Cell-Based Design Automation Framework using CADRE

The key idea in implementing a fully autonomous LDO generator is to identify the analog functions required to achieve voltage regulation and push these analog functions to the smallest circuit structures (referred to as auxiliary cells) possible. These auxiliary cells can then be used as building blocks of the LDO design. Once the auxiliary cells are defined and implemented for a particular LDO design, we use the standard digital flow to implement different LDO designs and characterize their post-PEX performance (model generation). Based on these models, we then automate the layout generation of a LDO that meets the user input specifications (specs). This design methodology is referred to as "cell-based analog design" [34]–[38].

#### 2.1 Baseline Design

#### 2.1.1 Architecture

To implement a synthesizable LDO, we adopted the digital LDO (DLDO) architecture [33], [39], as the baseline design. As shown in Figure 2-1, the main idea behind a digital LDO is the use of an array of small power transistors that operate as current switches. The use of power transistors as current switches facilitates low VDD power management and process scalability which makes digital LDOs a potential candidate for power management especially in lower technology nodes. In addition to the power transistor array, we have used a clocked comparator that could be implemented using only standard digital cells as an error detector. With the "Current Switch" and the "Clocked Comparator" as the auxiliary cells, and a single-bit bidirectional shift register as the LDO controller, an automatic LDO generator was developed.



Figure 2-1: Baseline Digital LDO Architecture [33]

# 2.1.2 Auxiliary Cells

Figure 2-2(a) shows the structure of the current switch auxiliary cell named "PT\_UNIT\_CELL" (Power Transistor Unit Cell). The PMOS in the structure is the current switch with gate voltage as the on/off control and the NMOS is a simple MOSFET capacitor added to the load for a more stable steady state response of the output. PT\_UNIT\_CELL is used as a custom cell in standard digital flow by generating the files "PT\_UNIT\_CELL.lib" and "PT\_UNIT\_CELL.lef" from the layout. The auxiliary cell named "LDO\_COMPARATOR" is the clocked comparator used for error detection. As shown in Figure 2-2(b), we can use a cross coupled NAND3 structure followed by an SR latch to implement the LDO\_COMPARATOR [40], [41]. When the clock is low (pre-charge phase) both the outputs of cross coupled NAND3 are pulled high while the SR latch retains its previous state. When the clock transitions to high (evaluation phase), the NAND3 with larger analog voltage input will be pulled down faster while the other

NAND3 output remains high. In the evaluation phase, the final evaluated state is latched on to the SR latch immediately. In Figure 2-2(b), the output "OUT" is evaluated high when  $V_{REG}$  is greater than  $V_{REF}$ . Similar to PT\_UNIT\_CELL, we can generate the ".lef" and ".lib" files of LDO\_COMPARATOR cell from the layout and use it as a custom cell in the standard digital flow.







Since these auxiliary cells have simple analog functions associated with the structure of the circuit, they can be easily ported from one PDK to another, much like a logic gate in a standard cell library. The auxiliary cell layouts are manually drawn in this work to make them compatible with the standard cell grids of a given technology node. However, these auxiliary cells are small

analog circuits, so we can use the current auto analog layout tools [23] with few modifications to support additional layout constraints like complying with a grid and generate the layouts of auxiliary cells automatically. [42] demonstrates the integration of layout automation tools to automate the synthesis and layout auxcell generation.

## 2.2 LDO w/ Stochastic ADC as error detector

#### 2.2.1 Architecture





In the baseline digital LDO design, we use a single-bit bi-directional shift register as the controller. While it is simple to implement a bi-directional shift register, the gain of this controller is constant and usually small to reduce steady state error. This small gain is usually a problem for regulating voltage during high speed transients of load current and input voltage resulting in high overshoot and undershoot values of the output voltage. Also, with only a single comparator as the error detector, it takes longer times to enter and come out of dead-zones (dead-zone is a state where we turn off the controller since the output is stable and there are no load or input voltage changes).

To improve the output transient response, we have used a flash ADC as the error detector. While a clocked comparator gives the direction in which  $V_{REG}$  deviates from  $V_{REF}$ , flash ADC gives a measure of how much  $V_{REG}$  deviates from  $V_{REF}$  in addition to the direction, thereby facilitating a fast tracking time. As shown in Figure 2-3, using the ADC as error detector with a multi-bit bi-directional shift register can significantly increase the gain of the controller for fast transients. To allow the automation of LDO generation, we have used the synthesizable stochastic flash ADC proposed in [41]. The synthesizable stochastic flash ADC uses many cross coupled NAND3 comparators and gives a digital measure of analog input voltage difference based on the stochastic variations in the offsets of the comparators. Implementing many identical cells plays directly to the strengths of cell-based design and APR. Using the LDO\_COMPARATOR as the auxiliary cell, we have implemented a stochastic flash ADC automatically in the LDO generation flow.

#### 2.3 Generator Framework

Table 2-1 Supported Baseline LDO Generator User Input Specifications

| Technology               | CAD Tools   | V <sub>IN</sub> range (V) | I <sub>load,max</sub><br>(mA) | Dropout<br>(mV) |
|--------------------------|-------------|---------------------------|-------------------------------|-----------------|
| 12nm                     | Commercial  | 0.6 - 0.9                 | 1 - 30                        | 50, 100         |
| 65nm                     | Commercial  | 0.6 - 1.3                 | 0.5 - 25                      | 50              |
| 130nm                    | Commercial  | 0.6 - 1.3                 | 0.5 - 25                      | 50              |
| 130nm Open<br>Source PDK | Open Source | 1.8 - 3.3                 | 0.5 - 50                      | 50              |

The LDO Generator developed in this work is a technology agnostic Python-based circuit design platform that automates the generation of the LDO design for a given user input specifications. Currently, the user specs that the generator supports are input voltage ( $V_{IN}$ ) and maximum load current ( $I_{load,max}$ ) for a dropout of 50mV. To show the robustness of the generator, we have ported the generator from 65nm PDK to support 130nm and 12nm PDKs as well. Table 2-1 shows the user input specs in different PDKs for which the generator can output a LDO design with post-PEX simulations matching the input specs.

#### 2.3.1 Generator Flow



Figure 2-4: Fully Autonomous LDO Generator Flow Using Commercial [34] & Open-Source [43] CAD Tools The LDO generator automates design optimization steps by generating the .v, .cdl, and .lib descriptions of the LDO along with a layout file (.gds). Figure 2-4 shows the complete automated flow of the LDO generator, indicating that either commercial [34] or open-source tools (adapted from [43]) can be used for running Synthesis, APR and post-PEX verification. All the LDO designs

in this thesis are synthesized and laid-out using the commercial CAD tools. Open-source PDK

based LDO designs generated using the open-source CAD tools are demonstrated in [44]. Design parameters of the LDO behavioral Verilog are determined based on the "ldo.model" file which contains the poly fit coefficients that can predict the post- PEX performance of any LDO design point. If the file doesn't exist, the generator automatically runs a stand-alone modeling of the auxcell library first. Model generation process is explained more in the next subsection.

The auxiliary cell generation block shown in Figure 2-4 is out of scope of this work and is shown only to present a full understanding of the generator functionality. As mentioned in section 2.1.2, we can integrate the LDO generator with existing open-source layout automation tools like ALIGN [23] and an example of automating auxcell generation is presented in [42] for the case of Static Random Access Memory (SRAM) design. Also for the supported input specs, we can use either of the baseline LDO or ADC based LDO architectures, so the first step shown in Figure 2-4 is skipped currently. The runtime of generating a LDO layout from input specs and verifying the performance with post-PEX simulations is approximately 15 minutes using a 16-core processor.

#### 2.3.2 LDO Modeling

In this step, we derive the poly-fit models of  $I_{load,max}$  performance corresponding to various combinations of the aux cell connections (connected in parallel and for different V<sub>IN</sub> values) in both "ON" and "OFF" states. Similar to the design generation flow shown in Figure 2-4, the modeling process includes generation of Verilog descriptions of the test circuits for different power transistor array sizes followed by Synthesis/APR (using Design Compiler and Innovus tools respectively) producing the layout. Then we extract the parasitics of the layout and perform post-PEX simulations which are used for the poly-fit model generation. In the current version of the LDO generator, we prepare around 30 test circuits for modeling and the poly fit coefficients are saved to "ldo.model" file. Figure 2-5 shows the post-PEX maximum load current performance with respect to the power transistor array size at different  $V_{IN}$  voltages for 65nm, 130nm and 12nm technologies.



Figure 2-5: Iload,max vs. Power transistor array Size in (a) 65nm (b) 130nm (c) 12nm technologies



Figure 2-6: LDO Modeling Test Circuit

To reduce the modeling time, we used an open-loop test circuit which includes the power transistor array and a binary to thermometer decoder. This reduces simulation time since the number of power transistors turned on can be controlled by the binary input to the test circuit. To get the maximum load current, the binary input given in the test bench is such that all the current switches are turned on in the next clock cycle while maintaining the output voltage ( $V_{REG}$ ) at a value equal to ( $V_{IN}$  – Vdropout). Figure 2-6 shows the block diagram of the test circuit in which

Rcnv is added for proper convergence of the DC operating point. The total modeling time is around 5 hours for a given technology node.

### 2.3.3 Spec Agnostic Place & Route Constraints

During the initial implementation of the generator, we observed that even though we are using non-default rules (wide metals for the output voltage signal) in Innovus APR tool for routing analog signals, there is ~6x to 9x times degradation in max load currents for large array sizes after the layout generation. The expected monotonicity of the maximum load current performance with respect to the array size is not seen and becomes unpredictable for large array sizes. Figure 2-7(a) and Figure 2-7(b) shows the pre-PEX simulations and no constraint post-PEX simulations respectively. The effect of place and route scripts for different design points is not predictable.



Figure 2-7: I<sub>load,max</sub> vs PT Array Size performance in 65nm (a) Pre-PEX (b) Post-PEX with no APR constraints (c) Post-PEX with spec agnostic APR constraints

Since non-default rules are only directives and not actual constraints, the wide metal routing is dropped for several auxiliary cells which is adding a large series resistance at the output as show in Figure 2-8. Also, since the placement of the auxiliary cells are not constrained, the number of auxiliary cells for which the wide metal routing was dropped is changing for each design point resulting in a varying series resistance with respect to the array size (shown in Figure 2-8).



Figure 2-8: Effective Post-PEX schematic of the LDO Modeling Test Circuit

To solve this problem and generate more reliable post-PEX models for large array sizes, we used technology agnostic fencing constraints to restrict auxiliary cell placements in combination with output voltage stripes to avoid wide metal routing being removed. Figure 2-7(c) shows that with constrained placement and striping, there is  $\sim$ 4x - 6x improvement in max load current performance and predictability compared to the unconstrained case in Figure 2-7(b). Also, comparing Figure 2-7(a) and Figure 2-7(c) there is a degradation of  $\sim$ 1.3x in max load current performance due to the parasitics.

Layout of a sample LDO design is shown in Figure 2-9(a) as an example output of the LDO generator. Figure 2-9(b) shows the constrained placement of PT\_UNIT\_CELLs and Figure 2-9(c) shows the output net  $V_{REG}$  implemented as a power signal with stripes overlapping with PT\_UNIT\_CELLs.



Figure 2-9: (a) Complete layout of LDO design 1 (b) constrained PT\_UNIT\_CELLs placement (c) V<sub>REG</sub> power stripes

# 2.4 Generated LDO Designs

# 2.4.1 Chip Block diagram

One advantage of using the standard digital flow for layout generation is that we can easily integrate this generator into SoC automation tools [45]. Integration with the SoC CPU cores is realized by adding APB bus slaves, in the SoC synthesizer, to the LDO cores implemented by the LDO generator. The LDO generator tool developed in this work is used to generate three LDO designs automatically for a complete SoC tapeout recently submitted in TSMC65LP PDK. This demonstrates the usability of the LDO generator tool as part of a fully autonomous SoC synthesizer [45]. To be able to talk to the ARM M0 core implemented in the SoC, we have added an APB

slave interface to the designed LDO core. In addition, for the purpose of testing LDO separately from the SoC, we have added a SPI slave interface to the designed LDO core. The bus interfaces also include additional config registers required for standalone testing of the error detector, controller and power transistor array. The interface can be switched between APB and SPI using the pin "SPI APB SEL" driven externally through the pads. The entire block diagram of the LDO design including the SPI/APB interfaces is given in Figure 2-10.



Figure 2-10: Final Block Diagram of taped out LDO designs

| Input Specifications       | Design1  | Design2  | Design3   |
|----------------------------|----------|----------|-----------|
| Architecture               | Baseline | Baseline | ADC Based |
| I <sub>load,max</sub> (mA) | 15       | 25       | 25        |
| V <sub>IN</sub> (V)        | 1.3      | 1.3      | 1.3       |
| Dropout (mV)               | 50       | 50       | 50        |

Table 2-2 Generated 65nm LDOs – Input Specifications

Table 2-2 presents the input specifications to the LDO generator for three different designs in 65nm technology. LDO design 1 and LDO design 2 use the baseline DLDO architecture with a bi-directional shift register controller and a clocked comparator for error detection. The input specification of LDO design 2 differs from LDO design 1 in terms of max load current. LDO design 3 supports the same load currents as LDO design 2 but it adopts the ADC based LDO architecture described in section 2.2.
## 2.4.2 Measurements



Figure 2-11:  $V_{REF}$  match for  $V_{IN} = 1.3V$  maximum  $I_{load}$  condition

Figure 2-11 shows regulation of output voltage  $V_{REG}$  for worst case loading in post parasitic extracted (post-PEX) simulations. The testbench is operated at  $V_{IN} = 1.3V$  and  $V_{REF} = 1.25V$  (at the dropout condition) with I<sub>load</sub> set to its maximum specification. Settling time is defined as the time it takes to reach the characteristic steady state ripple of a digital LDO. Steady state ripple of LDO design 1 is show in Figure 2-11 as an example. From Figure 2-11, we can see that when the reset is on  $V_{REG} = 0V$ . Once is reset signal is turned off at t = 500ns,  $V_{REG}$  starts ramping settling in around the reference voltage  $V_{REF}$ . Figure 2-11 shows the settling time for all three designs. For measuring load transient performance,  $I_{load}$  is changed from 1mA to  $I_{load,max}$  specification and back to 1mA with 100ps transition times for  $V_{IN} = 1.3V$  and  $V_{REF}$  values ranging from 0.6V to 1.2V. Both the simulations use the post-PEX netlist of the design and include packaging parasitics of 1nH inductance and 0.2 $\Omega$  resistance. The clock frequency used in the simulations is 200MHz.

| Output Specifications      | Design1   |       | Design2   |       | Design3    |       |
|----------------------------|-----------|-------|-----------|-------|------------|-------|
| Output Specifications      | Sim       | Meas  | Sim       | Meas  | Sim        | Meas  |
| Dropout Voltage (mV)       | 50        | 70    | 50        | 80    | 50         | 80    |
| I <sub>load,max</sub> (mA) | 15.00     | 15.38 | 25.00     | 24.84 | 25.00      | 23.72 |
| Settling Time - Ts (µs)    | 1.1       | 1.8   | 2.1       | 2.9   | 0.12       | 0.19  |
| Max Undershoot (V)         | 0.35      | 0.98  | 0.57      | 0.98  | 0.38       | 0.14  |
| Max Current Eff. (%)       | 94.2      | 96.4  | 95.7      | 94.5  | 81.9       | 74.0  |
| Load Regulation (mV/mA)    | -         | -1.00 | -         | -0.35 | -          | -3.6  |
| Line Regulation (V/V)      | -         | 0.180 | -         | 0.004 | -          | 0.950 |
| Area (µm <sup>2</sup> )    | 17,318.56 |       | 31,187.56 |       | 127,163.56 |       |

Table 2-3 Generated 65nm LDOs - Measurement Results @ 200MHz Fclk

Table 2-3 summarizes the post-PEX & measurement results of all three LDO designs. From Table 2-3, we can confirm that post-PEX simulations of all the designs meet the required specs of  $I_{load,max}$ . Also, we can see that due to the low gain of the single comparator based controller (as indicated by the slew rate of the transients), LDO design 1 has a high settling time (1.8µs) which further results in max undershoot of 0.98V for a 14mA load transition. For design 2 & design 3,  $I_{load}$ , transitions of 24mA is used in the measurement. Higher load step in Design 2, results in even higher settling time (2.9µs) and undershoot (0.98V), when compared to design 1. As described in section 2.2, ADC-based LDO design 3 improved the settling time by a factor of 15x (190ns) at the cost of a higher area (4x larger) compared to LDO Design 2. The overshoot and undershoot values

of LDO design 3 are still high due to the fact that 100ps load transition is very fast compared to the reference clock frequency. The difference in post-PEX vs. measurements of settling time and undershoot are identified to be caused due to the additional parasitics added on top-level integration of the LDO macros and wire-bonding to off-chip  $V_{REF}$  & load.

## 2.5 SoC Integration<sup>1</sup>

The LDO designs presented in Table 2-2 are implemented as part fully autonomous SoC generation process. Figure 2-12 shows the layout of the generated SoC in 65nm process. The SoC is powered by the design 2 and drives the PLLs, M0 core, ARM RAMs, custom SRAM and the temperature sensor. In addition, using the configurable LDO and PLL, we have demonstrated the dynamic voltage and frequency scaling (DVFS) operation of custom SRAM access using the M0. Figure 2-13 shows the test setup of the SoC. The DVFS commands are sent from a desktop computer to M0 using a JTAG debug port. In addition to the JTAG debug port, the M0 also has a serial UART bus interface to communicate with other devices. The PLL output is monitored using a portable oscilloscope, DIGILENT, which includes a serial communication interface. Using this

<sup>&</sup>lt;sup>1</sup>This work was done in collaboration with Kyumin Kwon, Tutu Ajayi, Sumanth Kamineni and Mehdi Saligane.

The author's main contribution is the design and implementation of LDOs using the fully autonomous cell-based design framework, bring up and testing of the SoC/DFS operation.

Kyumin Kwon designed the PLLs and equally contributed to the bringup of the SoC and DFS operation.

Tutu Ajayi did the top level integration of the tapeout and equally contributed to the bringup of the SoC.

Sumanth Kamineni designed the custom SRAM and verified the SRAM operation in testing.

Mehdi Saigane designed the custom temperature sensor.

test setup, we have verified the SoC DVFS functionality by ensuring that custom memory access operations are performed successfully at different voltages and frequencies.



Figure 2-12: Specs and layout of the SoC



Figure 2-13: Test setup for DFS operation

#### **Chapter 3 Synthesizable PID Controller**

#### 3.1 Motivation

Low dropout (LDO) regulators are essential for power management of sensitive analog and digital circuits in System-on-Chip (SoC) designs. With technology scaling, DLDO architectures are being adopted in SoC designs to enable high levels of integration and low input voltage operation of SoCs. In addition, Digital LDOs are dominantly used for powering switching circuits which are more tolerant to power supply noise, but require fast transient performance with low voltage droops/overshoots. The conventional synchronous shift register (SR) based integral control (I-control) for voltage regulation [33] has a trade-off between bandwidth (BW) and efficiency, and therefore typically employs a large off-chip capacitor to maintain desired regulation. Dual loop proportional-integral (PI) architectures with event-driven [46], asynchronous/self-clocked [47] coarse loops achieve faster settling times, low droop voltages, and eliminate the requirement of large output capacitors. However, due to the high feedback latencies the gain of the PI controller in these architectures is often limited to avoid limit cycle oscillations (LCOs) of the fast/asynchronous loops in steady state and maintain high current efficiency. A PI controller based on delay line comparator in conjunction with an edge racing comparator is presented in [48] to achieve a high figure of merit (FOM), but this architecture still requires a large output cap. A Hybrid LDO architecture with analog assisted NAND based differential loop proposed in [49] overcomes the PI-gain limitation, but employs a custom charge pump and high pass filter (HPF), increasing the complexity and design time. A synthesizable digital PID controller based on a continuous time ADC is presented in [50] and is shown to achieve fast settling times due to the differential loop gain, but at the expense of high quiescent current, limiting the overall FOM. To overcome these limitations, this work presents an output-capacitor-free DLDO with synthesizable PID controller architecture, using a novel multi-asynchronous-loop PI control and delay-line based differential control architecture, achieving a fast transient response on the order of <100ps at 50mV dropout.

#### VDD SPI VDD Proposed Architecture – Fully APR'ed **Conventional BAWP** VDD $V_{REF} V_{REG}$ SPI Intf. Bidirectional UD1, OD1 Transient Asynchronous Wave Pipeline Offset Ctrl Wd Wave Pipel TD Contro Unit Latch\_En Synchronous Control Latch Er VCTRL Current ⊠ arved RO 1-bit VREF Bidirectio RO Enable SP 🕸 VREF VREG LOAD VREG 🖞 СГК CLK de\_DIR $\Lambda \Lambda \Lambda \Lambda$ **Differential Control** UD1• OD1• Diff. ga DL\_Trig droop degradation in Multi Thresh, BAWP Delay UD2 Error (de) Transient Detector 2 Line TDC L Mea ILOAD REF VREG Conventional BAWP Multi Thresh, BAWP Proposed Architecture Multi-BAWP w/ Diff. Ctrl. CLK GND ⊥

#### 3.2 PID Controller Architecture

Figure 3-1: Conventional Single BAWP Architecture [47] vs. Proposed Multi-BAWP with Delay Line based Differential control Architecture [51]

Figure 3-1 shows a conventional bi-directional asynchronous wave pipeline (BAWP) based coarse loop [47] and the proposed architecture, Multi-BAWP with differential control. The proposed architecture employs a coarse loop that is triggered using a continuous time voltage transient (droop/overshoot) detector and has a tunable loop gain K<sub>P</sub>. If the transient detector (TD) offset is high, the response time of the loop slows down which degrades the transient performance. If the offset is very low, gain must be limited to avoid limit cycle oscillations and ensure stability in steady state to achieve low quiescent current consumption. We can implement multiple transient detectors with different offset/threshold values to set the gain of a single BAWP loop or trigger multiple BAWPs with fixed loop gains to break the performance vs. stability trade-off. However, continuous time TDs are power hungry and degrade the current efficiency. To improve the stability robustness across different load transients, while maintaining current efficiency, this work proposes a single TD with digitally tunable offset time shared between multiple fixed loop-gain BAWPs using a self-clocked state machine. The overall gain of multi-BAWP feedback control is proportional to the output voltage error, and because of the inherent current integration at the load capacitance, we effectively achieve a PI-control. This architecture can be used with higher  $K_P$ compared to the conventional single BAWP loop, thereby improving the recovery time, but undershoot/overshoot is still degraded due to the latency of the TD-sharing state machine. A differential control using transient slope detection [50] could compensate for the degradation in multi-threshold BAWP architecture. However, [50] uses a continuous time ADC for differential control which degrades the power efficiency and adds latency in the feedback loop. To overcome this limitation, we have implemented a delay-line based slope detector using two tunable TDs optimizing the power vs. performance trade-off.

#### 3.2.1 Multi-BAWP PI-Control

The block diagram of a single BAWP loop is shown in Figure 3-2. The pipeline shifts "0"s (ON state) to the right with undershoot detection and shifts "1"s (OFF state) left with overshoot detection. Gain of the loop corresponds to the size of the power switch. The steady state linearized model of the BAWP can be represented as a discrete system as shown in Figure 3-2, where  $t_{BAWP}$ 

is the propagation delay between two consecutive BAWP stages,  $K_P$  is the gain of the BAWP loop,  $I_{ON}$  is the unit load current,  $t_{FB}$  is the feedback delay &  $V_{OFF}$  is the transient detect offset value corresponding to the BAWP loop. This model is similar to the bi-directional shift register based synchronous controller model presented in [52], but running at a clock frequency of  $1/t_{BAWP}$ . Based on this model, the LCOs corresponds to the toggling of single current switch in the BAWP loop and the worst case amplitude of the LCOs can be estimated as shown in equation 3-1. The offset for any given gain loop "i" is then set as per the inequality given in equation 3-2, where  $K_{P,i}$  is in increasing order, to ensure the stability of all BAWP loops.

$$V_{pp,LCO} = \frac{K_P * I_{ON} * t_{BAWP}}{C_{load}},$$
(3-1)

$$V_{off,i} \ge \sum_{j=1}^{i} V_{pp,LCO,j}/2 \tag{3-2}$$



Figure 3-2: Block Diagram & Steady state linearized model of a BAWP loop



Figure 3-3: Self-Clocked TD Time Sharing State Machine

The TD time sharing state machine shown in Figure 3-3 enables/disables different fixedgain BAWP loops based on the transient detect signal (signal "TD" in the Figure 3-3), thereby progressively increasing/decreasing the gain till the load event persists. In steady state ("TD" signal is a logic zero), the state machine resets the TD offset to the value corresponding to the lowest  $K_p=1$ . Detection of a load transient event (i.e., "TD" signal becomes logic high) enables the 1x gain BAWP loop and the "RO En Logic" (Figure 3-3), which in-turn enables the Ring Oscillator (RO), clocking the TD state machine. The TD state machine then monitors the "TD" signal for a programmable number of RO cycles and increases or decreases the gain depending on whether the droop has recovered or not. The latency of the state machine can be controlled by setting this programmable count threshold. The RO clock frequency is set to 500MHz and can be lowered if necessary to increase the loop latency and ensure system stability. The "RO En Logic" (Figure 3-3) is also used to eliminate any glitch states that occur due to race conditions associated with output voltage not being synchronized with the RO. The proportional error correction is achieved only for transients slower than at least one RO clock period ( $t_{RO}$ ) and thus a fast responding differential control loop is needed to compensate for this limitation.



## **3.2.2** Synthesizable Differential Control



As shown in Figure 3-4, the transient detector architecture is based on the logic-thresholdtriggered comparator (LTTC) presented in [47]. To digitally tune the offset of the transient detector, PMOS switches were added at the first inverter output to control the bias for the second inverter. The block diagram of the delay line slope detector and the timing diagrams for the undershoot and overshoot events are shown in Figure 3-5. When the TD1 detects a transient (undershoot / overshoot), a logic "0" is propagated down the delay line. If the drooping/transient continues to persist, TD2, set with a higher voltage offset, also detects the transient. At the event of TD2 transient detection, the state of the delay line is captured on to a register. If the transient is fast, the time difference between the TD1 and TD2 detection is small and therefore fewer "0"s are propagated through the delay line. Similarly, if the transient is slow, the "0"s gets propagated through more stages of the delay line. Hence the number of "1"s in the final delay line state corresponds to the time difference between the TD1 and TD2 events. Since we know the voltage offsets at which these two events trigger, the time difference captured in the delay line serves as a differential error "dE".

This differential error is then multiplied with the differential gain (Kdp for undershoot and Kdn for overshoot) to apply a fast correction by rapidly turning on PMOS (undershoot) or NMOS (overshoot) current switches. The hysteresis control signal "Hist\_Ctrl" (shown in Figure 3-5) makes sure that the slope detector is only triggered once until a reset happens. At the positive edge of every synchronous clock cycle "VCLK\_SS", the delay line state is reset and the differential control word is added to (undershoot) or subtracted from (overshoot) the synchronous control word. The synchronous clock "VCLK\_SS" is a 1MHz clock generated off-chip. It is important to note that the proposed differential control can be triggered only once during one synchronous clock cycle. However, this single trigger mechanism is sufficient for powering switching circuits, where the fast transients happen only once at the clock edge.



Figure 3-5: Block diagram of the delay line slope detector based differential control loop & Timing diagram for undershoot and overshoot load events (with Kdn = Kdp = 15)

## **3.3 Fabricated Test Chips & Measurements**

#### 3.3.1 TSMC 65nm Test Chip



## 3.3.1.A Die Micrograph & Layout

Figure 3-6: Die Micrograph of the Synthesizable PID Controller test chip in TSMC 65nm

The test chip, shown in Figure 3-6, was fabricated in a 65nm CMOS bulk process and occupies an area of 0.0925mm<sup>2</sup>. We have manually implemented the unit power switch, synchronous comparator and the transient detectors to fit in the standard cell rows. The entire design is described in behavioral and structural Verilog, using the aforementioned cells as auxiliary standard cells. The layout (including the power transistor array) is automatically generated using commercial place and route tools.

## **3.3.1.B** Transient Measurements

Figure 3-7 shows the transient response of the implemented design for a 28mA (10mA to 38mA) load step at 0.1ns transition time. The measured droop is 275mV and includes parasitic IRdrop seen due to off-chip loading (Figure 3-8). We confirmed from the digital readout measurements that the implemented differential loop triggers and applies correction to the power transistor control on-chip, however the effect of this correction is not captured in the off-chip voltage measurements. Since the load transient is off-chip, the wire-bond inductances are shielding the true on-chip voltage transients that are on the order of nanoseconds. In addition, the  $1M\Omega$  termination resistance of the oscilloscope results in the rippling seen in measurements.



Figure 3-7: 65nm Test Chip – PID Controller Transient response – Undershoot/Overshoot at 50mV dropout This effect was also confirmed using post-PEX simulations. Figure 3-8(c) and Figure
3-8(d) show the transient simulation without off-chip loading parasitics and with estimated offchip loading parasitics respectively. Clearly the droop increases from Figure 3-8(c) to Figure
3-8(d) and we observe a huge IR drop due to the parasitic resistance in Figure 3-8(d). However, the inductance estimated based on number of pads and wirebond lengths is not accurate, resulting in a mismatch of Figure 3-8(d) with the measurements.



Figure 3-8: Post-PEX transient simulations for a 28mA load step @ 0.1ns transient time (a) FF corner (b) SS corner (c) TT corner (d) TT corner including estimated off-chip loading parasitics

Figure 3-8(a) and Figure 3-8(b) compares the post-PEX transient simulation at the fast-fast (FF) corner and slow-slow (SS) corner respectively, with respect to the typical-typical (TT) corner shown in Figure 3-8(c). Owing to the fact that this is a digital control architecture, speed of the transistors increase at FF corner resulting in lower loop latencies and as expected we see an improvement in the droop voltage by 9mV and settling time ( $t_s$ ) by 0.3ns. Similarly, there is a slight degradation of the droop by 7mV and the settling time by 0.4ns at SS corner due to the

increased loop latencies. Offsets due to temperature variations are calibrated manually and an offchip generator for  $V_{REF}$  input is used.

#### **3.3.1.C DC Measurements**

Load Regulation plots of the proposed controller are shown in Figure 3-9. Due to off-chip output loading and a bug in the hierarchical verification, the on-chip parasitic resistance became significant resulting in large IR drops. The parasitic resistance of off-chip load integration is calibrated by enabling only the synchronous loop. Figure 3-9 also shows the performance of the LDO after de-embedding the parasitic resistance, validating the functionality of the design. Figure 3-10 shows the line regulation measurements of TSMC 65nm test chip. The line regulation measurements are done at a load current of 10mA for 0.65 to 1.25V  $V_{REF}$ . Figure 3-11 shows the steady state ripple measurements achieving a voltage ripple of 7.9mV at 550mV dropout with 30mA load current and a ripple of 1.6mV at 50mV dropout with 60mA load current.



Figure 3-9: 65nm Test Chip – PID Controller DC Measurements – Load Regulation before and after deembedding the parasitic IR drop @50mV dropout



Figure 3-10: 65nm Test Chip – PID Controller DC Measurements – Line Regulation Plots @ 10mA load



Figure 3-11: 65nm Test Chip – PID Controller DC Measurements – Steady state ripple measured for the maximum load conditions at 50mV (best case) and 550mV (worst case) dropout voltages

**100mV** 

700 mV

# **3.3.1.D** Performance Comparison

|                               | This work<br>[51] | [46]       | [47]        | [48]               | <b>[49]</b> <sup>a</sup> | [50]              | [53] <sup>b</sup> |
|-------------------------------|-------------------|------------|-------------|--------------------|--------------------------|-------------------|-------------------|
| Process [nm]                  | 65                | 65         | 65          | 65                 | 28                       | 65                | 65                |
| Total Area [mm <sup>2</sup> ] | 0.0925            | 0.03       | 0.69        | 0.0627             | 0.0055                   | 0.012             | 0.002546          |
| Control                       | PID               | PI         | PI          | PI                 | PID                      | PID               | PI                |
| Synthesizable                 | Yes               | Yes        | Yes         | Yes                | No                       | Yes               | No                |
| Capacitor-Less                | Yes               | No         | Yes         | No                 | No                       | No                | No                |
| V <sub>IN</sub> [V]           | 0.7 - 1.3         | 0.45 - 1   | 0.7 - 1.2   | 0.5 - 0.1          | 0.4 - 0.55               | 0.5 - 1           | 0.8 - 1.2         |
| V <sub>OUT</sub> [V]          | 0.65 - 1.25       | 0.4 - 0.95 | 0.66 - 1.16 | 0.45 - 0.95        | 0.35 - 0.5               | 0.35095           | 0.6 - 1.15        |
| V <sub>dropout</sub> [mV]     | 50                | 50         | 40          | 50                 | 50                       | 50                | 50 - 200          |
| Max I <sub>LOAD</sub> [mA]    | 80                | 3.356      | 235         | 30                 | 20                       | 2.8               | 50                |
| Ι <sub>Q</sub> [μΑ]           | 25 - 1050         | 8.1 - 258  | 116 - 874   | 10                 | 0.81                     | 45.2              | 26.25             |
| Peak Current Eff.<br>η [%]    | 99.75             | 99.2       | 99.86       | 99.97 <sup>d</sup> | 99.99 <sup>d</sup>       | 98.4 <sup>d</sup> | 99.95             |
| C <sub>out</sub> [pF]         | 9.5 <sup>c</sup>  | 100        | 0.98        | 100                | 24                       | 100               | 100               |
| Droop [mV]                    | 275               | 34         | 96          | 101                | 117                      | 46                | 100               |
| ΔI [mA]                       | 28                | 1.44       | 89          | 12                 | 20                       | 1.76              | 47                |
| T <sub>R</sub> [ps]           | 93.3              | 2361.1     | 1.057       | 841.7              | 140.4                    | 2613.6            | 212.8             |
| T <sub>EDGE</sub> [ns]        | < 0.1             | N/A        | 77          | 0.1                | 3                        | 0.1               | 20                |
| FOM 1 [ps]                    | 0.233             | 18.89      | 0.0015      | 0.253              | 0.014                    | 41.82             | 0.12              |
| FOM 2 [ps]                    | 0.023             | N/A        | 0.114       | 0.025              | 0.042                    | 4.182             | 2.4               |
| FOM 3 [ps]                    | 0.358             | N/A        | 53.901      | 0.268              | 0.164                    | 42.618            | 5.106             |

Table 3-1 State of the art comparison of the synthesizable PID controller in 65nm test chip

 $T_R = \frac{C_{OUT} \times \Delta V}{\Delta I_{LOAD}}$ 

$$FOM \ 1 = T_R \times (1 - \eta)$$

<sup>a</sup> Reported/Calculated values for 4MHz

<sup>b</sup> Reported/Calculated values for a single LDO output

<sup>c</sup> Includes the bondpad capacitance

 $FOM \ 2 = FOM \ 1 \ \times T_{EDGE} \ / \ 1ns$ 

$$FOM \ 3 = \left(T_R + \frac{T_{EDGE}}{2}\right) \times (1 - \eta) \quad [27]$$

 $^{d}$  Calculated as 1 -  $(I_{Q,MIN}\,/\,I_{LOAD,MAX})$ 

Table 3-1 compares the measured data of this work with the state of the art LDO architectures. FOM 1 corresponds to the conventional FOM which includes the measured response time ( $T_R$ ) and peak current efficiency. The lower the FOM 1, the better performance an LDO

exhibits. While [47] and [53] has the lowest FOM 1 values among the synthesizable controllers, the slew rate ( $\Delta I/T_{EDGE}$ ) of the load current step is very low compared to others, where  $T_{EDGE}$  is the transition time of the load step. To account for this trade-off, FOM 2 which accounts for the 0<sup>th</sup> order correction for the finite slew rate and FOM 3, proposed in [31], which accounts for 1<sup>st</sup> order correction for the load step slew rate, are shown in the comparison table. The proposed multi-BAWP control with delay-line based differential control achieves a peak current efficiency of 99.75%, 93.3ps response time and a FOM 3 of about 358fs, which is 1.3 times the FOM 3 of [48], which is the best FOM in Table 3-1 for a synthesizable LDO, but with <10 times C<sub>OUT</sub>.

## 3.3.2 Global Foundries 12nm FinFET Test Chip

We have observed a bug in the hierarchical verification of the 65nm PID controller chip resulting in the high on-chip parasitic series resistance as shown in Figure 3-9. To fix this problem and to demonstrate the portability of the synthesizable PID controller, we re-taped out the synthesizable PID controller design in Global Foundries (GF) 12nm FinFET technology.



3.3.2.A Die Micrograph & Layout

Figure 3-12: Die Micrograph of the Synthesizable PID Controller test chip in Global Foundries 12nm FinFET

The test chip, shown in Figure 3-12, was fabricated in a 12nm CMOS FinFET process and occupies an area of 0.08mm<sup>2</sup> including the input decoupling capacitors (De-Caps). Similar to the TSMC 65nm test chip in section 3.3, the unit power switch, synchronous comparator and the transient detectors are manually laid out to fit in the standard cell rows for this technology. The rest of the design is described in behavioral and structural Verilog, using the aforementioned cells as auxiliary cells. The layout (including the power transistor array) is automatically generated using commercial place and route tools.



#### **3.3.2.B** Transient Measurements

Figure 3-13: 12nm Test Chip – PID Controller Transient response – Undershoot/Overshoot at 50mV dropout The transient response of this chip has similar problems compared to the TSMC 65nm test chip because of the off-chip loading problem. As shown in, we can see that a droop of 227mV is observed for a 50mA load step with a transition time of 0.1ns. Given a higher load step compared

to the TSMC 65nm test chip measurement, we can see an improvement in droop. However, the output voltage still affected by inductor  $\partial i/\partial t$  drop and the high termination resistance of the oscilloscope, resulting in large output ripples during the transient.







matches with the estimated total power grid network (PGN) resistance using post-PEX netlist. Using Ball-Grid-Array (BGA) packaging instead of Wirebond packaging could have reduced the PGN resistance to even lower values.

The line regulation measurement for this chip is performed at a load current for 10mA. Figure 3-15 shows the line regulation measurement of the 12nm test chip and total line regulation value of 0.0008V/V at  $V_{REF} = 0.65V$  corresponding a dropout of 250mV from input voltage to output voltage.



Figure 3-15: 12nm Test Chip – PID Controller DC Measurements – Line Regulation Plot @ 10mA ILOAD

#### 3.3.3 Transient Measurement with On-Chip Load

As mentioned in section 3.3.1.B and section 3.3.2.B, the high speed transient measurement is affected by the parasitics added due to off-chip load integration. For this purpose, a re-spin of the TSMC 65nm test chip with an integrated on-chip load is done to observe the effect of delay line based differential control in the synthesizable PID controller architecture. As shown in Figure 3-16, the delay line based differential controller achieves ~2x improvement of the undershoot value across different edge transition times (see Figure 5-6), when compared to the Multi-Threshold BAWP based PI control. From the simulations run in 65nm and 12nm processes, the T<sub>EDGE</sub> for which differential control loop can trigger, improves as we move to advanced PDKs.



Figure 3-16: 65nm Re-Spin Chip with On-Chip load – PID Controller Transient response – Undershoot/Overshoot at V<sub>IN</sub> = 1.3V, V<sub>REG</sub> = 1.25V (50mV dropout) & T<sub>EDGE</sub> = 4.5ns

#### 3.3.4 Automated DC Measurement Setup

DC measurements (load/line regulation plots) of a LDO involves measuring output voltage for multiple  $V_{IN}$ ,  $V_{REF}$  &  $I_{LOAD}$  input values. Even to iterate over ten points for each input type would result in a total of thousand measurements needed to be made. For this purpose, we implemented an automatic DC test setup using Xilinx FPGA based Opal Kelly XEM6001 evaluation board. As shown in Figure 3-17(a) and Figure 3-17(b), the bare dies of the TSMC 65nm and GF 12nm test chips are wire-bonded in a Quad-Flat-No lead (QFN) package which is then soldered on to a custom designed printed circuit board (PCB). These DC test boards are referred to as the "LDO QFN Board" in Figure 3-18(b).



Figure 3-17: LDO QFN Board for (a) TSMC 65nm Test Chip (b) GF 12nm Test Chip



(a) (b)

Figure 3-18: (a) DC measurement board connections including Programmable off-chip load, Opal Kelly XEM6001 FPGA and LDO QFN boards, (b) SPI & Power connection interfaces of the LDO QFN board

The entire test system is shown in Figure 3-18(a), which consists of a programmable offchip load, implemented on a printed circuit board (PCB), which is connected to the Opal Kelly XEM6001 and the LDO board with a soldered QFN package. Figure 3-18(b) shows the SPI and power connection interfaces of the LDO QFN board that are connected to the programmable offchip load. We can connect either the TSMC 65nm LDO QFN board or the GF 12nm FinFET LDO QFN board to the programmable load board, enabling us to use the same programmable off-chip load and XEM6001 to measure the DC characteristics of both the test chips.

## **Chapter 4 Transient Spec Synthesis & Layout Automation**

## 4.1 Additional Aux-Cells

In addition to the PT\_UNIT\_CELL, LDO\_COMPARATOR auxcells specified in section 2.1.2, two additional auxcells "TRANS\_DET" and "TD\_TUNE\_UNIT" have been implemented to support the automation of PID Controller synthesis and layout generation. As shown in Figure 3-4, the TRANS\_DET auxcell is composed of two separate blocks corresponding to the detection of undershoot and overshoot respectively. Each of the detectors are composed of 4 inverter stages. The first inverter powered by either  $V_{REG}$  (undershoot detector) or  $V_{REF}$  (overshoot detector) is configured for self-biasing and sets the threshold for the second inverter. The second inverter stage, powered by either  $V_{REF}$  (undershoot detector) or  $V_{REG}$  (overshoot detector), outputs a logic high when a transient event is detected. The third inverter stage is then used to level convert from  $V_{REF}/V_{REG}$  to VDD and the fourth stage is buffer to be able to drive the BAWPs and the delay line based differential controller. Figure 3-4 also shows the block diagram of the TD\_TUNE\_UNIT auxcell. The number of TD\_TUNE\_UNITs shown in Figure 3-4 is a design parameter that is determined from the user input specifications. A standard cell row compliant layout of the TRANS\_DET & TD\_TUNE\_UNIT auxcells are shown in Figure 4-1 (a) and (c) respectively.



(a)



Figure 4-1: (a) "TRANS\_DET" auxcell layout, (b) "TD\_TUNE\_UNIT" auxcell schemtic and (c) layout in TSMC 65nm

# 4.1.1 Characterization of Transient Detector Offsets

Similar to the characterization of PT\_UNIT\_CELL, we need to characterize the specs of TD\_UNIT\_CELL and its effect on the TRANS\_DET input offset. Figure 4-2 shows the tuning range of the transient detector with respect to number of TD\_UNIT\_CELLs in the design. To improve the resolution, both the PMOS transistors in this auxcell are sized to 4 times the minimum length (L<sub>MIN</sub>) of the technology node.



Figure 4-2: Schematic of the "TD\_TUNE\_UNIT" aux-cell and the simulated TD offset w.r.t to the ctrl. word



# 4.2 Design Synthesis Flow

Figure 4-3: Conversion of User Specifications to Synthesizable PID Design Parameters

The overall design generation and layout automation for the transient specification synthesis is similar to the baseline design automation flow show in Figure 2-4, with the "Determine Architecture" and the "Design Parameter Generation" steps combined into a single step. Once the user enters the LDO specification requirements in step 1, the controller architecture and corresponding design parameters are generated using the synthesizable PID controller as the template. Figure 4-3 shows the conversion of user input specifications to deciding controller loop architecture and the design parameter generation for each feedback loop.

As shown in Figure 4-3, the automation flow first determines the design parameters corresponding the DC input specifications (I<sub>LSB</sub>, overall current switch strength with respect to the fine current switch strength) in steps #1 and #2, with only the synchronous control as the feedback loop (i.e I-only controller architecture). In step #3, using the PT\_UNIT\_CELL I-V characteristics, the worst case undershoot or overshoot ( $\Delta V$ ) is estimated as the steady state output voltage when there is no feedback (with the assumption that all possible load transition times t<sub>edge</sub> are much smaller when compared to the synchronous clock period t<sub>CLK</sub> = 1/f<sub>CLK</sub> = 1µs). If the output capacitor (C<sub>OUT</sub>) is very high, the worst case  $\Delta V$  is determined using the C<sub>OUT</sub> charging/discharging model for a period of one synchronous clock period t<sub>CLK</sub>. If the estimated  $\Delta V$  is higher than the max allowed  $\Delta V$  specified in the user inputs, a PI controller architecture is adopted by adding a single bi-directional asynchronous wave pipeline (BAWP) as the coarse feedback control loop.

$$K_P > \frac{t_{BAWP}}{I_{LSB}} \left| \frac{I_{load,max} - I_{load,min}}{t_{edge}} \right|, \tag{4-1}$$

In step #4, the minimum proportional gain ( $K_P$ ) required to compensate for the worst case voltage correction is estimated as shown in equation 4-1, where  $I_{load,max}$ ,  $I_{load,min}$  are the LDO user input specifications and  $t_{BAWP}$  is the propagation delay between two consecutive BAWP stages.

Also using equation 3-1 & equation 3-2, we determine the minimum  $V_{offset}$  required as threshold for triggering the single BAWP loop in transients. In step #5,  $\Delta V$  is estimated again based on the PI feedback control and compared with the user input requirements to determine the necessity of a differential control with multi-BAWP control (PID-control) is required. In step #6, the differential feedback control design parameters are determined based on the slack in  $\Delta V$  due to PI control. Finally, in step #7, the number of BAWP stages for each loop and the length of the synchronous current switches is determined.

#### 4.2.1 Supported User Input Specifications

Table 4-1 summarizes the supported DC and Transient user input specifications that the generator can synthesize an LDO design. The DC user specs that the generator supports are input voltage ( $V_{IN}$ ) and load current range ( $I_{LOAD}$ ) for a dropout of 50mV, output ripple and output cap. As shown in Table 4-1, the output cap of the baseline design needs to be very high to maintain regulation and the supported load current is limited because the overhead of the controller is limited by the ripple. Synthesis of transient specs is only supported by the PI/PID controllers. From the table, we can see that the supported load range increases for the PID controller compared with the PI controller.

 Table 4-1 Supported LDO Generator User Input Specifications Using the Synthesizable PID Controller

| Input Specifications     | Controller 1      | Controller 2     | Controller 3                     |
|--------------------------|-------------------|------------------|----------------------------------|
| Architecture             | Baseline (I-only) | Single BAWP (PI) | Slope Detect w/ Multi-BAWP (PID) |
| I <sub>LOAD</sub> (mA)   | 1 - 25            | 1 - 60           | 1 - 70                           |
| $V_{IN}$ (V)             | 0.6V - 1.3V       | 1.3              | 1.3                              |
| C <sub>OUT</sub> (pF)    | $\geq 1000$       | $\geq 5$         | $\geq 5$                         |
| Dropout Voltage (mV)     | 50                | 50               | 50                               |
| Steady State Ripple (mV) | $\geq 2$          | $\geq 2$         | $\geq 2$                         |
| Max $\Delta V$ (mV)      | NA                | $\geq 200$       | $\geq 150$                       |
| t <sub>edge</sub> (ns)   | NA                | $\geq 2$         | $\geq 2$                         |

Template in 65nm Technology

## 4.3 Layout Construction



Figure 4-4: Generated top-level layout showing the sub-block placements

Figure 4-4 shows an example of the generated output layout of the PID controller. The lower half of the layout is used for power transistor array placement and the top half is the controller. Single BAWP implementation and the delay line based differential controller placements are also shown in the Figure 4-4. Power transistor cells (PT\_UNIT\_CELLs) are placed using fencing constraints similar to the baseline design (section 2.3.3) and rest of the components are placed at precise co-ordinates. These precise placement co-ordinates are calculated automatically by parsing the auxcell .lef files, making the process completely technology agnostic. Figure 4-5 shows the post-PEX transient performance of the output layout shown in Figure 4-4. This simulation shows that LDO designs with fast response time (~10ps) and high FOMs (59.2fs) can be implemented using the PID controller.



Figure 4-5: Post-PEX Simulation of the Generated Layout

#### **Chapter 5 Hybrid LDO Architecture**

#### 5.1 Introduction

The main function of a regulator is to provide a clean noise free supply to the rest of the circuits, and this includes suppression of any noise added at the input of the regulator. Figure 5-1 illustrates the concept of power supply noise rejection in digital LDOs. The noise introduced into the source voltage of the LDO is suppressed at the output regulated voltage.



Figure 5-1: Power Supply Noise Rejection in Digital LDO

Digital LDOs often use a discretized time-domain fine loop, corresponding to a low gain at the zero/DC frequency. This low DC gain problem of DLDOs ultimately results in poor power supply rejection ratio (PSRR) of the LDO, and combined with the problem of LCOs in steady state, the application of DLDOs is limited almost exclusively to powering switching circuits and systems. For this reason, noise sensitive analog circuits like voltage controlled oscillators (VCOs), sensor front-ends, transceivers, etc. are traditionally powered using analog LDOs which provide high DC gain. Some of the recent works address the problem of poor PSRR performance in DLDOs using different methods. [54] improves the PSRR by using NMOS current switches operating in sub-threshold instead of PMOS current switches, but require high dropout (>100mV) and the design of a voltage doubling charge pump to power the CTAT gate voltage control for the NMOS switches. [55] and [31] use a hybrid LDO architecture with a separate analog LDO control loop in parallel with the digital LDO control for localized and distributed power management, respectively. Furthermore [31] achieves a programmable PSRR by implementing the ALDO and DLDO loops as tiles. However, [55] and [31] both require the design of an analog control loop completely separate from the digital LDO which can be synthesized.

#### 5.2 Proposed LDO Architecture

We propose the hybrid fine loop architecture shown in Figure 5-2 to improve the PSRR of the PID controller. In addition to the synthesizable digital I-only control, an analog amplifier is used to control the switch strength of ON power transistors in a steady state fine loop. This is similar to Analog-assisted Tri-Loop architectures [49], with the difference being that [49] uses an analog loop to achieve better transient response, while the proposed architecture uses an analog loop to achieve better PSRR in steady state. A NAND based buffer is used to buffer the output digital control word and feed in an analog "ON" voltage to the switches.



Figure 5-2: Block diagram of the proposed fine loop for a hybrid LDO



Figure 5-3: Block Diagram of the proposed hybrid LDO architecture (synthesizable PID + hybrid fine loop)

The complete proposed hybrid LDO architecture including the synthesizable PID controller and the hybrid fine loop is shown in Figure 5-3. To facilitate the stability of the hybrid architecture, a steady state detector is implemented to put the fine digital control loop in the deadzone and enable the analog amplifier to correct for the final error in digital control. Moreover, since the amplifier only affects the switch strength of the ON transistors, the dead-zone is implemented when the synchronous comparator output goes from logic "0" to logic "1". This architecture requires the design of a low bandwidth and high gain analog amplifier in addition to the design of the PID controller. With the automation of amplifiers using existing netlist based analog circuit generators, this architecture can serve as a universal architecture that can be used as template for automating LDO design generation for both digital and sensitive analog loads. In this work, we used a single stage cross-coupled amplifier to control the logic "0" voltage of the NAND based buffer (shown in Figure 5-4) and the design of the amplifier is done manually.



Figure 5-4: Differential to single ended amplifier with Common source buffer

## 5.3 TSMC 65nm Hybrid LDO Test Chip

### 5.3.1 Die Micrograph & Layout

The hybrid LDO test chip is implemented in TSMC 65nm technology and occupies an active area of 0.9164mm<sup>2</sup> (Figure 5-5). In addition, because of the transient measurement problems seen in section 3.3.1.B and section 3.3.2.C, we have implemented an on-chip load as shown in the Figure 5-5. While the digital controller design is done using commercial CAD tools like Synopsys "Design Compiler" for synthesis and Cadence "INNOVUS" for APR, the design, layout implementation & integration of the amplifier is done completely manual.


Figure 5-5: Die Micrograph of the Hybrid LDO test chip in TSMC 65nm along with on-chip load

# 5.3.2 Transient Measurements

Since the analog amplifier is disabled during the transients, the performance of the hybrid LDO architecture is same as the synthesizable PID controller transient performance shown in Figure 3-16 for  $T_{EDGE} = 4.5$ ns. Due to the on-chip load integration, we are able to avoid the

wirebond inductor  $\partial i/\partial t$  drop and able to control the T<sub>EDGE</sub> more precisely. Figure 5-6 shows the transient response of the Hybrid LDO/Synthesizable PID controller for T<sub>EDGE</sub> = 2.5ns. From the measurement, we can see that the undershoot response is improved approximately 2x when the delay line based differential controller is enabled, with an overall undershoot value of 44mV for a 57mA load step for 2.5ns transition time.



Figure 5-6: 65nm Hybrid LDO test chip with on-chip load – Transient response – Undershoot/Overshoot at  $V_{IN} = 1.3V$ ,  $V_{REG} = 1.25V$  (50mV dropout) &  $T_{EDGE} = 2.5ns$ 

## 5.3.3 DC Measurements

Figure 5-7 shows the load regulation plot of the 65nm hybrid LDO test chip. We measured a load regulation value of 0.014 mV/mA, and the maximum supported load current at  $1.3 \text{V} \text{V}_{\text{IN}}$  and a 50mV dropout is 82mA. The line regulation measurement for this chip is performed at a load current for 20mA. Figure 5-8 shows the line regulation measurement of the test chip with a total line regulation value of 0.00065 V/V at  $V_{\text{REF}} = 0.65 \text{V}$  corresponding to a dropout from input voltage to output voltage of 50mV to 650mV.



Figure 5-7: 65nm Hybrid LDO test chip – DC Measurements – Load Regulation Plot @50mV dropout



Figure 5-8: 65nm Hybrid LDO test chip – DC Measurements – Line Regulation Plot @20mA ILOAD

The amplifier consumes a power of  $230\mu$ W and Figure 5-9 shows the PSRR performance of the LDO when the analog loop is enabled and disabled. The LDO with analog loop enabled achieves a total PSRR of -12.16dB at 100Hz frequency. This corresponds to an additional supply noise rejection of ~16.5dB at 100Hz frequency because of the analog loop integration.



Figure 5-9: 65nm Hybrid LDO test chip – DC Measurements – PSRR measurement

#### 5.3.4 Performance Comparison

Table 5-1 compares the measured data of the hybrid LDO with the synthesizable PID controller from section 3.3.1 and other state of the art LDO architectures. From the table, we can see that the hybrid LDO architecture with analog fine loop achieves better PSRR performance when compared to the fully-digital architecture. While [54] achieves the best PSRR of -25dB using NMOS switches operating in sub-threshold region, the maximum load current supported and the transient response of the controller are severely degraded. [56] also achieves a good PSRR performance, but looses the synthesizability due to the analog LDO architecture. In addition, the

output cap used by both [54], [56] are much high compared to the fully synthesizable PID controller architecture and the hybrid LDO architecture presented in this work. In addition, to the improvement in PSRR performance, the hybrid LDO also achieves the fastest response time when compared to other works.

|                                   | This Work                 |                                |                             |                   |
|-----------------------------------|---------------------------|--------------------------------|-----------------------------|-------------------|
|                                   | Hybrid LDO                | Synthesizable<br>PID Ctrl [51] | [54]                        | [56]              |
| Process [nm]                      | 65                        | 65                             | 65                          | 65                |
| Active Area [mm <sup>2</sup> ]    | 0.09164                   | 0.0925                         | 0.037                       | 0.0234            |
| Control                           | PID + Analog<br>Fine Loop | PID                            | Digital with<br>NMOS switch | ALDO              |
| Synthesizable                     | Mostly                    | Yes                            | Yes                         | No                |
| <b>V</b> <sub>IN</sub> <b>[V]</b> | 0.7 - 1.3                 | 0.7 - 1.3                      | 0.5-1                       | 1.25              |
| V <sub>OUT</sub> [V]              | 0.65 - 1.25               | 0.65 - 1.25                    | 0.45-0.95                   | 1                 |
| V <sub>dropout</sub> [mV]         | 50-650                    | 50-650                         | 50-300                      | 150               |
| Max I <sub>LOAD</sub> [mA]        | 82                        | 80                             | 0.0087                      | 10                |
| Ι <sub>Q</sub> [μΑ]               | 108 - 1300                | 25 - 1050                      | 0.022                       | 50                |
| Peak Current Eff. η [%]           | 99.87 <sup>a</sup>        | 99.75                          | 99.75 <sup>a</sup>          | 99.5 <sup>a</sup> |
| C <sub>out</sub> [pF]             | <b>4.3</b> <sup>b</sup>   | 9.5 <sup>b</sup>               | 90                          | 140               |
| Droop [mV]                        | 44                        | 275                            | 50                          | 43                |
| Δ <b>Ι</b> [mA]                   | 57                        | 28                             | 0.00825                     | 10                |
| T <sub>R</sub> [ps]               | 3.32                      | 93.3                           | 545,454                     | 602               |
| T <sub>EDGE</sub> [ns]            | 2.5                       | < 0.1                          | <0.1                        | 0.2               |
| PSRR [dB] @1kHz                   | -12.25                    | -3.623                         | -25                         | -20               |
| FOM 1 [ps]                        | 0.0043                    | 0.233                          | 1380                        | 3.01              |
| FOM 2 [ps]                        | 0.0108                    | 0.023                          | 138                         | 0.602             |
| FOM 3 [ps]                        | 1.63                      | 0.358                          | 1380                        | 3.51              |

Table 5-1 State of the art comparison of the synthesizable PID controller in 65nm test chip

$$T_R = \frac{C_{OUT} \times \Delta V}{\Delta I_{LOAD}}$$

 $^{a}$  Calculated as 1 -  $(I_{Q,MIN}\,/\,I_{LOAD,MAX})$ 

$$FOM \ 1 = T_R \times (1 - \eta)$$

FOM 2 = FOM 1  $\times$  T<sub>EDGE</sub> / 1ns

FOM 3 = 
$$\left(T_R + \frac{T_{EDGE}}{2}\right) \times (1 - \eta)$$
 [27]

#### **Chapter 6 Conclusion and Future Work**

## 6.1 Summary

During the recent years, most electronic chips are implemented as systems on a single chip, with the size and complexity of the SoCs increasing tremendously. In addition, with the advent of Internet-of-Things (IoT) phenomenon with the goal to transform more and more of the real world into smarter devices, created a high demand for faster SoC design cycles. One of the key bottlenecks to improve SoC design times is the lack of standard design flow for analog circuit design automation. LDO regulators are one such ubiquitous analog blocks used in SoCs, with as many as 50 to 100 LDOs present in modern day SoCs for powering various types of analog and digital (switching) circuits. The main objective of this thesis is to develop an automated method to design and layout LDO solutions that can be integrated in SoC designs required for a wide range of applications.

In the first part of the thesis, I introduced the LDO design synthesis and layout automation framework (LDO generator tool) using a bi-directional shift register based synchronous controller architecture as the baseline LDO design. The automated framework is based upon the concept of cell-based analog design, with the analog/mixed signal functionality of the LDO abstracted and discretized to simple circuits called auxiliary cells, similar to that of standard cells used in digital design automation. Using the LDO generator tool, designs with a 1-bit comparator and a stochastic-ADC for error detection are implemented and verified to meet the DC specifications (input voltage, dropout and maximum load current) provided by the user in measurements. The LDO generator is also ported to multiple commercial process design kits (PDKs) and an opensource PDK, demonstrating the robustness of this framework. Furthermore, the LDO generator tool is integrated within a SoC generator tool to realize full SoC design automation.

In the second part of the thesis, I introduced a novel synthesizable PID controller architecture for digital LDOs. The PID controller includes a bi-directional shift register based synchronous control for the fine control loop, a multi-threshold triggered multiple bi-directional asynchronous wave pipelines to implement an adaptive proportional gain K<sub>P</sub> and a delay line based single-trigger differential control. Test chips of this controller are implemented and tested in CMOS based 65nm and 12nm FinFET technologies. Furthermore, I present a methodology to automate the synthesis of the PID controller design parameters to realize the synthesis of transient specifications (output capacitance, maximum tolerable undershoot/overshoot voltage and minimum load transition time). Finally, I presented a novel hybrid LDO architecture to improve the PSRR of proposed PID controller.

#### 6.2 Future Work

With the implementation of multiple power domains and aggressive tuning of various power domains in modern day SoCs, reducing the design time of LDOs is crucial to reducing the overall SoC design cycle times. Consequently, the directions that can be explored as an extended future work for the current thesis are far and wide.

#### 6.2.1 Automating the Auxcell Generation

To complete the work presented in this thesis, an obvious route is the integration of existing transistor level/netlist based analog synthesizers to automate the design and implementation of auxcells realizing a completely hands-off LDO generator tool that can be ported to different technologies easily.

62

#### 6.2.2 Automating the Hybrid Analog Synthesis & Implementation

Integration of existing transistor level/netlist based analog synthesizers can also be used to automate the design and implementation of the analog amplifier and thereby achieve the automation of LDO design synthesis to meet a specific PSRR requirement.

## 6.2.3 Synthesis of Dynamic Range or Efficiency Specifications

The design space of LDOs for modern day SoCs is significantly wide in terms of PSRR, dropout, efficiency, transient and dynamic range requirements. While this thesis focuses on the synthesis of DC and transient specifications, a possible direction is to come up with other cell-based architectures that can be used for the synthesis of LDO dynamic range and efficiency.

## 6.2.4 Automated PGN Analysis and Synthesis of Distributed LDOs

With the huge die size and the low operating voltage of modern day SoCs, high power LDOs with >100s of amperes of current cannot be localized and require a distributed implementation of power transistor and the control loops. Realizing an automated framework for analyzing the IR drops in power grid network and suppressing cross-talk between local LDO controllers within the same voltage domain can be potentially a huge improvement to the work done in this thesis.

## **Bibliography**

- M. Roser, H. Ritchie, and E. Mathieu, "What is Moore's Law? Our World in Data," 2023. https://ourworldindata.org/moores-law (accessed Aug. 02, 2023).
- [2] J. Scheible and J. Lienig, "Automation of analog IC layout Challenges and solutions," in *Proceedings of the International Symposium on Physical Design*, 2015. doi: 10.1145/2717764.2717781.
- [3] D. Marolt, M. Greif, J. Scheible, and G. Jerke, "PCDS: A new approach for the development of circuit generators in analog IC design," in 22nd Austrian Workshop on Microelectronics, Austrochip 2014 - Proceedings, 2014. doi: 10.1109/Austrochip.2014.6946310.
- [4] M. Barros, J. Guilherme, and N. Horta, "Analog circuits and systems optimization based on evolutionary computation techniques," in SM2ACD 2008 - 10th International Workshop on Symbolic and Numerical Methods, Modeling and Applications to Circuit Design, Proceedings, 2008.
- [5] J. Lee and J. Kim, "Investigations on the optimal support vector machine classifiers for predicting design feasibility in analog circuit optimization," *Journal of Semiconductor Technology and Science*, vol. 15, no. 5, 2015, doi: 10.5573/JSTS.2015.15.5.437.
- [6] I. Fadloullah, A. Mechaqrane, and A. Ahaitouf, "Butterworth Low Pass filter design using evolutionary algorithm," in 2017 International Conference on Wireless Technologies, Embedded and Intelligent Systems, WITS 2017, 2017. doi: 10.1109/WITS.2017.7934661.

- [7] M. Bhanja and B. N. Ray, "Synthesis Procedure of Configurable Building Block-Based Linear and Nonlinear Analog Circuits," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 36, no. 12, 2017, doi: 10.1109/TCAD.2017.2681062.
- [8] A. C. Kammara, L. Palanichamy, and A. König, "Multi-objective optimization and visualization for analog design automation," *Complex & Intelligent Systems*, vol. 2, no. 4, 2016, doi: 10.1007/s40747-016-0027-3.
- [9] A. C. Kammara and A. Konig, "Absynth: A Comprehensive Approach for Full Front to Back Analog Design Automation," in SMACD 2018 - 15th International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design, 2018. doi: 10.1109/SMACD.2018.8434862.
- [10] N. Lourenço, R. Martins, and N. Horta, *Automatic analog IC sizing and optimization* constrained with PVT corners and layout effects. 2016. doi: 10.1007/978-3-319-42037-0.
- [11] G. G. E. Gielen, "CAD tools for embedded analogue circuits in mixed-signal integrated systems on chip," in *IEE Proceedings: Computers and Digital Techniques*, 2005. doi: 10.1049/ip-cdt:20045116.
- [12] G. G. E. Gielen, "CAD tools for embedded analogue circuits in mixed-signal integrated Systems-on-Chip," in *System-on-Chip: Next Generation Electronics*, 2006. doi: 10.1049/PBCS018E\_ch15.
- [13] J. Lee and Y. Bin Kim, "ASLIC: A low power CMOS analog circuit design automation," in *Proceedings - International Symposium on Quality Electronic Design, ISQED*, 2005. doi: 10.1109/ISQED.2005.23.

- [14] J. Lee and Y. Bin Kim, "ASLIC: A low power CMOS analog circuit design automation," *Integration, the VLSI Journal*, vol. 39, no. 3, 2006, doi: 10.1016/j.vlsi.2005.04.001.
- [15] S. Zhang *et al.*, "An efficient multi-fidelity Bayesian optimization approach for analog circuit synthesis," in *Proceedings Design Automation Conference*, 2019. doi: 10.1145/3316781.3317765.
- [16] N. Lourenco, R. Martins, and N. Horta, "Layout-aware sizing of analog ICs using floorplan & routing estimates for parasitic extraction," in *Proceedings -Design*, *Automation and Test in Europe, DATE*, 2015. doi: 10.7873/date.2015.0411.
- [17] B. Cardoso, R. Martins, N. Lourenco, and N. Horta, "AIDA-PEx: Accurate parasitic extraction for layout-aware analog integrated circuit sizing," in 2015 11th Conference on Ph.D. Research in Microelectronics and Electronics, PRIME 2015, 2015. doi: 10.1109/PRIME.2015.7251351.
- [18] H. Chen, M. Liu, X. Tang, K. Zhu, N. Sun, and D. Z. Pan, "Challenges and opportunities toward fully automated analog layout design," *Journal of Semiconductors*, vol. 41, no. 11. 2020. doi: 10.1088/1674-4926/41/11/111407.
- [19] J. M. Cohn, D. J. Garrod, R. A. Rutenbar, and L. R. Carley, *Analog Device-Level Layout Automation*, vol. 263. Boston, MA: Springer US, 1994. doi: 10.1007/978-1-4615-2756-5.
- [20] M. P. H. Lin, Y. W. Chang, and C. M. Hung, "Recent research development and new challenges in analog layout synthesis," in *Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC*, 2016. doi: 10.1109/ASPDAC.2016.7428080.
- [21] R. A. Rutenbar, "Analog Circuit and Layout Synthesis Revisited," 2015. doi: 10.1145/2717764.2717780.
- [22] Alan Hasti, "The Art of Analog Layout, Alan Hastings.pdf," Prentice Hall. 2009.

- [23] K. Kunal *et al.*, "INVITED: ALIGN Open-source analog layout automation from the ground up," in *Proceedings Design Automation Conference*, 2019. doi: 10.1145/3316781.3323471.
- [24] R. Martins, N. Lourenço, and N. Horta, "LAYGEN II-automatic layout generation of analog integrated circuits," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 32, no. 11, 2013, doi: 10.1109/TCAD.2013.2269050.
- [25] R. Camposano, "The Insanity of DRC Rules and DFM at 10nm and below," DAC 2016 Panel Discussion on DFM/DRC, Austin, 2016. https://si2.org/2017/05/02/dac-2016-paneldiscussion-dfmdrc-insanity-drc-rules-dfm-10nm/ (accessed Aug. 02, 2023).
- [26] G. Gielen *et al.*, "Tomorrow's analog: Just dead or just different?," in *Proceedings Design Automation Conference*, 2006. doi: 10.1145/1146909.1147089.
- [27] O. Abdelatty, A. Alghaihab, Y. K. Cherivirala, S. Kamineni, B. Calhoun, and D. D. Wentzloff, "A 300µW Bluetooth-Low-Energy Backchannel Receiver Employing a Discrete-Time Differentiator-Based Coherent GFSK Demodulation," in *Digest of Papers IEEE Radio Frequency Integrated Circuits Symposium*, 2021. doi: 10.1109/RFIC51843.2021.9490429.
- [28] M. Moosavifar, Y. K. Cherivirala, and D. D. Wentzloff, "A 320µW Receiver with-58dB SIR Leveraging a Time-Varying N-Path Filter," in *Digest of Papers - IEEE Radio Frequency Integrated Circuits Symposium*, 2022. doi: 10.1109/RFIC54546.2022.9863165.
- [29] Y. K. Cherivirala, H. Lyu, H. A. Alhowri, and A. Babakhani, "Wirelessly powered microchips for mapping hydraulic fractures," *SPE Journal*, vol. 24, no. 4, 2019, doi: 10.2118/194491-PA.

 [30] Y. K. Cherivirala, "Design of CMOS Based High Temperature Sensor with Integrated Memory," Rice University, United States -- Texas, 2018. Accessed: Aug. 07, 2023.
[Online]. Available:

https://proxy.lib.umich.edu/login?url=https://www.proquest.com/dissertationstheses/design-cmos-based-high-temperature-sensor-with/docview/2572635686/se-2

- [31] X. Liu *et al.*, "A Universal Modular Hybrid LDO with Fast Load Transient Response and Programmable PSRR in 14-nm CMOS Featuring Dynamic Clamp Strength Tuning," *IEEE J Solid-State Circuits*, vol. 56, no. 8, 2021, doi: 10.1109/JSSC.2021.3055742.
- [32] B. Qiang, Y. Zushu, Z. Yuanfu, and Y. Suge, "Analysis and design of voltage controlled current source for LDO frequency compensation," in 2005 IEEE Conference on Electron Devices and Solid-State Circuits, EDSSC, 2005. doi: 10.1109/EDSSC.2005.1635282.
- [33] Y. Okuma *et al.*, "0.5-V input digital LDO with 98.7% current efficiency and 2.7-μA quiescent current in 65nm CMOS," in *Proceedings of the Custom Integrated Circuits Conference*, 2010. doi: 10.1109/CICC.2010.5617586.
- [34] T. Ajayi *et al.*, "An Open-source Framework for Autonomous SoC Design with Analog Block Generation," in *IEEE/IFIP International Conference on VLSI and System-on-Chip*, *VLSI-SoC*, 2020. doi: 10.1109/VLSI-SOC46417.2020.9344104.
- [35] T. Ajayi *et al.*, "Fully-Autonomous SoC Synthesis Using Customizable Cell-Based Analog and Mixed-Signal Circuits Generation," in *IFIP Advances in Information and Communication Technology*, 2021. doi: 10.1007/978-3-030-81641-4\_4.
- [36] K. Kwon, O. Abdelatty, and D. Wentzloff, "Open-Source Fully-Synthesizable ADPLL for a Bluetooth Low-Energy Transmitter in 12nm FinFET Technology," in *Digest of Papers* -

*IEEE Radio Frequency Integrated Circuits Symposium*, 2022. doi: 10.1109/RFIC54546.2022.9863190.

- [37] K. Kwon, O. A. B. Abdelatty, and D. D. Wentzloff, "PLL Fractional Spur's Impact on FSK Spectrum and a Synthesizable ADPLL for a Bluetooth Transmitter," *IEEE J Solid-State Circuits*, vol. 58, no. 5, 2023, doi: 10.1109/JSSC.2023.3236640.
- [38] S. Kamineni, S. Gupta, and B. H. Calhoun, "MemGen: An Open-Source Framework for Autonomous Generation of Memory Macros," in *Proceedings of the Custom Integrated Circuits Conference*, 2021. doi: 10.1109/CICC51472.2021.9431501.
- [39] Y. Okuma *et al.*, "0.5-V input digital low-dropout regulator (LDO) with 98.7% current efficiency in 65nm CMOS," *IEICE Transactions on Electronics*, vol. E94-C, no. 6, 2011, doi: 10.1587/transele.E94.C.938.
- [40] S. Weaver, B. Hershberg, and U. K. Moon, "Digitally synthesized stochastic flash ADC using only standard digital cells," in *IEEE Symposium on VLSI Circuits, Digest of Technical Papers*, 2011.
- [41] S. Weaver, B. Hershberg, and U. K. Moon, "Digitally synthesized stochastic flash ADC using only standard digital cells," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 61, no. 1, 2014, doi: 10.1109/TCSI.2013.2268571.
- S. Kamineni, A. Sharma, R. Harjani, S. S. Sapatnekar, and B. H. Calhoun, "AuxcellGen: A Framework for Autonomous Generation of Analog and Memory Unit Cells," in 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2023, pp. 1–6. doi: 10.23919/DATE56975.2023.10137270.
- [43] Q. Zhang *et al.*, "An Open-Source and Autonomous Temperature Sensor Generator Verified With 64 Instances in SkyWater 130 nm for Comprehensive Design Space

Exploration," *IEEE Solid State Circuits Lett*, vol. 5, 2022, doi: 10.1109/LSSC.2022.3188925.

- Y. K. Cherivirala, M. Saligane, and D. D. Wentzloff, "An Open Source Compatible Framework to Fully Autonomous Digital LDO Generation," in 2023 IEEE International Symposium on Circuits and Systems (ISCAS), 2023, pp. 1–5. doi: 10.1109/ISCAS46773.2023.10181884.
- [45] D. D. Wentzloff, "FASoC: Fully-Autonomous SoC Synthesis using Customizable Cell-Based Synthesizable Analog Circuits," 2019. https://fasoc.engin.umich.edu/ (accessed Aug. 02, 2023).
- [46] D. Kim, J. Kim, H. Ham, and M. Seok, "A 0.5V-VIN 1.44mA-class event-driven digital LDO with a fully integrated 100pF output capacitor," in *Digest of Technical Papers -IEEE International Solid-State Circuits Conference*, 2017. doi: 10.1109/ISSCC.2017.7870403.
- [47] M. A. Akram, W. Hong, and I. C. Hwang, "Capacitorless self-clocked all-digital low-dropout regulator," *IEEE J Solid-State Circuits*, vol. 54, no. 1, 2019, doi: 10.1109/JSSC.2018.2871039.
- [48] J. Bang, S. Choi, S. Yoo, J. Lee, J. Kim, and J. Choi, "A 0.0084-mV-FOM, Fast-Transient and Low-Power External-Clock-Less Digital LDO Using a Gear-Shifting Comparator for the Wide-Range Adaptive Sampling Frequency," in *ESSCIRC 2021 - IEEE 47th European Solid State Circuits Conference, Proceedings*, 2021. doi: 10.1109/ESSCIRC53450.2021.9567821.
- [49] X. Ma, Y. Lu, R. P. Martins, and Q. Li, "A 0.4V 430nA quiescent current NMOS digital LDO with NAND-based analog-assisted loop in 28nm CMOS," in *Digest of Technical*

*Papers - IEEE International Solid-State Circuits Conference*, 2018. doi: 10.1109/ISSCC.2018.8310306.

- [50] S. J. Kim, D. Kim, H. Ham, J. Kim, and M. Seok, "A 67.1-ps FOM, 0.5-V-hybrid digital LDO with asynchronous feedforward control via slope detection and synchronous PI with state-based hysteresis clock switching," *IEEE Solid State Circuits Lett*, vol. 1, no. 5, 2018, doi: 10.1109/LSSC.2018.2875828.
- [51] Y. K. Cherivirala and D. D. Wentzloff, "A Capacitor-Less Digital LDO Regulator With Synthesizable PID Controller Achieving 99.75% Efficiency and 93.3-ps Response Time in 65 nm," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 70, no. 5, pp. 1769–1773, 2023, doi: 10.1109/TCSII.2023.3257686.
- [52] S. Bin Nasir, Y. Lee, and A. Raychowdhury, "Modeling and analysis of system stability in a distributed power delivery network with embedded digital linear regulators," in *Proceedings - International Symposium on Quality Electronic Design, ISQED*, 2014. doi: 10.1109/ISQED.2014.6783308.
- [53] Y. Lu, F. Chen, and P. K. T. Mok, "A Single-Controller-Four-Output Analog-Assisted Digital LDO with Adaptive-Time-Multiplexing Control in 65-nm CMOS," in *ESSCIRC* 2019 - IEEE 45th European Solid State Circuits Conference, 2019. doi: 10.1109/ESSCIRC.2019.8902511.
- [54] S. J. Kim, S. B. Chang, and M. Seok, "A High PSRR, Low Ripple, Temperature-Compensated, 10-µA-Class Digital LDO Based on Current-Source Power-FETs for a SubmW SoC," *IEEE Solid State Circuits Lett*, vol. 4, pp. 88–91, 2021, doi: 10.1109/LSSC.2021.3070556.

- [55] X. Liu *et al.*, "A Dual-Rail Hybrid Analog/Digital Low-Dropout Regulator With Dynamic Current Steering for a Tunable High PSRR and High Efficiency," *IEEE Solid State Circuits Lett*, vol. 3, pp. 526–529, 2020, doi: 10.1109/LSSC.2020.3035675.
- [56] Y. Lu, W. H. Ki, and C. P. Yue, "A 0.65ns-response-time 3.01ps FOM fully-integrated low-dropout regulator with full-spectrum power-supply-rejection for wideband communication systems," in *Digest of Technical Papers - IEEE International Solid-State Circuits Conference*, 2014. doi: 10.1109/ISSCC.2014.6757446.