A 1.02 µW Battery-Less, Continuous Sensing and Post-Processing SiP for Wearable Applications

Christopher J. Lukas, Farah B. Yahya, Jacob Breiholz, Abhishek Roy, Member, IEEE, Xing Chen, Harsh N. Patel, Senior Member, IEEE, NingXi Liu, Avish Kosari, Shuo Li, Divya Akella Kamakshi, Oluseyi Ayorinde, David D. Wentzloff, Senior Member, IEEE, and Benton H. Calhoun, Senior Member, IEEE

Abstract—Improving system lifetime and robustness is a key to advancing self-powered platforms for real world applications. A complete self-powered, battery-less, wearable platform requires a microwatt-power system-on-chip (SoC), operating reliably within this budget, capable of surviving long periods without charging, and recovering from power loss to its previous state. To meet these requirements, we designed a wireless sensing heterogeneous system-in-package (SiP) containing an ultra-low power (ULP) SoC, a non-volatile boot memory (NVM), and a 2.4 GHz frequency shift key (FSK) radio, all integrated with custom ULP interfaces. The SoC includes a fully integrated energy harvesting platform power manager (EH-PPM) to power the SiP and other commercial sensors. The EH-PPM is designed for small loads and powers the SoC and peripherals while drawing very low operating current. The SoC also includes a digital system data-flow for sensing applications, an analog front end for ECG signal acquisition, and a cold-boot management system (CBMIS) for boot and recovery from the NVM. The CBMIS enables integration of the SoC with the ULP NVM to create a wearable formfactor, self-powered system capable of recovery from power loss. The SoC also includes a radio interface tightly integrated with a compression accelerator to efficiently communicate with the FSK transmitter and reduce the FSK’s transmission time. This tight integration between accelerators on the SoC and peripherals is another feature that reduces the system’s power consumption by reducing the code size and number of memory accesses required to perform an operation. The SoC consumes 507 nW average power while running free-fall detection, 519 nW average power while measuring ambient temperature, and 1.02 µW during continuous ECG monitoring and post-processing.

Index Terms—Battery-less SoCs, data-flow architecture, energy-harvesting, NVM, ultra-low power.

I. INTRODUCTION

The value of adding sensing and computing to devices has become clear with numerous new commercial smart sensing and monitoring products entering the market. However, in applications where batteries are not readily replaceable or rechargeable, commercial processors and SoCs consume too high power to meet battery lifetime requirements of these devices. Another concern for many of these emerging devices (especially ones targeting wearables) along with lifetime is form-factor. Thus, there is a need for lower power systems capable of monitoring and data aggregation with a long lifetime and a compact formfactor.

Several promising solutions have been presented in the ULP space. For example, [1] offers a low power processor using unique, low leakage logic, however it does not offer a fully integrated solution for power management or support a programmable NVM, and its low peak performance would not support many sensing interfaces and their accompanying processing. Some works such as [2] include power management for battery operation and can manage multiple chips for different sensing applications. These systems offer a more complete IoT solution than standalone processors or mobile SoCs but are limited in application space by the lifetime of the battery, as recharge cycles quickly reduce battery capacity [3]. Additionally, they are not low enough power to support a self-powered battery-less system with a wearable formfactor. A low power system in [4] replaces a battery with a super-capacitor, significantly improving hardware lifetime. However, the power consumption is still in the range which allows for loss of power given only a day or two of intermittent harvesting.

A complete self-powered, battery-less, wearable platform requires an SoC that consumes very little power (sub-1µW), operates reliably within this small power budget, and still performs meaningful tasks to solve real world problems. In addition, the device needs to adapt to changing harvesting conditions, survive for long periods of time without recharging, and recover from power loss (due to poor harvesting conditions) with no outside intervention. Figure 1 shows a plot of the number of hours different SoCs can survive given a fully charged 1F super-capacitor and no harvesting. By designing a sub-1µW SoC, we have improved the lifetime of the system without harvesting to about 22 days.
In this paper, we present a heterogeneous integrated ULP SoC, a cold boot programmable NVM chip, and an FSK transmitter into a compact 12 × 12 mm², 100-pin QFN packaged SiP. The proposed SiP harvests energy from solar or thermoelectric, uses a super-capacitor for energy storage, consumes sub-1µW for sensing and processing and sub-1mW for transmitting, and can recover from complete power loss through its programmable NVM. Thus, the proposed SiP enables continuous wireless, long term monitoring in environments with poor harvesting conditions. The SiP also targets a wide range of applications with its multitude of sensing interfaces and its easily reconfigurable design. Its small footprint and off-chip peripheral control simplify its integration into a larger platform while maintaining a small formfactor. The SiP exhibits tight integration and leverages heterogeneous process technologies to create a self-reliant system.

We demonstrate the system for three wearable biomedical and IoT applications: ECG monitoring, temperature monitoring, and free-fall detection. By strategically designing the SiP into a wearable device on the hip, the free-fall application is capable of detecting falls in elderly patients. It can also be used as an IoT sensor for tracking shipping integrity (keeping track of free-falls in shipped fragile boxes). The temperature sensing application can be used to continuously monitor body temperature when designed into a wearable band-aid style sensing platform that can be worn under the arm. It can also be used as an IoT sensor to keep track of air temperature. In addition, the SPI digital interface allows the SiP to interface with other commercial sensors for health monitoring such as ozone/carbon-monoxide gas sensing for asthma patients. The SPI interface can also connect a microphone to detect wheezing.

The rest of the paper is organized as follows: Section II starts by presenting an overview of our SiP architecture and its interfaces. Section III discusses the SoC features that allow it to operate within the harvester’s power budget. Next, Section IV introduces the EH-PPM. Section V discusses the backup and bootup processes enabled by the SiP interfaces that allow the system to recover from power loss. Section VI discusses the digital data-flow architecture enabling sub-µW operation. Section VII introduces the sub-threshold ECG AFE and ADC. Sections VIII and IX conclude with measured results from three applications and a summary, respectively.

II. System in Package Overview

Battery-less systems rely on harvested energy that has a varying profile [9], thus these systems can completely lose power, compromising their reliability. We approach this concern from two angles: 1) designing a microwatt-power system (Section VI) to reduce the chance of power loss, and 2) including a custom ULP NVM memory with a CBMS to enable system backup and recovery. To support recovery, the NVM must hold the program as well as any critical data generated by the system. However, integrating and managing NVM within a battery-less system budget is challenging because of the inherent high-powered nature of most NVM technologies. Thus, we propose dividing the system into three main components that integrate into a small formfactor SiP.

This approach allows us to leverage the advantages of multiple technologies to design a ULP system capable of recovering from power loss.

Figure 2 shows the SiP with its 3 main components: an SoC with an EH-PPM [10], a custom ULP NVM ferroelectric cold boot management system with critical backup memory [11], and an FSK radio transmitter. The SiP integrates the three chips into a single modular package, leveraging the benefits of multiple technologies into a single solution. A low-leakage 130nm technology was chosen for the always-on ULP digital and sensing system, while 65nm was chosen to enable UHF radio band transmissions, and a proprietary 130nm Fe-RAM technology was chosen for its low power NVM capability. This heterogeneous approach also improves design time and yield by allowing each sub-system to be designed, verified, tested, and binned separately. At the same time, the SiP retains the formfactor of a mobile, wearable, single chip solution as shown in Figure 3.

Unlike previously proposed platforms [2], [4], the proposed SiP is also capable of interfacing and powering commercial sensors. The SoC includes a Platform Power Manager (PPM) that integrates three power switches to control the power distribution to the SiP components and off-chip sensors. The CBMS controls the power switches feeding the NVM, while the SiP programmer can completely power switch the off-chip sensors when the system does not need to collect data, and the FSK transmitter when the system does not need to transmit data. This SiP feature simplifies the board design required, thus facilitating its integration into a larger application space while at the same time managing the power consumption of high-power sensors. These power saving techniques enable the SiP to remain within the budget of a smaller harvester.

To enable recovery without compromising the SiP’s power, the custom designed NVM takes advantage of different technologies such as power shut-off and multi-voltage design to reduce its contribution to the system. Different architectural techniques were also introduced to limit power [11]. A cold-boot bus (CBB) utilizes a parallel interface to efficiently transfer data between the NVM and SoC. The CBB data lines are bonded within the SiP, and thus do not contribute to the SiP’s footprint.
Bringing the parallel interface in-package removes the cost and area overhead of integrating two chips using a parallel bus on a PCB. This parallel architecture reduces the on-time of the NVM by increasing bandwidth without increasing the clock frequency. By doing this, we can switch off the NVM sooner, thereby reducing the NVM’s contribution to the system’s power budget. Once the SoC recovers from power loss, the CBB can recover data from the NVM 8x faster than a serial bus while consuming 4.12 µW [11].

In addition to the NVM, the SiP includes a custom designed FSK transmitter for integration with the SiP. The transmitter is compatible with the BLE standard for non-connectable advertisement that does not require two-way communication [12].

The controller of the SiP platform is the sub-µW SoC. To achieve this low power consumption, we co-designed and tightly integrated many blocks to optimize the SoC’s data-flow without compromising its ability to perform the required tasks. Different accelerators were tightly integrated with the blocks using them to create a custom data-flow that reduces the code size (required memory) and number of memory accesses required to perform an operation. This data-flow architecture allows us to significantly reduce the system’s power consumption and use a single main low power controller (LPC) for both software and memory management (DMA). The LPC acts as the master of a shared wishbone bus, with all accelerators acting as slaves.

The bus interface includes 3 sensor interfaces, 5 general use accelerators, 2 application specific accelerators, and 2 custom SiP buses for interfacing with the radio transmitter and NVM. The sensing interfaces include an ECG Analog Front End (AFE) with integrated ADC, a SPI master for interfacing with Commercial Off The Shelf (COTS) sensors, and 8 GPIO pins for interfacing with custom sensors, including two interrupts for waking up the system upon an event.

The general-purpose accelerators include a Finite Impulse Response (FIR) filter and a Multiply-Accumulate (MAC) unit for signal processing, two timers for periodic sensing and sleeping, as well as a compression block for reducing data memory (DMEM) utilization and radio transmission time. The application specific accelerators include the RR block for measuring RR intervals (heart rate) and the AFIB block for detecting atrial fibrillation from digitized ECG data. The EH-PPM powers the SiP chips while drawing very low operating current. It harvests from either PV or TEG using either a single-inductor boost converter with maximum power point tracking (MPPT) control [13] or a fully integrated (no external passives) voltage doubling switched-cap harvester. The harvested energy can be stored in either a super-capacitor or a rechargeable battery. The EH-PPM also includes three fully integrated regulators to power the different components of the SiP (SoC, NVM, and TX) as well as off-chip sensors. The regulators are specifically designed to reduce their quiescent currents and handle sub-µW loads.
To improve the system’s lifetime, the Power Monitor (PM) keeps track of the available energy in the system and adjusts the system’s operation to remain within the power budget of the harvester. The PM also tightly integrates with the CBMS to save critical data and recover from power loss. The CBMS also enables integration of the SoC with a ULP NVM to create a wearable, self-powered system capable of recovery from programmable memory. The CBMS utilizes the in-package CBB to minimize energy and power cost of recovering from NVM. The system also contains a Radio Interface (RI) for integration with the FSK transmitter to manage wireless transmissions of sensor data to a base-station. The dedicated serial interface minimizes radio on-time by customizing the power up, configuration, and radio transmission for minimal data transfer on the RI bus. Lastly, the SoC contains a compression block [14] to reduce the data size results in fewer packets, allowing the radio to turn off sooner or improve its duty cycle.

IV. ENERGY HARVESTING PLATFORM POWER MANAGER

For the system to be autonomous, it must have harvesting and power management circuits to support the SiP as well as any peripheral sensors. Designing a low quiescent current, high efficiency, EH-PPM improves system lifetime. Ideally, the EH-PPM should also minimize its contribution to the formfactor of the wearable system. This motivates a push towards integrated regulators over the use of inductors in power delivery.

The EH-PPM (Figure 4) used in this system is completely integrated and provides three voltages to the platform. A 1.8V rail provides power for common COTS sensors and the boot NVM. A 1.0V rail provides power to analog circuits, as well as the in-package low power boot NVM and radio transmitter. A 0.5V rail provides power to the digital side of the system, as well as the ULP analog front end.

To reduce the power conversion overhead in the SiP, the EH-PPM employs a hybrid architecture consisting of nW-quiescent power switched-capacitor DC-DC converters and low drop out (LDO) regulators. The platform uses a 1.3nW gate-leakage-based voltage reference generator, operational from 0.5V, along with Pulse Frequency Modulation (PFM) control to further lower the quiescent power of the switching regulators. The EH-PPM achieves a peak end-to-end efficiency of 71.1% while powering a 1µW load [10]. The EH-PPM power-up controller controls the power-up sequence of the whole platform to minimize the current drawn and ensure smooth startup. It includes a Power-On-Reset (POR) generator and on-chip load switches that can be controlled by the user. Once all the rails are at their target voltage, the POR signal is asserted to turn on the PM within the digital subsystem.

V. SYSTEM BOOT AND BACKUP PROCESSES

After the POR is asserted, the PM and CBMS recover data from the NVM and program the on-chip memories. First, the PM monitors available energy before beginning the boot process to ensure the SoC does not fall into a death loop where the system dies after or during recovery due to insufficient energy in the super-capacitor. The CBMS is designed to enable the system to recover quickly and with minimal power overhead. The PM and CBMS also backup user data into the NVM when the system is close to losing power. These features allow the SiP to operate reliably in commercial applications. Figure 5 shows the series of states during typical bootup and sensing.

A. The Power Monitor (PM)

The PM is responsible for monitoring the energy available to the system by measuring $V_{\text{CAP}}$. To do that, the PM relies on a voltage-controlled ring oscillator (VCO) that samples $V_{\text{CAP}}$ at a programmable frequency. Based on the state of the system and the available energy, the PM either manages the bootup sequence, backup sequence, or the digital power consumption. During the bootup sequence, the PM compares $V_{\text{CAP}}$ to a configurable safe bootup threshold before starting the CBMS.
This threshold is critical to guarantee successful bootup of the platform. It is configured based on the super-capacitor and instruction memory size and can be determined by post-silicon characterization. During bootup, the PM holds the rest of the SoC in reset until the end of the bootup sequence. Once that concludes, the PM turns on the system and monitors and controls its power consumption.

During normal operation, the PM monitors the available energy to determine the correct mode of operation (identified as green, yellow and red modes). In each of the three modes, the user can configure the PM to employ the available power saving features (power shut-off and clock gating) to adjust the power consumption of the system in order to preserve its continuous operation.

If \( V_{\text{CAP}} \) drops below a user defined threshold, the PM starts the backup sequence. Once the PM enters that mode, a backup signal is sent as an interrupt to the main controller allowing the system to backup critical data into the NVM. The backup threshold is a value where the system risks power loss, but still contains enough energy for a full backup sequence. Since the backup operation occurs when the system is low on energy resources, the amount of data to be backed up must be kept to a minimum. Thus, in our system, the backup register file is only 16 bytes long, which is enough to store critical information for battery-less systems like key counter values, the number of AFIB events, number of free-falls, recent temperature statistics, or number of wake-on-speech events for low power speech detection and recognition. The backup sequence copies this register file designated for backup into a backup FIFO within the NVM.

**B. The Cold-Boot Management System (CBMS)**

The CBMS is responsible for programming the SoC from the NVM and backing up any critical data before a power loss. Figure 6 shows a block diagram of the CBMS while Figure 7 shows the bootup and backup sequences. Once the PM detects that a backup is required, it signals the LPC to collect any critical data and send it to the CBMS. Once the LPC completes this data transfer, the CBMS powers on the NVM and sends a BACKUP command followed by the data to be saved. After a power loss event, the PM starts the backup sequence. The CBMS powers on the NVM and retrieves the instruction memory first to program the SoC. An End-Of-File (EOF) sequence signals the end of the instruction memory. Once the program EOF is reached, critical data is retrieved from a backup FIFO in the NVM and saved within the CBMS for the LPC to retrieve when the system starts.

**VI. THE DIGITAL SUB-SYSTEM**

The digital sub-system blocks were co-designed to reduce the power consumption of the system without negatively impacting its ability to perform complex tasks. To achieve a 1\( \mu \)W power budget for this sub-system, we adopt three approaches: 1) the LPC and instruction SRAM are tightly integrated to reduce the contribution of the SRAM to the power budget, 2) a coarse-grained reconfigurable data-flow architecture is introduced to optimize target applications, eliminate the need for a direct-memory access unit, reduce the required operating frequency, and allow the LPC/SRAM to go into deep sleep for extended periods of time, and 3) on-the-edge processing and compression reduce the frequency of radio transmissions to reduce their contribution to the power budget of the system. The digital subsystem also employs traditional power saving techniques such as fine-grained power shutoff and clock gating to minimize the contribution of unused blocks to the power budget.

**A. The Control Logic**

At the heart of the digital sub-system is the LPC and its ULP SRAM instruction memory. The SRAM contains a myriad of power saving techniques to reduce its contribution to the power budget [15]. These features include standby and shutdown modes to reduce the leakage power contribution, and a read burst mode to reduce active power consumption during program execution. The LPC is designed to be compact (only 1381 gates) and to tightly integrate with the instruction SRAM to make use of its power saving features. It is a two-stage pipelined processor implementing a custom Instruction Set Architecture (ISA) shown in Table I. This ISA and the available accelerators target IoT applications, and thus allow users to develop compact programs without sacrificing functionality. Reducing program size reduces the number of SRAM banks that must remain powered. Thus, the SHTDN instruction allows the programmer to completely shut down banks of unused SRAM to reduce power. The STALL instruction allows the users to put the system in a low power state while waiting for an event. When
a STALL instruction is issued, the LPC is automatically clock gated and the instruction SRAM is held in standby mode significantly cutting down the system’s power consumption. Since the LPC uses a custom targeted instruction set, a python-based assembler is developed to translate assembly style instructions into programming data for the SoC.

### B. The Data-Flow Architecture

The SoC includes several accelerators that target ULP, low throughput applications. These include a MAC unit, an FIR filter, two timer units, a sensor data compression block, and an RR-AFIB block. To improve the power consumption of the system without compromising its flexibility, we introduce a data-flow architecture into the SoC. The data-flow is created by directly integrating the accelerators needed by a sensing or communication interface with their corresponding interface. These data-flows act like an ASIC independent of the LPC or bus, sensing, digitizing, processing, and transmitting data. This frees the LPC to either stall or perform other tasks. Since, in our digital subsystem, the main contributor to the power consumption is the instruction SRAM and the LPC, the data-flow architecture allows us to reduce the power consumption by offloading most of the processing and data transfer into the data-flow accelerators, allowing us to reduce both the code size and the number of memory accesses. By stalling the LPC and its instruction SRAM until the data-flow accelerators complete their task, we reduce the overall system power by up to 65%.

This architectural choice also allows for a one-bus system without a DMA interface and without loading the LPC with data transfers between blocks. For example, the RR-AFIB block is integrated with the ECG AFE, and the compression block is integrated with the radio interface to create a health monitoring data-flow. Adding the data-flow paths improves the number of cycles required to complete a given sensing task and reduces LPC and bus usage, allowing for either more tasks to be completed in a given time or reduced system clock frequency. In the case of a wearable, self-powered device, an improved data-flow through reduction in required super-capacitor size.

To maintain the flexibility of the system, the data-flow paths can be enabled or disabled through configuration bits. This flexibility also allows us to highlight the advantages of the data-flow architecture by developing the same algorithm to either use or discard the data-flow paths. For example, we developed two versions of a sensor data compression algorithm, one that connects the compression block straight to the radio interface through the data-flow path, and the other uses the LPC/bus to transfer data between the compression block and the radio interface. In the latter, the compression block generates an interrupt when its internal buffer is full. The LPC then goes into an Interrupt Service Routine (ISR) where it reads the output of the compression block and moves it into the radio interface. In this example, the ISR adds at least 12 instructions into the code, and costs at least 16 cycles every time an interrupt is generated. Instead, the data-flow path reduces the code size by at least 12 instructions and avoids the need for the ISR completely, thus allowing the SoC to complete its task more efficiently, to go to sleep more often and with minimal overhead in area.

### C. On-the-Edge Processing and Compression

The wireless transmitter is an essential component in ULP sensing systems that is inherently high power compared to other system components. Therefore, the transmitter typically either limits the system functionality by imposing strict duty cycle limitations or limits the system lifetime by increasing the system power consumption. The SiP addresses transmitter power consumption by minimizing the amount of data that must be transmitted through both SoC processing and compression of sensor data. The RR-AFIB accelerator processes ECG sensor data and is closely integrated with the AFE/ADC to minimize its contribution to system power and reduce user overhead. It extracts the RR intervals and analyzes the entropy between intervals to detect AFIB events. The SoC can then transmit the RR intervals or atrial fibrillation events instead of the raw ECG data, thus minimizing the amount of data that must be transmitted.

The compression accelerator [14] can effectively compress ECG, acceleration, or any other type of sensor data that has a high degree of temporal correlation between consecutive samples. It is closely integrated with the radio interface to minimize its contribution to system power and can be bypassed or enabled with a single configuration bit. It also has a special operating mode that allows it to effectively compress multiple data streams concurrently, such as 3-axis acceleration data. It implements the low-overhead lossless entropy compression algorithm (GAS-LEC), which was chosen to minimize the processing overhead of compression while maximizing the compression ratio. Measured results show that it adds only 4.4nW processing overhead, reducing the required transmitter duty cycle by 3.7x, and the entire system power by 2.9x when the system is transmitting ECG at a 360Hz sampling rate [14].

### VII. Analog Front End and ADC

The SoC also includes an integrated AFE [16], followed by a SAR ADC, for ECG signal acquisition. These applications require a low input referred noise, on the order of a few μV rms, and a low bandwidth of <200Hz.
The AFE is expected to be always-on for continuous monitoring yet needs to satisfy the ultra-low power consumption requirements of the SoC. Therefore, the active power of the AFE is designed to consume less than 100nW, while meeting noise and linearity requirements for practical ECG monitoring applications. By using a weak inversion biasing technique and a low supply voltage, the AFE achieves an input-referred noise of 2.8 $\mu$V rms with a power consumption of only 68nW. The low supply voltage of 0.5V relaxes the voltage boost requirements of the energy-harvesting system. Figure 8(a) shows the schematic of the AFE in this work. A fully differential topology is utilized to ensure a high common-mode rejection ratio (CMRR) and power supply rejection ratio (PSRR). To save power, a non-chopper differential topology and transistor level low-noise design methods are used to ensure a flicker noise corner frequency of below 100Hz. A more in-depth discussion about power-noise tradeoff in the AFE along with a comparison with the state-of-the-art biopotential AFEs is provided in [16].

The AFE directly couples to the single ended SAR ADC (Figure 8(b)). The combined measured power consumption of the integrated AFE and ADC is 301nW at 0.5V. The sub-$\mu$W ADC features a ground referenced comparator that removes the need for extra reference circuits and reduces power. The design uses coupled metal capacitors for improved area efficiency and operates at 0.5V using high threshold devices for improved power efficiency. Two 6-bit capacitor banks are used to reduce area, with a small custom capacitor coupling the two banks to allow the LSB capacitor bank to match the least capacitance within the MSB bank. The down plate of each capacitor in the two banks connects to three analog pass-gates that are controlled by three digitally generated signals: sample, invert, and switch, respectively. Sample and invert are generated by a digital controller outside of the ADC, and switch is generated by the SAR ADC logic inside the ADC. The ADC is controlled by system clock and reset. Once the reset is disabled, the 12-bit parallel output will be available after 15 clock cycles, and the ADC continues sampling the input voltage and providing 12-bit output every 16 clock cycles.

VIII. Measured Results

We fabricated the SoC, NVM, and radio, and tested the platform within a single package for temperature sensing aggregation. Before deployment, instructions are programmed through a four-pin programming interface into the NVM [10]. In the field, the CBMS detects a safe level of harvested energy, powers on the NVM to stream in instructions, switches off the NVM, and begins executing instructions.

Figure 9 shows an annotated die photo of the SoC fabricated in a commercial 130nm technology, highlighting its main components. The SoC was tested along with the NVM and the FSK transmitter to show functionality. Figure 10 shows the measured EH-PPM harvesting energy onto a 10mF super-capacitor ($V_{\text{CAP}}$) from a PV cell. Once enough energy is available on $V_{\text{CAP}}$, the regulators ramp up the three rails starting with the 1.8V rail followed by the 1.0V rail and finally the 0.5V rail. Once all rails are stable, the startup circuit within the EH-PPM de-asserts the POR signal indicating it is safe to turn on the rest of the system.

Figure 11 shows the measured PM/CBMS bootup sequence through the CBB. For clarity only two of the 8 parallel data lines in the CBB are shown. After the CBMS programs the instruction
The LPC takes over the system performing the required application. During the bootup sequence, measured results show that the SiP consumes only 8.3 $\mu$W.

Figure 12 highlights the measured power savings due to the tight integration between the LPC and its SRAM. With all the SRAM power saving features utilized, the system power consumption can be reduced by up to 65%.

To highlight the flexibility of the system, three example IoT applications are chosen: 1) Shipping-Integrity Tracking (SIT) through free-fall detection, 2) temperature sensing 3) health monitoring through ECG measurement. The first two applications highlight the advantages of the PPM and its ability to support off-chip commercial sensors, while the third application highlights the low-power on-chip ECG AFE sensor and the SoC’s ability to continuously monitor health within a microwatt power budget.

In the SIT application, an off-chip commercial low power accelerometer is used to detect free-fall events. The SPI interface along with a GPIO event pin are used to communicate with the accelerometer. The SoC first configures the accelerometer through SPI to generate an event upon a free-fall detection. Once an event occurs (dropping the SiP and sensor), the SoC reads the free-fall data from the accelerometer and streams it to the FSK transmitter. The compression block is enabled to reduce the amount and frequency of transmissions. Figure 13 shows the measured power breakdown of the digital sub-system during its active state after a free-fall event with the compression enabled and disabled, and during its low power state waiting for the wakeup event. Figure 13 also shows the measured transmitted data (output of the TX chip) following a free-fall event. The average digital power assuming a pessimistic 1 free-fall event per minute is 507nW. For this application, the measured SoC power consumption from the super-capacitor including regulation, the off-chip sensor, and radio with no harvesting is 20.6 $\mu$W ($V_{\text{CAP}} = 1.19V$) on average, most of which is consumed by the off-chip sensor.

In the temperature sensing application, an off-chip commercial temperature sensor is used to sense the ambient temperature every 5 minutes and report the reading over the integrated radio. Figure 14 shows the measured data transfers among the SoC, the

<table>
<thead>
<tr>
<th>TABLE II</th>
<th>MEASURED POWER OF THE THREE APPLICATIONS. THE POWER FROM VCAP INCLUDES REGULATION, SOC, FSK TX, AND EXTERNAL SENSOR POWER</th>
</tr>
</thead>
<tbody>
<tr>
<td>Mode</td>
<td>Temp Sensing</td>
</tr>
<tr>
<td>-------</td>
<td>----------------</td>
</tr>
<tr>
<td>SOC</td>
<td>2.06 0.218 0.218</td>
</tr>
<tr>
<td>Fre</td>
<td>0.415 0</td>
</tr>
<tr>
<td>Source</td>
<td>TX 163.3 0</td>
</tr>
<tr>
<td>Vcap</td>
<td>187.7 7.599</td>
</tr>
</tbody>
</table>
When operating solely from harvested energy, the end-to-end system consumes 7.61 µW on average from the super-capacitor while sensing temperature (includes regulation and radio transmission power).

In the health monitoring application, the on-chip low power AFE/ADC is used to sample the ECG signal at a rate of 100Hz. The SoC first configures the AFE so that its interface generates an event upon a new sample. Once an event occurs, the LPC transfers data to the compression block that performs the compression and forwards the data to the radio interface through the custom data-flow. Once enough data is available for transmission, the radio interface generates an interrupt that notifies the LPC that a radio transfer is about to occur. The LPC then stalls until the data transfer is completed. Figure 15 shows the measured power breakdown of the SoC's power along with the data transferred through the radio. The average power including the digital sub-system, the I/O, and the AFE is 1.02 µW. For this application, the measured system power consumption from the super-capacitor including regulation and the radio with no harvesting is 5.98 µW ($V_{CAP} = 0.93V$) on average.

**IX. CONCLUSION**

In this paper, we presented an SoC integrated with an NVM and an FSK radio in an SiP for self-powered IoT monitoring. Custom SiP interfaces allow the system to boot from the ULP non-volatile boot memory and communicate with the FSK radio transmitter. The system consumes 1.02 µW while measuring ECG. This reduced level of power consumption was achieved through an architecture designed for improved data-flow for specific applications, tightly integrated components, data compression designed for health sensing, and efficient radio and NVM interfaces. Additionally, the system operates solely on harvested energy and powers off-chip sensors through its EH-PPM.

We demonstrated multiple applications in different spaces to highlight the flexibility of the system and show the reduction to power in each case using our power-saving features. Table II summarizes the power breakdown of the three applications in different modes. The first application highlights the SoC powering a commercial sensor from the EH-PPM. The complete platform consumes 20.6 µW total from $V_{CAP}$, with the SoC consuming an average of only 507nW. The second application highlights the deep sleep capabilities of the platform enabled by the co-design of the LPC with the SRAM. In this application, the SoC consumes 519nW, the radio consumes an average of 12nW, and the complete system consumes 7.61 µW total on $V_{CAP}$. The third application highlights the on-chip sensing for health monitoring, where the SoC consumes an average of 1.02 µW during continuous sensing, and 5.98 µW total on $V_{CAP}$. Table III has a comparison to state-of-the-art systems. To date, this SiP is the first configurable heterogeneous SiP with cold boot, EH, and integrated PPM that can manage power to commercial sensors.

**ACKNOWLEDGMENT**

Thanks to TI for FeRAM fabrication and much support.

**REFERENCES**

Christopher J. Lukas received the B.S. degree in computer engineering from the University of Pittsburgh, Pittsburgh, PA, USA, in 2013, and the Ph.D. degree in electrical engineering from the University of Virginia, Charlottesville, VA, USA, in 2017. In 2015, he was a Research Scientist Intern with Nvidia, developing on-chip, high-speed, and energy efficient signaling circuits. He is currently a Senior Design Engineer with Psikick, Charlottesville, VA, USA, where he works on SoC architecture, physical design, and chip integration for self-powered sensing platforms.

While with the University of Virginia, he researched high-speed and low-power SoCs and systems in package, SoC design and architecture, and SoC test, all targeting the ultra-low-power application space.

Farah B. Yahya received the B.E. and M.E. degrees in electrical and computer engineering from the American University of Beirut, Beirut, Lebanon, in 2008 and 2011, respectively, and the Ph.D. degree from the University of Virginia, Charlottesville, VA, USA, in 2017.

Between 2007 and 2012, she was an Embedded Software Engineer with S. & A. S. Ltd., Lebanon. She interned at Intel and ARM in 2012 and 2014, respectively, where she researched volatile and non-volatile memories. Since 2017, she has been a Senior Design Engineer with Psikick, Charlottesville, VA, USA, where she works on battery-less SoC architecture and physical design. Her research interests include ultra-low power circuits design for battery-less SoCs, emerging memories, and ultra-low power SRAM.

Jacob Breiholz received the B.S. degree in electrical engineering from the University of Virginia, Charlottesville, VA, USA, in 2015, and is currently working toward the Ph.D. degree in electrical engineering. He joined the RLP-VLSI group immediately after graduating in 2015. His research interests include ultra-low power digital integrated circuit design and self-powered system on chip design for Internet of Things-based applications.

Abhishek Roy (S'13–M'18) received the B.E. degree in electronics and communication engineering from the University of Delhi, New Delhi, India, in 2007, the M.S. degree in electrical engineering from the University of Michigan, Ann Arbor, MI, USA, in 2011, and the Ph.D. degree in electrical engineering from the University of Virginia, Charlottesville, VA, USA, in 2017. From 2007–2010, he was a Physical Design Engineer with Microcontroller Solutions Group, Freescale Semiconductor, India. From 2012–2013, he worked on custom processor designs with Qualcomm, Raleigh, NC, USA. In 2016, he interned at Circuits Research Group, Nvidia, where he contributed towards on-chip voltage regulation and power delivery research. He is currently a Senior Engineer with Marvell Semiconductor, Inc., Hamilton, Bermuda, where he is working on power delivery, power management and adaptive clocking circuits in next-generation, and high-performance network processors. His research interests include low-power circuit design, clock generation, and power delivery circuits in ultra-low power systems.

Xing Chen received the B.S. degree from the Beijing Institute of Technology, Beijing, China, in 2013, and the M.S. degree from the University of Michigan, Ann Arbor, MI, USA, in 2015, where he is currently working toward the Ph.D. degree. He was with Qualcomm, San Diego, CA, USA, as a RFIC Design Intern in 2018 developing frequency synthesizers for 5G applications. He also held internship positions at Psikick, Inc., in 2017 and Carnegie Mellon University in 2014. His research interests include analog/mixed signal IC design, all digital frequency synthesizers for wireless communications, and energy efficient RF transceivers.
Harsh N. Patel (SM’13) received the B.E. degree in electronics and communication from Dharmshil Desai University, Nadiad, India, in 2006, the master’s degree in VLSI design from the Vellore Institute of Technology, Vellore, India, in 2009, and the Ph.D. degree from the University of Virginia, Charlottesville, VA, USA, in 2017. He is currently working with Globalfoundries, Santa Clara, CA, USA, in the memory solution team developing eNVM solutions. He joined STMicroelectronics, India, in 2010, where he worked for three and half years in Technology R&D toward memory BIST and test algorithm development. He was responsible for various memory BIST compiler developments. In 2016, he interned with NVidia where he researched on soft-error characterization in 14 nm FinFET technology for the sequential blocks and design soft-error detection sensor. His research interests include eNVM (STT-MRAM and ReRAM) and SRAM design.

Ningxi Liu received the B.S. degree in micro-electronics from Sun Yat-Sen University, Guangzhou, China, in 2011, the M.S. degree in micro-electronics from Fudan University, Shanghai, China, in 2014, and is currently working toward the Ph.D. degree with the University of Virginia, Charlottesville, VA, USA. His main research areas include low power SRAM design, in-memory computing, and mixed-signal design for IoT applications.

Avish Kosari received the B.Sc. and M.Sc. degrees in electrical engineering from Shahid Beheshti University, Tehran, Iran, in 2012, and the M.Sc. and Ph.D. degrees in electrical engineering from the University of Michigan, Ann Arbor, MI, USA, in 2014 and 2018, respectively. She is currently a Post-Doctoral Research Fellow with the Department of Electrical Engineering, University of Michigan. Her research interests include ultra-low power analog and RF integrated circuits and systems as well as high performance wireless radio systems and architectures. She was a recipient of the University of Michigan Rackham Fellowship in 2013, the Barbour Scholarship in 2016, and the University of Michigan ECE Innovator award in 2018.

Divya Akella Kamakshi received the B.Tech. degree in electronics and communication engineering from the Cochin University of Science and Technology, Kochi, India, in 2010, and the Ph.D. degree in electrical engineering from the University of Virginia, Charlottesville, VA, USA, in 2017. She pursued the Ph.D. degree as part of the Robust Low Power VLSI group under the guidance of Prof. B. Calhoun. In October 2017, she joined Mythic, Inc., as an NVM Design Engineer. Previously, she was as a Design Engineer with NIQUES, a sub-threshold circuit design for ultra-low-power IoT systems, low power and variation tolerant circuit design methodologies, and high voltage circuits for non-volatile memories.

Benton H. Calhoun (M’02–SM’12) received the B.S. degree from the University of Virginia, Charlottesville, VA, USA, in 2000, and the M.S. and Ph.D. degrees in electrical engineering from the Massachusetts Institute of Technology, Cambridge, MA, in 2002 and 2007, respectively. In 2012, he co-founded PsiKick, a fabless semiconductor company developing ultra-low power wireless SoCs. Since August 2007, he has been with the Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, where he is currently an Associate Professor.

His research focuses on RF integrated circuits, with an emphasis on ultra-low power design. He has served on the technical program committee for ICCUWB 2008-2010, ISLPED 2011-2015, S3S 2013-2014, and RFCIC 2013-2015, and is a Guest Editor for the IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, the IEEE Communications Magazine, and the Elsevier Journal of Signal Processing: Image Communication. He is a Senior Member of the IEEE Circuits and Systems Society, IEEE Microwave Theory and Techniques Society, IEEE Solid-State Circuits Society, Tau Beta Pi. He is the recipient of the 2009 DARPA Young Faculty Award, 2009-2010 Eta Kappa Nu Professor of the Year Award, 2011 DAC/ISSCC Student Design Contest Award, 2012 IEEE Subthreshold Microelectronics Conference Best Paper Award, the 2012 NSF CAREER Award, the 2014 ISSCC Outstanding Forum Presenter Award, the 2014–2015 Eta Kappa Nu ECE Professor of the Year Award, the 2014–2015 EECS Outstanding Achievement Award, and the 2015 Joel and Ruth Spira Excellence in Teaching Award.

Oluseyi Ayorinde received the Ph.D. degree from the University of Virginia, Charlottesville, VA, USA, where he explored generating and configuring custom, sub-threshold FPGA hardware, as well as designing accelerators for ultra-low power SoCs. He is a Researcher with the Silicon Team, Army Research Laboratory, Playa Vista, CA, USA. He is currently focusing on development of low-power digital circuits for various applications. His research interests include machine learning acceleration, digital ASIC design, and FPGA hardware design.

Shuo Li received the B.S. degree from the University of Electronic Science and Technology of China, Chengdu, China, in 2013, and the M.S. degree from Fudan University, Shanghai, China, in 2016, both in electrical engineering. He is currently working toward the Ph.D. degree in electrical engineering with the University of Virginia, Charlottesville, VA, USA. His research interests include highly power-efficient energy harvester and power management design, ultra-low-power sensor interface design, and self-powered IoT SoC design.