Secure C Coding through Binary Exploitation — Introduction

Ragnar Security
3 min readJun 9, 2020

--

Image from https://www.sei.cmu.edu/education-outreach/credentials/credential.cfm?customel_datapageid_14047=15129

When people think about binary exploitation, they might think of Mr. Robot, hacking things quickly, and being able to gain access to some secret E-Corp server. The truth is, exploitation and hacking are slow, meticulous, and requires a lot of patience. By learning it, we will learn how to secure our software. It’s a puzzle that evolves continuously and there is always something to learn from it. In this series of blog posts, we will be learning secure coding through the eyes of an attacker. By understanding the fundamentals of how an adversary finds and uses vulnerabilities, we can understand how to secure our software and systems.

Just as a disclaimer: Please DO NOT exploit a device without expressed permission from the owner. There are CTF Competitions (ctftime.org), many binaries that you can run on your own computer, or even programs that you can write yourself.

An example of an integer overflow & attempted protection. Image from https://android-developers.googleblog.com/2016/05/hardening-media-stack.html

To begin let us start off with the most fundamental reason why there are so many vulnerabilities in C programming: bits and bytes. The combination of various bytes is how a computer understands what to do. You would not be able to read this article without a computer translating the bytes. A bit is just a 1 or a 0, and a byte is a collection of eight bits. It is very tedious for us humans to read through; thus, we created the assembly level language and higher-level programming languages like C. The difference between the two is that assembly level languages are bytes converted to human readable instructions whereas higher level languages allows us to write programs in a logical way. These are not perfect because of the limitations of data sizes and human error. For example, if you look at the chart below, data types need to be standardized in the machine; hence there are byte limitations. For example, a character is one byte (8 bits with max value of 255), and you need four bytes to represent an integer (max value = 255^4). If you assign values greater than their respective max values, an integer overflow will occur. For example, if you have an unsigned character with the value of 255, and you add 1 to it, rather than getting 256, you get 0 becausae 256 = 0001 0000 0000 in bits; however, the last byte gets cut off since a character can only be one byte of data; thus, it turns into (0000 0000 = 0).

Example of buffer overflow. Image from https://www.imperva.com/learn/application-security/buffer-overflow/

Another fundamental C issue is related to memory. In most programming languages, we need to assign a specific size to a variable so that we know how much memory we need. This applies to C as well; however, the language allows us to assign a value to a location we are not supposed to. For example, in older versions of C you could create an array that is 64 elements long, but you decided that you needed to write the letter A at the 65th element; thus, resulting in memory corruption. The most prominent exploit that has come from this is called buffer overflow; which allows users to take advantage of functions like fgets() to modify program behavior.

Having the basic understanding of where vulnerabilities come from can help you understand where to look for potential issues. This will allow you to be able to test code to ensure security and write code that has no flaws to begin with. In the next article, we will be going over the basics of using gdb for disassembly.

--

--