Trying to learn Assembly for Reverse Engineering


I became interested in cybersecurity because I was curious about how malware creates damage. After some time of research and learning the fundamentals of computers and cybersecurity. I realize malware are like any other programs but with a malicious intent. (Yes, I took weeks to discover this)

After getting my hands on Practical Malware Analysis by Andrew Honig and Michael Sikorski, I was ready to dissect the malwares provided in the labs.

Just look at the book cover.

Having worked in the automotive and heavy machinery industries, I occasionally restore or “reverse engineer” components. Dissecting malware and understanding the processes taken is right up my alley. Most automotive components have manuals, tools and guides to help with repairs and overhauls. One of my first attempt at “reverse engineering” was bypassing the unlocking software in an ECU. (My happiness was short-lived and left a huge hole in my pocket.)

Unfortunately Malware authors are not going to offer you guides and instructions to analyze their work. I needed to get closer to the source. How are these malwares communicating to the system? What language are they communicating in? (Hint: It’s binary) The key lies in the lowest level human readable language.

Enter Assembly.

“An assembly language is a type of low-level programming language that is intended to communicate directly with a computer’s hardware. Unlike machine language, which consists of binary and hexadecimal characters, assembly languages are designed to be readable by humans. Low-level programming languages such as assembly language are a necessary bridge between the underlying hardware of a computer and the higher-level programming languages—such as Python or JavaScript—in which modern software programs are written.” – Jason Fernando

I’m just touching the tip of the iceberg in analyzing and reverse engineering malware. In my personal opinion, having an adequate knowledge in Assembly is going to be essential in analyzing malware and even regular programs. Below are some of my tips and resources that I have assembled (no pun intended) to better understand this phenomenal language. If you have any other tips, feel free to share them. 

Tip 1 : Document everything as if your life depends on it.

No alt text provided for this image

It is estimated that 2-10% of the world’s population have a photographic memory. Unfortunately, I’m not one of them. Countless times I have had a Déjà vu moment in infosec and I scramble to find my notes. Having a platform where you document your findings is essential for the longevity of your learning capabilities. Plus, by making your research public, you can help others in need.

Tools used for documentation:

  • Public platforms : WordPress , GitHub etc.
  • Personal notes : OneNote , Google Sheets , Evernote
  • Snipping tool : Take photos like it’s a crime scene.
  • Storage : Google Drive

Tip 2 : Learn about the x86 Architecture and memory in an OS.

I did the mistake of jumping in straight into reading and reversing Assembly code instead of understanding the fundamentals of an operating system. Learning about the x86 architecture such as instructions, memory registers and the stack is important to have an overview of how a system executes code.  

Resources to learn from:

Tip 3 : Set up the tools of the trade

Before we try to “reverse engineer” or “analyze” life-altering malware we need to set up our tools beforehand. 

Tool list:

  1. Source-code editor : Visual Studio Code
  2. Disassemblers/Debuggers : IDA Pro , GDB , radare2 , Binary Ninja

Tip 3.1 : Highly recommend trying out Godbolt for people with limited computing power.

All of the above-mentioned tools can be downloaded for free and will be ample enough for basic reverse engineering. There is an interactive online compiler that shows the assembly output of many compiled codes called Godbolt. It has a great UI and is beginner friendly. I use this online compiler often on the go when I want to check on certain concepts in Assembly instead of launching up my lab. 

Tip 4 : Reverse engineer your own programs.

Several sources have stated to “read between the lines” of an Assembly code. And to look out for functions and registers instead of trying to understand every single line. Reverse engineering is a meticulous artform and I couldn’t think of a better way than to start from the beginning. 

I decided to create extremely simple programs and disassemble them. Starting with everyone’s favorite first program, Hello World!

The example below is done using the online compiler Godbolt.

C source code

Assembly code

Godbolt color codes the respective source code with the assembly language. Making it easier for beginners to cross-examine source codes with assembly language. Thus ignoring the noise generated from built-in library functions. 

Tip 5 : Familiar yourself with low-level languages like C/C++

Tip 5.1 : Look through Assembly instructions and registers

Some free resources I used to learn C: 

Resources for Assembly:

Tip 6 : Practice programming concepts

You won’t progress very far in IT if you only study programming syntax. It’s crucial to comprehend ideas like functions, loops, variables, and data structures in order to understand how a program operates. I encountered several functions and loops in Assembly language in my rudimentary experience deciphering malware and crackmes (more on this in the next tip).

A loop program being disassembled in Assembly language:

A simple C loop program

Assembly code

A simple program utilizing a function disassembled into Assembly.

No alt text provided for this image

No alt text provided for this image

Reading Assembly might seem daunting at first but after familiarizing yourself with the mnemonic instructions and programming concepts you will start to get an “idea” of what the program is executing.

Tip 7 : Learn the Assembly language. (Or try to)

The Assembly language might seem daunting at first (even till today for myself) but eventually you will start to get an “idea” of it. Using the same point as Tip 4 we can start by creating basic programs in the Assembly language. In other words reverse reverse engineering.

We cant all be like Chris Sawyer who single-handedly coded the original RollerCoaster Tycoon in x86 assembly but we can start with the common introductory programs.

Resources used to learn basic Assembly:

Tip 8 : Practice till you are able to reverse time itself.

There are many resources available for you to use as practice in order to improve your reverse engineering learning experience (Crackmes). A few of these problems are designed to help newcomers grasp the basics of reverse engineering.

  1. Crackmes.one : You are able to choose the difficulty level , language and OS.
  2. Pico CTF : Filter out the Reverse Engineering and Binary Exploitation challenges.
  3. TryHackMe : Try out the Basic Malware RE and Reversing ELF rooms.
  4. Some Github users might upload basic RE crackmes for beginners, do a search on them too.

Tip 9 : Have a gameplan

You might come across many breaking/patching password challenges that utilizes simple programming concepts. One thing that helped me in these challenges is to have a RE methodology. I use the following methodology in tacking reverse engineering challenges.

My RE Methodology:

  1. Determine the type of a file and its data from the challenge.
  2. Keep a look out for hard coded strings.
  3. Do a test run on the program to observe execution.
  4. Disassemble the program and observe the function calls.
  5. Visualize or map out the control flow of the program.
  6. Refer to Assembly cheat sheets like this.
  7. Document the important steps for further reading.

Tip 10 : Do not give up.

But even simple challenges can be frustrating. Go back to the challenge another time or reread the Assembly instructions again. Maybe you’re missing out on a small detail that makes a difference. When you get stuck, use a decompiler like Ghidra. This helps in displaying the code in a more human readable language. In the last resort , read or watch walkthroughs by other users. Instead of blindly following the steps, go through each step and understand the reasoning behind those steps. Lastly, remember to document what you obtain in the walkthrough for future use.

Tip 11 : Learn from the pros.

With the advent of the Internet, self-paced learning has proven effective, convenient, and fast. However, it can be difficult and lonely at times. The best you can do is seek knowledge. The infosec community is vast and offers ample resources for professionals and jobseekers to connect, learn, attend events, and upskill. Seek out professionals you come across in LinkedIn, twitter or discord and ask them questions. Pick apart their brain and understand their reverse engineering methodology. 

Of course, it goes without saying to be courteous and not entitled. 

Create a website or blog at WordPress.com