Introduction

I've been reverse-engineering software for a few years now and one thing that I've noticed is that no matter what the software, the processes are nearly identical. To beginner reversers as they're called, they will look at me funny for saying that reversing an assembly-language program is similar to reversing a progressive web application. However, once a reverser has learned assembly, learned javascript, learned C, learned decompilers, learned disassemblers, learned network capture tools, etc.... Suddenly, the patterns remain the same and my goal for this article is to highlight the patterns rather than the details of the specific system because the process is far more valuable in the long run.

Before I dive into the process that works for me, I want to first present why the hell anyone would want to reverse-engineer software to begin with.

Why reverse-engineer software?

It seems that the two most common beliefs for reverse-engineering software include figuring out how undocumented or legacy software works, or trying to learn how a competitor's software works, such as what their architecture looks like or how a specific algorithm is able to be so performant. In the latter case, reverse-engineering can almost be seen as a "dirty" trade or thing to do to gain a competitive advantage - although it is generally completely legal to do from an educational standpoint.

However, there are quite a few other good reasons for reverse-engineering software that I would like to bring to light - especially for those who work in the field of Software Security like I do.

1) Interoperability with undocumented or unmaintained software - A good example of this use-case is anti-malware software firms. These firms are notorious for employing some of the largest number of software reverse-engineers because they need to find out how malicious software works on a continuing basis in order to continue to improve their anti-malware product. When you strip away the "good vs bad" aspect of this, what's really happening is Company A's software needs to be able to work well with the software of Group B (in this case, the malware authors). This relationship may be needed elsewhere, such as if a large software firm X goes out of business, but another company Y's software needs to continue to work well with X's software for some period of time.

2) Discovering your application's attack surface - This is a HUGE one, especially for me working in software security. This doesn't just apply to defending your application and its users from malicious hackers, but also defending your intellectual property from competitors... Unless those groups get someone inside of your organization, they have to use a blackbox approach and reverse-engineering is by far the best way I've found to conduct an audit on both application security and intellectual property protection... Simply because, as a reverser, you are in the shoes of the attacker and see exactly what they will see... You see the exact same low-hanging fruit, you see which static strings leak company information, passwords, keys, etc...

3) 3rd party modifications and tinkering - Another big reason for reversing applications is for example, you have a device such as Bluetooth headphones and through research, you've realized that you'd like feature X in the software or firmware of the headphones, but the manufacturer has not supplied this feature yet, even though the hardware is capable of performing it. In this case, hobbyists reverse-engineer firmware and software in order to modify it to add features to their device and perhaps the devices of other customers. Some manufacturer's may frown on this but many actually appreciate the enthusiast/hobbyist communities and they do in fact drive extra product purchases. In many cases, devices are capable of doing more than the UI exposes, and some reversing of firmware or Bluetooth UUIDs for example, can expose and allow customers to get more from their IoT devices.

My Process

This is what my process typically looks like:

1) Preliminary light-technical reconnaissance

Includes downloading/using the app, googling the publisher and searching them on Wikipedia, learning about which software libraries are known to be used, versions of those libraries/binaries, etc... Sometimes I'll go as far as to look up the developer on LinkedIn, read about who made it and what his/her background is even.

2) First stage technical recon

Includes running basic static analysis tools on the release/build files such as PE-Studio for Windows binaries, ELF or Mach-O utilities, "strings", "XORstrings", "signsrch" on Unix or macOS files, etc... At this stage, I also identify which language(s) the application is written in and check if I have the proper tools for step 3, if not, I research and/or download them. For example, if the app is an Android app, I'm downloading apktool, if it's a .NET app, I'm downloading dnSpy and de4dot, if it's a native app, I'm getting IDA Pro and OllyDbg, etc.... For web apps, Burp Suite, OWASP ZAP, Fiddler, Postman, Wireshark, shodan.io, robtex.com, Netcraft site report, etc... The last 2 sites are great for recon of web servers to get software used and versions. For hardware devices, this includes research on which chips are used in the device and Googling every single chip used, looking at the specs/datasheets and then searching for then product/model no. followed by the keywords "security vulnerabilities." For hardware, it's also important to assess the network stack and test that as well. So if the hardware exposes a bluetooth or wifi interface for example, get bluetooth and wifi testing tools, learn about the protocols, common pitfalls, etc... For software, I will search for any components for any application in exploit-db.com . So if I'm testing a Wordpress site, I get the version # used and immediately type "Wordpress" into exploit-db and look for known vulnerabilities.

3) Heavy Technical Reconnaisance

It should be noted that starting at this step and beyond, the steps are optional - they may not even be necessary! For example, if our intention was to see if a "data breach" was possible - aka access to sensitive data stored in the back-end, and in step 2 we found a Oauth Identity root key in the static strings of the application, we're already done! Of course, we'd want to explore to make sure that there are no more secrets exposed, but you get the point.

Assuming we need to explore further - step 3 is the phase where we open up our disassemblers, decompilers, network packet analyzers, and web-proxies and begin exploring using our detailed notes from steps 1 and 2 to look for key problem-areas… You were taking notes, right? For example, in step 2 I search for the word "password" and maybe we found a hit on "password: fhashjdj230" as a static string in a binary... Now we go and search the disassembly or decompilation for this string to find out how/where it's been used and by which code functions. In the case of web apps, I run through them in several browsers and experiment with all user input areas - what happens if I pass extra characters into an input box? What about the wrong character type? What about if I add some text to the end of a URL? What if I change a GET request parameter and resubmit the request? What if I add some javascript into the request, etc... Doing this using the browser and dev tools is mostly good enough but I also use a web proxy such as Burp Suite and/or tools like Fiddler and Postman here... With all of those tools, I'm empowered to find nearly any web app vulnerability.

This step also includes network traffic for non-web apps as well - Which IPs/domains is the application calling out to? What type of data is it sending? Is it encrypted? Using which algorithm? Using which mode? How are keys exchanged? Which protocols are used? Are certificates used? How are those certificates validated? Etc... For this, HTTP proxies and tools like Wireshark can be used.

The Heavy Technical Recon phase can be tedious if not done right and for beginners - this step does not mean that you spend weeks going through every module of the application. It is CRITICAL that you have targets and a goal in mind otherwise you can go through rabbit holes forever and waste a ton of time/get frustrated. Many new reversers assume that getting better and better at deep technical analysis is key but actually what I've found that helped my work far more is doing a better job in the preliminary and stage 1 technical recon such that I have clear targets in mind when I reach this phase.... You don't want to be doing basic recon by reading a line of assembly code at a time, or blindly throwing data at APIs that you have no clue about... That is, unless you're using a special fuzzing tool for precisely this.

The idea of this is that simply learning about how the application works and experimenting with it, you will begin to identify weaknesses and vulnerabilities, especially as you gain more field experience and read relevant books such as Web Application Hacker's Handbook and The Art of Software Security Assessment Volumes I and II, which show you exactly what to look for. As your skills improve, you will become better at each phase here and be able to zone-focus on specific commonly vulnerable areas of applications.

4) Identification of weaknesses

Jot down everything you find, which source files or areas of the binary/website you've found the issues... Google similar weaknesses for other applications or hardware of the same type... See what damage was caused, how they were remediated, etc...

5) Risk assessment

Research how severe each of these findings are, how much they impact your users... Are we talking a minor inconvenience or annoyance, or is private user data at risk? I've found in my career so far there are two major types of risks - 1 is a one-off, such as a nasty vulnerability allows for a complete device/application takeover, but these are often localized and would require an attacker already has access to the device either remotely through a malware install/phishing campaign, or physically has access to the device... In my experience, this risk is isolated to high-profile users or systems which often (but not always) must be specifically targeted by criminals. An example of software which is at risk here is banking software or software on the Prime Minister or President's devices, as an extreme example. The second type of risk is generally cloud/server-side where a weakness can expose thousands, millions, or BILLIONS of peoples' sensitive information - so-called "data breaches" which are much more applicable to more tech companies, damage brand reputation, and cost hundreds of millions of dollars.

6) Remediation

In this phase, we identify what the root-cause of the vulnerability is, what can be done to patch it up, and also we sometimes develop so-called PoCs or proofs of concepts - actual attacks that a malicious actor could do to cause damage in order to demonstrate the severity and further explore patch solutions. In many cases, remediation for specific issues is already widely published and documented.. For example, if you've found a cross-site scripting vulnerability in a web app, you can reference OWASP's page on XSS including their remediation steps: owasp.org/index.php/XSS_(Cross_Site_Script… .

7) Ticket filing

In this phase, we write up detailed, thorough, well-researched tickets to explain to the development teams the finding, the risk, the priority, and the remediation steps. It is important to give a thorough background and plenty of information because developers are busy with a plethora of bugs and issues and may not be security nuts like application security engineers are. Simply writing "This is a CSRF, it's very bad" is not very helpful in assisting the development team in understanding the threats, implications, and solutions for the problem. This also means not crying wolf and marking everything as a critical bug when risk or exploitation is extremely rare... If you're submitting 100 critical bugs every week, that's not going to do the business and its customers much good in the way of security.

8) Follow-up

I like to be a resource to the developers for any security-related questions they have as a result of my findings - either in-person, via Slack, on the ticket, etc... This step also includes ensuring that the ticket has reached the proper team or team lead.