feeds

by zer0x0ne — on

cover-image

some of my favourite websites: the hackers news trail of bits dark reading threatpost tripwire security weekly xkcd



xkcd

Retrieved title: xkcd.com, 3 item(s)
Quantified Self

It's made me way more excited about ferris wheels, subways, car washes, waterslides, and store entrances that have double doors with a divider in the middle.

Wing Lift

Once the air from the top passes below the plane of the wing and catches sight of the spooky skulls, it panics, which is the cause of turbulent vortices.

Two Key System

Our company can be your one-stop shop for decentralization.

Check Point Research

Retrieved title: Check Point Research, 3 item(s)
The New Era of Hacktivism – State-Mobilized Hacktivism Proliferates to the West and Beyond

Introduction

Until last year, hacktivism has primarily been associated with groups like Anonymous – decentralized and unstructured collectives made up of private individuals with a variety of agendas. Anonymous has launched multiple campaigns against a wide range of targets based on the preferences and wishes of its members. There was no real ideological affiliation or connection between the group’s members, and apparently no long-term agenda. Anyone, regardless of political affiliation, is welcome to join.

In the past year, things have changed. As one of the multiple fallouts of conflicts in Eastern Europe and the Middle East, some hacktivism groups stepped up their activities in form and focus to a new era; Hacktivism is no longer just about social groups with fluid agendas. The upgraded hacktivism is better organized, structured and more sophisticated. Though the change began in specific conflict-related geographical regions, it has now spread west and even further. Major corporations and governments in Europe and the US are being heavily targeted by this emerging type of hacktivism. In recent months, the US, Germany, Lithuania, Italy, Estonia, Norway, Finland, Poland and Japan suffered severe attacks from state-mobilized groups, which in some cases have had a significant impact. The recent attacks targeted not only the governments of these countries, but also major corporations like Lockheed Martin, a global defense contractor. Also, the latest large-scale attacks on the Albanian government were executed by a state-mobilized hacktivist group.

The major hacktivist groups that appeared in the last year share many characteristics of structured organizations: clear and consistent political ideology, a well-designed hierarchy for members and leadership, a formal recruitment process and even tools that the groups provide to their members. In addition, the groups and their members are aligned on the targets, and in a few cases, there is even organized cooperation between groups. In addition, the groups also have robust public relations operations to publicize and promote their successes, including on major media channels and websites.

All this allows the new hacktivism groups to be mobilized to governmental narratives and achieve strategic and broad-based goals with higher success levels – and much wider public impact – than ever before. Hacktivist groups no longer consist of a few random individuals who carry out small DDoS or defacement attacks on low-tier websites. These are coordinated organizations which launch organized large-scale DDOS and disruptive attacks against their targets, with far-reaching public relations. Therefore government agencies and organizations should consider themselves duly warned.


Old School Hacktivism

Examples of old-school hacktivist attacks include campaigns like Operation KKK against Ku Klux Klan supporters and members, the campaign against the United Nations in retaliation for not granting a seat to Taiwan, Operation AntiSec, whose goal was to steal and publish classified government documents, #Opwhales to support whale preservation, and more. In some cases, there were even contradictory campaigns executed by Anonymous within the same year, such as the #OpTrump and #OpHillaryClinton campaigns.

Figure 1 – Example of contradictory campaigns executed by Anonymous 

 

Hacktivism Model 2022 – Mobilization to Government Agendas

The shift in hacktivism began quietly 2 years ago in the Middle East, with several hacktivist groups like Hackers of Savior, Black Shadow and Moses Staff that focused exclusively on attacking Israel. Most did not hide their affiliation with the Iranian regime’s anti-Israel narrative. In parallel, several other groups in the Middle East, the most prominent being Predatory Sparrow, focused solely on attacking pro-Iranian targets. Their only shared agenda is opposition to the Iranian regime.

The geo-political agenda that mobilized hacktivism is not limited to the Middle East but is also an essential part of the Russian-Ukrainian war. In early 2022, the Belarusian Cyber-Partisans group, formed in 2020 to oppose the Belarussian government, began launching destructive cyber-attacks to stymie Russia’s troops.

The IT Army of Ukraine was publicly mobilized by the Ukrainian government to attack Russia. The new hacktivism also saw groups that supported the Russian geopolitical narrative, with groups like Killnet, Xaknet, From Russia with Love (FRwL), NoName057(16), and more.

Although the new hacktivism started in specific and limited geographical areas, the Russian-mobilized groups soon turned their focus from being solely on Ukraine, but on anyone opposing the Russian agenda, i.e. Europe, the United States and even Asia. This included significant attacks on governments and major corporations in the US, Lithuania, Italy, Estonia, Norway, Finland, Poland, Japan, and more.

These groups have also clearly stated agendas supporting Russian information warfare and interests, as we can see in the manifest of Noname057(16).

 

Figure 2 – Manifest of Noname057(16) group 

This group has a clear pro-Russian agenda, and has been regularly targeting Ukraine, and expanding their focus. During the last few months, Noname057(16) targeted many countries in the European Union which publicly supported Ukraine, such as Poland, Lithuania, Latvia, Slovakia and Finland. NoName057(16) also notably attacked the website of the Finnish Parliament in August, after Finland expressed interest in joining NATO.

Figure 3 – Noname057(16) targeting of Finnish Parliament

From Russia with Love (FRwL) is another group that sticks to the same state-mobilized modus operandi but gets less public attention. The group focuses on publishing private information on its Telegram channel and claims to have committed several attacks on “Russia’s enemies.” They claim to have gained sensitive information on Estonia and Lithuania by accessing Telegram channels related to the Ukrainian Security Service. FRwL joined the wave of attacks on Lockheed Martin and its subcontractors who produce HIMARS, that are part of the American assistance delivered to Ukraine. FRwL also claimed to breach Gorilla Circuits, a USA Printed Circuit Board Manufacturer, who are one of Lockheed Martin’s suppliers.

On the other side of the conflict, there are also multiple mobilized hacktivist groups who side with Ukraine. Some, like the IT Army of Ukraine, are officially run by the Ukrainian government. The IT Army of Ukraine was established days after the Russian invasion began and coordinated skilled volunteers from all over the world to operate under Ukrainian directive. According to CSS Zurich, the IT Army consists of both multiple global volunteers that work to coordinate DDoS attacks against Russian targets, and an additional team that works on deeper levels, perhaps comprised of Ukrainian defense and intelligence experts that can carry out more complicated cyber operations against specific Russian targets.

One of the most dominant groups joining the IT Army of Ukraine is TeamOneFist, the Pro-Ukraine collective that in August targeted Khanty-Mansiysk City in Russia, damaging the natural gas power plant and also causing a blackout at its airport.

Figure 4 – Team OneFist claims on the attack against Khanty-Mansiysk city in Russia

Although the pro-Ukrainian-mobilized groups currently focus exclusively on Russia, they still established precedents for state-affiliated and mobilized hacktivism.

During the last few months, we also saw a clear proliferation of Iranian state-mobilized hacktivism targeting Europe and NATO. On July 15, 2022, Albania suffered a serious cyberattack which temporarily shut down numerous Albanian government digital services and websites. The group that took responsibility for this attack is a hacktivist group called Homeland Justice, which is affiliated with Iran’s Ministry of Intelligence and Security. In this case, Homeland Justice clearly serves the Iranian government’s agenda against Mujahedin-e-Khalq (MEK), an Iranian dissident group sheltered by the Albanian government.

 

Killnet Case Study – Starting East and Going West

One of the major actors in the hacktivist ecosystem is Killnet, which was publicly launched around the end of February 2022, at the start of the Russian–Ukrainian war. The group began their aggressive activity in March, with targets mostly in Ukraine. However, in April the group completely shifted its focus to support Russian geopolitical interests all over the world. Between late February and September, the group claimed to have executed more than 550 attacks. Only 45 of them were against Ukraine, less than 10% of the total number of attacks.

 

Figure 5 – Distribution of Killnet attacks by country

Many of those attacks were against high-profile targets like major government websites, large financial companies, airports, and more. While in some cases it is difficult to understand the real impact, in many cases the attacks were clearly successful. They caused downtime for major websites, many of which provide essential public services.

Here are several examples of the attacks executed by Killnet:

  1. In March, the group executed a DDoS attack on Bradley International Airport in Connecticut (US). The US authorities confirmed an attempted large-scale DDOS attack on the airport’s website.

Figure 6 – Announcement about the attack against Bradley International Airport

  1. In April, websites belonging to the Romanian Government, such as the Ministry of Defense, Border Police, National Railway Transport Company and a commercial bank, were rendered unreachable for several hours. These attacks were in response to a statement made by the Romanian leader of the Social Democratic Party Marcel Ciolacu, who offered to provide weapons to Ukraine.

Figure 7 – Announcement about the attack against the Romanian government

  1. In May, major DDOS attacks were executed against two major EU countries:
    • Several German targets were affected, including German government and politicians’ websites, among them Chancellor Olaf Scholz’s party-affiliated site, Germany’s Defense Ministry, the German Parliament, Federal Police and several state police authorities. All this was a response to the Scholz administration’s efforts to supply military equipment to Ukraine. The government authorized the transfer of 50 Gepard anti-aircraft installations and announced the delivery of seven self-propelled, rapid-fire artillery systems.
    • Italy’s Parliament, military and National Health Institute were
  2. In June, two very significant waves of attacks were executed against Lithuania and Norway in response to severe geopolitical developments between those countries and Russia:
    • Following the decision of Lithuania’s government to halt the transit of Russian goods to Kaliningrad, a wave of major attacks hit Lithuanian public services and the private sector. During the attack, Jonas Skardinskas, the Head of cybersecurity at the Lithuanian National Cyber Security Center (NCSC), warned that the disruptions might continue for several days with the transport, energy and financial sectors bearing the brunt of the attacks. At some point, the majority of Lithuanian websites were not accessible from IP addresses outside of the country, most likely as a preemptive measure to mitigate the attack.
    • The same month, several large Norwegian organizations were taken offline. This attack was believed to be executed as a result of a dispute over transit through Norwegian territory to an Arctic coal-mining settlement controlled by Russia.
  3. In July, Killnet focused their efforts on Poland and caused several government websites to be unavailable. Most of the attacks were directed against government portals, tax authorities and police websites.
  4. August was quite a busy month for Killnet. It started with an attack on Latvia. After declaring Russia a “State sponsor of Terrorism”, the Latvian Parliament’s website suffered a major DDoS attack. Later in the month, Estonia faced its most extensive cyber-attacks since 2007, as a response to the removal of Soviet monuments. The effectiveness of those attacks was questionable, as it seems that Estonia was well prepared for such scenarios. During August, Killnet also started focusing on the United States. The giant American manufacturer Lockheed Martin was heavily targeted by Killnet as a response to providing military systems to the Ukrainian army. In parallel, Killnet also targeted the US Electronic Health Monitoring and Tracking System and the US Senate, which was debating sending additional aid to Ukraine.

Figure 8 – Announcement about the attack against Lockheed Martin

  1. In September, the group targeted Asia for the first time and focused its efforts on Japan, due to Japan’s support for Ukraine. With the escalation of the Russian–Japanese conflict over the Kuril Islands, Killnet successfully attacked multiple top Japanese websites, including the e-government, the public transportation websites for Tokyo and Osaka, the JCB payment system, and Mixi, Japan’s second largest social media site.


Leadership, Recruitment & Tools

Organizational Structure

The major hacktivist groups that arose during the past year are characterized by their well-structured operations which allow them to not only achieve targeted attacks in waves but also attract more skilled individuals. These individuals are usually motivated by a clear state-affiliated ideology, and their goals are incorporated into a manifesto and/ or set of rules to follow. For example, Killnet has more than 89,000 subscribers on their Telegram channel and is organized in a military-like structure with a clear top-down hierarchy. Killnet consists of multiple specialized squads to perform attacks which answer to the main commandment. There are currently around a dozen sub-groups, the main one being Legion. All of these groups are led by an anonymous hacker called KillMilk, who announced his intention to go solo in July, but is still involved in the group’s activity. Legion and the squads (known as: Jacky”, “Mirai”, “Impulse”, “Sakurajima”, “Rayd”, “Zarya”, “Vera”,  “Phoenix”, “Kajluk”, “Sparta” and “DDOSGUNG”) are referred to as Killnet’s special forces, with Legion referred to as its Cyber Intelligence Force.

Figure 9 – Recruitment announcement to the squads of the Legion

Multiple small squads are organized around the largest group and its former leader, KillMilk, which relays attack orders to each group’s “commander”, which allows independent infrastructures, inevitably improving the survival of the entire organization. This proves effective as the squads continue to recruit members and increase in number. Their Telegram page contains rules, discussions about targets, and instructions on joining/creating additional squads by members who seek autonomy or “promotion.”

Figure 10 – Rules of the Legion

Killnet’s evolution has put them in a position where other groups want to collaborate with them or officially join forces.

Recruitment

Another interesting and new phenomenon concerns the groups’ recruitment methods. Unlike Anonymous, who takes pride in welcoming everyone, with no requirement of proof of skills or specific agenda, the new era hacktivists only accept members who meet certain minimum requirements.

Figure 11 – Required professionals to Killnet

Many groups, such as Killnet and its squads, choose to invest in “proper” recruitment programs advertised on their Telegram channels. Some groups have a pre-selection process to bring only skilled hackers or experts in a particular field, to reduce the risk of making mistakes that could expose the whole operation.

Figure 12 – Recruitment form to the squads of the Legion

However, recently we observed KillNet relaying DDoS attacks instructions to the masses, perhaps due to a lack of manpower to carry out all its intended actions.

Figure 13 – DDoS attacks instructions by Killnet

On several occasions, we also saw KillNet offering rewards for individuals performing real-world, not virtual, vandalism in Ukraine.

Figure 14 – KillNet offering a reward for anyone willing to physically sabotage monuments on their behalf in Ukraine

The recruitment process is similar for many Russian state-mobilized groups. For example, XakNet (who also refer to themselves as the “Team of Russian Patriots”) is a Russian-speaking group that has been active as early as March 2022. XakNet threatened to retaliate against Ukrainian organizations for any cyber-attacks carried out against Russia and has targeted several entities within Ukraine and leaked the contents of a Ukrainian government official’s email. XakNet declared they will not recruit hackers, pentesters, or OSINT specialists without proven experience and skills.

Figure 15 – XakNet talent acquisition announcement

Other groups, like the pro-Russian NoName057(16), may offer training through different means such as e-learning platforms, tutorials, courses or mentoring.

Tools & Technological Prowess

Hacktivist groups strive to utilize more advanced tools in their attacks, as the more damaging the attacks, the more exposure and notoriety for the group. We have seen flashes of advanced tactics globally, but with the immediate and repetitive nature of hacktivism campaigns, most of the activity was focused around DDoS attacks through the use of huge botnets:

Figure 16 – Killnet’s claim about the size of their botnet

According to Avast, NoName057(16) uses a RAT known as Bobik, which has been around since 2020 together with Redline stealer. Recent reports state that devices infected with Bobik are part of a botnet carrying out DDoS attacks on behalf of NoName057(16).

In some cases, we see the group allegedly uses much more sophisticated destructive tools. For example, TeamOneFist is linked to destructive activities against Russian SCADA systems, and the Belarusian Cyber Partisans breached the computer systems of Belarusian Railways, right before the conflict started. In August, From Russia with Love (FRwL), claimed to have written their own Locker-like ransomware called “Somnia”.

Figure 17 – From Russia with Love claim about launch of ransomware

Conclusion 

Conflicts in Eastern Europe and the Middle East in the last few years affected the lives of many and escalated situations in broad domains across the world. One of the most significant escalation can be observed in the cyberspace ecosystems. During the previous decade, hacktivism was mostly a buzz word, which did not pose significant risks to global organizations. Having become more organized, structured, and sophisticated, hacktivism has ushered in a renaissance era. More concerning now is, many hacktivist groups have a very clear state-affiliated agenda and serve the special interests of specific governments. Even though this all began in specific conflict areas, we already see its proliferation to the west and beyond. We further expect hacktivist operators to enhance their arsenal and unleash state-level destructive attacks. Another growing worry is the fact that more and more governments are inspired by the success of new state-mobilized hacktivist groups, which may signify this phenomenon is here to evolve into a long-term activity.

The post The New Era of Hacktivism – State-Mobilized Hacktivism Proliferates to the West and Beyond appeared first on Check Point Research.

7 Years of Scarlet Mimic’s Mobile Surveillance Campaign Targeting Uyghurs

Introduction

In 2022, Check Point Research (CPR) observed a new wave of a long-standing campaign targeting the Uyghur community, a Turkic ethnic group originating from Central Asia, one of the largest minority ethnic groups in China. This malicious activity, which we attributed to the threat actor Scarlet Mimic, was first brought to light back in 2016.

Since then, CPR has observed the group using more than 20 different variations of their Android malware, disguised in multiple Uyghur-related baits such as books, pictures, and even an audio version of the Quran, the holy text of the Islamic faith. The malware is relatively unsophisticated from a technical standpoint. However, its capabilities allow the attackers to easily steal sensitive data from the infected device, as well as perform calls or send an SMS on the victim’s behalf and track their location in real-time. Also, it allows audio recording of incoming and outgoing calls, as well as surround recording. All this makes it a powerful and dangerous surveillance tool.

In this report, we present a technical analysis and describe the evolution of the campaign in the last seven years. Although a small part of this campaign was briefly discussed in Cyble’s publication as an isolated and unattributed incident, in this article we put the whole campaign in perspective and outline almost a decade’s worth of persistent efforts in phone surveillance of the Uyghur community.

 

Overview of the campaign

Since 2015, CPR has identified more than 20 samples of Android spyware called MobileOrder, with the latest variant dated mid-August 2022. As there are no indications that any of them were distributed from the Google Store, we can assume the malware is distributed by other means, most likely by social engineering campaigns. In most cases, the malicious applications masquerade as PDF documents, photos, or audio. When the victim opens the decoy content, the malware begins to perform extensive surveillance actions in the background. These include stealing sensitive data such as the device info, SMS and calls, the device location, and files stored on the device. The malware is also capable of actively executing commands to run a remote shell, take photos, perform calls, manipulate the SMS, call logs and local files, and record the surround sound.

Figure 1 – MobileOrder malware samples observed in the wild.

All the samples are based on the code of the MobileOrder malware from 2015, although during the ensuing years some changes were introduced by the developers. A few of these changes were clearly developed to reduce the chances of the malware being detected by security solutions: the malware authors experimented with ways to hide the malicious strings (which indicate the malware’s intentions), first by moving them to the resources section, and later encoding them in base64.

The actors also added a few adjustments and features to gather more information from their victims’ devices. One new aspect is to move from using AMAP SDK, an Android SDK used to identify geolocation, to using the standard Android LocationListener implementation. This allows the attackers to track their target’s location in real-time instead of an on-demand basis.

Figure 2 – Evolution of the Android malware.

The MobileOrder malware, despite being actively used and updated, still does not support modern Android OS features, such as runtime permissions or new intent for APK installation, and does not use techniques common to most modern malware such as accessibility usage, avoiding battery optimization, etc.

We are not able to identify which attacks have been successful, however, the fact that the threat actors continue to develop and deploy the malware for so many years suggests that they have been successful in at least some of their operations.

 

Technical analysis

When the victim opens the lure, whether it is a document, picture, or audio file, it actually launches the malicious application, which in turn opens a decoy document to distract the victim from background malicious actions. Some of the versions also ask for Device Admin and root access, which not only gives the malware full access to the device, but also prevents the victim from easily uninstalling the application:

Figure 3 – Device admin activation and superuser request.

The malware then hides its icon and launches two services: core and open. The open service is responsible for showing the victim the decoy content (a PDF file or an image or an audio record) which is stored in res/raw/, res/drawable/ or assets:

Figure 4 – Malware code that displays a decoy picture from February 2022 version.

The core service launches the Communication thread, which connects to the C&C (command & control) server and processes the commands received, and the KeepAlive thread, which periodically triggers a connection to the server and relaunches the parent service.

Figure 5 – The service that starts the Communication and KeepAlive threads.

However, the KeepAlive thread is not the only one responsible for keeping the malware active. The malware developer also created BroadcastReceiver that starts the core Service. The triggers for this receiver are numerous actions registered in the AndroidManifest, making sure the malware stays active all the time.

Figure 6 – AndroidManifest.xml specifying triggers for the BroadcastReceiver which is responsible for keeping the malware alive.

 

C&C Communication

Depending on the sample, the malware can use a hardcoded list of C&C servers, dead drop resolvers, or both.

First, the malware starts the process of resolving the C&C server, which includes decoding the built-in C&C addresses and, where it is defined, extracting the C&C server from dead drop resolvers which point to additional C&C infrastructure.

Figure 7 – The malware decodes the hardcoded C&C domains and the C&C server from the dead drop resolver.

The use of dead drop resolvers helps prevent the infrastructure from being easily discovered through static analysis, but also enables operational resiliency as this infrastructure may be dynamically changed. All the versions of the malware that make use of dead drop resolvers query different posts on the Chinese Sina blog platform.

Dead drop resolvers

First, the malware requests a specific blog page:

Figure 8 – Dead drop resolver on a Sina blog post.

Then it searches the received HTML for a specific base64-encoded regex pattern and decodes it to get the real C&C IP address and port.

Figure 9 – The code responsible for regex pattern matching of the dead drop resolvers.

In this specific example, the string MjA5Ljk3LjE3My4xMjQ6MjY3NQ== is decoded to 209.97.173.124:2675. The malware then creates a socket connection to the specified IP and port.

Encryption

To secure communication with the C&C server, the malware encrypts the data with AES. The key is generated in runtime from an encrypted passphrase inside dex by calculating the MD5 digest:

Figure 10- AES key generation.

 

Command execution

After successfully connecting to the C&C, the malware processes commands from the remote server. It first reads a command, then an argument size, and finally the actual encrypted arguments.

This is the full list of commands:

Command ID Description
64 Send a list of files from the specific path.
65 Send a list of processes running on the device.
67 Send device and connectivity information (IMEI, Phone Number, Network type, Accounts, Installed applications, Browser history and others).
68 Delete specific files on the device.
69 Upload files from a specified path on the device to the C&C server.
70 Download files from the C&C server (any file type).
71 Upload all SMS messages.
72 Upload all Contacts.
73 Upload all Call Logs.
74 Take a photo from the camera.
77 Start Audio Recording task (immediately or at a specified time).
78 Start “Network” location updates and send cell location info immediately.
79 Start “GPS” location updates.
82 Install APK (silently or via UI).
83 Uninstall the application (silently or via UI).
84 Execute “chmod -R 777” to a specific path via su.
85 Launch a specific application on the device.
86 Send Broadcast with a specific action to trigger other applications.
87 Run shell command.
88 Change the minimal time interval between a location updates.
89 Disable location tracking.
91 Check if a screen is on.
92 Send SMS to a specific number.
93 Delete specific SMS.
94 Perform call to a specific number.
96 Delete a specific call log.
97 Update the C&C list.
98 Take a screenshot.

As we can see from this list, the malware contains stealer functionality to upload all kinds of sensitive data from the device (device info, SMS, calls, location, etc.), but also provides RAT functionality by executing active commands on the device such as remote shell, file downloading, taking photos, performing calls, manipulating the SMS and call logs, etc. In the next sections, we analyze the most important functions.

SMS and Call Logs manipulation

The malware has commands to upload all the SMS and call logs to the attackers’ server. In addition, it provides the functionality to send text messages or perform a call to a specific number. This allows the actors to conduct further malicious activity against additional targets by impersonating the current victim, using his name, phone number and credibility. This drastically increases the chances of success.

To hide these actions from the victim, the attackers may use commands to remove the last messages or call logs so that no traces of their interactions with third parties are left on the device.

Figure 11- Malware code responsible for running calls / sending SMS from a victim’s device and functions to cover the evidence of these actions.

Location tracking

The malware can collect the victim’s device location and track its changes over time. When it is launched, the malware registers a location listener, which means Android will trigger this listener every time the location is changed.

The malware collects latitude, longitude, altitude, speed, bearing, accuracy, and the provider (GPS or network) that supplied these results. It also tries to convert the current location from latitude and longitude coordinates to a physical address using the Geocoder class. The number of details and the precision of this reverse geocoding process may vary. For example, one set of coordinates can be translated to the full street address of the closest building, while another might contain only a city name and a postal code.

The geolocation data is immediately sent by the spyware to the remote server. Additionally, the malicious application also writes this data with a timestamp to the file called map.dat, thereby continuously collecting and saving the victim’s geolocation. Even if the internet connection on a victim’s devices or to the C&C server is unavailable, the file with all the geolocation information is continuously updated and is uploaded to the attacker-controlled server when the connection is restored.

Figure 12 – Location updates processing and reverse geocoding.

The attackers can also configure the Location listener parameters remotely:

  • Change the minimal interval between the location updates – This allows the actors to decrease the number of updates but can still track the victim.
  • Change the provider for location tracking between GPS (based on satellite usage) or network (based on the availability of cell towers and WiFi access points).

Before the malware developers started to utilize the standard Android LocationListener, the malware used a third-party SDK called AMAP to track the victim’s location. The overall idea is similar: when the malicious app receives a command from the attackers’ server to start tracking the device’s location, it subscribes to location updates from the AMAP SDK. This way, at every location change, the malware writes the current location with a timestamp to the map.dat file and stores it as a variable.

Figure 13 – Device location tracking in the versions that use the AMAP SDK

As a result, the attackers can send commands from the remote server to read the current location or to request a full tracking file.

To summarize, in the most recent versions, the malware developers added the ability to track their target’s location in real-time. The malware sends location updates on its own, compared to previous versions where the server needed to send additional commands to get the location information.

Call recording and file upload

To record both incoming and outgoing calls from the infected device’s microphone, the malware uses a BroadcastReceiver called CallRecorder. It monitors the phone state and saves the call records locally to the db file, so that it can be uploaded later to the attacker-controlled remote server by issuing the “upload file” command.

Figure 14 – The malware code responsible for recording the incoming and outgoing calls.

Surround recording

Besides recording incoming and outgoing calls, the attackers can start surround recording remotely by issuing a relevant command from the C&C server.

When the command is received, the malware gets as an argument the desired duration and the specified delay before the recording starts. If there is no delay specified, it launches a thread that immediately starts to record. Otherwise, it creates a PendingIntent for the BroadcastReceiver that is registered in AlarmManager – and as a result, triggers a recording in the specified time.

Figure 15 – Starting audio recordings.

After the AudioRecording thread performs the recording with the specified duration, it saves it to the db file with the timestamp:

Figure 16 – Surround recording implementation.

As the recorded files may be quite large, we would expect to see some restrictions in the code on how the resulting files are exfiltrated (for example, upload the files only via Wi-Fi networks), but there are no such limitations in the code. However, there is no automatic upload for the recorded calls. The attackers decide when to exfiltrate the files, so they could send a command to get device information (which contains the current network connection type) and then exfiltrate the files from the device when convenient.

Because the attackers have updated information about the victim’s location, they can choose the opportune moment to record offline private conversations, which affects not only the victim’s privacy but also that of unsuspecting third parties.

Remote shell

The malware can receive commands to execute a remote shell, which is done by starting a thread that, in turn, starts a shell process and establishes a socket connection to the same C&C server, but over a different port. The shell’s output is redirected to the socket output stream from which the malware reads the commands, then decrypts and executes them:

Figure 17 – Remote shell execution.

Drop additional APK

When it receives a command to install an APK, the malware starts a thread that checks if it has enough privileges to install the application silently. If the check fails, the malware launches a regular UI installation via intent:

Figure 18 – Silent apk installation via PackageManager.

Uninstalling an application performs exactly the same logic.

 

Attribution

The first report that summarized the activity of Scarlet Mimic and various elements of this threat was published in 2016. It reviewed a series of persistent attacks that targeted Uyghur and Tibetan minority rights activists as well as those who support their cause.

The group’s arsenal at that point included multiple Trojans and tools for Windows and macOS. In 2015, the actors started to expand their espionage efforts from PCs to mobile devices using the spyware called MobileOrder, which focused on compromising Android devices. Based on the code similarity, shared infrastructure and victimology, we conclude that the new wave of attacks belongs to the same threat actor and that the group continues to deploy and develop MobileOrder malware until this day. In addition to clear code overlaps, we observed multiple overlaps in the infrastructure between the new samples and the old MobileOrder malware variant, as well as multiple variants of Windows Psylo Trojan previously attributed to Scarlet Mimic, that interact with the same malicious domains as the mobile malware.

In late 2017, Lookout research published their report on another cluster of malicious activity, which relied on JadeRAT Android malware to target the Uyghur community. This campaign “had some overlap [with ScarletMimic] around the apps they trojanized, the likely groups they targeted, their capabilities, and to some extent their implementation.”

Together with the evidence of the ongoing campaign using Android spyware provided in this report, this emphasizes the heavy shift of activity targeting these minority groups towards mobile surveillance in the last few years.

Code overlaps

The MobileOrder from the 2015 report also started by registering itself as a device admin with admin privileges to secure its persistence and to lay a proper foundation for the rest of the malware’s functionalities:

Figure 19 – MobileOrder sample from 2015 (md5: a886cbf8f8840b21eb2f662b64deb730) requesting device admin privileges vs the sample from April 2020 performing the same request (right)

The 2015 version of MobileOrder masqueraded as a PDF document, with an embedded PDF called rd.pdf in the application’s resources. This is similar to all the new samples in the ongoing campaign where the decoy content is PDF files. The bait PDF extracted from the malware resources is written to the device’s SD card and displayed to the victim while executing the malicious actions in the background:

Figure 20 – APK structure and the decoy PDF file location in 2015 sample of MobileOrder and August 2022 sample (right).

The main communication thread, which is responsible for communicating with a C&C server via socket and processing received commands, also did not change much over time, although many of the commands themselves changed the command id, and a few more functionalities were added.

Figure 21 – Command processing in MobileOrder from 2015 vs commands processing in newer samples (deobfuscated code).

 

Victimology and lures

Most of the malicious applications we observed have names in the Uyghur language, in its Arabic or Latin scripts. They contain different decoys (documents, pictures, or audio samples) with content related to the ethnic geopolitical conflict centered on Uyghurs in China’s far-northwest region of Xinjiang, or with the religious content referencing the Uyghurs’ Muslim identification. We can therefore conclude that this campaign is likely intended to target the Uyghur minority or organizations and individuals supporting them, which is consistent with the Scarlet Mimic group’s previously reported activity.

A few interesting examples of decoys used by the actor over the years include:

  • The sample with the original name “photo” (md5:a4f09ccb185d73df1dec4a0b16bf6e2c) contains the picture of Elqut Alim, the “New Chief Media Officer” of the Norwegian Youth Union who call themselves “a group of Uyghur youth who live in Norway with a common understanding and a common goal, which is to stand up against China’s invasion of East Turkestan.” The malware was uploaded to VT with the name in Uyghur Latin and a fake “.jpg” extension.

Figure 22 – Decoy image from the sample a4f09ccb185d73df1dec4a0b16bf6e2c.

  • The application named پارتىزانلىق ئۇرۇشى” which translates from Uyghur to “Guerrilla Warfare” (md5: b5fb0fb9488e1b8aa032d7788282005f) contains the PDF version of the short version of the military course by Yusuf al-Ayeri, the now deceased first leader of Al-Qaeda in Saudi Arabia, which outlines the tactical methods of guerrilla warfare.

Figure 23 – The lure PDF containing the materials by the military wing of Al-Qaeda.

  • Another sample called “rasimim” (“pictures” in Uyghur, sample md5:06c8c089157ff059e78bca5aeb430810) contains multiple pictures referring to the escalated tensions in Xinjiang Uygur Autonomous Region in May 2014, including the deployment of special police forces next to the Urumqi Railway Station and the medical evacuation after a terrorist attack in a street market.

Figure 24 – The lure pictures of escalations in Urumqi, the capital of Xinjiang.

  • The sample called “The China Freedom Trap” (md5: a38e8d70855412b7ece6de603b35ad63) masquerades as a partial PDF of the book with the same name written by Dolkun Isa, politician and activist from the region of Xinjiang and the current president of the World Uyghur Congress:

Figure 25 – The cover of the lure PDF.

  • The sample called “quran kerim” which translates as “Noble Quran” (md5: f10c5efe7eea3c5b7ebb7f3bf7624073) uses as a decoy an mp3 file of a recorded speech in what seems to be a Turkic language.

Some of the other lures include the pictures of unidentified individuals, and as reverse search engines fail to trace their origin, we can assume that these pictures are borrowed from the private profiles of these individuals in some social networks or were stolen from their mobile devices as a result of the spyware deployment.

It’s interesting that one of the samples, called “القائمة” (“The list” in Arabic) with the package name com.sy.go.immx (md5:7bf2ca0e7242cabcee8d3bb37ac52fc7) doesn’t follow the pattern of referencing Uyghurs. The name and the lure of this application is in Arabic, and the lure document contains a picture of a list of persons wanted by Shabwah Governorate in Yemen for threatening the security and stability of the province. This may indicate the additional targeting of individuals or organizations located in a different geographical zone and involved in another conflict.

 

Over the years, Scarlet Mimic strongly continues its espionage operations against the Uyghur community using Android malware. The persistence of the campaign, the evolution of the malware and the persistent focus on targeting specific populations indicate that the group’s operations over the years are successful to some extent. This threat group’s shift in their attack vector into the mobile sector provides another evidence of a growing tendency of extensive surveillance operations executed on mobile devices as the most sensitive and private assets.

 

Check Point’s Harmony Mobile helps securing mobile devices across all attack vectors: apps, network and OS and protects against Android malware such as the one used on this campaign.
Harmony Mobile leverages Check Point’s ThreatCloud and award-winning file protection capabilities to block the download of malicious files to mobile devices and prevent file-based cyber-attacks, such as the one’s described on this blog.

 

IOCs

SHA256 Package Name
fd99acc504649e8e42687481abbceb71c730f0ab032357d4dc1e95a6ef8bb7ca
com.emc.pdf
89f350332be1172fc2d64ac8ecd7fd15a09a2bd6e0ab6a7898a48fb3e5c9eac3
pw.nrt.photo.google
84ce04fd8d1c15046e7d50cd429876f0f5fbca526d7a0a081b6b9a49fe66131f
com.sy.go.immx
f876b2a60d4cf7f88925f435f29f89c0393f57a59ec46d490c7e87821f29fc0f
com.pdf.google.vv
c2cd40f1c21719d4611ff645c7f960d0070c19e8ad12cc55aded7b5a341c89a3
com.pdf.google.vm
2e94183fcbc3381071d023a030640aaef64739006b6c22603b94b970cebeeec2
com.pdf.google.vm
73729646a7768a5bd4c301842c19b3b16bb190e435af466a731ad36544982098
com.pdf.google.vm
13e457ce16c0fe24ad0f4fe41a6ad251ebffb2fdaaebe7df094d7852ba0cfdc6
com.photo.android.p
155d0707858cbb18ed5ecb4d98009288e4c5a1e68275d9db5b2390f204636431
com.update.google
0703185a3e206b8da96a86f4bbcb750b48bbec8b2fc2598eed8603e4027cf4ae
com.photo.android.p
be0ae4394b8592cd1325b86669fa78f9ccd320d23f839e81001138be914a760f
com.photo.android.p
990e50ce20706be80b4d62367ff6ed615d6dd04551b42cfd80b1a8950065b646
com.photo.android.p
633739c3b51715516fb226b3b9c693530d8ef715ac19093cdf6aaf108149b91f
com.view.openpdf
e959dc221a8667cde8b9ff080d078e60ed1e8bf5a3c6f1f352919c9b8f696830
com.view.openpdf
e3ee0ccfb01e2effd49feddb252781baa2a05f8360d5cf949d09e3add1e73e4d
com.photo.android.p
126e41c231c1b5a25584e27d47132d0d243da155e6a70517d08dbf611201fdca
com.photo.android
ed3aa8e58d65c81df2f18e970456225b7c2b78e4add4dea556298a915b8fef1a
com.photo.android
35adf82e2ace8fe0ddfd50b21dad274df40696f5dfcdf7372fe63eed8bbed869
com.photo.android
03004ccc23033a09532bea7dfa08c8dfa85814a15f5e3aedb924a028bcd6f908
com.view.openpdf
afcbf339d1c0a6174f93425cd1b8ba50979132856f0c333865a62d7c6e8a3084
com.photo.android
91c34071622b678b2f64a8b896c7898cceff658764eb0ae5e100b3d4d868a664
com.photo.android
549ea085fbb23729ee000721938d95ea38ff2e70a63af1d4aa8db6b7b3458f6f
com.photo.android
ba08ee68d9218e0aaa3384bcb2ab281fd8273fe40aee65c300adbf85120cbc7b
com.lppads.android
Indicator Type
adfgasfasfasf123[.]com
C&C
blackbeekey[.com
C&C
fly100.dellgod[.]net
C&C
islam.ansardawlatalislam[.]com
C&C
k7k7[.]co
C&C
mobile.muslimbro[.]org
C&C
ziba.lenovositegroup[.]com
C&C
209.97.173[.]124
C&C
45.32.112[.]182
C&C
https://blog.sina.com[.]cn/u/5241106671
Dead drop resolver
https://blog.sina.com[.]cn/u/5955775229
Dead drop resolver
https://blog.sina[.]cn/dpool/blog/s78u
Dead drop resolver
https://blog.sina.com[.]cn/u/5926910809
Dead drop resolver

The post 7 Years of Scarlet Mimic’s Mobile Surveillance Campaign Targeting Uyghurs appeared first on Check Point Research.

Native function and Assembly Code Invocation

Author: Jiri Vinopal

Introduction

For a reverse engineer, the ability to directly call a function from the analyzed binary can be a shortcut that bypasses a lot of grief. While in some cases it is just possible to understand the function logic and reimplement it in a higher-level language, this is not always feasible, and it becomes less feasible the more the logic of the original function is fragile and sophisticated. This is an especially sore issue when dealing with custom hashing and encryption — a single off-by-one error somewhere in the computation will cause complete divergence of the final output, and is a mighty chore to debug.

In this article, we walk through 3 different ways to make this “shortcut” happen, and invoke functions directly from assembly. We first cover the IDA Appcall feature which is natively supported by IDA Pro, and can be used directly using IDAPython. We then demonstrate how to achieve the same feat using Dumpulator; and finally, we will show how to get that result using emulation with Unicorn Engine. The practical example used in this article is based on the “tweaked” SHA1 hashing algorithm implemented by a sample of the MiniDuke malware.

Modified SHA1 Hashing algorithm implemented by MiniDuke

The modified SHA1 algorithm in the MiniDuke sample is used to create a per-system encryption key for the malware configuration. The buffer to be hashed contains the current computer name concatenated with DWORDs of all interface descriptions, e.g. 'DESKTOP-ROAC4IJ\x00MicrWAN WAN MicrWAN MicrWAN InteWAN InteWAN Inte'. This function (SHA1Hash) uses the same constants as the original SHA1 for both the initial digest and intermediate stages, but produces different outputs.

Figure 1: MiniDuke SHA1Hash function constants

Since the constants used are all the same in the original and modified SHA1, the difference must occur somewhere in one of the function’s 1,241 assembly instructions. We cannot say whether this tweak was introduced intentionally but the fact remains that malware authors are growing fonder of inserting “surprises” like this, and it falls to analysts to deal with them. To do so, we must first understand in what form the function expects its input and produces its output.

As it turns out, the Duke-SHA1 assembly uses a custom calling convention where the length of buffer to be hashed is passed in the ecx register and the address of the buffer itself in edi. A value is technically also passed in eax but this value is identically 0xffffffff whenever the executable invokes the function, and we can treat it as a constant for our purposes. Interestingly, the malware also sets the buffer length (ecx) to 0x40 every time it invokes this function, effectively hashing only the first 0x40 bytes of the buffer.

Figure 2: SHA1Hash function arguments

The resulting 160-bit SHA1 hash value is returned in 5 dwords in registers (from high dword to low: eax , edx , ebx , ecx , esi). For example, the buffer DESKTOP-ROAC4IJ\x00MicrWAN WAN MicrWAN MicrWAN InteWAN InteWAN Inte has a Duke-SHA1 value of 1851fff77f0957d1d690a32f31df2c32a1a84af7, returned as EAX:0x1851fff7 EDX:0x7f0957d1 EBX:0xd690a32f ECX:0x31df2c32 ESI:0xa1a84af7.

Figure 3: Example produced SHA1 Hash of buffer

 

As explained before, hunting down the exact place(s) where the logic of SHA1 and Duke-SHA1 diverge and then reimplementing Duke-SHA1 in Python is an excellent way to waste a day, and possibly a week. Instead, we will use several approaches to “plug into” the function’s calling convention and invoke it directly.

IDA – Appcall

Appcall is a feature of IDA Pro which allows IDA Python scripts to call functions inside the debugged program as if they were built-in functions. This is very convenient, but it also suffers from the typical curse of convenient solutions, which is a very sharp spike in difficulty of application when the use case gets somewhat unusual or complex. Alas, such is the case here: while passing a buffer length in ecx and a buffer in edi is par for the course, the 160-bit return value split across 5 registers is not your typical form of function output, and Appcall requires some creative coercion to cooperate with what we want it to do here.

We proceed by creating a custom structure struc_SHA1HASH which holds the values of 5 registers, and is used as a return type of the function prototype:

STRUCT_NAME = "struc_SHA1HASH"
# ------------------Struct Creation ------------------
sid = idc.get_struc_id(STRUCT_NAME)
if (sid != -1):
    idc.del_struc(sid)
sid = idc.add_struc(-1, STRUCT_NAME, 0)
idc.add_struc_member(sid, "_EAX_", -1, idc.FF_DWORD, -1, 4)
idc.add_struc_member(sid, "_EDX_", -1, idc.FF_DWORD, -1, 4)
idc.add_struc_member(sid, "_EBX_", -1, idc.FF_DWORD, -1, 4)
idc.add_struc_member(sid, "_ECX_", -1, idc.FF_DWORD, -1, 4)
idc.add_struc_member(sid, "_ESI_", -1, idc.FF_DWORD, -1, 4)
Figure 4: IDA Structure Window – “struc_SHA1HASH”

Now with the structure definition in place, we are poised to invoke the magic incantation that will allow Appcall to interface with this function prototype, as seen in the PROTO value below.

# ------------------Initialization ------------------
FUNC_NAME = "SHA1Hash"
STRUCT_NAME = "struc_SHA1HASH"
PROTO = "{:s} __usercall {:s}@<0:eax, 4:edx, 8:ebx, 12:ecx, 16:esi>(int buffLen@, const int@, BYTE *buffer@);".format(STRUCT_NAME, FUNC_NAME) # specify prototype of SHA1Hash function
SHA1BUFF_LEN = 0x40
CONSTVAL = 0xffffffff

As IDA Appcall relies on the debugger, to invoke this logic we first need to write a script that will start the debugger, make required adjustments to the stack and do other required housekeeping.

# ------------------ Setting + Starting Debugger ------------------
idc.load_debugger("win32",0)                 # select Local Windows Debugger
idc.set_debugger_options(idc.DOPT_ENTRY_BPT) # break on program entry point
idc.start_process("","","")                  # start process with default options
idc.wait_for_next_event(idc.WFNE_SUSP, 3)    # waits until process get suspended on entrypoint
eip = idc.get_reg_value("eip")               # get EIP
idc.run_to(eip + 0x1d)                       # let the stack adjust itself (execute few instructions)
idc.wait_for_next_event(idc.WFNE_SUSP, 3)    # waits until process get suspended after stack adjustment
Figure 5: IDA View – Stack adjusting

Using Appcall is the last step, and there are several ways to utilize it to call functions. We can call the function directly without specifying a prototype, but this highly relies on a properly typed function in IDA’s IDB. The second way is to create a callable object from the function name and a defined prototype. This way we can call a function with a specific prototype, no matter what type is set in the IDB, as shown below:

SHA1Hash = Appcall.proto(FUNC_NAME, PROTO)   # creating callable object
inBuff = Appcall.byref(b'DESKTOP-ROAC4IJ\x00MicrWAN WAN MicrWAN MicrWAN InteWAN InteWAN Inte')
buffLen = SHA1BUFF_LEN
const = CONSTVAL

retValue = SHA1Hash(buffLen, const, inBuff)
eax = malduck.DWORD(retValue._EAX_)
edx = malduck.DWORD(retValue._EDX_)
ebx = malduck.DWORD(retValue._EBX_)
ecx = malduck.DWORD(retValue._ECX_)
esi = malduck.DWORD(retValue._ESI_)

The full script to call Duke-SHA1 using Appcall is reproduced below.

# IDAPython script to demonstrate Appcall feature on modified SHA1 Hashing algorithm implemented by MiniDuke malware sample
# SHA1 HASH is stored in EAX, EDX, EBX, ECX, ESI (return values)
# SHA1 HASH Arguments -> ECX = 0x40 (buffLen), EAX = 0xFFFFFFFF (const), EDI =  BYTE *buffer (buffer)

import idc, malduck
from idaapi import Appcall

# ------------------Initialization ------------------
FUNC_NAME = "SHA1Hash"
STRUCT_NAME = "struc_SHA1HASH"
PROTO = "{:s} __usercall {:s}@<0:eax, 4:edx, 8:ebx, 12:ecx, 16:esi>(int buffLen@, const int@, BYTE *buffer@);".format(STRUCT_NAME, FUNC_NAME) # specify prototype of SHA1Hash function
SHA1BUFF_LEN = 0x40
CONSTVAL = 0xffffffff

# ------------------Struct Creation ------------------
sid = idc.get_struc_id(STRUCT_NAME)
if (sid != -1):
    idc.del_struc(sid)
sid = idc.add_struc(-1, STRUCT_NAME, 0)
idc.add_struc_member(sid, "_EAX_", -1, idc.FF_DWORD, -1, 4)
idc.add_struc_member(sid, "_EDX_", -1, idc.FF_DWORD, -1, 4)
idc.add_struc_member(sid, "_EBX_", -1, idc.FF_DWORD, -1, 4)
idc.add_struc_member(sid, "_ECX_", -1, idc.FF_DWORD, -1, 4)
idc.add_struc_member(sid, "_ESI_", -1, idc.FF_DWORD, -1, 4)

# ------------------ Setting + Starting Debugger ------------------
idc.load_debugger("win32",0)                 # select Local Windows Debugger
idc.set_debugger_options(idc.DOPT_ENTRY_BPT) # break on program entry point
idc.start_process("","","")                  # start process with default options
idc.wait_for_next_event(idc.WFNE_SUSP, 3)    # waits until process get suspended on entrypoint
eip = idc.get_reg_value("eip")               # get EIP
idc.run_to(eip + 0x1d)                       # let the stack adjust itself (execute few instructions)
idc.wait_for_next_event(idc.WFNE_SUSP, 3)    # waits until process get suspended after stack adjustment

# ------------------ Arguments + Execution ------------------
SHA1Hash = Appcall.proto(FUNC_NAME, PROTO)   # creating callable object
inBuff = Appcall.byref(b'DESKTOP-ROAC4IJ\x00MicrWAN WAN MicrWAN MicrWAN InteWAN InteWAN Inte')
buffLen = SHA1BUFF_LEN
const = CONSTVAL

retValue = SHA1Hash(buffLen, const, inBuff)
eax = malduck.DWORD(retValue._EAX_)
edx = malduck.DWORD(retValue._EDX_)
ebx = malduck.DWORD(retValue._EBX_)
ecx = malduck.DWORD(retValue._ECX_)
esi = malduck.DWORD(retValue._ESI_)

# ------------------ RESULTS ------------------
print("SHA1 HASH RET VALUES: EAX:0x%x EDX:0x%x EBX:0x%x ECX:0x%x ESI:0x%x" % (eax, edx, ebx, ecx, esi))

# ------------------ Exiting Debugger ------------------
idc.exit_process()

And some sample output:

Figure 6: Script execution – “IDA Appcall” producing the same SHA1 Hash values as the MiniDuke sample

The above is fine if we just want to use the invoked function as a black box, but sometimes we may want access to registry values in a specific state of execution, and specifying the prototype as above is something of a chore. Happily, both these downsides can be mitigated, as we will see below.

As IDA Appcall relies on the debugger and can be invoked right from IDAPython, we can invoke Appcall from the debugger and gain more granular control over its execution. For example, we can make Appcall hand control back to the debugger during execution by setting a special option for Appcall – APPCALL_MANUAL.

# ------------------ Arguments + Execution ------------------
SHA1Hash = Appcall.proto(FUNC_NAME, PROTO)   # creating callable object
SHA1Hash.options = Appcall.APPCALL_MANUAL    # APPCALL_MANUAL option will cause the debugger to break on function entry and gives the control to debugger
inBuff = Appcall.byref(b'DESKTOP-ROAC4IJ\x00MicrWAN WAN MicrWAN MicrWAN InteWAN InteWAN Inte')
buffLen = SHA1BUFF_LEN
const = CONSTVAL

SHA1Hash(buffLen, const, inBuff)             # invoking Appcall and breaking on function entry (SHA1Hash)

This way we can make use of Appcall to prepare arguments, allocate a buffer and later restore the previous execution context. We can also avoid specifying the structure type for the return value (type it as void) as this will be handled by the debugger. There are more ways to get the return values of the function, so as we are now controlling the debugger, we can use (for example) a conditional breakpoint to print desired values in a specific state of execution (such as on return).

# ------------------Set conditional BP on Return ------------------
def SetCondBPonRet():
    cond = """import idc
print("SHA1 HASH RET VALUES: EAX:0x%x EDX:0x%x EBX:0x%x ECX:0x%x ESI:0x%x" % (idc.get_reg_value("eax"), idc.get_reg_value("edx"), idc.get_reg_value("ebx"), idc.get_reg_value("ecx"), idc.get_reg_value("esi")))
return True
"""
    func = idaapi.get_func(idc.get_name_ea_simple(FUNC_NAME))
    bpt = idaapi.bpt_t()
    bpt.ea = idc.prev_head(func.end_ea)      # last instruction in function -> should be return
    bpt.enabled = True
    bpt.type = idc.BPT_SOFT
    bpt.elang = 'Python'
    bpt.condition = cond                     # with script code in condition we can get or log any values we want
    idc.add_bpt(bpt)
    return bpt                               # return breakpoint object -> will be deleted later on

We can restore the previous state (before Appcall invocation) at any desired moment of execution by calling cleanup_appcall(). So in our case, right after hitting the conditional breakpoint.

SHA1Hash(buffLen, const, inBuff)             # invoking Appcall and breaking on function entry (SHA1Hash)
idc.wait_for_next_event(idc.WFNE_SUSP, 3)
idaapi.continue_process()                    # debugger has control now so continue to hit the new conditional breakpoint
idc.wait_for_next_event(idc.WFNE_SUSP, 3)
idc.del_bpt(bpt.ea)                          # deleting the previously created conditional breakpoint
Appcall.cleanup_appcall()                    # clean Appcall after hitting the conditional breakpoint -> return

The full script is reproduced below.

# IDAPython script to demonstrate Appcall feature on modified SHA1 Hashing algorithm implemented by MiniDuke malware sample
# SHA1 HASH is stored in EAX, EDX, EBX, ECX, ESI (return values)
# SHA1 HASH Arguments -> ECX = 0x40 (buffLen), EAX = 0xFFFFFFFF (const), EDI =  BYTE *buffer (buffer)

import idc, idaapi
from idaapi import Appcall

# ------------------ Initialization ------------------
FUNC_NAME = "SHA1Hash"
PROTO = "void __usercall {:s}(int buffLen@, const int@, BYTE *buffer@);".format(FUNC_NAME) # specify prototype of SHA1Hash function
SHA1BUFF_LEN = 0x40
CONSTVAL = 0xffffffff

# ------------------Set conditional BP on Return ------------------
def SetCondBPonRet():
    cond = """import idc
print("SHA1 HASH RET VALUES: EAX:0x%x EDX:0x%x EBX:0x%x ECX:0x%x ESI:0x%x" % (idc.get_reg_value("eax"), idc.get_reg_value("edx"), idc.get_reg_value("ebx"), idc.get_reg_value("ecx"), idc.get_reg_value("esi")))
return True
"""
    func = idaapi.get_func(idc.get_name_ea_simple(FUNC_NAME))
    bpt = idaapi.bpt_t()
    bpt.ea = idc.prev_head(func.end_ea)      # last instruction in function -> should be return
    bpt.enabled = True
    bpt.type = idc.BPT_SOFT
    bpt.elang = 'Python'
    bpt.condition = cond                     # with script code in condition we can get or log any values we want
    idc.add_bpt(bpt)
    return bpt                               # return breakpoint object -> will be deleted later on

# ------------------ Setting + Starting Debugger ------------------
idc.load_debugger("win32",0)                 # select Local Windows Debugger
idc.set_debugger_options(idc.DOPT_ENTRY_BPT) # break on program entry point
bpt = SetCondBPonRet()                       # setting the conditional breakpoint on function return
idc.start_process("","","")                  # start process with default options
idc.wait_for_next_event(idc.WFNE_SUSP, 3)    # waits until process get suspended on entrypoint
eip = idc.get_reg_value("eip")               # get EIP
idc.run_to(eip + 0x1d)                       # let the stack adjust itself (execute few instructions)
idc.wait_for_next_event(idc.WFNE_SUSP, 3)    # waits until process get suspended after stack adjustment

# ------------------ Arguments + Execution ------------------
SHA1Hash = Appcall.proto(FUNC_NAME, PROTO)   # creating callable object
SHA1Hash.options = Appcall.APPCALL_MANUAL    # APPCALL_MANUAL option will cause the debugger to break on function entry and gives the control to debugger
inBuff = Appcall.byref(b'DESKTOP-ROAC4IJ\x00MicrWAN WAN MicrWAN MicrWAN InteWAN InteWAN Inte')
buffLen = SHA1BUFF_LEN
const = CONSTVAL

SHA1Hash(buffLen, const, inBuff)             # invoking Appcall and breaking on function entry (SHA1Hash)
idc.wait_for_next_event(idc.WFNE_SUSP, 3)
idaapi.continue_process()                    # debugger has control now so continue to hit the new conditional breakpoint
idc.wait_for_next_event(idc.WFNE_SUSP, 3)
idc.del_bpt(bpt.ea)                          # deleting the previously created conditional breakpoint
Appcall.cleanup_appcall()                    # clean Appcall after hitting the conditional breakpoint -> return

# ------------------ Exiting Debugger ------------------
idc.exit_process()

Dumpulator

Dumpulator is a python library that assists with code emulation in minidump files. The core emulation engine of dumpulator is based on Unicorn Engine, but a relatively unique feature among other similar tools is that the entire process memory is available. This brings a performance improvement (emulating large parts of analyzed binary without leaving Unicorn), as well as making life more convenient if we can time the memory dump to when the program context (stack, etc) required to call the function is already in place. Additionally, only syscalls have to be emulated to provide a realistic Windows environment (since everything actually is a legitimate process environment).

A minidump of the desired process could be captured with many tools (x64dbg – MiniDumpPlugin, Process Explorer, Process Hacker, Task Manager) or with the Windows API (MiniDumpWriteDump). We can use the x64dbg – MiniDumpPlugin to create a minidump in a state where almost all in the process is already set for SHA1 Hash creation, right before the SHA1Hash function call. Note that timing the dump this way is not necessary, as the environment can be set up manually in dumpulator after taking the dump; it is just convenient.

Figure 7: Creation of minidump using “x64dbg – MiniDumpPlugin”

Dumpulator not only has access to the entire dumped process memory but can also allocate additional memory, read memory, write to memory, read registry values, and write registry values. In other words, anything that an emulator can do. There is also a possibility to implement system calls so code using them can be emulated.

To invoke Duke-SHA1 via Dumpulator, we need to specify the address of the function which will be called in minidump and its arguments. In this case, the address of SHA1Hash is 0x407108 .

Figure 8: Opening produced minidump in IDA

As we do not want to use already set values in the current state of minidump, we define our own argument values for the function. We can even allocate a new buffer which will be used as a buffer to be hashed. The decidedly elegant code to do this is shown below.

# Python script to demonstrate dumpulator on modified SHA1 Hashing algorithm implemented by MiniDuke malware sample
# SHA1 HASH is stored in EAX, EDX, EBX, ECX, ESI (return values)
# SHA1 HASH Arguments -> ECX = 0x40 (buffLen), EAX = 0xFFFFFFFF (const), EDI =  BYTE *buffer (buffer)

from dumpulator import Dumpulator

# ------------------Initialization ------------------
FUNC_ADDR = 0x407108            # address of SHA1Hash function in MiniDuke
SHA1BUFF_LEN = 0x40
CONSTVAL = 0xffffffff

# ------------------ Setting + Starting Dumpulator ------------------
dp = Dumpulator("miniduke.dmp", quiet=True)
inBuff = b'DESKTOP-ROAC4IJ\x00MicrWAN WAN MicrWAN MicrWAN InteWAN InteWAN Inte'
bufferAddr = dp.allocate(64)
dp.write(bufferAddr, inBuff)
#dp.regs.ecx = SHA1BUFF_LEN     # possible to set the registers here
#dp.regs.eax = CONSTVAL
#dp.regs.edi = bufferAddr
#dp.call(FUNC_ADDR)
dp.call(FUNC_ADDR, regs= {"eax": CONSTVAL, "ecx": SHA1BUFF_LEN, "edi": bufferAddr})

# ------------------ RESULTS ------------------
print("SHA1 HASH RET VALUES: EAX:0x%x EDX:0x%x EBX:0x%x ECX:0x%x ESI:0x%x" % (dp.regs.eax, dp.regs.edx, dp.regs.ebx, dp.regs.ecx, dp.regs.esi))

Execution of this script will produce correct Duke-SHA1 values.

Figure 9: Script execution – “Dumpulator” producing the same SHA1 Hash values as the MiniDuke sample

Emulation – Unicorn Engine

For the emulation approach, we can use any kind of CPU emulator (ex. Qiling, Speakeasy, etc.) which is able to emulate x86 assembly and has bindings for Python language. As we do not need any higher abstraction level (Syscalls, API functions) we can use the one which most of the others are based on – Unicorn Engine.

Unicorn is a lightweight, multi-platform, multi-architecture CPU emulator framework, based on QEMU, which is implemented in pure C language with bindings for many other languages. We will be using Python bindings. Our goal is to create an independent function SHA1Hash which can be called like any other ordinary function in Python, producing the same SHA1 hashes as the original one in MiniDuke. The idea behind the implementation we use is pretty straightforward — we simply extract the opcode bytes of the function and use them via the CPU emulation.

Extracting all bytes of original function opcodes can be done simply via IDAPython or using IDA→Edit→Export Data.

# IDAPython - extracting opcode bytes of SHA1Hash function
import idaapi, idc

SHA1HashAddr = idc.get_name_ea_simple("SHA1Hash")
SHA1Hash = idaapi.get_func(SHA1HashAddr)
SHA1HASH_OPCODE = idaapi.get_bytes(SHA1Hash.start_ea, SHA1Hash.size())
SHA1HASH_OPCODE.hex()
# Output: '0f6ec589cb8dad74a3[...]'
Figure 10: Using IDA “Export data” dialog to export opcode bytes of SHA1Hash function

As in the previous approaches, we need to set up the context for execution. In this case this means preparing arguments for the function, and setting addresses for our extracted opcodes and input buffer.

# ------------------Initialization ------------------
# remove "retn" instruction from SHA1Hash function opcodes or -> UC_ERR_FETCH_UNMAPPED -> no ret address on stack
SHA1HASH_OPCODE = b"\x0f\x6e\xc5\x89\xcb\x8d\xad\x74\xa3..........................."
OPCODE_ADDRESS = 0x400000
SHA1BUFF_LEN = 0x40
CONSTVAL = 0xffffffff
BUFFERADDR = OPCODE_ADDRESS + 0x200000

Note that the last retn instruction should be deleted from the extracted opcode listing in order to not transfer back execution to the return address on the stack, and the stack frame should be manually set up by specifying values for ebp and esp. All these things are shown in the final Python script below.

# Python script to demonstrate Unicorn emulator on modified SHA1 Hashing algorithm implemented by MiniDuke malware sample
# SHA1 HASH is stored in EAX, EDX, EBX, ECX, ESI (return values)
# SHA1 HASH Arguments -> ECX = 0x40 (buffLen), EAX = 0xFFFFFFFF (const), EDI =  BYTE *buffer (buffer)

from unicorn import *
from unicorn.x86_const import *

def GetMinidukeSHA1(inBuff:bytes) -> Uc:
    # ------------------Initialization ------------------
    # remove "retn" instruction from SHA1Hash function opcodes or -> UC_ERR_FETCH_UNMAPPED -> no ret address on stack
    SHA1HASH_OPCODE = b"\x0f\x6e\xc5\x89\xcb\x8d\xad\x74\xa3..........................."
    OPCODE_ADDRESS = 0x400000
    SHA1BUFF_LEN = 0x40
    CONSTVAL = 0xffffffff
    BUFFERADDR = OPCODE_ADDRESS + 0x200000

    # ------------------ Setting + Starting Emulator ------------------
    try:
        mu = Uc(UC_ARCH_X86, UC_MODE_32)                                        # set EMU architecture and mode
        mu.mem_map(OPCODE_ADDRESS, 0x200000, UC_PROT_ALL)                       # map memory for SHA1Hash function opcodes, stack etc.
        mu.mem_write(OPCODE_ADDRESS, SHA1HASH_OPCODE)                           # write opcodes to memory
        mu.mem_map(BUFFERADDR, 0x1000, UC_PROT_ALL)                             # map memory for input to be hashed
        mu.mem_write(BUFFERADDR, inBuff)                                        # write input bytes to memory
       
        mu.reg_write(UC_X86_REG_ESP, OPCODE_ADDRESS + 0x100000)                 # initialize stack (ESP)
        mu.reg_write(UC_X86_REG_EBP, OPCODE_ADDRESS + 0x100000)                 # initialize frame pointer (EBP)
        mu.reg_write(UC_X86_REG_EAX, CONSTVAL)                                  # set EAX register (argument) -> CONSTVAL
        mu.reg_write(UC_X86_REG_ECX, SHA1BUFF_LEN)                              # set ECX register (argument) -> SHA1BUFF_LEN
        mu.reg_write(UC_X86_REG_EDI, BUFFERADDR)                                # set EDI register (argument) -> BUFFERADDR to be hashed
        mu.emu_start(OPCODE_ADDRESS, OPCODE_ADDRESS + len(SHA1HASH_OPCODE))     # start emulation of opcodes
        return mu

    except UcError as e:
        print("ERROR: %s" % e)

# ------------------ RESULTS ------------------
inBuff = b'DESKTOP-ROAC4IJ\x00MicrWAN WAN MicrWAN MicrWAN InteWAN InteWAN Inte'
mu = GetMinidukeSHA1(inBuff)
print("SHA1 HASH RET VALUES: EAX:0x%x EDX:0x%x EBX:0x%x ECX:0x%x ESI:0x%x" % (mu.reg_read(UC_X86_REG_EAX), mu.reg_read(UC_X86_REG_EDX), mu.reg_read(UC_X86_REG_EBX), mu.reg_read(UC_X86_REG_ECX), mu.reg_read(UC_X86_REG_ESI)))

The script output can be seen below.

Figure 11: Script execution – “Unicorn Engine” producing the same SHA1 Hash values as the MiniDuke sample

Conclusion

All the above-described methods for direct invocation of assembly have their advantages and disadvantages. We were particularly impressed by the easy-to-use Dumpulator which is free, fast to implement, and highly effective. It is well suited for writing universal string decryptors, config extractors, and other contexts where many different logic fragments have to be called in sequence while preserving a hard-to-set-up context.

The IDA Appcall feature is one of the best solutions in situations where we would like to enrich the IDA database directly with results produced by the invocation of a specific function. Syscalls could be a part of such a function as Appcall is usually used in real execution environments – using a debugger. One of the greatest things about Appcall is the fast and easy context restoration. As Appcall relies on a debugger and could be used together with IDAPython scripting, it could even in theory be used as a basis for a fuzzer, feeding random input to functions in order to discover unexpected behavior (i.e. bugs), though the performance overhead might make this approach not very practical.

Using pure emulation via Unicorn Engine is a universal solution for the independent implementation of specific functionality. With this approach, it is possible to take a part of the code as-is and use it with no connection to the original sample. This method does not rely on a runnable sample and there is no problem to re-implement functionality for just a part of the code. This approach may be harder to implement for functions that are not a contiguous, easily-dumpable block of code. For part of code where APIs or syscalls occur, or the execution context is much harder to set up, the previously mentioned methods are usually a preferable choice.

Pros and Cons Summary

IDA Appcall

PROS:

  • Natively supported by IDA
  • Possible to use with IDAPython right in the context of IDA.
  • Natively understands higher abstraction layer so Windows APIs and syscalls can be a part of the invoked function.
  • Can be used on corrupted code/file with Bochs emulator IDB emulate feature (non-runnable sample).
  • The combination of the Appcall feature and scriptable debugger is very powerful, giving us full control at any moment of Appcall execution.

CONS:

  • Prototypes of more sophisticated functions using custom calling conventions (__usercall) are harder to implement.
  • Invoked assembly needs to be a function, not just part of code.

Dumpulator

PROS:

  • Very easy-to-use. Code making use of it is Pythonic and fast to implement.
  • If a minidump is obtained in a state where all context is already set, with no need to map memory or set things like a stack, frame pointer, or even arguments, Dumpulator can leverage this to de-clutter the invocation code even further.
  • Understands the higher abstraction layer and allows use of syscalls (though some may need to be implemented manually).
  • Enables lower access level to modify the context in a similar way to usual emulation.
  • Can be used to emulate part of code (does not have to be a function)

CONS:

  • Requires a minidump of the desired process to be worked on, which in turn requires a runnable binary sample.

Emulation – Unicorn Engine

PROS:

  • The most independent solution, requires only the interesting assembly code.
  • Low access level to set and modify context.
  • Can be used to emulate part of code fully independently. Allows free modification and patching of instructions on the fly.

CONS:

  • Harder to map memory and set the context of the emulation engine correctly.
  • No out-of-the-box access to higher abstraction layer and system calls.

References

IDA – Appcall

https://hex-rays.com/blog/introducing-the-appcall-feature-in-ida-pro-5-6/

https://hex-rays.com/blog/practical-appcall-examples/

https://www.hex-rays.com/wp-content/uploads/2019/12/debugging_appcall.pdf

Dumpulator

https://github.com/mrexodia/dumpulator

https://github.com/mrexodia/MiniDumpPlugin

Unicorn Engine

https://github.com/unicorn-engine/unicorn

https://github.com/unicorn-engine/unicorn/tree/master/bindings/python

Samples + Scripts (password:infected)

  1. Original MiniDuke sample: VirusTotal, miniduke_original.7z
  2. Unpacked MiniDuke sample: miniduke_unpacked.7z
  3. MiniDuke minidump: miniduke_minidump.7z
  4. All scripts mentioned in the article: IDAPython_PythonScripts.7z

The post Native function and Assembly Code Invocation appeared first on Check Point Research.

Trail of Bits

Retrieved title: Trail of Bits Blog, 3 item(s)
It pays to be Circomspect

By Fredrik Dahlgren, Staff Security Engineer

In October 2019, a security researcher found a devastating vulnerability in Tornado.cash, a decentralized, non-custodial mixer on the Ethereum network. Tornado.cash uses zero-knowledge proofs (ZKPs) to allow its users to privately deposit and withdraw funds. The proofs are supposed to guarantee that each withdrawal can be matched against a corresponding deposit to the mixer. However, because of an issue in one of the ZKPs, anyone could forge a proof of deposit and withdraw funds from the system.

At the time, the Tornado.cash team saved its users’ funds by exploiting the vulnerability to drain the funds from the mixer before the issue was discovered by someone else. Then they patched the ZKPs and migrated all user funds to a new version of the contract. Considering the severity of the underlying vulnerability, it is almost ironic that the fix consisted of just two characters.

The fix: Simply replace = by <== and all is well (obviously!).

This bug would have been caught using Circomspect, a new static analyzer for ZKPs that we are open-sourcing today. Circomspect finds potential vulnerabilities and code smells in ZKPs developed using Circom, the language used for the ZKPs deployed by Tornado.cash. It can identify a wide range of issues that can occur in Circom programs. In particular, it would have found the vulnerability in Tornado.cash early in the development process, before the contract was deployed on-chain.

How Circom works

Tornado.cash was developed using Circom, a domain-specific language (DSL) and a compiler that can be used to generate and verify ZKPs. ZKPs are powerful cryptographic tools that allow you to make proofs about a statement without revealing any private information. For complex systems like a full computer program, the difficult part in using ZKPs becomes representing the statement in a format that the proof system can understand. Circom and other DSLs are used to describe a computation, together with a set of constraints on the program inputs and outputs (known as signals). The Circom compiler takes a program and generates a prover and a verifier. The prover can be used to run the computation described by the DSL on a set of public and private inputs to produce an output, together with a proof that the computation was run correctly. The verifier can then take the public inputs and the computed output and verify them against the proof generated by the prover. If the public inputs do not correspond to the provided output, this is detected by the verifier.

The following figure shows a small toy example of a Circom program allowing the user to prove that they know a private input x such that x5 - 2x4 + 5x - 4 = 0:

A toy Circom program where the private variable x is a solution to a polynomial equation

The line y <== x5 - 2 * x4 + 5 * x - 4 tells the compiler two things: that the prover should assign the value of the right-hand side to y during the proof generation phase (denoted y <-- x5 - 2 * x4 + 5 * x - 4 in Circom), and that the verifier should ensure that y is equal to the right-hand side during the proof verification phase (which is denoted y === x5 - 2 * x4 + 5 * x - 4 in Circom). This type of duality is often present in zero-knowledge DSLs like Circom. The prover performs a computation, and the verifier has to ensure that the computation is correct. Sometimes these two sides of the same coin can be described using the same code path, but sometimes (for example, due to restrictions on how constraints may be specified in R1CS-based systems like Circom) we need to use different code to describe computation and verification. If we forget to add instructions describing the verification steps corresponding to the computation performed by the prover, it may become possible to forge proofs.

The Tornado.cash vulnerability

In the case of Tornado.cash, it turned out that the MIMC hash function used to compute the Merkle tree root in the proof used only the assignment operator <-- when defining the output. (Actually, it uses =, as demonstrated in the GitHub diff above. However, in the previous version of the Circom compiler, this was interpreted in the same way as <--. Today, this code would generate a compilation error.) As we have seen, this only assigned a value to the output during proof generation, but did not constrain the output during proof verification, leaving the verifying contract vulnerable.

Our new Circom bug finder, Circomspect

Circomspect is a static-analyzer and linter for programs written in the Circom DSL. Its main use is as a tool for reviewing the security and correctness of Circom templates and functions. The implementation is based on the Circom compiler and uses the same parser as the compiler does. This ensures that any program that the compiler can parse can also be parsed using Circomspect. The abstract syntax tree generated by the parser is converted to static single-assignment form, which allows us to perform simple data flow analyses on the input program.

The current version implements a number of analysis passes, checking Circom programs for potential issues like unconstrained signals, unused variables, and shadowing variable declarations. It warns the user about each use of the signal assignment operator <--, and can often detect if a circuit uses <-- to assign a quadratic expression to a signal, indicating that the signal constraint assignment operator <== could be used instead. This analysis pass would have found the vulnerability in the Tornado.cash described above. All issues flagged by Circomspect do not represent vulnerabilities, but rather locations that should be reviewed to make sure that the code does what is expected.

As an example of the types of issues Circomspect can find, consider the following function from the circom-pairing repository:

An example function from the circom-pairing repository

This function may look a bit daunting at first sight. It implements inversion modulo p using the extended Euclidean algorithm. Running Circomspect on the containing file yields a number of warnings telling us that the assignments to the arrays y, v, and newv do not contribute to the return value of the function, which means that they cannot influence either witness or constraint generation.

Running Circomspect on the function find_Fp_inverse produces a number of warnings.

A closer look at the implementation reveals that the variable y is used only to compute newv, while newv is used only to update v and v is used only to update y. It follows that none of the variables y, v, and newv contribute to the return value of the function find_Fp_inverse, and all can safely be removed. (As an aside, this makes complete sense since running the extended Euclidean algorithm on two coprime integers num and p computes two integers x and y such that x * num + * p = 1. This means that if we’re interested in the inverse of num modulo p, it is given by x, and the value of y is not needed. Since x and y are computed independently, the code used to compute y can safely be removed.)

Improving the state of ZKP tooling

Zero-knowledge DSLs like Circom have democratized ZKPs. They allow developers without a background in mathematics or cryptography to build and deploy systems that use zero-knowledge technology to protect their users. However, since ZKPs are often used to protect user privacy or assure computational integrity, any vulnerability in a ZPK typically has serious ramifications for the security and privacy guarantees of the entire system. In addition, since these DSLs are new and emerging pieces of technology, there is very little tooling support available for developers.

At Trail of Bits, we are actively working to fill that void. Earlier this year we released Amarna, our static-analyzer for ZKPs written in the Cairo programming language, and today we are open sourcing Circomspect, our static analyzer and linter for Circom programs. Circomspect is under active development and can be installed from crates.io or downloaded from the Circomspect GitHub repository. Please try it out and let us know what you think! We welcome all comments, bug reports, and ideas for new analysis passes.

Magnifier: An Experiment with Interactive Decompilation

By Alan Chang

Today, we are releasing Magnifier, an experimental reverse engineering user interface I developed during my internship. Magnifier asks, “What if, as an alternative to taking handwritten notes, reverse engineering researchers could interactively reshape a decompiled program to reflect what they would normally record?” With Magnifier, the decompiled C code isn’t the end—it’s the beginning.

Decompilers are essential tools for researchers. They transform program binaries from assembly code into source code, typically represented as C-like code. A researcher’s job starts where decompilers leave off. They must make sense of a decompiled program’s logic, and the best way to drill down on specific program paths or values of interest is often pen and paper. This is obviously tedious and cumbersome, so we chose to prototype an alternative method.

The Magnifier UI in action

Decompilation at Trail of Bits

Trail of Bits is working on multiple open-source projects related to program decompilation: Remill, Anvill, Rellic, and now Magnifier. The Trail of Bits strategy for decompilation is to progressively lift compiled programs through a tower of intermediate representations (IRs); Remill, Anvill, and Rellic work together to achieve this. This multi-stage approach helps break down the problem into smaller components:

  1. Remill represents machine instructions in terms of LLVM IR.
  2. Anvill transforms machine code functions into LLVM functions.
  3. Rellic transforms the LLVM IR into C code via the Clang AST.

Theoretically, a program may be transformed at any pipeline stage, and Magnifier proves this theory. Using Magnifier, researchers can interactively transform Anvill’s LLVM IR and view the C code produced by Rellic instantaneously.

It started as a REPL

Magnifier started its life as a command-line read-eval-print-loop (REPL) that lets users perform a variety of LLVM IR transformations using concise commands. Here is an example of one of these REPL sessions. The key transformations exposed were:

  • Function optimization using LLVM
  • Function inlining
  • Value substitution with/without constant folding
  • Function pointer devirtualization

Magnifier’s first goal was to describe the targets being transformed; depending on the type of transformation, these targets could be instructions, functions, or other objects. To describe these targets consistently and hide some implementation details, Magnifier assigns a unique, opaque ID to all functions, function parameters, basic blocks, and IR instructions.

Magnifier’s next important goal was to track instruction provenance across transformations and understand how instructions are affected by operations. To accomplish this, it introduces an additional source ID. (For unmodified functions, source IDs are the same as current IDs.) Then during each transformation, a new function is created that propagates the source IDs but generates new, unique current IDs. This solution ensures that no function is mutated in place, facilitating before-and-after comparisons of transformations while tracking their provenance.

Lastly, for transformations such as value substitution, Magnifier enables the performance of additional transformations in the form of constant folding. These extra transformations are often desirable. To accommodate different use cases, Magnifier provides granular control over each transformation in the form of a universal substitution interface. This interface allows users to monitor all the transformations and selectively allow, reject, or modify substitution steps as they see fit.

Here’s an example of transformations in action using Magnifier REPL.

First, a few functions are defined as follows:

Here’s the same “mul2” function in annotated LLVM IR:

The opaque IDs and the provenance IDs are shown. “XX|YY” means “XX” is the current ID, and “YY” is the source ID. The IDs in this example are:

Function: 44
Parameter “a”: 45
Basic block (entry): 51
Instruction “ret i32”: 50

Now, substitution takes place that sets the parameter “a” to 10:

The “perform substitution” message at the top shows that a value substitution has happened. Looking at the newly transformed function, each instruction has a new current ID, but the source IDs still track the original function and instructions. Also, a call to “@llvm.assume” is inserted to document the value substitution.

Next, the “b” parameter is substituted for 20, and the two calls to “addOne” are inlined:

The end result is surprisingly simple. We now have a function that calls “@llvm.assume” on “a” and “b” then returns just 231. The constant folding here shows Magnifier’s ability to evaluate simple functions.

MagnifierUI: A More Intuitive Interface

While the combination of a shared library plus REPL is a simple and flexible solution, it’s not the most ideal setup for researchers who just want to use Magnifier as a tool to reverse-engineer binaries. This is where the MagnifierUI comes in.

The MagnifierUI consists of a Vue.js front end and a C++ back end, and it uses multi-session WebSockets to facilitate communication between the two. The MagnifierUI not only exposes most of the features Magnifier has to offer, but it also integrates Rellic, the LLVM IR-to-C code decompiler, to show side-by-side C code decompilation results.

We can try performing the same set of actions as before using the MagnifierUI:

Use the Upload button to open a file.

The Terminal view exposes the same set of Magnifier commands, which we can use to substitute the value for the first argument.

The C code view and the IR view are automatically updated with the new value. We can do the same for the second parameter.

Clicking an IR instruction selects it and highlights the related C code. We can then inline the selected instruction using the Inline button. The same can be done for the other call instruction.

After inlining both function calls, we can now optimize the function using the Optimize button. This uses all the available LLVM optimizations.

Simplified the function down to returning a constant value

Compared to using the REPL, the MagnifierUI is more visual and intuitive. In particular, the side-by-side view and instruction highlighting make reading the code a lot easier.

Capturing the flag with LLVM optimizations

As briefly demonstrated above, we can leverage the LLVM library in various ways, including its fancy IR optimizations to simplify code. However, a new example is needed to fully demonstrate the power of Magnifier’s approach.

Here we have a “mystery” function that calls “fibIter(100)” to obtain a secret value:

It would be convenient to find this secret value without running the program dynamically (which could be difficult if anti-debugging methods are in place) or manually reverse-engineering the “fibIter” function (which can be time-consuming). Using the MagnifierUI, we can solve this problem in just two clicks!

Select the “fibIter” function call instruction and click the “Inline” button

With the function inlined, we can now “Optimize”! 

Here’s our answer: 3314859971, the “100th Fibonacci number” that Rellic has tried to fit into an unsigned integer. 

This example shows Magnifier’s great potential for simplifying the reverse-engineering process and making researchers’ lives easier. By leveraging all the engineering wisdom behind LLVM optimizations, Magnifier can reduce even a relatively complex function like “fibIter,” which contains loops and conditionals, down to a constant.

Looking toward the Future of Magnifier

I hope this blog post sheds some light on how Trail of Bits approaches the program decompilation challenge at a high level and provides a glimpse of what an interactive compiler can achieve with the Magnifier project.

Magnifier certainly needs additional work, from adding support for transformation types (with the hope of eventually expressing full patch sets) to integrating the MagnifierUI with tools like Anvill to directly ingest binary files. Still, I’m very proud of what I’ve accomplished with the project thus far, and I look forward to what the future holds for Magnifier.

I would like to thank my mentor Peter Goodman for all his help and support throughout my project as an intern. I learned a great deal from him, and in particular, my C++ skills improved a lot with the help of his informative and detailed code reviews. He has truly made this experience unique and memorable!

Using mutants to improve Slither

By Alex Groce, Northern Arizona University

Improving static analysis tools can be hard; once you’ve implemented a good tool based on a useful representation of a program and added a large number of rules to detect problems, how do you further enhance the tool’s bug-finding power?

One (necessary) approach to coming up with new rules and engine upgrades for static analyzers is to use “intelligence guided by experience”—deep knowledge of smart contracts and their flaws, experience in auditing, and a lot of deep thought. However, this approach is difficult and requires a certain level of expertise. And even the most experienced auditors who use it can miss things.

In our paper published at the 2021 IEEE International Conference on Software Quality, Reliability, and Security, we offer an alternative approach: using mutants to introduce bugs into a program and observing whether the static analyzer can detect them. This post describes this approach and how we used it to write new rules for Slither, a static analysis tool developed by Trail of Bits.

Using program mutants

The most common approach to finding ways to improve a static analysis tool is to find bugs in code that the tool should have been able to find, then determine the improvements that the tool needs to find such bugs.

This is where program mutants come into play. A mutation testing tool, such as universalmutator, takes a program as input and outputs a (possibly huge) set of slight variants of the program. These variants are called mutants. Most of them, assuming the original program was (mostly) correct, will add a bug to the program.

Mutants were originally designed to help determine whether the tests for a program were effective (see my post on mutation testing on my personal blog). Every mutant that a test suite is unable to detect suggests a possible defect in the test suite. It’s not hard to extend this idea specifically to static analysis tools.

Using mutants to improve static analysis tools

There are important differences between using mutants to improve an entire test suite and using them to improve static analysis tools in particular. First, while it’s reasonable to expect a good test suite to detect almost all the mutants added to a program, it isn’t reasonable to expect a static analysis tool to do so; many bugs cannot be detected statically. Second, many mutants will change the meaning of a smart contract, but not in a way that fits into a general pattern of good or bad code. A tool like Slither has no idea what exactly a contract should be doing.

These differences suggest that one has to laboriously examine every mutant that Slither doesn’t detect, which would be painful and only occasionally fruitful. Fortunately, this isn’t necessary. One must only look at the mutants that 1) Slither doesn’t detect and 2) another tool does detect. These mutants have two valuable properties. First, because they are mutants, we can be fairly confident that they are bugs. Second, they must be, in principle, detectable statically: some other tool detected them even if Slither didn’t! If another tool was able to find the bugs, we obviously want Slither to do so, too. The combination of the nature of mutants and the nature of differential comparison (here, between two static analysis tools) gives us what we want.

Even with this helpful method of identifying only the bugs we care about, there might still be too much to look at. For example, in our efforts to improve Slither, we compared the bugs it detected with the bugs that SmartCheck and Securify detected (at the time, the two plausible alternative static analysis tools). This is what the results looked like:

A comparison between the bugs that Slither, SmartCheck, and Securify found and how they overlap

A handful of really obvious problems were detected by all three tools, but these 18 mutants amount to less than 0.5% of all detected mutants. Additionally, every pair of tools had a significant overlap of 100-400 mutants. However, each tool detected at least 1,000 mutants uniquely. We’re proud that Slither detected both the most mutants overall and the most mutants that only it detected. In fact, Slither was the only tool to detect a majority (nearly 60%) of all mutants any tool detected. As we hoped, Slither is good at finding possible bugs, especially relative to the overall number of warnings it produced.

Still, there were 1,256 bugs detected by SmartCheck and 1,076 bugs detected by Securify that Slither didn’t detect! Now, these tools ran over a set of nearly 50,000 mutants across 100 smart contracts, which is only about 25 bugs per contract. Still, that’s a lot to look through!

However, a quick glance at the mutants that Slither missed shows that many are very similar to each other. Unlike in testing, we don’t care about each individual bug—we care about patterns that Slither is not detecting and about the reasons Slither misses patterns that it already knows about. With this in mind, we can sort the mutants by looking at those that are as different as possible from each other first.

First, we construct a distance metric to determine the level of similarity between two given mutants, based on their locations in the code, the kind of mutation they introduce, and, most importantly, the actual text of the mutated code. If two mutants change similar Solidity source code in similar ways, we consider them to be very similar. We then rank all the mutants by similarity, with all the very similar mutants at the bottom of the ranking. That way, the first 100 or so mutants represent most of the actual variance in code patterns!

So if there are 500 mutants that change msg.sender to tx.origin, and are detected by both SmartCheck and Slither, which tend to be overly paranoid about tx.origin and often flag even legitimate uses, we can just dismiss those mutants right off the bat; we know that a good deal of thought went into Slither’s rules for warning about uses of tx.origin. And that’s just what we did.

The new rules (and the mutants that inspired them)

Now let’s look at the mutants that helped us devise some new rules to add to Slither. Each of these mutants was detected by SmartCheck and/or Securify, but not by Slither. All three of these mutants represent a class of real bug that Slither could have detected, but didn’t:

Mutant showing Boolean constant misuse:

if (!p.recipient.send(p.amount)) { // Make the payment

    ==> !p.recipient.send(p.amount) ==> true

if (true) { // Make the payment

The first mutant shows where a branch is based on a Boolean constant. There’s no way for paths through this code to execute. This code is confusing and pointless at best; at worst, it’s a relic of a change made for testing or debugging that somehow made it into a final contract. While this bug seems easy to spot through a manual review, it can be hard to notice if the constant isn’t directly present in the condition but is referenced through a Boolean variable.

Mutant showing type-based tautology:

require(nextDiscountTTMTokenId6 >= 361 && ...);

    ==> ...361...==>...0…

require(nextDiscountTTMTokenId6 >= 0 && ...);

This mutant is similar to the first, but subtler; a Boolean expression appears to encode a real decision, but in fact, the result could be computed at compile time due to the types of the variables used (DiscountTTMTokenId6 is an unsigned value). It’s a case of a hidden Boolean constant, one that can be hard for a human to spot without keeping a model of the types in mind, even if the values are present in the condition itself.

Mutant showing loss of precision:

byte char = byte(bytes32(uint(x) * 2 ** (8 * j)));

    ==> ...*...==>.../…

byte char = byte(bytes32(uint(x) * 2 ** (8 / j)));

This last mutant is truly subtle. Solidity integer division can truncate a result (recall that Solidity doesn’t have a floating point type). This means that two mathematically equivalent expressions can yield different results when evaluated. For example, in mathematics, (5 / 10) * 2 and (5 * 2) / 10 have the same result; in Solidity, however, the first expression results in zero and the other results in one. When possible, it’s almost always best to multiply before dividing in Solidity to avoid losing precision (although there are exceptions, such as when the size limits of a type require division to come first).

After identifying these candidates, we wrote new Slither detectors for them. We then ran the detectors on a corpus that we use to internally vet new detectors, and we confirmed that they are able to find real bugs (and don’t report too many false positives). All three detectors have been available in the public version of Slither for a while now (as the boolean-cst, tautology, and divide-before-multiply rules, respectively), and the divide-before-multiplying rule has already claimed two trophies, one in December of 2020 and the other in January of 2021.

What’s next?

Our work proves that mutants can be a useful tool for improving static analyzers. We’d love to continue adding rules to Slither using this method, but unfortunately, to our knowledge, there are no other static analysis tools that compare to Slither and are seriously maintained.

Over the years, Slither has become a fundamental tool for academic researchers. Contact us if you want help with leveraging its capacities in your own research. Finally, check out our open positions (Security Consultant, Security Apprenticeship) if you would like to join our core team of researchers.

Zero Day Initiative

Retrieved title: Zero Day Initiative - Blog, 3 item(s)
MindShaRE: Analyzing BSD Kernels for Uninitialized Memory Disclosures using Binary Ninja

Disclosure of uninitialized memory is one of the common problems faced when copying data across trust boundaries. This can happen between the hypervisor and guest OS, kernel and user space, or across the network. The most common bug pattern noticed among these cases is where a structure or union is allocated in memory, and some of the fields or padding bytes are not initialized before copying it across trust boundaries. The question is, is it possible to perform variant analysis of such bugs?

The idea here is to perform a control flow insensitive analysis to track all memory store operations statically. Any memory region never written to is identified as uninitialized when the data from it is copied across trust boundaries.

Generalizing the code pattern for analysis

Consider the case of CVE-2018-17155, a FreeBSD kernel memory disclosure in the getcontext() and swapcontext() system calls due to a lack of structure initialization. Shown below is the patch for sys_getcontext(). The listing on the left shows the patched code. sys_swapcontext() was patched in a similar fashion.

Figure 1 - Patch for sys_getcontext() information disclosure. Vulnerable code appears on the right.

The vulnerable code declared a ucontext_t structure on the stack, wrote to some but not all fields, and finally used copyout() to copy UC_COPY_SIZE bytes of data from the structure to userland. The problem here is that not all fields are initialized, so any data occupying the uninitialized parts of the structure memory region are disclosed. To solve the problem, the patched code zeroes out the entire structure using the bzero() function.

The generalization of the above code pattern looks like this:

• A memory region (structure, union, etc.) is declared on the stack or allocated on the heap, which could be the source of uninitialized memory.
• The memory region may get fully or partially written.
• There is an API that transfers data across trust boundaries. This could be the sink for uninitialized memory.
• The API generally takes at least 3 parameters: source buffer, destination buffer, and size. In this case, the source of the memory is a stack offset, and the size of the transfer is a constant value. A constant size of transfer means the value is either the entire size of the memory region (using sizeof operator) or a part of it until an offset.
• The memory region may be zeroed out before usage using functions like memset() or bzero().

The sink function is application-specific. To mention a few of the more likely sinks: copy_to_user() in case of Linux kernel, copyout() in case of BSD kernels, send() or sendto() for network transfers or any wrappers around them. The definitions of these functions are either documented, or else understood by reverse engineering if the target is closed source.

Searching the code pattern for analysis

Once the sink function and its definition are known, we can query for calls to the sink function with a constant size argument and source buffer pointing to a stack offset or heap memory. Querying for a pointer to stack memory is straightforward, whereas detecting heap pointers requires visiting the definition site of source variables. Consider the definition of copyout() function in BSD:

         copyout(const void *kaddr, void *uaddr, size_t len)

When looking for stack memory disclosures, search for cross-references to the copyout() function where kaddr is pointing to a stack offset and the len parameter is a constant.

Binary Ninja has a static data flow feature that propagates known values within a function, including stack frame offsets and type information. Using this feature, it is possible to narrow down calls to copyout() that satisfy our search criteria. To understand this better, let’s inspect the arguments passed to copyout() from sys_getcontext().

Figure 2 - sys_getcontext() invoking copyout(kaddr, uaddr, len)

The kaddr parameter, or params[0], holds a kernel stack pointer, is shown as the stack frame offset -0x398. The value for the len parameter, or params[1], is shown as the constant 0x330. Since Binary Ninja has no information regarding uaddr, this is shown as . With this register type information for kaddr and len, the following query fetches all instances of calls to copyout() with a kernel stack pointer and constant size:

Statically tracking memory stores

The core idea of the analysis is to track all the memory store operations using Binary Ninja’s static data flow capability and propagate pointers manually using Single Static Assignment (SSA) form whenever necessary. For tracking stack memory stores in local function scope, we rely on Low-Level IL (LLIL), because Medium Level IL (MLIL) abstracts stack access and might eliminate some of the memory stores. For tracking inter-procedure store operations where the address is passed to another function, we rely on the MLIL SSA form to propagate the pointers. The visitor class implemented to handle IL instructions is based on Josh Watson’s Emilator.

Tracking stack memory stores with LLIL

In LLIL, any instruction writing to memory is represented as an LLIL_STORE operation. It has a source and destination parameter. The idea is to linearly visit each LLIL instruction in a function and check if it is an LLIL_STORE operation having a stack frame offset as its destination. When a memory store writing to stack is identified, we will log the source offset of the write and its size. Consider a simple 8-byte memory move operation and its corresponding LLIL information provided by Binary Ninja:

Figure 3 - LLIL_STORE operation in freebsd32_sigtimedwait()

The StackFrameOffset value is the offset from the base of the stack and the size property gives the size of the store operation. Using this information, it is possible to know which memory address are being written. In this case, the addresses from stack base offset -116 to -109 (8 bytes) are being initialized.

Static function hooks and memory writing APIs

While memory store instructions are one way to initialize memory, functions like memset() and bzero() are frequently used to initialize a memory region with NULLs. Similarly, functions such as memcpy(), memmove(), bcopy(), strncpy(), and strlcpy() are also used to write to a memory region. All these functions have something in common: there is a destination memory pointer and a size to write. If the destination and size values are known, it is possible to know the memory region being written to. Consider the case of bzero(), which is used to clear stack memory in the patched sys_getcontext():

Figure 4 - Clearing stack memory using bzero()

By querying the destination pointer and size parameters, it is possible to know their respective values and hence the target memory region.

Now let us consider how the analyzer can handle CALL operations. Static hooks are handlers to functions which we intend to handle differently compared to other functions. For any CALL instruction with a known destination i.e., MLIL_CONST_PTR, the symbol is fetched to check for static hooks.

A JSON configuration with the function names as well their positional parameters (destination buffer and size) is provided to the analyzer for static hooking:

The copyin() function is specific to BSD kernels. It is used to initialize kernel buffers with data from user space. Any target-specific functions to hook can be added to the JSON config and handled in visit_function_hooks() as per necessity.

Handling x86 REP optimization

Many times compilers optimize memory writing functions into REP instructions or a series of store operations. While store operations introduced due to optimization can be handled like any other store operation, REP instructions requires special handling. Static function hooks are not useful in detecting memory writes due to REP. So how do we handle such optimizations and avoid missing those memory writes? First, let’s look at how Binary Ninja translates the REP instruction in LLIL or MLIL.

Figure 5 - memcpy() optimized to REP instruction

Figure 6 - REP instruction translation in MLIL

The REP instruction repeats the string operation until RCX is 0. The direction of copy operation depends on the Direction Flag (DF), hence the branching where one branch increments the source (RSI) and destination (RDI) pointers and the other decrements them. In general, it is reasonably safe to assume that DF will be 0, and that pointers are incremented.

When linearly walking through the ILs, the translated REP instruction will look no different from other instructions. The idea is to check for GOTO instruction, and for every GOTO instruction in IL, fetch the disassembly at the same address. If the disassembly is REP instruction, then fetch the destination pointer as well as size arguments and mark the memory region as initialized.

The LLIL has a get_possible_reg_values() API to read values of registers statically. The MLIL provides couple of APIs, get_var_for_reg() and get_ssa_var_version(), to map architecture registers to SSA variables. This is very useful when propagating values manually using SSA variables in the absence of RegisterValueType information (i.e. RegisterValueType.UndeterminedValue). Similar APIs are currently missing in LLIL and tracked as a feature request: API to get SSARegister for a register at a given LLIL instruction.

Tracking Inter-procedure memory stores with MLIL

At this point we can track memory store operations, CALL operations such as bzero(), memset(), and also deal with REP optimization. The next task is to track memory writes across function calls, as when a caller passes a memory address to a callee. The interesting challenge here is that once a stack pointer has been passed into another function, it can no longer be tracked using the register value type information (StackFrameOffset) as we did within the local function scope using LLIL (see above).

To solve this problem, we propagate the pointers within the callee function using MLIL SSA variables, just like propagating taint information. Whenever a MLIL_STORE_SSA instruction is encountered, we log the offset of the write operation and size values whenever the destination of the memory write operation is resolved manually based on values of SSA variables. The set_function_args() function shown below iterates through MLIL variables and assigns the value (pointer) passed by the caller:

Once the initial SSA variables are set, we visit all the instructions linearly to propagate the pointer and log memory writes. While doing this, the most common operation performed on the pointer is addition. Therefore, it is necessary to emulate MLIL_ADD instruction to handle pointer arithmetic operations. Additionally, it is also important to emulate instructions such as MLIL_SUB, MLIL_LSR and MLIL_AND to handle certain pointer-aligning operations in case of optimizations. Here is an example of how these MLIL SSA expressions are resolved to log a memory store operation:

Considering the SSA variable rax_43#65 as a manually propagated pointer value, it is possible to resolve the destination of the store operation as well as the size of the write. But when the value of the SSA variable rax_43#65 is not available, this memory is not associated with the pointer that was propagated by the caller and therefore not logged.

Handling pointer-aligning optimizations

When performing inter-procedure analysis, further optimizations were noticed in addition to the REP optimization as seen in the “Handling x86 REP optimization” section above. A variable allocated on the stack will usually be aligned to meet the needs of upcoming operations. Let’s say a stack pointer is passed to memset() and the compiler inlines the call as a REP instruction. In this case, it is very likely the memory will be allocated at an aligned address such that the fastest instructions can be used during REP operation.

However, when a pointer is received as an argument by a callee or as a return value of an allocator function, the compiler may have to generate pointer and size alignment opcodes which could rely on branching decisions before reaching REP instruction. Here is an example of such an optimization commonly found in the NetBSD kernel used for analysis:

Figure 7 - An example memset() optimization from NetBSD

When such branching decisions are involved, the pointer, as well as the size, can take multiple possible values (from the perspective of static analysis) at the point of REP instruction. This is different from what we observed in the “Handling x86 REP optimization" section where there is only one possible value for pointer and size. Our goal here is to find the actual value of the pointer and size in the absence of pointer-aligning computations. To achieve this, a couple of SSA expressions were identified that can be used to resolve the original value:

• Search for an expression involving (ADDRESS & BYTESIZE). This could be the first use of ADDRESS before taking any conditional branches.
• Search for an expression involving (SIZE >> 3). This is where the adjusted size is passed to a REP instruction.

I had a couple of ideas in mind to track back the above expressions from the point of REP instruction, one relying entirely on SSA and the other based on dominators:

• Use get_ssa_var_definition() and get_ssa_var_uses() APIs to get a variable’s definition site and its uses.
• Alternatively, get the dominators of the basic block containing the REP instruction and visit the instructions in the dominator blocks.

The function resolve_optimization() shown below uses dominators to get the basic blocks to perform the search operation. Since the pointer is manually passed by the caller, the value is fetched from the SSA variables.

In the case of a possible constant size value, we fetch the maximum from the list of available size values. Once both pointer and size values are available, we log the memory region as initialized.

Tracking memory stores in dynamic memory allocations

So far, all our analyses were concentrated on stack memory as the source buffer for information disclosure. This is largely due to the prevalence of stack memory disclosure bugs, as described in KLEAK: Practical Kernel Memory Disclosure Detection (PDF). What about other memory regions such as the heap? Can we model some of the heap memory disclosures too?

When looking for heap memory disclosures, the idea remains the same. We are still looking for calls to sink functions with known size value. But instead of the source pointer being RegisterValueType.StackFrameOffset, we check for RegisterValueType.UndeterminedValue. Consider the code for sys_statfs():

Figure 8 - Dynamic memory allocation in sys_statfs()

Here the kernel pointer rdi_1#2 in copyout() is undetermined because Binary Ninja does not know what the allocator function returns. However, by using the SSA form, we can manually track back whether rdi_1#2 is holding the return value of malloc(). For example, follow the highlighted instructions in Figure 8. - the variables are assigned as rax_1#1->r15#1->rdi_1#2. This information can be obtained programmatically using the MLIL get_ssa_var_definition() API. Once the definition site of an SSA variable is obtained, we can check whether the variable is initialized using a CALL operation as demonstrated below:

How does the analyzer know the definition of allocator functions? We can take the same approach used for providing information regarding static function hooks (see the “Static function hooks and memory writing APIs” section above). A JSON configuration with a list of allocator functions and an index of size parameters is provided to the analyzer. For any CALL instruction with a known destination (i.e., MLIL_CONST_PTR), the symbol is fetched to check for known allocator functions. Here is a sample JSON configuration used for analysis:

Once we have established the connection between the source pointer and allocator call, the next question is, what pointer value will be assigned as the return value of the allocator call? The stack pointers as tracked as negative offsets in Binary Ninja as seen below:

To have a generalized representation between the stack and heap pointers, I decided to set the return value of a heap allocator calls as a negative value of the size of the allocation. For the malloc() call in sys_statfs(), rax_1#1 is set to -0x1d8 as the starting address. Therefore, the memory region which needs to be initialized ranges from -0x1d8 to 0 [start + size of allocation]. Even when the allocation size is undetermined, starting address can be set to some arbitrary value such as -0x10000. All that matters here is to know whether the contiguous memory region accessed by copyout() is initialized or not.

Filtering memory stores using dominators and post dominators

A dominator in graph theory provides information on the order of execution of some basic blocks. While we have already used dominators for handling pointer-aligning optimizations in the “Handling pointer aligning optimizations” section, this section details the usage of dominators in detecting control flow-sensitive memory store operations.

To analyze uninitialized memory disclosures, we explore two ideas: dominators and post-dominators. A basic block X is said to dominate another basic block Y if all paths to Y should go through X. A basic block Y is said to post-dominate basic block X if all paths from X to any of the function’s return blocks should go through Y. Consider this example from Wikipedia:

Figure 9 - Graph demonstrating dominators and post dominators

In the provided graph, node B dominates nodes C, D, E, and F because all paths to these nodes must go through node B. By definition, every node dominates itself, so the set of all nodes dominated by node B will be B, C, D, E, and F. Also, node A dominates all the nodes in the graph. Therefore, the dominators of nodes C, D, E, F are A and B.

Similarly, when A is considered as the function entry node, with E and F being exit nodes, node B is the post-dominator of node A. This is because all paths from A to the exit nodes must go through B.

Now, how can dominators and post-dominators help us in this analysis?

We can perform dominator analysis on the callers of the sink function. The idea is to log only memory stores in basic blocks which dominate the basic block calling copyout(), that is, basic blocks which will be executed irrespective of branching decisions. Consider the code below:

Figure 10 - Dominators of basic block calling copyout()

Here the basic block calling copyout() is and there are five dominator blocks in the path from the function entry to copyout(). When performing dominator-based analysis, we will log only memory stores within these five dominator blocks. The memory store operations in other basic blocks might be skipped and not execute. The same is the case with the callee function. We will perform an inter-procedure analysis only when the function is called from a dominator block.

Post-dominator analysis is done on the callee function during an inter-procedure analysis. It is meant to find bugs where a callee can possibly return before initializing the memory region it is supposed to. Consider the callee function do_sys_waitid() from figure 10.

Figure 11 - Post dominators of function entry block in do_sys_waitid()

The function entry block is always executed. The other basic blocks that are executed irrespective of the branching decisions are and . Once again, memory stores and callee analysis are limited only to these three basic blocks.

Dominator- and post-dominator-based analysis tries to fill the gaps in control flow insensitive analysis performed by the analyzer. The general assumption here is that memory is initialized or cleared before performing further operations and therefore dominates other basic blocks. However, this assumption is not always true. For example, there are cases where individual code paths can perform the same operation as done in the dominators. Moreover, when a callee returns due to any error condition, the return value could be validated by the caller before calling copyout(). Consequently, dominator-based analysis as done in this implementation is prone to large numbers of false positives.

Checking for uninitialized memory disclosures

Once all the memory store operations are statically logged with information on offset and size of write, the memory region copied out to user space using copyout() can be evaluated for uninitialized memory disclosure. Consider the call to copyout() shown below:

The source pointer is -0x398 and the size copied is 0x330 bytes. Therefore, the analyzer has to validate if all the bytes in the memory range from -0x398 to (-0x398 + 0x330) are initialized, and if not, flag that as a bug.

False positives and limitations

The analyzer is written with the goal of finding memory regions that never get written to in any possible code paths. False positives occur in cases when it is unable to track a memory store operation. Below are some common false positive conditions and limitations of the implementation:

• The analyzer does not emulate branching instructions. Therefore, false positives are seen in code constructs involving control flow decisions. Consider a memory region such as an array that is initialized in a loop operation. In this case, the store operation would be detected only once because the loop body is visited only once by the analyzer, and not in a loop as it would be during execution.

• Indirect calls are not resolved statically. Consequently, any memory store done during indirect calls is not tracked.

• Optimizations may make it harder to track memory stores. Some common optimizations noticed were tackled in the “Handling x86 REP optimization” and “Handling pointer aligning optimizations” sections.

• Binary Ninja may wrongly detect type information of functions used for static hooking or sink functions like copyout(). Since our analysis relies on RegisterValueType information, any failure to accurately detect the function prototype may lead to false results. Verify the type information before analysis and update if necessary.

• The analyzer looks only for code patterns where the memory source and sink function are within the same function. There is no tracking back of memory source beyond the local function scope.

• Dominator analysis is experimental. You should use it only as a guideline to perform code reviews.

When there is access to source code, some of these false positives can be resolved by changing the optimization flags or by unrolling loops to reduce branching decisions.

Analysis and results

The target kernel executable is loaded in Binary Ninja to generate the BNDB analysis database. Then the analyzer is executed against the database for faster analysis. There are a couple of scripts: one for analyzing stack memory disclosures and another for analyzing sink functions with known size and unknown source pointer. Since the source pointer could be from a heap allocator, provide a JSON configuration with a list of allocator functions as an argument. The dominator analysis is experimental. You need to enable it using an optional argument when needed:

Conclusion

The scripts were tested on Binary Ninja version 2.4.2846 against FreeBSD 11.4, NetBSD 9.2, and OpenBSD 6.9 kernels. Amongst the results, code paths that are possibly reachable for an unprivileged user were evaluated. The OpenBSD bugs were found in sysctls related to multicast routing in IPv4 as well as IPv6, which are tracked as ZDI-22-073 and ZDI-22-012 respectively.

The four vulnerabilities (ZDI-22-075, ZDI-22-1036, ZDI-22-1037, ZDI-22-1067) found in NetBSD are related to syscalls supporting backward compatibility for older NetBSD releases ZDI-22-075 and ZDI-22-1036 are information disclosures in VFS syscalls for NetBSD 3.0 and NetBSD 5.0 respectively. Details regarding the fixes can be found here. Next, ZDI-22-1037 is an information disclosure in getkerneinfo syscall for NetBSD 4.3. This bug was fixed with many other potential issues as seen here. Finally, ZDI-22-1067 is another information disclosure related to VFS syscalls but in NetBSD 2.0 compatibility. Details regarding the fix can be found here.

The FreeBSD bug found in version 11.4 was also related to compatibility, which in this case for supporting 32-bit binaries. However, this bug was fixed without a disclosure during a large change done for the 64-bit inode. The uninitialized structure fields were cleared in the copy_stat function as part of the 64-bit inode project. Though this commit was in May 2017, it was tagged to release 12.0 and above. Therefore, the bug remained unfixed in release 11.4 until it reached EOL in September 2021, soon after our bug report.

Putting it together, most of the bugs were found in BSD’s compatibility layers. Additionally, all these bugs are stack memory disclosures. For anyone interested, the source code for the project can be found here.

 You can find me on Twitter @RenoRobertr, and follow the team on Twitter or Instagram for the latest in exploit techniques and security patches.

 

Acknowledgments and references

— Various blog posts from Trail of Bits on Binary Ninja
— Josh Watson for various projects using Binary Ninja. The visitor class implementation is based on emilator
— Jordan for all the code snippets and the Binary Ninja slack community for answering various questions
KLEAK: Practical Kernel Memory Disclosure Detection by Thomas Barabosch and Maxime Villard
Solving Uninitialized Stack Memory on Windows by Joe Bialek
Building Faster AMD64 Memset Routines by Joe Bialek

The September 2022 Security Update Review

Another Patch Tuesday is upon, and Adobe and Microsoft have released a bevy of new security updates. Take a break from your regularly scheduled activities and join us as we review the details of their latest security offerings.

Adobe Patches for September 2022

 For September, Adobe released seven patches addressing 63 in Adobe Experience Manager, Bridge, InDesign, Photoshop, InCopy, Animate, and Illustrator. A total of 42 of these bugs were reported by ZDI Sr Vulnerability Researcher Mat Powell. The update for InDesign is the largest patch this month, with eight Critical-rated and 10 Important-rated vulnerabilities receiving fixes. The most severe of these could lead to code execution if a specially crafted file is opened on an affected system. The patch for Photoshop fixes 10 CVEs, nine of which are rated Critical. Again, an attacker can get code execution if they can convince a user to open a malicious file. The fix for InCopy fixes five similar code execution bugs and two info disclosure bugs. Adobe Animate also receives patches for two Critical-rated code execution bugs.

The update for Adobe Bridge corrects 10 Critical-rated code execution bugs and two Important-rated info disclosure bugs. One of the three Illustrator vulnerabilities getting patched could also lead to code execution. As with the bugs previously mentioned, a user would need to open a malicious file with an affected software version. Finally, the patch for Adobe Experience Manager addresses 11 Important-rated bugs, primarily of the cross-site scripting (XSS) variety.

None of the bugs fixed by Adobe this month are listed as publicly known or under active attack at the time of release. Adobe categorizes these updates as a deployment priority rating of 3.

Apple Patches for September 2022

Yesterday, Apple released updates for iOS, iPadOS, macOS, and Safari. They also released updates for watchOS and tvOS but provided no details on any of the fixes included in these patches. Two of the bugs patched by Apple were identified as being under active exploit. The first is a kernel bug (CVE-2022-32917) resulting from improper bounds checking. It affects iOS 15 and iPadOS 15, macOS Big Sur, and macOS Monterey. Interestingly, this CVE is also listed in the advisory for iOS 16, but it is not called out as being under active exploit for that flavor of the OS. The Big Sur version of macOS also includes a fix for an Out-of-Bounds (OOB) Write bug in the kernel (CVE-2022-32894) that’s also listed as under active attack. One final note: Apple states in its iOS 16 advisory that “Additional CVE entries to be added soon.” It is possible other bugs could also impact this version of the OS. Either way, it’s time to update your Apple devices.

Microsoft Patches for September 2022

This month, Microsoft released 64 new patches addressing CVEs in Microsoft Windows and Windows Components; Azure and Azure Arc; .NET and Visual Studio and .NET Framework; Microsoft Edge (Chromium-based); Office and Office Components; Windows Defender; and Linux Kernel (really). This is in addition to the 15 CVEs patched in Microsoft Edge (Chromium-based) and one patch for side-channel speculation in Arm processors. That brings the total number of CVEs to 79. Five of these CVEs were submitted through the ZDI program.

The volume of fixes released this month is about half of what we saw in August, but it is in line with the volume of patches from previous September releases. For whatever reason, the last quarter of the calendar year tends to have fewer patches released. We’ll see if that trend continues in 2022.

Of the 64 new CVEs released today, five are rated Critical, 57 are rated Important, one is rated Moderate, and one is rated Low in severity. One of these new CVEs is listed as publicly known and under active attack at the time of release. Let’s take a closer look at some of the more interesting updates for this month, starting with the CLFS bug under active attack:

-       CVE-2022-37969 - Windows Common Log File System Driver Elevation of Privilege Vulnerability
This bug in the Common Log File System (CLFS) allows an authenticated attacker to execute code with elevated privileges. Bugs of this nature are often wrapped into some form of social engineering attack, such as convincing someone to open a file or click a link. Once they do, additional code executes with elevated privileges to take over a system. Usually, we get little information on how widespread an exploit may be used. However, Microsoft credits four different agencies reporting this bug, so it’s likely beyond just targeted attacks.

-       CVE-2022-34718 - Windows TCP/IP Remote Code Execution Vulnerability
This Critical-rated bug could allow a remote, unauthenticated attacker to execute code with elevated privileges on affected systems without user interaction. That officially puts it into the “wormable” category and earns it a CVSS rating of 9.8. However, only systems with IPv6 enabled and IPSec configured are vulnerable. While good news for some, if you’re using IPv6 (as many are), you’re probably running IPSec as well. Definitely test and deploy this update quickly.

-       CVE-2022-34724 - Windows DNS Server Denial of Service Vulnerability
This bug is only rated Important since there’s no chance of code execution, but you should probably treat it as Critical due to its potential impact. A remote, unauthenticated attacker could create a denial-of-service (DoS) condition on your DNS server. It’s not clear if the DoS just kills the DNS service or the whole system. Shutting down DNS is always bad, but with so many resources in the cloud, a loss of DNS pointing the way to those resources could be catastrophic for many enterprises.

-       CVE-2022-3075 - Chromium: CVE-2022-3075 Insufficient data validation in Mojo
This patch was released by the Google Chrome team back on September 2, so this is more of an “in case you missed it.” This vulnerability allows code execution on affected Chromium-based browsers (like Edge) and has been detected in the wild. This is the sixth Chrome exploit detected in the wild this year. The trend shows the near-ubiquitous browser platform has become a popular target for attackers. Make sure to update all of your systems based on Chromium.

Here’s the full list of CVEs released by Microsoft for September 2022:

CVE Title Severity CVSS Public Exploited Type
CVE-2022-37969 Windows Common Log File System Driver Elevation of Privilege Vulnerability Important 7.8 Yes Yes EoP
CVE-2022-23960 * Arm: CVE-2022-23960 Cache Speculation Restriction Vulnerability Important N/A Yes No Info
CVE-2022-34700 Microsoft Dynamics 365 (on-premises) Remote Code Execution Vulnerability Critical 8.8 No No RCE
CVE-2022-35805 Microsoft Dynamics 365 (on-premises) Remote Code Execution Vulnerability Critical 8.8 No No RCE
CVE-2022-34721 Windows Internet Key Exchange (IKE) Protocol Extensions Remote Code Execution Vulnerability Critical 9.8 No No RCE
CVE-2022-34722 Windows Internet Key Exchange (IKE) Protocol Extensions Remote Code Execution Vulnerability Critical 9.8 No No RCE
CVE-2022-34718 Windows TCP/IP Remote Code Execution Vulnerability Critical 9.8 No No RCE
CVE-2022-38013 .NET Core and Visual Studio Denial of Service Vulnerability Important 7.5 No No DoS
CVE-2022-26929 .NET Framework Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2022-38019 AV1 Video Extension Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2022-38007 Azure Guest Configuration and Azure Arc-enabled servers Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2022-37954 DirectX Graphics Kernel Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2022-35838 HTTP V3 Denial of Service Vulnerability Important 7.5 No No DoS
CVE-2022-35828 Microsoft Defender for Endpoint for Mac Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2022-34726 Microsoft ODBC Driver Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-34727 Microsoft ODBC Driver Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-34730 Microsoft ODBC Driver Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-34732 Microsoft ODBC Driver Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-34734 Microsoft ODBC Driver Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-37963 Microsoft Office Visio Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2022-38010 Microsoft Office Visio Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2022-34731 Microsoft OLE DB Provider for SQL Server Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-34733 Microsoft OLE DB Provider for SQL Server Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-35834 Microsoft OLE DB Provider for SQL Server Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-35835 Microsoft OLE DB Provider for SQL Server Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-35836 Microsoft OLE DB Provider for SQL Server Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-35840 Microsoft OLE DB Provider for SQL Server Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-37962 Microsoft PowerPoint Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2022-35823 Microsoft SharePoint Remote Code Execution Vulnerability Important 8.1 No No RCE
CVE-2022-37961 Microsoft SharePoint Server Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-38008 Microsoft SharePoint Server Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-38009 Microsoft SharePoint Server Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-37959 Network Device Enrollment Service (NDES) Security Feature Bypass Vulnerability Important 6.5 No No SFB
CVE-2022-38011 Raw Image Extension Remote Code Execution Vulnerability Important 7.3 No No RCE
CVE-2022-35830 Remote Procedure Call Runtime Remote Code Execution Vulnerability Important 8.1 No No RCE
CVE-2022-37958 SPNEGO Extended Negotiation (NEGOEX) Security Mechanism Information Disclosure Vulnerability Important 7.5 No No Info
CVE-2022-38020 Visual Studio Code Elevation of Privilege Vulnerability Important 7.3 No No EoP
CVE-2022-34725 Windows ALPC Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2022-35803 Windows Common Log File System Driver Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2022-30170 Windows Credential Roaming Service Elevation of Privilege Vulnerability Important 7.3 No No EoP
CVE-2022-34719 Windows Distributed File System (DFS) Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2022-34724 Windows DNS Server Denial of Service Vulnerability Important 7.5 No No DoS
CVE-2022-34723 Windows DPAPI (Data Protection Application Programming Interface) Information Disclosure Vulnerability Important 5.5 No No Info
CVE-2022-35841 Windows Enterprise App Management Service Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2022-35832 Windows Event Tracing Denial of Service Vulnerability Important 5.5 No No DoS
CVE-2022-38004 Windows Fax Service Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2022-34729 Windows GDI Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2022-38006 Windows Graphics Component Information Disclosure Vulnerability Important 6.5 No No Info
CVE-2022-34728 Windows Graphics Component Information Disclosure Vulnerability Important 5.5 No No Info
CVE-2022-35837 Windows Graphics Component Information Disclosure Vulnerability Important 5 No No Info
CVE-2022-37955 Windows Group Policy Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2022-34720 Windows Internet Key Exchange (IKE) Extension Denial of Service Vulnerability Important 7.5 No No DoS
CVE-2022-33647 Windows Kerberos Elevation of Privilege Vulnerability Important 8.1 No No EoP
CVE-2022-33679 Windows Kerberos Elevation of Privilege Vulnerability Important 8.1 No No EoP
CVE-2022-37956 Windows Kernel Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2022-37957 Windows Kernel Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2022-37964 Windows Kernel Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2022-30200 Windows Lightweight Directory Access Protocol (LDAP) Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2022-26928 Windows Photo Import API Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2022-38005 Windows Print Spooler Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2022-35831 Windows Remote Access Connection Manager Information Disclosure Vulnerability Important 5.5 No No Info
CVE-2022-30196 Windows Secure Channel Denial of Service Vulnerability Important 8.2 No No DoS
CVE-2022-35833 Windows Secure Channel Denial of Service Vulnerability Important 7.5 No No DoS
CVE-2022-38012 Microsoft Edge (Chromium-based) Remote Code Execution Vulnerability Low 7.7 No No RCE
CVE-2022-3038 * Chromium: CVE-2022-3038 Use after free in Network Service Critical N/A No No RCE
CVE-2022-3075 * Chromium: CVE-2022-3075 Insufficient data validation in Mojo High N/A No Yes RCE
CVE-2022-3039 * Chromium: CVE-2022-3039 Use after free in WebSQL High N/A No No RCE
CVE-2022-3040 * Chromium: CVE-2022-3040 Use after free in Layout High N/A No No RCE
CVE-2022-3041 * Chromium: CVE-2022-3041 Use after free in WebSQL High N/A No No RCE
CVE-2022-3044 * Chromium: CVE-2022-3044 Inappropriate implementation in Site Isolation High N/A No No N/A
CVE-2022-3045 * Chromium: CVE-2022-3045 Insufficient validation of untrusted input in V8 High N/A No No RCE
CVE-2022-3046 * Chromium: CVE-2022-3046 Use after free in Browser Tag High N/A No No RCE
CVE-2022-3047 * Chromium: CVE-2022-3047 Insufficient policy enforcement in Extensions API Medium N/A No No SFB
CVE-2022-3053 * Chromium: CVE-2022-3053 Inappropriate implementation in Pointer Lock Medium N/A No No N/A
CVE-2022-3054 * Chromium: CVE-2022-3054 Insufficient policy enforcement in DevTools Medium N/A No No SFB
CVE-2022-3055 * Chromium: CVE-2022-3055 Use after free in Passwords Medium N/A No No RCE
CVE-2022-3056 * Chromium: CVE-2022-3056 Insufficient policy enforcement in Content Security Policy Low N/A No No SFB
CVE-2022-3057 * Chromium: CVE-2022-3057 Inappropriate implementation in iframe Sandbox Low N/A No No EoP
CVE-2022-3058 * Chromium: CVE-2022-3058 Use after free in Sign-In Flow Low N/A No No RCE

* Indicates this CVE had previously been assigned by a 3rd-party and is now being incorporated into Microsoft products.

Checking the remaining Critical-rated updates, there are two for Windows Internet Key Exchange (IKE) Protocol Extensions that could also be classified as “wormable.” For both bugs, only systems running IPSec are affected. There are also two Critical-rated vulnerabilities in Dynamics 365 (On-Premises) that could allow an authenticated user to perform SQL injection attacks and execute commands as db_owner within their Dynamics 356 database.

Moving on to other code execution bugs, more than half of this month’s release involves some form of remote code execution. Of these, the patches for SharePoint stand out. Microsoft recently detailed how a SharePoint bug was used by Iranian threat actors against the Albanian government, resulting in Albania breaking off diplomatic relations with Iran. Those attacks involved a SharePoint bug we had previously blogged about. These new SharePoint cases do require authentication, but they sound very similar to other SharePoint bugs that came through the ZDI program. There are six RCE bugs in OLE DB Provider for SQL Server, but they require user interaction. A threat actor would need a user on an affected system to connect to a malicious SQL server via OLEDB, which could result in the target server receiving a malicious packet, resulting in code execution. There are five RCE bugs in the ODBC driver that also require user interaction. For these, opening a malicious MDB in Access would get code execution, similar to the other open-and-own bugs in Office components. The bug in LDAP also requires user interaction, but no other information about the exploit scenario is given.

The bug in the Enterprise App Management component requires authentication, but it’s still intriguing. An attacker could use the vulnerability to install arbitrary SYSTEM services that would then run with SYSTEM privileges. I could definitely see this bug being used after an initial breach for lateral movement and to maintain a presence on a target network. The RPC bug also looks interesting, but it’s likely not as practical since an attacker would need to spoof the localhost IP address of the target. There’s an RCE bug in .NET, but no information besides the requirement for user interaction is given. Finally, there are updates for the AV1 video extension and the Raw image extension. Both updates are delivered automatically through the Microsoft store. If you’re in a disconnected environment, you’ll need to apply these updates manually.

There are a total of 19 elevation of privilege (EoP) fixes in this month’s release, including the aforementioned patch for CLFS. Many of these require an authenticated user to run specially crafted code on an affected system. The bug in Windows Defender for Mac fits this description, as do the kernel-related patches. However, there are a couple of interesting bugs that don’t fit this profile. The first of these is a bug in the Credential Roaming Service that could allow attackers to gain remote interactive logon rights on a machine. There are two bugs in Kerberos that could lead to SYSTEM, but both have many caveats, so exploitation is unlikely. The EoP in Azure Guest Configuration and Arc-Enabled servers is fascinating for multiple reasons. A threat actor could use this vulnerability to replace Microsoft-shipped code with their own code, which would then be run as root in the context of a Guest Configuration daemon. On an Azure Arc-enabled server, it could run in the context of the GC Arc Service or Extension Service daemons. While this is interesting on its own, the mere fact that Microsoft is producing patches for the Linux kernel boggles the mind. And, of course, it wouldn’t be a monthly update if it didn’t include a patch for the print spooler.

The September release includes six patches for information disclosure vulnerabilities. For the most part, these only result in leaks consisting of unspecified memory contents. One exception is the bug impacting the Data Protection Application Programming Interface (DPAPI). If you aren’t familiar with it, DPAPI allows you to encrypt data using information from the current user account or computer. The bug patched this month could allow an attacker to view the DPAPI master key. The vulnerability in the Windows graphics component could leak metafile memory values, although it’s not clear what an attacker could do with this information.

Seven different DoS vulnerabilities are patched this month, including the DNS bug previously mentioned above. Two bugs in secure channel would allow an attacker to crash a TLS by sending specially crafted packets. There’s a DoS in IKE, but unlike the code execution bugs listed above, no IPSec requirements are listed here. If you’re running newer OSes with the latest features, don’t miss the fix for an HTTP DoS. The system needs HTTP/3 enabled and the server using buffered I/O to be affected. HTTP/3 is a new feature in Windows Server 2022, so in this rare instance, older is better.

The September release includes a fix for a lone security feature bypass in Network Device Enrollment (NDES) Service. An attacker could bypass the service’s cryptographic service provider.

The Low-rated bug is a sandbox escape in Microsoft Edge (Chromium-based) that requires user interaction. However, the CVSS for this bug is 7.7, which Mitre classifies as “High.” Microsoft claims the user interaction involved justifies the Low rating, but I would still treat this as an important update and not delay the rollout.

No new advisories were released this month. The latest servicing stack updates can be found in the revised ADV990001.

Looking Ahead

The next Patch Tuesday falls on October 11, and we’ll return with details and patch analysis then. Don’t forget - I’ll be premiering the Patch Report webcast tomorrow on our YouTube channel at 9:00 am Central time. I hope you’re able to tune in and check it out. Until then, stay safe, happy patching, and may all your reboots be smooth and clean!

Riding the InfoRail to Exploit Ivanti Avalanche – Part 2

In my first blog post covering bugs in Ivanti Avalanche, I covered how I reversed the Avalanche custom InfoRail protocol, which allowed me to communicate with multiple services deployed within this product. This allowed me to find multiple vulnerabilities in the popular mobile device management (MDM) tool. If you aren’t familiar with it, Ivanti Avalanche allows enterprises to manage mobile device and supply chain mobility solutions. That’s why the bugs discussed here could be used by threat actors to disrupt centrally managed Android, iOS and Windows devices. To refresh your memory, the following vulnerabilities were presented in the previous post:

 ·       Five XStream insecure deserialization issues, where deserialization was performed on the level of message handling.
·       A race condition leading to authentication bypass, wherein I abused a weakness in the protocol and the communication between services.

This post is a continuation of that research. By understanding the expanded attack surface exposed by the InfoRail protocol, I was able to discover an additional 20 critical and high severity vulnerabilities. This blog post takes a detailed look at three of my favorite vulnerabilities, two of which have a rating of CVSS 9.8:

·       CVE-2022-36971 – Insecure deserialization.
·       CVE-2021-42133 – Arbitrary file write/read through the SMB server.
·       CVE-2022-36981 – Path traversal, delivered with a fun authentication bypass.

Each of these three vulnerabilities leads to remote code execution as SYSTEM.

CVE-2022-36971: A Tricky Insecure Deserialization

I discovered the first vulnerability when I came across an interesting class named JwtTokenUtility, which defines a non-default constructor that could be a potential target:

At [1], the function base64-decodes one of the arguments.

At [2], it checks if the publicOnly argument is true.

If not, it deserializes the base64 decoded argument at [3].

This looks like a possible insecure deserialization sink. In addition, it is invoked from many locations within the codebase. The following screenshot illustrates several instances where it is invoked with the first argument set to false:

Figure 1 - Example invocations of JwtTokenUtility non-default constructor

It turned out that most of these potential vectors require control over the SQL database. The serialized object is retrieved from the database, and I found no direct way to modify this value. Luckily, there are two services with a more direct attack vector: the Printer Device Server and the Smart Device Server. The exploitation of both services is almost identical. We will focus on the Printer Device Server (PDS).

Let’s have a look at the PDS AmcConfigDirector.createAccessTokenGenerator method:

At [1], it uses acctApi.getGlobal to retrieve an object that implements IGlobal.

At [2], it retrieves the pkk string by calling global.getAccessKeyPair.

At [3], it decrypts the pkk string by calling PasswordUtils.decryptPassword. We are not going to analyze this decryption routine. This decryption function implements a fixed algorithm with a hardcoded key, thus the attacker can easily perform the encryption or decryption on their own.

At [4], it invokes the vulnerable JwtTokenUtility constructor, passing the pkk string as an argument.

At this point, we are aware that there is potential for abusing the non-default JwtTokenUtility constructor. However, we are missing two things:

       -- How can we control the pkk string?
       -- How can we reach createAccessTokenGenerator?

Let’s start with the control of the pkk string.

Controlling the value of pkk

To begin, we know that:

       -- The code retrieves an object to assign to the global variable. This object implements IGlobal.
       -- It calls the global.getAccessKeyPair getter to retrieve pkk.

There is a Global class that appears to control the PDS service global settings. It implements the IGlobal interface and both the getter and the setter for the accessKeyPair member, so this is probably the class we’re looking for.

Next, we must look for corresponding setAccessKeyPair setter calls. Such a call can be found in the AmcConfigDirector.processServerProfile method.

At [1], processServerProfile accepts the config argument, which is of type PrinterAgentConfig.

At [2], it retrieves a list of PropertyPayload objects by calling config.getPayload.

At [3], the code iterates over the list of PropertyPayload objects.

At [4], there is a switch statement based on the property.name field.

At [5], the code checks to see if property.name is equal to the string "webfs.ac.ppk".

If so, it calls setAccessKeyPair at [6].

So, the AmcConfigDirector.processServerProfile method can be used to control the pkk value. Finally, we note that this method can be invoked remotely through a ServerConfigHandler InfoRail message:

At [1], we see that this message type can be accessed through the subcategory 1000000 (see first blog post - Message Processing and Subcategories section).

At [2], the main processMessage method is defined. It will be called during message handling.

At [3], the code retrieves the message payload.

At [4], it calls the second processMessage method.

At [5], it deserializes the payload and casts it to the PrinterAgentConfig type.

At [6], it calls processServerProfile and provides the deserialized config object as an argument.

Success! We can now deliver our own configuration through the ServerConfigHandler method of the PDS server. This method can be invoked through the InfoRail protocol. Next, we need to get familiar with the PrinterAgentConfig class to prepare the appropriate serialized object.

It has a member called payload, which is of type List.

PropertyPayload has two members that are interesting for us: name and value. Recall that the processServerProfile method does the following:

       -- Iterates through the list of PropertyPayload objects with a for loop.
       -- Executes switch statement based on PropertyPayload.name.
       -- Sets values based on PropertyPayload.value.

With this in mind, we can understand how to deliver a serialized object and control the pkk variable. We have to prepare an appropriate gadget (we can use the Ysoserial C3P0 or CommonsBeanutils1 gadgets), encrypt it (decryption will be handled by the PasswordUtils.decryptPassword method) and deliver through the InfoRail protocol.

The properties of the InfoRail message should be as follows:

       -- Message subcategory: 1000000.
       -- InfoRail distribution list address: 255.3.5.15 (PDS server).

Here is an example payload:

The first step of the exploitation is completed. Next, we must find a way to call the createAccessTokenGenerator function.

Triggering the Deserialization

Because the full flow that leads to the invocation of createAccessTokenGenerator is extensive, I will omit some of the more tedious details. We will instead focus on the InfoRail message that allows us to trigger the deserialization via the needFullConfigSync function. Be aware that the PDS server frequently performs synchronization operations, but typically does not perform a synchronization of the full configuration. By calling needFullConfigSync, a full synchronization will be performed, leading to execution of doPostDeploymentCleanup:

At [1], the code invokes our target method, createAccessTokenGenerator.

The following snippet presents the NotificationHandler message, which calls the needFullConfigSync method:

At [1], the message subcategory is defined as 2200.

At [2], the main processMessage method is defined.

At [3], the payload is deserialized and casted to the NotifyUpdate type (variable nu).

At [4], the code iterates through the entries of the NotifyUpdateEntry object that was obtained from nu.getEntries.

At [5], [6], and [7], the code checks to see if entry.ObjectType is equal to 61, 64, or 59.

If one of the conditions is true, the code sets the universalDeployment variable to true value at [8], [9], or [10], so that needFullConfigSync will be called at [11].

The last step is to create an appropriate serialized message object. An example payload is presented below. Here, the objectType field is equal to 61.

The attacker must send this payload through a message with the following properties:

-- Message subcategory: 2200.
-- InfoRail distribution list address: 255.3.5.15 (PDS server).

To summarize, we must send two different InfoRail messages to exploit this deserialization issue. The first message is to invoke ServerConfigHandler, which delivers a serialized pkk string. The second message is to invoke NotificationHandler, to trigger the insecure deserialization of the pkk value. The final result is a nice pre-auth remote code execution as SYSTEM.

CVE-2021-42133: One Vuln to Rule Them All - Arbitrary File Read and Write

Ivanti Avalanche has a File Store functionality, which can be used to upload files of various types. This functionality has been already abused in the past, in CVE-2021-42125, where an administrative user could:

-- Use the web application to change the location of the File Storage and point it to the web root.
-- Upload a file of any extension, such as a JSP webshell, through the web-based functionality.
-- Use the webshell to get code execution.

The File Store configuration operations are performed through the Enterprise Server, and they can be invoked through InfoRail messages. I quickly discovered three interesting properties of the File Store:

  1. It supports Samba shares. Therefore, it is possible to connect it to any reachable SMB server. The attacker can set the File Store path pointing to his server by specifying a UNC path.
  2. Whenever the File Store path is changed, Avalanche copies all files from the previous File Store directory to the new one. If a file of the same name already exists in the new location, it will be overwritten.
  3. The File Store path can also be set to any location in the local file system.

These properties allow an attacker to freely exchange files between their SMB server and the Ivanti Avalanche local file system. In order to modify the File Store configuration, the attacker needs to send a SetFileStoreConfig message:

At [1], the subcategory is defined as 1501.

At [2], the standard Enterprise Server processMessage method is defined. The implementation of message processing is a little bit different in the Enterprise Server, although the underlying idea is the same as in previous examples.

At [3] and [4], the method saves the new configuration values.

The only thing that we must know about the saveConfig method is that it overwrites all the properties with the new ones provided in the serialized payload. Moreover, some of the properties, such as the username and password for the SMB share, are encrypted in the same manner as in the previously described deserialization vulnerability.

To sum up this part, we must send an InfoRail message with the following properties:

--Message subcategory: 1501.
--Message distribution list: 255.3.2.5 (Enterprise Server).

Below is a fragment of an example payload, which sets the File Store path to an attacker-controlled SMB server:

Arbitrary File Read Scenario

The whole Arbitrary File Read scenario can be summarized in the following picture:

Figure 2 - Example scenario for the Arbitrary File Read exploitation

  1. The attacker points the File Store to a non-existent SMB share. This step is optional, but makes the exploit cleaner by ensuring that files from the current File Store location will not be copied to the location where the attacker wants to retrieve the files.
  2. The attacker points the File Store to a desired local file system path from which he wants to disclose files.
  3. The attacker points the File Store to his SMB share.
  4. Files from the previous File Store path (local file system) are transferred to the attacker’s SMB share.

The following screenshot presents an example exploitation of this scenario:

Figure 3 - Exploitation of the Arbitrary File Read scenario

As shown, the exploit is targeting the main Ivanti Avalanche directory: C:\Program Files\Wavelink\Avalanche.

The following screenshot presents the exploitation results. Files from the Avalanche main directory were gradually copied to the attacker’s server:

Figure 4 - Exploitation of the Arbitrary File Read scenario - results

Arbitrary File Write Scenario

The following screenshot presents the Arbitrary File Write scenario:

Figure 5 - Example scenario for the Arbitrary File Write scenario

  1. The attacker creates an SMB share that contains the JSP webshell.
  2. The attacker points the File Store to a non-existent SMB share. This step is optional, but makes the exploit cleaner by ensuring that files from the current File Store location will not be copied to the Avalanche webroot.
  3. The attacker points the File Store to the SMB share containing the webshell file.
  4. The attacker points the File Store to the Ivanti Avalanche webroot directory.
  5. Avalanche copies the webshell to the webroot.
  6. The attacker executes code through the webshell.

The following screenshot presents an example exploitation attempt. It uploads a file named poc-upload.jsp to C:\Program Files\Wavelink\Avalanche\Web\webapps\ROOT:

Figure 6 - Exploitation of the Arbitrary File Write scenario

Finally, one can use the uploaded webshell to execute arbitrary commands.

Figure 7 - Executing arbitrary code via the webshell

CVE-2022-36981: Path Traversal in File Upload, Plus Authentication Bypass

We made it to the final vulnerability we will discuss today. This time, we will exploit a path traversal vulnerability in the Avalanche Smart Device Server, which listens on port TCP 8888 by default. However, InfoRail will play a role during the authentication bypass that allows us to reach the vulnerable code.

Path Traversal in File Upload

Our analysis begins with examining the uploadFile method.

At [1], the endpoint path is defined. The path contains two arguments: uuid and targetbasename.

At [2], the doUploadFile method is called.

Let’s start with the second part of the doUploadFile method, as I want to save the authentication analysis for later in this section.

At [1], the uploadPath string is obtained by calling getUploadFilePath. This method accepts two controllable input arguments: uuid and baseFileName.

At [2], the method instantiates a File object based on uploadPath.

At [3], the method invokes writeToFile, passing the attacker-controlled input stream together with the File object.

We will now analyze the crucial getUploadFilePath method, as this is the method that composes the destination path.

At [1], it constructs deviceRoot as an object of type File. The parameters passed to the constructor are the hardcoded path obtained from getCachePath() and the attacker-controllable uuid value. As shown above, uuid is not subjected to any validation, so we can perform path traversal here.

At [2], the code verifies that the deviceRoot directory exists. From here we see that uuid is intended to specify a directory. If the directory does not exist, the code creates it at [3].

At [4], it validates the attacker-controlled baseFileName against a regular expression. If the validation fails, baseFileName is reassigned at [5].

At [6], it creates a new filename fn, based on the current datetime, an integer value, and baseFileName.

At [7], it instantiates a new object of type File. The path for this File object is composed from uuid and fn.

After ensuring that the file does not already exist, the file path is returned at [8].

After analyzing this method, we can draw two conclusions:

       -- The uuid parameter is not validated to guard against path traversal sequences. An attacker can use this to escape to a different directory.
       -- The extension of baseFileName is not validated. An attacker can use this to upload a file with any extension, though the filename will be prepended with a datetime and an integer.

Ultimately, when doUploadFile calls writeToFile, it will create a new file with this name and write the attacker-controlled input stream to the file. This makes it seem that we can exploit this as a path traversal vulnerability and write an arbitrary file to the filesystem. However, there are two major obstacles that will be presented in the next section.

Authentication and Additional UUID Verification

Now that we’ve covered the second part, let’s go back and analyze the first part of the doUploadFile method.

At [1], the code retrieves the mysterious testFlags.

At [2], it validates the length of the uuid, to ensure it is at least 5 characters long.

At [3], it performs an authorization check (perhaps better thought of as an authentication check) by calling isAuthorized. This method accepts uuid, credentials (authorization), and testFlags.

At [4], the code retrieves the deviceId based on the provided uuid.

At [5], the code checks to see if any device was retrieved. If not, it checks for a specific value in testFlags at [6]. If this second check is also not successful, the code raises an exception.

At [7], it calls allowUpload to perform one additional check. However, this final check has nothing to do with validating uuid. It only verifies the amount of available disk space, and this should not pose any difficulties for us.

We can spot two potential roadblocks:

       -- There is an authentication check.
       -- There is a check on the value of uuid, in that it must map to a known deviceId. However, we can bypass this check if we could get control over testFlags. If testFlags & 0x100 is not equal to 0, the exception will not be thrown, and execution will proceed.

Let’s analyze the most important fragments of the isAuthorized method:

At [1], the method retrieves enrollmentId, found within the token submitted by the requester.

At [2], it tries to retrieve the enrollment object from the database, based on enrollmentId.

At [3], it checks to see if enrollment was retrieved.

Supposing that enrollment was not retrieved successfully, the code checks for a particular value in testFlags at [4]. If not, it will return false at [6]. But if the relevant value is found in testFlags, the authentication routine will return true at [5], even though the requester’s authorization token did not contain a valid enrollmentId.

Note that this method also checks an enrollment password, although that part is not important for our purposes.

Here as well, testFlags can also be used to bypass the relevant check. Hence, if we can control testFlags, neither the authentication nor the uuid validation will cause any further trouble for us.

Here is where InfoRail comes into play. It turns out that the Smart Device Server AgentTaskHandler message can be used to modify testFlags:

At [1], it retrieves flagsToSet from the sds.modflags.set property.

At [2], it obtains the Config Directory API interface.

At [3], it uses flagsToSet to calculate the new flags value.

At [4], it saves the new flag value.

To sum up, an attacker can control testFlags, and use this to bypass both the authentication check and the uuid check.

Exploitation

Exploitation includes two steps.

1) Set testFlags to bypass the authentication and the uuid check.

To modify the testFlags, the attacker must send an InfoRail message with the following parameters:

       -- Message subcategory: 2500.
       -- Distribution list: 255.3.2.17 (SDS server).
       -- Payload:

2) Exploit the path Traversal through a web request

The path traversal can be exploited with an HTTP Request, as in the following example:

The response will return the name of the uploaded webshell:

Finally, an attacker can use the uploaded JSP webshell for remote code execution as SYSTEM.

Figure 8 - Remote Code Execution with the uploaded webshell

Conclusion

I really hope that you were able to make it through this blog post, as I was not able to describe those issues with a smaller number of details (believe me, I have tried). As you can see, undiscovered attack surfaces can lead to both cool and dangerous vulnerabilities. It is something that you must look for, especially in products that are responsible for the administration of many other devices.

This blog post is the last in this series of articles on Ivanti Avalanche research. However, I am planning something new, and yes, it concerns Java deserialization. Until then, you can follow me @chudypb and follow the team on Twitter or Instagram for the latest in exploit techniques and security patches.