3.1. Checking Web Page Source Code

 

Let's go through how to analyze the HTML code of the tested website. First, let's determine how to view the HTML code. There are two methods.

The first method is the simplest. First, open the page, then right-click anywhere to open the context menu, and select the "View Page Source" option. Alternatively, you can press Ctrl+U. This will open a new tab with the HTML code.

The second method is slightly more complex. You will need to open the developer panel. To do this, simultaneously press Ctrl+Shift+I and switch to the "Elements" tab (for Chrome) or "Inspector" (for Mozilla Firefox):

Develeoper console in Chrome

Developer console in Firefox

Please pay attention to the <HEAD> and <BODY> tags. The HTML document structure consists of two parts: the head and the body.

The head is easily identified by the <HEAD> … </HEAD> tags. The head section includes metadata, linked styles and scripts. This part is not visible to users.

The body is defined by the <BODY> … </BODY> tags. This is where the visible content of the page resides.

 

Searching for Comments

During development, it's a good practice, as well as helpful, to comment your code to understand it better. At this stage, developers might even leave potentially sensitive information in comments, such as links to restricted parts of the site, usernames and passwords, software versions, and more. Of course, anything unnecessary should eventually be removed, but we're all human and mistakes can happen.

Comments are denoted by the symbols <!-- comments -->. As an example, open the very first page of OWASP BWA by simply entering the machine's IP address in your browser, then press Ctrl+U to view the HTML code:

Commented link to PHPBB2 app in page source code

Here, as you can see, comments are highlighted in green for better visibility. Additionally, it's evident that, for some reason, the developer has commented out a table cell containing a link to another vulnerable application, PHPBB2. Notably, everything that's commented out is not displayed to the user in the browser. Let's open the link:

Screenshot of PHPBB2 app

As we can see, the application is operational and appears to be some kind of forum.

For the second example, let's open the BodgeIT application:

The link to open BodgeIt application in OWASP BWA

Commented link to admin page of BodgeIT

Here, the developer has also commented out a link to the administration page, and it seems to be accessible without any password:

The screenshot of BodgeIT admin page

In addition to that, using the styling of comments and certain markers/keywords, you can determine which framework was used in the application:

Keyword

Framework

<!-- START headerTags.cfm

Adobe ColdFusion

__VIEWSTATE

Microsoft ASP.NET

<!-- ZK

ZK

<!-- BC_OBNW -->

Business Catalyst

ndxz-studio

Indexhibit

I recommend going through the rest of the applications by yourself. I'm sure you'll find something interesting there too.

 

Searching for Comments with Nmap

What do you do if the application contains many pages? Manually viewing and analyzing comments can become a daunting task, and there's an increased chance of missing something.

To make your task easier, you can use the program Nmap, which is a network scanner. The program is quite powerful and allows you to perform various tasks. It also includes various scripts that perform specific tasks. One such script can help us in this context.

To use it, open the terminal and enter the command:

nmap 10.0.2.4 --script=http-comments-displayer

The output of the command is quite extensive as it checks all applications and all pages for comments. Furthermore, the script examines not only HTML code but also JavaScript. You can scroll through the output and look for something interesting, but this is not the best option. Therefore, save the scan result to a file using the following command:

nmap 10.0.2.4 --script=http-comments-displayer -oN /kali/comment_output

Here, I've specified the output file name as "comment_output" and saved it in the home directory of the user "kali." You can choose any other location.

 

Website Metadata

Every website contains certain metadata. Some of them are mandatory, while others are optional.

So, what are metadata, and where can you find them?

Metadata is essentially operational data that contains specific information for search engine optimization (SEO) and instructions for browsers and other programs. You can easily identify metadata through the <meta attributes /> tag in the HTML document's header, inside the HEAD tags.

The following types of metadata are of particular interest:

Example of Metatag Author in HTML code

If a website belongs to an individual, you'll often find a tag with the attribute name="Author". The information in this tag doesn't typically pose a vulnerability, but it can be used as supplementary information. For example, knowing the author's name, you might find their other projects and potentially conduct a different type of attack or gather more interesting information.

 

Example of Metatag Generator in HTML code

In this type of metadata, you typically find the name of the program that generated the content. This could be a Content Management System (CMS) or any other framework. Sometimes, the version of the software is even mentioned. Having such information allows you to search for already published vulnerabilities and even exploits.

 

Example of Metatag refresh in HTML code

This type of metadata was traditionally used for automatic page refresh or redirection to another site.

There are other types of metadata that control cookies, caching, and indexing, but they are not of particular interest. Moreover, modern browsers and search engines like Google and Yandex largely ignore these tags and rely on their own algorithms. Nevertheless, it's still worth checking a site's metadata as an additional source of information.

 

Hidden Elements on a webpage

Sometimes, websites may contain hidden elements that serve a specific purpose. For example, these elements may become visible when clicked or when a certain event occurs. They can be used for identification or information transfer. Occasionally, developers may simply forget to remove unnecessary code.

How do you find such elements? It's quite simple. Here are some tips:

Invisible elements in Firefox are highlighted in gray. To see them, open the developer panel using Ctrl+Shift+I:

Example of insible content in HTML code

Input form elements with the "hidden" attribute look like this:

Hidden input elements in HTML code

If you press Ctrl+F and enter "hidden," "input," or "form" in the search field, you should easily find these elements if they exist.

An element becomes hidden through CSS styles. This can be achieved with inline styles that have the property display: none, as well as with classes that typically have names like "visible," "invisible," "hide," or "hidden":

CSS properties to hide the content

Certainly, developers can come up with entirely different names for CSS classes, so it's essential to carefully review the code.