I found a quite good article here, which describes the technique used in this exploit. In short, the memory chunks are handled as doubly linked lists. The free function calls unlink, if there are two adjacent free chunk. This can be exploited to modify memory, if we overwrite the allocated chunk sizes with strcpy.
The best way to solve this exploit is setting the address of puts in GOT to the address of winner. Thus instead of the last printf, the winner function will be called. The address of puts in GOT is 0x0804b128 and the address of winner is 0x08048864. However unlink tries to write on both addresses and winner is in the .text section, which is not writable. The heap is however writable, so we should use an address from the heap and we need a short shellcode, which jumps to the winner function. This simple shellcode pushes the address of the winner onto the stack and then executes a RET.
The final solution:
/opt/protostar/bin/heap3 `python -c ‘print “\x90″*4 + “\x68\x64\x88\x04\x08\xC3” + “A”*22 + “\xfc\xff\xff\xff”*2 + “XXXX” + “\x1c\xb1\x04\x08” + “\x0c\xc0\x04\x08″‘` B C