VSzA techblog

In 2004 OASIS created WS-Security, which describes several techniques for securing web services. The simplest is UsernameToken (PDF warning), which can be thought of as the equivalent of HTTP authentication in the SOAP world – the client supplies a username and a password, and latter can be transmitted either in cleartext or in a digested form.

The digest algorithm is quite simple (Base64(SHA-1(nonce + created + password))) and by using a nonce, this protocol can prevent replay attacks, while a timestamp can reduce the memory requirements since nonces can expire after a specified amount of time. A sample envelope can be seen below, I removed the longish URLs for the sake of readability, these can be found in the PDF linked in the previous paragraph. If you're into security, you can try to guess the password based on the username, and then try to verify the digest based on that. ;)

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
 <soap:Header>
  <wsse:Security xmlns:wsse="..." xmlns:wsu="..." soap:mustUnderstand="1">
   <wsse:UsernameToken wsu:Id="UsernameToken-3">
    <wsse:Username>admin</wsse:Username>
    <wsse:Password Type="...#PasswordDigest">fTI7fNcwD69Z3dOT1bYfvSbQPb8=</wsse:Password>
    <wsse:Nonce EncodingType="...#Base64Binary">1DLfpq3fLJ5O8Dlrnr4blQ==</wsse:Nonce>
    <wsu:Created>2011-05-05T17:20:22.319Z</wsu:Created>
   </wsse:UsernameToken>
  </wsse:Security>
 </soap:Header>
 <soap:Body>
  ...
 </soap:Body>
</soap:Envelope>

Python had little support for UsernameToken, SUDS, the preferred web service client mentioned cleartext support in their documentation, so I set up a simple service using Apache CXF and tried to access it. As it turned out, the implementation violated the OASIS standard by not specifying the Type attribute of the Password element, which would've indicated whether the password were transmitted in cleartext or in a digested form.

It was a trivial fix, and while I was there, I also added a standards-compliant digest support, and tested it with Apache CXF. I sent a patch to the SUDS mailing list in May 2011, but got no response ever since, so I have no information if/when this improvement will get into mainline SUDS.

On the server side, things got trickier. The preferred Python web service implementation is soaplib/rpclib, and I did some research, whether it's possible to implement UsernameToken support in it. It turned out, that there's a project called sec-wall which takes this to a whole new level by creating a (reverse) proxy, and this way, security can be detached from the service to another layer, which also satisfies the UNIX part of my mind.

Overview of sec-wall

I started hacking on sec-wall, first with some code cleanup, then I managed to fix up the codebase so that all test passed on Python 2.7, too. After getting myself familiar with the project, I created an environment with Soaplib as a server, sec-wall as the proxy, SUDS as a client, and tried both UsernameToken configurations. It worked pretty well, with minor glitches, such as sec-wall expecting a nonce and creation time even when cleartext password was used. I helped the developer, Dariusz Suchojad fixing the problem, so in the end, I could create a pure Python solution utilizing UsernameToken to secure webservices.

That previous sentence could be a great one to end this post with, so this paragraph is kind of an extra for those who kept on reading. The current WSSE implementation in sec-wall “lets all nonces in”, so I created a class that overrode this implementation using memcached. There are two Python clients for it, so I developed and tested both. Below is the code for python-memcached, which is pure Python, whereas pylibmc uses native code, but mimics the interface of former, so only the second line needs to be changed to switch between the implementations.

from secwall.wsse import WSSE
from memcache import Client

class WSSEmc(WSSE):
  keyfmt = 'WSSEmc_nonce_{0}'

  def __init__(self):
    self.mc = Client(['127.0.0.1:11211'], debug=0)

  def check_nonce(self, wsse_nonce, now, nonce_freshness_time):
    if not nonce_freshness_time:
      return False
    key = self.keyfmt.wsse_nonce
    if self.mc.get(key):
      return True
    self.mc.set(key, '1', time=nonce_freshness_time)
    return False

I hope to publish at least a second part in this subject, focusing on digital signatures in the next two months (it's part of my Masters thesis, which is due December 9, 2011).

As I was hacking around with django-oursql vs f33dme, I started sniffing the network traffic between the Python process and the MySQL server to follow up on a bug in oursql. I found that the following queries (yes, plural!) ran every time a feed item was marked as read.

SELECT `f33dme_item`.`id`, `f33dme_item`.`title`, `f33dme_item`.`content`,
  `f33dme_item`.`url`, `f33dme_item`.`date`, `f33dme_item`.`added`,
  `f33dme_item`.`feed_id`, `f33dme_item`.`score`, `f33dme_item`.`archived`
FROM `f33dme_item` WHERE `f33dme_item`.`id` = ?

SELECT (1) AS `a` FROM `f33dme_item` WHERE `f33dme_item`.`id` = ?  LIMIT 1

UPDATE `f33dme_item` SET `title` = ?, `content` = ?, `url` = ?, `date` = ?,
  `added` = ?, `feed_id` = ?, `score` = ?, `archived` = ?
WHERE `f33dme_item`.`id` = ?

The above queries not only multiply the round-trip overhead by three, but the first and the last ones generate quite a bit of a traffic, by sending the content of all the fields (including the content which might contain a full-blown blog post like this) to and from the ORM, respectively. The innocent-looking lines of code that generated them were the following ones.

item = Item.objects.get(id=item_id)
if not item:
  return HttpResponse('No item found')
item.archived = True
item.save()

By looking at the queries above first, it's pretty clear, that the get method needs to query all the columns, since later code might access any of the fields. The same can be said about the update, which knows nothing about the contents of the database – it even has to check if a row with the ID specified exists to figure out whether to use an INSERT or and UPDATE DML query.

Of course, the developers of the Django ORM met this situation as well, and even documented it along with nice examples in the QuerySet API reference. All I needed was to adapt the code a little bit, and as the documentation states, the update method even “returns the number of affected rows”, so the check for the existence of the item can be preserved. The improved code is the following.

if Item.objects.filter(id=item_id).update(archived=True) != 1:
  return HttpResponse('No item found')

The first line is a bit longish (although still 62 characters only), but replaced four lines of the original code. When read carefully, one might even find it readable, and it produces the following SQL queries in the background.

UPDATE `f33dme_item` SET `archived` = ? WHERE `f33dme_item`.`id` = ?

VSzA techblog

Tags

Contact

Flattr

Member of

Blogroll

Secure web services with Python part 1 - UserNameToken

2011-10-20

Optimizing Django ORM in f33dme

2011-10-19