Securing data and communication with RSA across Android app and Python server


I took my Algorithms class in USC with Prof. Leonard Adleman. Unsurprisingly, one of the biggest topics that we focused on in that class was RSA encryption. “Generate a pair of keys – you can encrypt by taking this exponent and mod that, decrypt by taking that exponent and mod this.” It all sounded too simple.

A few semesters later, I find myself needing to use a public-key encryption scheme. I’m trying to implement Spotify support for the playlist app I’ve been working on, QCast, for which we need to ask the user to input their Spotify credentials. Since a Spotify account involves some really sensitive details, and we might want to store the credentials locally on device (so that we can offer to automatically login for user), we thought we’d be a little smarter than just having plaintext username and passwords around. Vinnie argues: “https is enough!”; I’d generally agree for communication (though a quick Google search seems to suggest otherwise), but definitely not for storage.

In fact, for that matter, not just any encryption – it has to be asymmetric so that even if the key to encrypt is public and known to everyone, the encrypted data cannot be decrypted.

So I thought I’d implement RSA, thinking it can’t be that hard… right? What, import RSA, RSA.encrypt(key, message)? Hah, so I naively thought. Turns out, there are different ways of generating the keys, different ways of encrypting and decrypting. For the cherry on top, Java and Python use different ways of naming and setting up things. As a result, even though the final solutions that we have isn’t too complicated, we wasted a ton of time researching for the right way to do things. I hope to save ourselves and other people the troubles for doing all the research again in the future with a short guide.

Goal

  • Generate a pair of RSA keys
  • Encrypt using the public key in Android with the javax.crypto package (though it can be easily adapted to any Java project).
  • Decrypt using the private key in Python with PyCrypto*.

* M2Crypto is the hip alternative, but is not supported by Google App Engine.

Generating Keys

To generate, we use openssl, a widely-used security library.

First, let’s generate a private key.

This command will do the trick, outputting the generated key to the file private_key.pem. 2048 is the length of the key generated; the length is actually very important – turns out, you cannot encrypt a string that is longer than the size of the key.

From this private key, we can generate a public key.

This command takes in the private key from the file private_key.pem, generates a public key (with the -pubout flag) using the format der, and writes it to the file public_key.der. We encode the public key in der format and not pem format, because Java likes the base64-decoded version of the key (that’s pretty much the only difference, anyway – a good StackOverflow answer here on the different key formats).

Python, on the other hand is much more robust and can support the pem encoding just fine.

In your Android app, create a folder named raw inside [modulename]/src/main/res/, and put the public key there. It’s okay if the public key is “leaked” (hence the name public) – the encrypted data cannot be decrypted with the public key. In your python app, I would create a folder called ‘resources’ in root and put the private key there.

Encrypting Messages in Java

(Special thanks to David for his help in this part)

Before we start encrypting, we first need to read from the public key file, and generate a usable PublicKey object.

(The comment above X509EncodedKeySpec ... highlights my frustration – it is rather hard to get a clear answer to the different formats and encodings of keys, and I had to settle with a lot of trial and error.)

So now that we have the key ready to be used, we can set up the “encryption engine” and feed the key to it.

This part seems more straight froward (except with the weird syntax that Cipher uses). One line, in particular, deserves a little more attention, though.

Cipher.getInstance("RSA") – There are actually multiple ways of using RSA, including different kind of “padding” – it works kind of like a salt in a hash. This StackExchange post really helped me understand this concept:

The operation at the core of RSA is a modular exponentiation: given input m, compute m^e modulo n. Although in general this is a one-way permutation of integers modulo n, it does not fulfill all the characteristics needed for generic asymmetric encryption:

  • If e is small and m is small, then m^e could be smaller than n, at which point the modular exponentiation is no longer modular, and can be reverted efficiently.
  • The operation is deterministic, which allows for exhaustive search on the message: the attacker encrypts possible messages until a match is found with the actual encrypted message.
  • The modular exponentiation is malleable: given the “encryption” of m1 and m2, a simple multiplication yields the encryption of m1m2. This is akin to homomorphic encryption, which can be a good property, or not, depending on the context.

For these reasons, the integer m which is subject to RSA must not be the data to encrypt alone, but should be the result of a transform which ensures that m is “not small”, contains some random bytes, and deters malleability.

So while getInstance("RSA") can compute the RSA encryption, the string parameter can actually be overloaded with different options.

  1. Encryption algorithm to use – in our case, RSA.
  2. “Transformation mode” – doesn’t seem to have an effect on RSA encryption, so we’ll just set that to NONE. (I’d love to be corrected)
  3. Padding standard. Some brief research tells me that the one should be used these days seems to be OAEP Padding, with some hash function (usually SHA or MD5, by default SHA1) and mask generation function (by default MGF1).

Therefore, you might want to replace that string parameter with "RSA/NONE/OAEPWithSHA1AndMGF1Padding". (Why they decide to have parameters to be this really baffles me.)

The base64-encoded string is then stored in the variable encrypted, ready to be sent to the server for decryption.

Decrypting Messages in Python

Over on the server-side, we need to do something similar. Namely, read from the key file, set it up with the correct padding configuration, and decrypt. Fortunately, the code to do this is a lot simpler on Python.

First, make sure your dependencies are set correctly. In app.yaml of your Google App Engine project, make sure you specify you want to use the pycrypto project that is included with Google App Engine: (or, (sudo) pip install pycrypto for other projects)

Setting up in Python is actually extremely easy. All the modules we need come from the Cipher package.

PKCS_OAEP.new actually takes in multiple parameters; but since we decided to use the default OAEP settings (SHA1 and MGF1), nothing needs to be set explicitly. However, if you do change it, you might find this reference guide useful. For example, to use with the Java configuration of "RSA/NONE/OAEPWithSHA-512AndMGF1Padding", you would want to change to PKCS1_OAEP.new(rsakey, SHA512).

Similarly, if you decide not to use OAEP Padding, then PKCS1_OAEP.new(rsakey) is completely unnecessary. Just use what RSA.importKey returns and that’ll do the trick.

Afterwards, we can decrypt very easily:

Ta da! The decrypted string is now in the decrypted variable.

Limitations

As mentioned above, the amount of data we can use with RSA depends on the length of the key that we chose. If we generated our private key with length 2048 bits, the message cannot be more than 256 bytes, minus padding size.

The most common solutions I’ve found for encrypting long strings is a hybrid solution, combining both a symmetric encryption scheme (like AES) and RSA:

  1. Generate symmetric key, K
  2. Encrypt data with K
  3. Encrypt K, which we can guarantee length for, with RSA using the public key
  4. Send encrypted data, and the encrypted key to server
  5. Server decrypts the encrypted key with RSA using the private key, getting original K
  6. Decrypt with AES using K

Conclusion

The motivation for writing was that there didn’t seem to be a simple guide for cryptography beginners who want to implement some security in their app without having to dive into jargons that are way too complicated. I hope this guide served that purpose.

Please help me correct anything incorrectly explained in the post – I’m no expert and would definitely love to learn more.

(We are taking QCast to iOS too, so there might just be a post with encryption in iOS soon. )