Friday, November 30, 2012

Converting a Hex String to a Byte Array in Scala

Here's a Scala method that converts a hex string to an array of bytes. This is useful when needing to encrypt a hex string, because most of the crypto libraries operate on byte arrays.

def hexStringToByteArray(s : String) = {
    val len = s.length();
    var data = new Array[Byte](len / 2)
    var i = 0

    while (i < len) {
        val b = (Character.digit(s.charAt(i), 16) << 4) +
                (Character.digit(s.charAt(i+1), 16))

        data(i / 2) = b.asInstanceOf[Byte]



Friday, November 23, 2012

Many Time Pad Attack - Crib Drag

The one time pad (OTP) is a type of stream cipher that is a perfectly secure method of encryption. It's very simple to implement and is perfectly secure as long as the length of the key is greater than or equal to the length of the message. That's it's major downfall. However, it also requires that the key never be used more than once. This tutorial shows what happens when you re-use a key to encrypt more than one message. I also show how to uncover the plain-text of two messages that have been encrypted with the same key, without even knowing the key. I use a method called crib dragging.

Let's begin with a brief description of OTP and how it works. Let's take the following message and key:

message = "Hello World"
key = "supersecret"

If we convert both the message and key to hex strings, we get the following:

message = "48656c6c6f20576f726c64"
key = "7375706572736563726574"

If we do a simple XOR of the two hex strings we get the following cipher-text:

cipher-text = "3b101c091d53320c000910"

If we XOR the cipher-text with the key, we can recover the plain-text. That's how OTP works. Without the key, you have no way of uncovering the plain-text.

Let's consider what happens when you have two messages encrypted with the same key. Take the following two messages and key:

message1 = "Hello World"
message2 = "the program"
key = "supersecret"

If we convert each message and the key to hex strings, and then encrypt each message using a simple XOR with the key, we'll get the following cipher-texts:

cipher-text1: "3b101c091d53320c000910"
cipher-text2: "071d154502010a04000419"

Let's say that all we have is the two cipher-texts and the knowledge that they were encrypted with a supposed OTP; however, they were both encrypted with the same key. To attack this encryption and uncover the plain-text, follow the steps below.

  1. Guess a word that might appear in one of the messages
  2. Encode the word from step 1 to a hex string
  3. XOR the two cipher-text messages
  4. XOR the hex string from step 2 at each position of the XOR of the two cipher-texts (from step 3)
  5. When the result from step 4 is readable text, we guess the English word and expand our crib search.
    1. If the result is not readable text, we try an XOR of the crib word at the next position.

Step 1 seems difficult (guessing a word that might appear in one of the messages), but when you think about it, the word "the" is the most commonly used English word. So, we'll start with assuming "the" is in one of the messages. After encoding "the" as a hex string, we'll get "746865". That takes care of steps 1 and 2. If we XOR the two cipher-texts, we'll get the following result:

cipher-text1 XOR cipher-text2 = "3c0d094c1f523808000d09"

The next step is to XOR our crib word "746865" at each position of the XOR of the cipher-texts. What we'll do is slide "746865" along each position of "3c0d094c1f523808000d09" and analyze the result. After the first XOR, we get the following result:

XOR  746865

When we convert the hex string "48656c" to ASCII, we get the following text, "Hel". This takes us to step 5 from above. Because this looks like readable text, we can assume that the word "the" is in the first position of one message. If we didn't get readable text, we would slide "746865 (the)" one position to the right and try again (and keep repeating until the end of 3c0d094c1f523808000d09).

Note that we don't know which message contains the word "the". It could be in either message1 or message2. Next, we need to guess what the word "Hel" is when fully expanded. It could be "Help", "Hello", etc. If we guess "Hello", we can convert "Hello" to a hex string, we get "". We then XOR it with the XOR of the two cipher-texts (just like we did with "the"). Here's the result:

XOR  48656c6c6f

"7468652070"  when converted to ASCII, is "the p". We then repeat the process, guessing what "the p" might be when expanded and then XOR that result with the XOR of the cipher-texts. Granted, guessing what "the p" might expand to is not super easy, but you get the idea. If we were to guess "the program", convert it to a hex string, and XOR it with the XOR of the cipher-texts, we'll get "Hello World".

This is called crib dragging. My suggestion is to first try " the " (note the spaces before and after). Most cipher-texts that you'll try cracking will contain that word somewhere in the text. If the result of your crib drag yields gibberish, then you can be sure " the " isn't in either of the plain-text messages. So, try another commonly used English word or phrase and keep trying until the result yields something that looks like readable text. Then you can just expand your guess and keep XORing until you uncover the plain-text messages.

In a future blog post, I'll demonstrate an implementation of a crib drag in Scala.

Monday, November 19, 2012

Dependency Management - Avoid Cyclic Dependencies

We're always striving to design software that's easy to maintain. With every code change, we try to avoid code bloat and develop a solution that's simple, while still meeting the project requirements. As a result, we measure our success by whether or not somebody else can look at our code and understand it clearly. Code clarity is a very important measure of maintainability, but we also need to look at our project from a 1,000 foot view and take a moment to look at the dependencies in our project.

To start with, I recommend modeling each component and the compile-time dependencies amongst those components. As an example, think of each component as a Java package. Take, for example, the following high-level view of a system with just three components. Here, component A depends on component C, component B depends on component A, and component C depends on component A.

A red flag should go up if you see this kind of compile-time cyclic dependency in your projects. There are a number of problems with this type of cyclic dependency. First, you can't reuse just one component without bringing along the others. Second, if you modify one component, you indirectly affect the others. This creates a nightmare to test. Also, when a new requirement arrives on your desk, it's difficult at first glance how wide or deep the change will ripple through the system. In some cases, you're forced to follow the affected code through all of the components in order to determine the impact that a change requires. Architecture 101 teaches us that any change that goes wide or deep is a high-risk change. Cyclic dependencies often result in the smallest of changes affecting multiple components.

Avoid compile-time cyclic dependencies! We should always strive for acyclic dependencies. Take the following graph as an example. All I've done for this example is reversed the dependency from A => C to   C => A. This is just to illustrate the example. The result, however, is zero circular dependencies in our system. The issues discussed in the cyclic example go away (or at least aren't as serious).

This example is very trivial. In reality, we're usually working on very large applications with many, many components. So, it's clearly not as easy a task to avoid cyclic dependencies in a real world project. However, it's very important to manage your dependencies and avoid compile-time cyclic dependencies by keeping it at the forefront of your designs.

Saturday, November 17, 2012

Camel File Poller and JMS Consumer in Scala

I put the code on  GitHub:

In this tutorial, I'm going to demonstrate using Apache Camel to write a Scala file poller. When a file arrives in the "incoming" directory, I'm placing a JMS message onto a queue. Lastly, I'll have a JMS consumer written in Scala that will receive the message from the queue. I'll be using ActiveMQ as my JMS message broker. The main focal point in the tutorial is using Apache Camel for creating a file poller. I'm not doing anything special with ActiveMQ and the JMS consumer could just as easily have been written in Java. I just chose to use Scala, because I like it for its expressiveness.

I won't go through the steps of setting up the architecture. I'll cover that in a future tutorial. For now, I assume you have ActiveMQ running, the Camel JARs downloaded, and your Scala development environment ready to go. Now on to the code.

There are only two Scala classes needed. The first component is the Scala class that defines our Camel routes. We have a single route that routes incoming files to a JMS consumer. Here's our Camel route definition in Scala.

import org.apache.camel.scala.dsl.builder.RouteBuilder
import org.apache.camel.model.dataformat.BindyType

class JmsRoutes extends RouteBuilder {
    "file://incoming" --> "activemq:queue:test.queue"

The second component in the application is the JMS consumer. This consumer simply blocks, waiting for a file to arrive in a directory named "incoming". Here's the code for the JMS message consumer.

import javax.jms.ConnectionFactory
import org.apache.camel.CamelContext
import org.apache.camel.ConsumerTemplate
import org.apache.camel.impl.DefaultCamelContext
import org.apache.camel.component.jms.JmsComponent
import org.apache.activemq.ActiveMQConnectionFactory

object JmsConsumer {
    def main(args : Array[String]) {
        val connectionFactory = new ActiveMQConnectionFactory("vm://localhost?broker.persistent=false")
        val context = new DefaultCamelContext()

        context.addComponent("activemq", JmsComponent.jmsComponentAutoAcknowledge(connectionFactory))

        context.addRoutes(new JmsRoutes())

        val consumerTemplate = context.createConsumerTemplate()


        println("received message")

This shows how simple it is to use Camel for routing messages in a Scala application. I like the expressiveness of the routing syntax. Camel is a very powerful integration framework, and yet it's extremely easy to use.

XOR Hex-Encoded Strings in Scala

It's been said that all cryptographers do is XOR things together. Although that's not entirely true, we do find ourselves XORing things quite a bit... maybe there is some truth to that after all. Anyway, the other day, while I was writing a two-time pad attack on stream ciphers, I needed a few utility methods to make the string operations a bit easier. One of the methods was performing XOR operations on two hex-encoded strings. Here's the utility method in Scala. I posted it here in case anybody else might find it useful.

def xorHexStrings(hexString1 : String, hexString2 : String) = {
    val iterator1 = hexString1.sliding(2, 2)
    val iterator2 = hexString2.sliding(2, 2)
    val result =  new StringBuilder
    if (hexString2.length > hexString1.length) {
        while (iterator1.hasNext) {
            val i = Integer.toString(Integer.parseInt(, 16) ^
                    Integer.parseInt(, 16), 16)

            if (i.length == 1) result.append("0")

        while (iterator2.hasNext) result.append(
    else {
        while (iterator2.hasNext) {
            val i = Integer.toString(Integer.parseInt(, 16) ^
                    Integer.parseInt(, 16), 16)

            if (i.length == 1) result.append("0")

        while (iterator1.hasNext) result.append(

If we call this method within our main function, we can see the results.

def main(args : Array[String]) {
    val hexString1 = "1274560603"
    val hexString2 = "876429"

    //prints 95107f0603
    println(xorHexStrings(hexString1, hexString2))

Wednesday, November 7, 2012

Exception Shielding Your Services

The most basic of all security mechanisms in SOA is shielding or hiding exceptions to the consumer of your services. Don't overlook this key feature when designing your services. Let's say you're constructing a service that inserts a record to an underlying database. Most likely, you'll have a try-catch somewhere in your service that handles SQLException(s). In a library routine that will be used by your internal applications, it's not much of a concern to expose the details of that exception to the caller. However, in a service that will potentially be consumed by outside clients, exposing the message within the exception could give the client too much information, revealing how your service is designed and built. This could be a security risk, allowing somebody to exploit the service.

To mitigate that risk, it's best to sanitize the exception and return a generic error message to the caller of the service. This can be done in a couple of ways:

  1. You can have a utility service that resides within your local network (not accessible to the outside world) that takes an exception and the service method name as input, and returns a sanitized message that's safe to be returned to the client.
  2. You can write the sanitation logic within each, individual service.

I prefer to use option 2, but that's just my personal preference. I find that every service has its own unique nuances that make it easier to deal with exceptions in the service that caught the exception. Often times, the message you return to the client will contain some verbiage regarding the service method/capability itself. For example, if you have a service method named "CreateCustomer" and a SQLException is caught within the service, you might return a message to the client that says something like, "An error occurred while creating the customer record". Having a single utility service that manages all of the possible mappings from exception to sanitized message can get bloated very quickly.

After you've decided on your strategy for sanitizing exceptions, you need to make sure you're persisting these exceptions with the full exception message (inner message, stack trace, etc.) somewhere for debugging purposes. A good approach is to have a database table where you log the exception message, along with a unique identifier, such as a GUID. When you return the sanitized message to the client, return the GUID along with it. That way, when a client reports an error message, you can lookup the raw exception message and debug the issue much easier.

Lastly, define a standard means for returning exceptions to the consumers of your services. If you're developing a SOAP service, you can make use of javax.xml.soap.SOAPException to wrap your exceptions that you return to the client. However you decide to do it, keep it consistent throughout your services, so that your consumers don't have to write custom exception handling code for every service they consume.

Tuesday, November 6, 2012

Scala Loops

Just like in Java, Scala has while and do loops. For example, you can write a while loop like this.

while (x > 0) {
    y = y + 1

However, there is no for loop like you're familiar with in Java with a structure of for (initialize; test; update). However, once you get comfortable with Scala, you'll see that the for loop in Scala is much better. Here's a for loop that loops from 1 to 10.

for (x <- 1 to 10) {

To me, the use of the to syntax is very expressive. The call 1 to 10 returns a Range of those values from 1 to 10. It's inclusive, so 10 is part of that Range. What's even cooler, is that the type of the variable x is auto-magically the same element type of the collection.

Let's say you need the range to exclude the last element. This is often the case if you're iterating over the characters in a string or array. You can do that by changing the to to until. The following code will iterate over the string "Travis" one character at a time.

val name = "Travis"
     for (x <- 0 until name.length) {

Of course, you can be even more concise in Scala and write this instead.

val name = "Travis"
    for (x <- name) {

When you look at the syntax, it makes sense, because in the context of the code, we already know that name is a string, so we'd expect "c <- name" to iterate over that string and make "c" a character type.

With every new Scala feature that I learn, I get even more excited to find the next great feature in the language that builds on the previous feature. I don't know about you, but whether I like it or not, the need for nested loops is always there. Scala gives us a really clean syntax for writing nested loops that de-clutters the code and makes it much easier to read. Take the following Java code, for example.

//Java code
for (int i=1; i < 5; i++) {
   for (int j=1; j < 5; j++)    {

In Scala, we can write this instead.

// prints 1 2 2 4 3 6 4 8
for (i <- 1 until 5; j <- 1 until 3) {
   print (i * j + " ")

These are called generators and you can have as many as you like. Each generator can have a guard, which is just a Boolean condition that follows an if.

// prints 2 2 3 6 4 8
for (i <- 1 until 5; j <- 1 until 3 if i != j) {
   print (i * j + " ")

If you need to, you can define variables inside your for loop definition and use them in the expression to evaluation the loop. For example, the following loop defines a new variable named "y" that is used inside the loop.

// prints 9 10 11 12 13 14 15 16 18 20 22 24 26 28
for (i <- 1 until 3; y = 10 - i; j <- y until 15) {
   print (i * j + " ")

If you need to build a collection of values from a loop, you can do this very concisely as well. Here's a Scala loop that constructs a collection of integer values. In Scala, we call this a for comprehension.

// Builds a Vector(2, 4, 6, 8, 10)
for (i <- 1 to 5) yield i * 2

If you haven't picked up Scala yet, I encourage you to take it for a test drive and I think you'll really like its expressiveness. It's free of clutter, easy to read, and quick to write (i.e. less code). Loops are a great example of this.