Throughout this series, I will be exploring the F# (pronounced F Sharp) language as a beginner. Perhaps you're just like me in that you've never worked with the F# language before but you are very curious about it. You may not understand the hype you've been hearing about so-called functional languages. But that's OK. If you want to learn along with me, that would be great.
Along the way, I welcome your comments and feedback, both to instruct me and other readers. You can get an overview of the complete series by visiting the series index. Enjoy.
Part 4 - Imperative Features
The F# programming language is not just a functional language. It has features of imperative and object-oriented languages as well. This is, in fact, what makes F# so compelling to work in. By combining the best features of functional, imperative and object-oriented languages, F# has something to offer to just about everyone. Some people say that by trying to be all things to all people, computer programming languages run the risk of becoming diluted. This is certainly one of the arguments we hear about the C# language whevener new features are added to it. And F# is no different except that it's a new programming language. And as a new language, the designers of F# had a great opportunity to get the balance of the features of the language just right. Personally, I think they did a great job.
I think the best way to learn how the imperative language features in F# work is by comparing them to a language you're likely to know already. So, I'll use C# as the teaching tool. If you are a Java developer, don't worry. C# is very close to Java and you should be able to follow along easily. I've attached the source code at the bottom of this post if you want to download it. Building on the example I started last time, let's suppose you want to write a class to handle conversions between Arabic numerals and Roman numerals. If you analyze the way Roman numerals work, you'll see a pattern emerge. I'm only going to handle numbers between and including zero to 3,999 in my class. For that set of Arabic numerals, these are the rules:
- For the Arabic numbers 1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4 and 1, there exist discrete Roman numeral sequences that can be evaluated left to right. We'll call these the 13 Arabic divisions.
- Only the sequences for the values of 1000, 100, 10 and 1 can be repeated in a Roman number.
- No repeatable Roman numeral can repeat more than three times.
- After encountering the Arabic divisions 900, 400, 90, 40, 9 and 4, no other value in the same order of magnitude (base 10) may be included in the value or the sequence.
Following these simple rules, one can parse an Arabic number by iterating over the 13 divisions, subtracting out the division amounts and accumulating the Roman numeral sequences until the value reaches zero. To parse a Roman number into an integer, you can also iterate through the 13 divisions, accumulating the associated number quantities as you go. As long as the rules are observed, it turns out to be a simple task. I am going to start off by designing a simple interface in C# that allows me to get and set the Arabic integers and Roman numeral strings implementing classes. For the remainder of this post, I am going to implement this interface in a C# based class then again in an F# based class to compare and contrast the two languages.
Now, to begin with implementing the rules shown above, I need some sort of data structure to manage the information about the rules. The three pieces of information I need to collect about each of the 13 Arabic division points are:
- The integer value of the division point.
- The associated Roman number sequence.
- A count of divisions to skip if the current division in selected at least once.
I could create a very specific class to manage this information. Or I could try to implement something a bit more generic. To illustrate one of the important things about F# that you'll need to know later, I'm going to go down the generic path by creating something called a Tuple. In short, a Tuple is a class that will be able to hold a combination of three bits of data. I can refer to those parts as P1, P2 and P3 later. The Tuple class might look like this in C#:
In my C# based class, I am going to create a Tuple<int, string, int> to hold the data for managing my Roman numerals:

Knowning a bit about Roman numerals and the rules I outlined above, you should be able to recognize this data and what I'm going to do with it. To implement the IArabicRomanConverter interface, the ArabicValue implementation in C# is pretty simple. I'll create an integer backing value and a simple property that does a bit of range checking.

The Roman numerals above value 1000 get a bit strange so I am going to leave them out of this example. Considering the rule that no single Roman numeral sequence can be repeated more than three times, my class only handles non-negative integers below 4000. The RomanValue property required to flesh out the IArabicRomanConverter interface is a bit more complicated. I'll show it in two parts. First the getter:

After handling the special case for the number zero, a StringBuilder is used to accumulate the Roman numeral sequences that are collected as each integer value in the array of Tuples is encountered. It's simple to loop through the Tuples, subtracting off the value of each Tuple that's found, going from the largest to the smallest value in the list. The setter for the RomanValue property is a bit more complex because the rules for Roman numeral repetition have to be observed. The setter looks like this in C#:

After the formality of dealing with null, trimming the input string and converting to uppercase, an accumulator is set up to receive the value of the Roman number as it is parsed. String comparisons on the second part of each Tuple are performed in a loop. The rules about allowed repetitions are observed by counting how many times each Tuple sequence is applied. If any Tuple is applied more than three times or if any sequence that is not allowed to repeat appears more than once, an exception is thrown. It turns out that the skip count portion of the tuple (part three) can be used to determine which sequences are allowed to repeat more than once. The cardinal values of 1000, 100, 10 and 1 have skip counts of zero, meaning that when they are used in the parsing operation, the list pointer does not skip over the next item in the list. Those values having zero skip counts are also the ones that can repeat according to the rules. Finally, if any part of the string is left over when the parsing is complete, it's invalid so an exception is thrown.
OK, so the C# class to convert Arabic and Roman numbers is pretty simple. Now, we need to look at what an F# class that does the same thing might look like. Be aware that the code you are about to look at is not ideal. I ported it over to be as close as possible to the C# code so you could start to draw some conclusions in your mind's eye. A more pure F# implementation of this class would look quite different. But for now, the inefficiencies in the F# code you'll see are OK because they are instructive. In a future post, I will show you a near ideal implementation of this class in F#. I'm going to show you the entire F# class below and describe what happening in a list below it. Here's the code:

The type keyword on line 5 is important. It's the equivalent of the class keyword in C#. The next thing I'll point out is the use of the mutable keyword on line 6 in declaring the _arabicValue field. The mutable keyword is at the heart of the imperative language features in C#. If you remember, imperative means "duty to change" in English. So, an imperative language is one that focuses on managing changes in the state of data as the means for tracking calculations. A functional language, on the other hand, treats functions as first-class citizens, using chains of function calls to manage computation. In F#, when you want to be able to change a bit of data, you must mark it as changeable using the mutable keyword (or by another means as we'll see shortly). Also notice that on line 5, I manually declared that the _arabicValue field is of type integer. As you may know, with F#'s incredible type inference system, declaring types is usually not required. F#'s parser goes to extraordinary lengths to figure out what types we're working with by observing how we use them. And it may be that, in this case, F# could figure out that _arabicValue should be an integer. However, it's not always important or prudent to let F# perform type inference. In this case, we know that the _arabicValue will always be an integer so it's OK to say so.
Next, on lines 15 through 20, you can see the _divisions list that in C# consisted of an array of Tuple instances is quite a bit more compact in F#. Lists, Sets and Sequences are first-class citizens in F#, with language constructs to help declaring and manipulating them. Declaring a literal List in F# is done by separating items by semi-colons and using square brackets to bound them. What's happening inside the _divisions List is equally interesting. In C#, I had to write a special class to group my Roman numeral management data into Tuples. But in F#, tuples are also first-class citizens. Separating literals by commas and enclosing them in parentheses creates an F# tuple on the fly as you can see. Collecting tuples into lists is a powerful way to organize strongly-typed data in memory, almost like a database.
Next, you'll notice on line 22 that I am implementing the IArabicRomanConverter interface in this class. This is quite a bit different than the interface implementation models we find in languages like C++, Java and C#. Those languages declare up front what their intentions are. In F#, however, you simply pick the spot where you want to start implementing an interface and have at it. Now, if you scan down the code, pay special attention to lines 23 and 32. These two lines represent the declaration of the two interface properties we need to define. The prefix of "x." on those two declarations is important here and you should try to understand what it means. In C# or Java, you know about the this parameter which is passed invisibly to instance methods and properties of a class. In F#, though, you have to declare the parameter that represents the invoking object yourself. However, rather than passing that parameter on the right side of the declaration, following the name as you might expect, you must do it as shown on lines 23 and 32 instead. The "x." prefix means "within this property, I would like to refer to the invoking object as x." In fact, in the getter for the ArabicValue property on line 24, you can see the "x." designation in use when fetching the class's mutable _arabicValue field using "x._arabicValue". If we had declared the ArabicValue property as "jabberwocky.ArabicValue", we would have fetched the _arabicValue field in the getter as "jabberwocky._arabicValue" instead. Make sure you understand that before proceeding.
The raise statement on line 27 is noteworthy. The raise keyword is equivalent to the throw keyword in C++, Java or C#. The rest of the ArabicValue property implementation is pretty straightforward. However, I want to point out something about line 30 for a moment. There's an operator in use there that you may not understand. When you see the characters "<-" it means assignment. In C#, for example, a single "=" character means assignment whereas a double "==" means equality test. In F#, a single "=" character means equality test whereas "<-" means assignment. I really like that notation. It's like an arrow that says to me, "copy the data in this direction". Of course, it's no surprise that only mutable fields may be on the left side of the "<-" operator.
Now, let's turn our attention to the RomanValue property implementation in the F# code. Let's focus on the getter first because it's simpler. On line 41, draw your attention to the ref keyword. This keyword is the other way to create mutable data in your F# code. This type of mutable data is called a reference cell. Reference cells are a great way to hide so-called side effects inside of a class boundary. In general, in F#, if you must use mutable data, limiting the scope of the mutation to a specific member or class boundary is a good way to reduce the potential for side-effects that would make the functional programming model less effective. The reference cell called value that's declared on line 41 is limited to the get accessor for the RomanValue property. Actually, if you look at the indentation, which helps define scope in F#, the value reference cell is limited to the block comprising lines 38 through 53 created by the else clause on line 37. So, although the reference cell named value is mutable, it's mutation is limited to a very small scope. This is a good idea, always. Looking at line 51, you'll see an odd syntax that's likely to trip you up. The ":=" operator is used to mutate reference cells, just as the "<-" operator is used to modify class-level mutable fields as discussed earlier. And the "!" operator before the reference cell usage on the right of the ":=" is somewhat like the "*" operator in C Language or the "&" operator in C++. Because the reference cell named value was declared with the ref keyword, it must essentially be "re-referenced" to get to its value.
OK, F# is a functional language so functions are first class citizens. I keep saying that. In lines 47 though 51, you can see the definition of an anonymous function that will be passed to the List.iter function. Passing functions to functions. We're OK with that now, right? I wrote line 46 as I did on purpose to demonstrate a special operator that's unlike anything you've probably encounted before. The forward pipelinging operator "|>" is very important in F# and, like most really important things, it's incredibly simple in defintition yet incredibly difficult to understand without some context. When you see "|>", just imagine that the object or result on the left is being fed into the function on the right. You can chain these forward pipeline operators in interesting ways to give F#'s type inference engine much more power. More on that later. For now, you can see on line 47 that I am iterating over a list using the List.iter function. However, the parameter to List.iter is missing on line 51 (just after the closing parenthetic mark). Where's the parameter coming from. Well, on line 46, the _divisions list is being forward pipelined into List.iter using the "|>" operator. I didn't have to write it this way but I wanted to pick out a simple example that helps you understand how that operator works. In future posts in this series, you'll need to draw on this understanding for much more complex examples.
The last thing I want to point out about the get accessor for the RomanValue property is on line 48. What you see there is called tuple parsing. Tuples are made up of a fixed number of parts of varying types. When you have a tuple in hand, such as the division parameter passed by List.iter as it iterates over the list of tuples in my class, it can be broken up into its constituent parts. In this case, for the getter on RomanValue, I only need parts one and two because the third part contains the skip count. I only need the skip count when parsing Roman number strings into integers. So, the syntax seen on line 48 does what I want. The lone underscore in the third place says "I don't care about this part of the tuple, so toss it". The first two parts go into values named dValue and dSymbol. And you can see that I use those parts in the while loop at the end of the anonymous function being invoked by List.iter.
You're probably expecting me to dive into the set accessor the the RomanValue property now. After all, I said we would start with the get accessor because it was simpler. But that's not really true. The set accessor is only longer. There really aren't any new concepts introduced there. But everything you've learned so far is there. Read through the set accessor source code on lines 55 through 99. If you were paying attention so far, you should be able to understand the code. And that means you're on the way to becoming an F# programmer. In the next post, I'm going to show you how to write this class the right way, the F# way. And that will give us some fresh ideas for how to make our C# better. For me, that's the whole point of learning this new language.
Click here to download the source code for this blog post (20Kb)
That's all for now. Feel free to take a look at the other parts of this series exploring the F# language by visiting the series index.