Previous | Up | Next

Working Review of "Practical ML Programming with SML#" (Ohori, Ueno), CHAPTER 2

The Essence of ML Programming

Chapter Summary

The second chapter of this book is titled “The Essence of ML Programming” and is a kind of whirlwind tour of the basic datatypes available in the language, along with a showcase of approaches to solving problems in a functional manner.

As with the preceding introductory chapter, this one is written in a fast-paced, crisp style, and is a very pleasurable read. Again, there are a couple understated “flexes”, demonstrating some of SML# undoubtedly cool improvements upon the Standard ML experience.

Expressions and their types

The chapter consists of four main parts, the first one of which lists all the basic and composite data types that SML# has to offer, including Booleans, Strings, Tuples, Arrays, and so on.

Here’s a piece of code that definitely stands out:


# val 都道府県コードUrl =
>   "https://dashboard.e-stat.go.jp\
>   \/api/1.0/Json/getRegionInfo\
>   \?Lang=JP\
>   \&ParentRegionCode=00000";
val 都道府県コードUrl =
  "https://dashboard.e-stat.go.jp/api/1.0/Json/...=00000";

Showing both multi-line string capabilities (available in Standard ML) and unicode variable names (not available in Standard ML).

The rest of the chapter deals with instantiation and usage of values of various types.

Defining and using functions & Use of higher-order functions

The next two sections of this chapter give an intro to ML-style programming, utilizing tail recursion (with tail-call optimization) and let-bindings for nested helper functions.

Then, higher-order functions are introduced, based on the Sigma function.

So far so good, but if you’ve been doing functional programming for a while, nothing here should surprise you – in fact, nothing in these two sections would feel out-of-place in a tutorial on Scheme.

Polymorphic functions

The last section in this chapter holds some very pleasant surprises. First, the authors explain polymorphism by using the fst function, which extracts the first element of a 2-tuple.

The interesting thing to note here, is that the type signature of the function is not what you’d get in a traditional Standard ML environment. For comparison:

(* Poly/ML *)
> fun fst (x, _) = x;
val fst = fn: 'a * 'b -> 'a
(* SML# *)
# fun fst (x, _) = x;
val fst = fn : ['a, 'b. 'a * 'b -> 'a]

This type of quantified type signature will show up every time we define a polymorphic function, like the identity function, which comes up next:

# fun id x = x;
val id = fn : ['a. 'a -> 'a]

The real kicker comes when we look at functions which are polymorphic over record types. While these types of operations do appear in the heretofore-theoretical SuccessorML, SML# has them, and they “just work”:

Polymorphic record field extraction

We can define a field accessor getWeight which will work with any record that has a weight field:

# fun getWeight x = #weight x;
val getWeight = fn : ['a#{weight: 'b}, 'b. 'a -> 'b]

Using it, we can compare apples and oranges (or at least the weight of their respective containers :):

We can define a record representing a bag of apples:

# val emptyBagOfApples = {contents= [], color= "Gray", weight= 0.1};
val emptyBagOfApples =
  {color = "Gray", contents = [], weight = 0.1}
  : {color: string, contents: ['a. 'a list], weight: real}

…and a record representing a box of oranges:

# val boxOfOranges = {width=12, height=23, weight=5.0, material="Wood"};
val boxOfOranges =
  {height = 23, material = "Wood", weight = 5.0, width = 12}
  : {height: int, material: string, weight: real, width: int}

Now, because both records have a weight field, we can apply getWeight to both:

# getWeight boxOfOranges < getWeight emptyBagOfApples;
val it = false : bool

This is technically called row polymorphism, and you might have come across it if you’ve ever used the Elm programming language. I think this feature has a very beneficial effect on developer UX, and the authors note that this feature was necessary for them to integrate native SQL and JSON support into the language.

Record update syntax

The other impressive feature in this section is polymorphic record update syntax. This feature comes in super handy in Elixir, where it can be used to create new records by updating only certain fields in an old one. Like so:

# Elixir #
iex(1)> bagOfApples = %{apples: ["a1", "a2"], weight: 2.0, color: "gray"}
%{color: "gray", apples: ["a1", "a2"], weight: 2.0}

iex(2)> newBag = %{bagOfApples | apples: tl(bagOfApples.apples), weight: bagOfApples.weight - 1}
%{color: "gray", apples: ["a2"], weight: 1.0}

This type of operation is not really possible in Standard ML, where one must fully extract all fields of a record in order to create a new, updated copy:

(* Poly/ML *)
> val bagOfApples = {weight=2.0, apples=["a1", "a2"], color="gray"};

> fun removeApple {apples, color, weight} = {apples=tl apples, color=color, weight=weight-1.0};

> removeApple bagOfApples;
val it = {apples = ["a2"], color = "gray", weight = 1.0}:
   {apples: string list, color: string, weight: real}

We can use an as pattern to shift the programmer’s focus a bit, but there’s no way to “re-use” the ... catch-all pattern in the body of the function, even if we help by adding type annotations:

(* Poly/ML *)
> type bagOfApples = {apples: string list, color: string, weight: real};
type bagOfApples = {apples: string list, color: string, weight: real}

> fun removeApple' (bag as {apples, weight, ...}) : bagOfApples = {apples=(tl apples), color=(#color bag), weight=(weight-1.0)} : bagOfApples;
poly: : error: Can't find a fixed record type.
Found near {apples = apples, weight = weight, ...}
poly: : error: Can't find a fixed record type. Found near #color

But SML# lets us do this with ease! Like so:

(* SML# *)
# val bagOfApples = {contents= ["a1", "a2"], color= "Gray", weight= 2.1};
val bagOfApples =
  {color = "Gray", contents = ["a1", "a2"], weight = 2.1}
  : {color: string, contents: string list, weight: real}

# fun removeApple (bag as {contents, weight, ...}) = bag#{contents=tl(contents), weight=(weight-1.0)};
val removeApple = fn : ['a#{contents: 'b list, weight: real}, 'b. 'a -> 'a]

# removeApple bagOfApples;
val it =
  {color = "Gray", contents = ["a2"], weight = 1.1}
  : {color: string, contents: string list, weight: real}

As you can see, our update function only required the input record to have adequately-typed contents and weight fields, and created a new record with the two fields changed, but others preserved (color in our case).

The above two added degrees of polymorphism are really nice additions to the base Standard ML language. There is one more addition that solves a major “pain point” in the standard, but it will only be introduced as part of the exercises for the chapter.

Exercises

The exercises in this chapter really offer no hand-holding and require the reader to have ingested the lessons of the previous one in full. Additionally, one should know their way around a POSIX system. I appreciated the fact that not only did the authors of SML# design the language to go “with the grain” of POSIX, as demonstrated by native UTF-8 support and native Makefile generation, but they also teach bits of POSIX as part of the language tutorial.

The exercises also do justice to the book’s title, as they are very practical.

I’m not going to go over all the exercises, but I will note several interesting things:

The smlsharp interpreter is pipe-friendly

One of the exercises requires us to evaluate a bunch of SML# expressions and store their output in files. We don’t get any kind of popen-like interface to launching subprocesses, but are expected to use OS.Process.system to launch a shell process and use redirection to perform I/O.

This is cool, and it’s important to note that the smlsharp interpreter, when launched in a pipe, reads “interactive” commands from stdin and writes output to stdout, then shuts down gracefully. This is a very nice unixy touch for a compiler!

% echo 'structure P = OS.Process;' | smlsharp
SML# 4.0.1-30.g944e3e5d (2023-03-27 23:44:46 JST) for x86_64-unknown-linux-gnu with LLVM 12.0.1
structure P =
  struct
    type status = int32
    val success = 0 : status
    val failure = 1 : status
    val isSuccess = fn : status -> bool
    val system = fn : string -> status
    val atExit = fn : (unit -> unit) -> unit
    val exit = fn : ['a. status -> 'a]
    val terminate = fn : ['a. status -> 'a]
    val getEnv = fn : string -> string option
    val sleep = fn : Time.time -> unit
  end
%

SML# can print arbitrary datatypes!

This is another practical and developer-friendly solution to Standard ML’s lack of a polymorphic toString function.

Personally, I find this lack one of the bigger drawbacks of day-to-day programming in portable standard ML. Modern languages which aim for developer-friendliness go out of their way to provide an easy way to get a representation of your datatype on the screen. Elixir has IO.inspect, Haskell can generate show with deriving (Show), and go has the wonderful %v format string.

SML#’s solution to the problem is the Dynamic structure (a.k.a. module), which allows the compiler to derive various representations for its internal datatypes. Looking at its exported functions, I expect it to be in use when writing code that converts data to and from external formats such as SQL and JSON. I hope to find out more in subsequent chapters.

In the meantime, we are introduced to this super-nifty debugging function:

# Dynamic.pp;
val it = fn : ['a#reify. 'a -> unit]

# Dynamic.pp bagOfApples;
{color = "Gray", contents = ["a1", "a2"], weight = 2.1}
val it = () : unit

And we get to use it to debug our existing code, taking advantage of StandardML’s imperative sequencing syntax: (e1 ; e2 ; ... ; en)

Summing up

A fun chapter which starts out easy and a bit encyclopedic, but ends on a strong note with interesting exercises and a demonstration of three absolutely killer features.

Previous | Up | Next