Working Review of "Practical ML Programming with SML#" (Ohori, Ueno), CHAPTER 2
The Essence of ML Programming
2023-07-23
Chapter Summary
The second chapter of this book is titled “The Essence of ML Programming” and is a kind of whirlwind tour of the basic datatypes available in the language, along with a showcase of approaches to solving problems in a functional manner.
As with the preceding introductory chapter, this one is written in a fast-paced, crisp style, and is a very pleasurable read. Again, there are a couple understated “flexes”, demonstrating some of SML# undoubtedly cool improvements upon the Standard ML experience.
Expressions and their types
The chapter consists of four main parts, the first one of which lists all the basic and composite data types that SML# has to offer, including Booleans, Strings, Tuples, Arrays, and so on.
Here’s a piece of code that definitely stands out:
# val 都道府県コードUrl =
> "https://dashboard.e-stat.go.jp\
> \/api/1.0/Json/getRegionInfo\
> \?Lang=JP\
> \&ParentRegionCode=00000";
val 都道府県コードUrl =
"https://dashboard.e-stat.go.jp/api/1.0/Json/...=00000";
Showing both multi-line string capabilities (available in Standard ML) and unicode variable names (not available in Standard ML).
The rest of the chapter deals with instantiation and usage of values of various types.
Defining and using functions & Use of higher-order functions
The next two sections of this chapter give an intro to ML-style programming, utilizing tail recursion (with tail-call optimization) and let-bindings for nested helper functions.
Then, higher-order functions are introduced, based on the Sigma
function.
So far so good, but if you’ve been doing functional programming for a while, nothing here should surprise you – in fact, nothing in these two sections would feel out-of-place in a tutorial on Scheme.
Polymorphic functions
The last section in this chapter holds some very pleasant surprises. First, the
authors explain polymorphism by using the fst
function, which extracts the
first element of a 2-tuple.
The interesting thing to note here, is that the type signature of the function is not what you’d get in a traditional Standard ML environment. For comparison:
(* Poly/ML *)
> fun fst (x, _) = x;
val fst = fn: 'a * 'b -> 'a
(* SML# *)
# fun fst (x, _) = x;
val fst = fn : ['a, 'b. 'a * 'b -> 'a]
This type of quantified type signature will show up every time we define a polymorphic function, like the identity function, which comes up next:
# fun id x = x;
val id = fn : ['a. 'a -> 'a]
The real kicker comes when we look at functions which are polymorphic over record types. While these types of operations do appear in the heretofore-theoretical SuccessorML, SML# has them, and they “just work”:
Polymorphic record field extraction
We can define a field accessor getWeight
which will work with any record
that has a weight
field:
# fun getWeight x = #weight x;
val getWeight = fn : ['a#{weight: 'b}, 'b. 'a -> 'b]
Using it, we can compare apples and oranges (or at least the weight of their respective containers :):
We can define a record representing a bag of apples:
# val emptyBagOfApples = {contents= [], color= "Gray", weight= 0.1};
val emptyBagOfApples =
{color = "Gray", contents = [], weight = 0.1}
: {color: string, contents: ['a. 'a list], weight: real}
…and a record representing a box of oranges:
# val boxOfOranges = {width=12, height=23, weight=5.0, material="Wood"};
val boxOfOranges =
{height = 23, material = "Wood", weight = 5.0, width = 12}
: {height: int, material: string, weight: real, width: int}
Now, because both records have a weight
field, we can apply getWeight
to
both:
# getWeight boxOfOranges < getWeight emptyBagOfApples;
val it = false : bool
This is technically called row polymorphism, and you might have come across it if you’ve ever used the Elm programming language. I think this feature has a very beneficial effect on developer UX, and the authors note that this feature was necessary for them to integrate native SQL and JSON support into the language.
Record update syntax
The other impressive feature in this section is polymorphic record update syntax. This feature comes in super handy in Elixir, where it can be used to create new records by updating only certain fields in an old one. Like so:
# Elixir #
iex(1)> bagOfApples = %{apples: ["a1", "a2"], weight: 2.0, color: "gray"}
%{color: "gray", apples: ["a1", "a2"], weight: 2.0}
iex(2)> newBag = %{bagOfApples | apples: tl(bagOfApples.apples), weight: bagOfApples.weight - 1}
%{color: "gray", apples: ["a2"], weight: 1.0}
This type of operation is not really possible in Standard ML, where one must fully extract all fields of a record in order to create a new, updated copy:
(* Poly/ML *)
> val bagOfApples = {weight=2.0, apples=["a1", "a2"], color="gray"};
> fun removeApple {apples, color, weight} = {apples=tl apples, color=color, weight=weight-1.0};
> removeApple bagOfApples;
val it = {apples = ["a2"], color = "gray", weight = 1.0}:
{apples: string list, color: string, weight: real}
We can use an as
pattern to shift the programmer’s focus a bit, but there’s
no way to “re-use” the ...
catch-all pattern in the body of the function, even if we help
by adding type annotations:
(* Poly/ML *)
> type bagOfApples = {apples: string list, color: string, weight: real};
type bagOfApples = {apples: string list, color: string, weight: real}
> fun removeApple' (bag as {apples, weight, ...}) : bagOfApples = {apples=(tl apples), color=(#color bag), weight=(weight-1.0)} : bagOfApples;
poly: : error: Can't find a fixed record type.
Found near {apples = apples, weight = weight, ...}
poly: : error: Can't find a fixed record type. Found near #color
But SML# lets us do this with ease! Like so:
(* SML# *)
# val bagOfApples = {contents= ["a1", "a2"], color= "Gray", weight= 2.1};
val bagOfApples =
{color = "Gray", contents = ["a1", "a2"], weight = 2.1}
: {color: string, contents: string list, weight: real}
# fun removeApple (bag as {contents, weight, ...}) = bag#{contents=tl(contents), weight=(weight-1.0)};
val removeApple = fn : ['a#{contents: 'b list, weight: real}, 'b. 'a -> 'a]
# removeApple bagOfApples;
val it =
{color = "Gray", contents = ["a2"], weight = 1.1}
: {color: string, contents: string list, weight: real}
As you can see, our update function only required the input record to have
adequately-typed contents
and weight
fields, and created a new record with
the two fields changed, but others preserved (color
in our case).
The above two added degrees of polymorphism are really nice additions to the base Standard ML language. There is one more addition that solves a major “pain point” in the standard, but it will only be introduced as part of the exercises for the chapter.
Exercises
The exercises in this chapter really offer no hand-holding and require the reader to have ingested the lessons of the previous one in full. Additionally, one should know their way around a POSIX system. I appreciated the fact that not only did the authors of SML# design the language to go “with the grain” of POSIX, as demonstrated by native UTF-8 support and native Makefile generation, but they also teach bits of POSIX as part of the language tutorial.
The exercises also do justice to the book’s title, as they are very practical.
I’m not going to go over all the exercises, but I will note several interesting things:
The smlsharp
interpreter is pipe-friendly
One of the exercises requires us to evaluate a bunch of SML# expressions and
store their output in files. We don’t get any kind of popen
-like interface to
launching subprocesses, but are expected to use OS.Process.system
to launch a
shell process and use redirection to perform I/O.
This is cool, and it’s important to note that the smlsharp
interpreter, when
launched in a pipe, reads “interactive” commands from stdin and writes output
to stdout, then shuts down gracefully. This is a very nice unixy touch for a
compiler!
% echo 'structure P = OS.Process;' | smlsharp
SML# 4.0.1-30.g944e3e5d (2023-03-27 23:44:46 JST) for x86_64-unknown-linux-gnu with LLVM 12.0.1
structure P =
struct
type status = int32
val success = 0 : status
val failure = 1 : status
val isSuccess = fn : status -> bool
val system = fn : string -> status
val atExit = fn : (unit -> unit) -> unit
val exit = fn : ['a. status -> 'a]
val terminate = fn : ['a. status -> 'a]
val getEnv = fn : string -> string option
val sleep = fn : Time.time -> unit
end
%
SML# can print arbitrary datatypes!
This is another practical and developer-friendly solution to Standard ML’s lack
of a polymorphic toString
function.
Personally, I find this lack one of the bigger drawbacks of day-to-day
programming in portable standard ML. Modern languages which aim for
developer-friendliness go out of their way to provide an easy way to get a
representation of your datatype on the screen. Elixir has IO.inspect, Haskell
can generate show
with deriving (Show)
, and go has the wonderful %v
format string.
SML#’s solution to the problem is the Dynamic
structure (a.k.a. module),
which allows the compiler to derive various representations for its internal
datatypes. Looking at its exported functions, I expect it to be in use when
writing code that converts data to and from external formats such as SQL and
JSON. I hope to find out more in subsequent chapters.
In the meantime, we are introduced to this super-nifty debugging function:
# Dynamic.pp;
val it = fn : ['a#reify. 'a -> unit]
# Dynamic.pp bagOfApples;
{color = "Gray", contents = ["a1", "a2"], weight = 2.1}
val it = () : unit
And we get to use it to debug our existing code, taking advantage of
StandardML’s imperative sequencing syntax: (e1 ; e2 ; ... ; en)
Summing up
A fun chapter which starts out easy and a bit encyclopedic, but ends on a strong note with interesting exercises and a demonstration of three absolutely killer features.