OCaml’s Standard Library (Stdlib
)
Every programming language comes with some “batteries” included - mostly in the form of its standard library. That’s typically all of the functionality that’s available out-of-the-box, without the need to install additional libraries. (although the definition varies from language to language) Usually standard libraries are pretty similar, but I think that OCaml’s a bit “weird” and slightly surprising in some regards, so I decided to write down a few thoughts on it and how to make the best of it.
OCaml’s standard library is called Stdlib
and it’s the source of much
“controversy” in the OCaml community. Historically Stdlib
was focused only the
needs of the OCaml compiler (many people called it “the compiler library” for
that reason) and it was very basic when it comes to the functionality that it
provided. This is part of the reason why libraries like Jane Street’s Base
and Core
(alternatives to Stdlib
), and OCaml Containers
(complementary
extensions to Stdlib
) become so popular in the OCaml community.
From what I gathered, the compiler authors felt it was the responsibility of the
users of the language to find (or create) the right libraries for their
use-cases and preferred to keep the standard library as lean as possible. I get
their reasoning, but I think this backfired to some extent, as it’s not
something that many newcomers to a language would expect. The standard library
was definitely a point of surprise and disappointment for me when playing with
OCaml for the first time. I still remember how surprised I was that the book
Real World OCaml began with the instructions
to replace the built-in standard library with the more full-featured Base
and
Core
libraries. I was used to fairly minimal standard library from my time
with Clojure, but OCaml really outdid Clojure in this regard!
These days, however, I’ve noticed an increased focus on aligning the Stdlib
functionality with the expectations of most programmers. That’s obvious when you
check the recent OCaml releases, that feature many additions to it:
- OCaml 5.1: 57 new standard library functions.
- OCaml 5.2: Around 20 new functions added to the standard library.
- OCaml 5.3: Around 20 new functions in the standard library (in the
Domain
,Dynarray
,Format
,List
,Queue
,Sys
, andUchar
modules).
I’ve written about some of those recent additions in the past - e.g. List.take
and
List.drop
and I
think they’ll be quite helpful for newcomers to the language.
I think the trend to extend Stdlib
started somewhere around OCaml
4.07 and has accelerated recently.
That probably won’t surprise long-term users of OCaml, as the Stdlib
module didn’t even exist before. I’ll come back to this topic later in the article.
Exploring Stdlib
The Stdlib
module is
automatically opened at the beginning of each compilation. All components of
this module can therefore be referred by their short name, without prefixing
them by Stdlib
.
In particular, it provides the basic operations over the built-in types
(numbers, booleans, byte sequences, strings, exceptions, references, lists,
arrays, input-output channels, …) and the standard library modules.
In OCaml 5.3 Stdlib
consists of the following modules:
Arg
: parsing of command line argumentsArray
: array operationsArrayLabels
: array operations (with labels)Atomic
: atomic referencesBigarray
: large, multi-dimensional, numerical arraysBool
: boolean valuesBuffer
: extensible buffersBytes
: byte sequencesBytesLabels
: byte sequences (with labels)Callback
: registering OCaml values with the C runtimeChar
: character operationsComplex
: complex numbersCondition
: condition variables to synchronize between threadsDomain
: Domain spawn/join and domain local variablesDigest
: MD5 message digestDynarray
: Dynamic arraysEffect
: deep and shallow effect handlersEither
: either valuesEphemeron
: Ephemerons and weak hash tableFilename
: operations on file namesFloat
: floating-point numbersFormat
: pretty printingFun
: function valuesGc
: memory management control and statistics; finalized valuesHashtbl
: hash tables and hash functionsIn_channel
: input channelsInt
: integersInt32
: 32-bit integersInt64
: 64-bit integersLazy
: deferred computationsLexing
: the run-time library for lexers generated by ocamllexList
: list operationsListLabels
: list operations (with labels)Map
: association tables over ordered typesMarshal
: marshaling of data structuresMoreLabels
: include modules Hashtbl, Map and Set with labelsMutex
: locks for mutual exclusionNativeint
: processor-native integersOo
: object-oriented extensionOption
: option valuesOut_channel
: output channelsParsing
: the run-time library for parsers generated by ocamlyaccPrintexc
: facilities for printing exceptionsPrintf
: formatting printing functionsQueue
: first-in first-out queuesRandom
: pseudo-random number generator (PRNG)Result
: result valuesScanf
: formatted input functionsSeq
: functional iteratorsSet
: sets over ordered typesSemaphore
: semaphores, another thread synchronization mechanismStack
: last-in first-out stacksStdLabels
: include modules Array, List and String with labelsString
: string operationsStringLabels
: string operations (with labels)Sys
: system interfaceType
: type introspectionUchar
: Unicode charactersUnit
: unit valuesWeak
: arrays of weak pointers
Lots of good stuff here! Sure, it’s not anything like the standard libraries of
languages like Ruby
, Python
or Java
, but you have the basics covered, at
least to some extent.
Note that unlike the core Stdlib
module, sub-modules are not automatically
“opened” when compilation starts, or when the toplevel system (e.g. ocaml
or
utop
) is launched. Hence it is necessary to use qualified identifiers
(e.g. List.map
) to refer to the functions provided by these modules, or to add
open
directives.
One thing I found somewhat peculiar at first was the presence of two versions of
some standard library modules - e.g. List
and ListLabels
. Both of them have
the same functions, but the ListLabels
module makes heavy use of labeled
parameters. I’m not sure what’s the reasoning behind this, but I’m guessing this
was influenced by the Base
library, that’s using labels everywhere
pervasively. Here are a few examples:
(* Using List module *)
let squares_list = List.map (fun x -> x * x) [1; 2; 3; 4; 5]
(* Result: [1; 4; 9; 16; 25] *)
(* Using ListLabels module *)
let squares_list_labels = ListLabels.map [1; 2; 3; 4; 5] ~f:(fun x -> x * x)
(* Result: [1; 4; 9; 16; 25] *)
(* Using List module *)
let sum_list = List.fold_left (+) 0 [1; 2; 3; 4; 5]
(* Result: 15 *)
(* Using ListLabels module *)
let sum_list_labels = ListLabels.fold_left [1; 2; 3; 4; 5] ~init:0 ~f:(+)
(* Result: 15 *)
The labeled arguments in ListLabels
make it clear what each parameter means -
e.g. ~init
for the initial value and ~f
for the folding function. I’m not
sure how I feel about labeled arguments in general, as in most cases I don’t
think they are really needed, but you’ve got the option if you want it.
One notable omission from Stdlib
is some module for dealing with regular
expressions. OCaml bundles the (controversial)
str module, but it’s not part of
Stdlib
and you have to link it to your applications manually:
ocamlc other options -I +str str.cma other files
ocamlopt other options -I +str str.cmxa other files
Not to mention that you probably want to use something different instead. (e.g. re
)
Note: The documentation of
Stdlib
is excellent and I highly
recommend everyone to peruse it.
Base
or Stdlib
?
A lot of people might be wondering whether to use Jane Street’s standard library
Base
or Stdlib
? I’m guessing there was a time when Base
offered bigger
advantages over Stdlib
, but today it’s harder to recommend Base
over
Stdlib
. Especially when you factor in the library OCaml
Containers which provides numerous
extensions to Stdlib
.
My advice for most newcomers would be to start with Stdlib
and mix in Containers if
needed. If you deem they are not enough for you - feel free to explore Base
at this point.
I think Base
(and Core
) are excellent and battle-tested libraries, but I still think
it’s a good idea for everyone to be familiar with OCaml’s “native” standard library. And for
all of us to be pushing to make it better, of course.
A note about the core library
Sometimes you might hear mentions of OCaml’s “core library” (not to be confused
with Core
by Jane Street) and you might wonder what’s that exactly.
Well, the “core library” is composed of declarations for built-in types and exceptions, plus the module Stdlib that provides basic operations on these built-in types.
You can learn more about the core library here.
A note about Pervasives
Early on in my OCaml journey I’d find references here and there to a library
named Pervasives
, that sounded more or less like a standard library.
Turns out that Pervasives
got renamed to Stdlib
in OCaml 4.07. Here are a few highlights
from the release notes of this quite important release:
- The standard library is now packed into a module called
Stdlib
, which is open by default. This makes it easier to add new modules to the standard library without clashing with user-defined modules. - The
Bigarray
module is now part of the standard library. - The modules
Seq
,Float
were added to the standard library.
I know Pervasives
was kept around for a while for backwards compatibility and it seems it’s no
longer present in OCaml 5.x.
Epilogue
OCaml’s Stdlib
is often cited as a reason why the language is not popular, and
I think that’s a valid argument. Still, it seems to me that lately Stdlib
has
been moving in the right direction, and the out-of-the-box OCaml experience got
improved because of this. I can only hope that this trend will continue and that
as a result OCaml will become more beginner-friendly and more useful out-of-the-box.
What improvements would you like to see there going forward?
That’s all I have for you today. Keep hacking!