0. Arrows
Albert Y. C. LaiWhat Are Arrows?
Many Haskellers begin by asking "what are arrows?" I won't answer this question. It distracts from the theory and practice.
First Arrow Program
Take a deep breath.
Now we plunge right into our first program using arrows. Don't panic! It is short, and I will explain what it does and why.
So, take a deep breath again, and here we go! You can also download it as lesson-0.hs.
import Text.XML.HXT.Core play :: Integer -> IO () play arg = do { results <- runX (dumb arg) ; print results } dumb :: Integer -> IOSArrow XmlTree Integer dumb n = arr (const n) <+> arr (const (n+1)) >>> returnA <+> arr (* 2) >>> isA (<= 60)
This program inputs an integer and outputs a list of more integers. If the input is n, the output is [n,2n,n+1,2n+2] --- but with a catch: numbers above 60 are thrown away. This dumb example performs no XML processing, but it helps bootstrap a mental model of arrows in HXT. If you see what's going on in this dumb exercise, the real XML processors will fail to intimidate you! So, please try to enjoy it...
How to enjoy it? At a GHCi prompt, Prelude> :load lesson-0.hs *Main> play 30 Try to run it, run it for various inputs and outputs, modify it for variations, stare at it, look up the arrow and HXT docs for the functions used... until you are thoroughly satisfied or utterly confused. Or just itchingly impatient. Then you're ready for my explanation.
Anatomy of The Program
Now let's examine the program in pieces.
dumb :: Integer -> IOSArrow XmlTree String dumb 30 :: IOSArrow XmlTree String
I'll spend some time on this type signature first. It is very
important to know how to read it because similar type signatures are
everywhere among the real XML arrows. I show the types of both
dumb
itself and after an integer parameter is provided.
The latter type says: an HXT arrow that takes a document tree as input
and produces strings as output.
In general, most HXT arrows have types of the form
IOSArrow x y
and it says: an HXT arrow that takes input of type x
and
produces output of type y
.
As a first step, you can understand it as a function from
x
to y
. What's more, in the case of HXT, it
is a multiple-valued function: internally it produces a list of
y
's rather than a single y
. This is useful
in many ways. Beware: I am speaking of HXT arrows specifically here,
not arrow types in general; not all arrow types fit this mental model.
(IOSArrow x y
is a shorthand for
IOStateArrow () x y
.
More on this in a later lesson.)
In the case of dumb
, y
is
String
by our choice, but x
has to be
XmlTree
because we use the HXT function runX
to run dumb
, and runX
wants that. For other
arrows, such as those internal to dumb
, we are free to
choose x
, as long as everything fits together.
An HXT arrow is not just a pure function, but also capable of various side effects. We'll have a chance to meet them in later lessons.
Now it's a good time to see what our dumb arrow does.
dumb n = arr (const n) ... -- this line is IOSArrow XmlTree Integer
The job of arr
is to convert an ordinary function into an
arrow in the most expected way: the arrow behaves like the
function. Our function here has type XmlTree->Integer
,
and so the arrow has type
IOSArrow XmlTree Integer
. Our
function is a constant function, mapping everything to the number
30 (let's say n is 30), and so the arrow outputs 30 under any input.
(Yes, we are ignoring the input here.) But an HXT arrow is supposed
to output a list. So the actual output is [30].
(Although we don't use the input, you may be itching to know what is
in it. This is provided by runX
, and it is an
XmlTree
document tree consisting of one root node with no
child.)
arr (const n) <+> arr (const (n+1)) ...
The job of <+>
, in HXT, is to run two arrows with
the same input and concatenate the two output lists. (Again, this
mental model is specific to HXT and does not apply to all arrow
types.) Thus, the arrow on the left produces [30], and the arrow on
the right produces [31], and so the whole line produces [30,31].
arr (const n) <+> arr (const (n+1)) >>> ...
The job of >>>
is to chain up arrows. If you
write f>>>g
, the output of f
(upstream) becomes the input of g
(downstream). But wait,
f
outputs a list, and g
takes only one item,
what's going on? Answer: g
will be run multiple times,
once for each item in the output list of f
; furthermore,
the output lists from the multiple runs of g
are
concatenated to form one big output list. Whew! (If this reminds you
of the list monad, yes it's the same deal.) If this is still unclear,
it will become apparent when we look at one more line of code for
concreteness:
arr (const n) <+> arr (const (n+1)) >>> returnA <+> arr (* 2) -- :: IOSArrow Integer Integer ...
So here I have to explain two things: what the downstream does in isolation, and what the chaining does in whole.
What the downstream does: returnA
passes the input to the
output without change (except the output has to be a list), so if you
input 30 you get [30]. arr (* 2)
multiplies the
input by 2, so if you input 30 you get [60]. Combining these two with
<+>
, if you input 30 you get [30,60].
What the chaining does: The upstream emits [30,31]. Give 30 to the downstream, get [30,60]; give 31 to the downstream, get [31,62]. Combine, get [30,60,31,62]. Whew!
So up to now you have a pretty good idea why I promised the program to output [n,2n,n+1,2n+2]. But I also promised to throw numbers above 60 away, so let's see how.
... something outputting four numbers >>> isA (<= 60) -- :: IOSArrow Integer Integer
isA
tests the input against the given predicate, in this
case (<= 60)
: if the test passes, the input is
passed to the output; if the test fails, the output list is empty. So
for example, with input 30, the output is [30]; with input 62, the
output is []. Combined with >>>
, the effect is
letting through certain inputs and blocking others. E.g., if the
upstream gives [30,60,31,62], the downstream is run four times, once
for each of the numbers, and the outputs are [30], [60], [31], and [],
respectively; combining, the result is [30,60,31].
Now that we see what the dumb arrow can do, let's see how to bring it to life.
play :: Integer -> IO () play arg = do { results <- runX (dumb arg) ; print results }
runX
is the most convenient way to execute HXT
arrows. Its type signature says alot:
runX :: IOSArrow XmlTree y -> IO [y]
It executes an HXT arrow and brings results back to the
IO
world. Since an HXT arrow outputs a list internally,
the IO
world receives a list too. The input type of the
arrow has to be XmlTree
, and the input value is provided
by runX
. Usually this input goes unused (e.g., the arrow
will read an XML file elsewhere instead), so we will pay little
attention to it. (But again, if you're curious, it's a document tree
with a root and no child.)
Our dumb arrow outputs [30,60,31] (if n is 30).
runX
runs our arrow and ends up with that list. Then we
print it and that's what we get.
I encourage you to experiment with various modifications to this program to increase and verify your understanding. Add more stuff or block more stuff in the arrow, for example.
New Friends from This Lesson
Name | from Module | Summary |
---|---|---|
arr | Control.Arrow (GHC) | converts an ordinary function to an arrow |
returnA | Control.Arrow (GHC) | no-op pass-through arrow |
>>> | Control.Arrow (GHC) | chains arrows |
<+> | Control.Arrow (GHC) | fission |
isA | Control.Arrow.ArrowList (HXT) | filtering by a predicate |
runX | Text.XML.HXT.Arrow.XmlStateArrow | executes an HXT arrow and brings results back to the IO world |