Someone recently noted in the #haskell IRC channel that
A 'newbie', in Haskell, is someone who hasn't yet implemented a compiler. They've only written a monad tutorial.
This is a nod to the plethora of extant tutorials about this concept, which (i feel) extends Haskell from being mainly of academic interest to being more of a language for day-to-day programming.

Monads are a concept from category theory, where they're also known as triples and 'standard constructions'. One doesn't have to understand their theoretical underpinnings in order to understand their function in Haskell, however. Monads are a way of isolating parts of the program that violate referential transparency - that is, that have significant side-effects - from the rest of the purely functional program.

Usually, the first monad that Haskell beginners come across is the IO monad - the monad required to utilise STDIN and STDOUT. This tends to cause some consternation, along the lines of "I have to learn about monads just to print some text?!?" It also tends to give the impression that monads are some kind of kludgy band-aid to enable side-effects in an otherwise purely functional language. But they're not. Haskell monads are actually very powerful, as they can be used to provide more control over one's programming environment than one would otherwise have. i'll come back to this below.

People have tried to communicate what Haskell monads are about in various ways: via 'container' metaphors (e.g. this and this; i found the former to be more illuminating than the latter); via relationship metaphors (e.g. this, which i found more confusing than helpful), and even via a 'monsters' metaphor (which i found to be rather amusing)1. One tutorial that people on the Haskell-café list seem eager to recommend (and that's recommended on the Haskellwiki) is All about monads, but that just overwhelmed me when i first read it; and even now, when i've got a better understanding of monads, i still find it difficult to follow. In contrast, i found Tackling the awkward squad and Monads for functional programming to both be very enlightening.

As far as i can tell, however, a monad simply seems to be a computational environment in which one can specify that certain types and methods of computation be performed, and in which the three monad laws are expected to hold. Within that framework, one can establish a variety of computational approaches: not only can one write in an imperative style, for example (using what's referred to as do-notation), one can also create an environment for exception-based programming (e.g. Control.Exception) or which uses continuations (e.g. Control.Monad.Cont) or which uses Software Transactional Memory2 (e.g. Control.Monad.STM) .

Given this, suggesting that monads are merely a 'kludge' to make 'feature-poor' Haskell reasonably useful is like suggesting that operating systems are merely a 'kludge' to make PCs useful. Sure, one can 'hard-wire' certain behaviour into a particular piece of hardware (e.g. televisions) - but then that piece of hardware doesn't have as much flexibility, and one is stuck with the pre-provided functionality - which one can only build on top of, and which is difficult to change should the extended functionality require it. So in a sense, monads provide Haskell with the ability to change how the language works in a way reminiscent of dialects of Lisp.

And that's my monad 'tutorial'. :-)

1. i actually agree with a comment made on reddit that:
Maybe one of the problems with the proliferation of monad tutorials and other newbie Haskell tutorials is assuming that there can be a single metaphor that suits everyone and all learning styles.
2. i found Simon Peyton Jones' recent paper Beautiful concurrency to be a good introduction to STM in Haskell.

In a recent article published on Dr. Dobb's Journal, "It's (Not) All Been Done", Herb Sutter describes the concurrency revolution currently underway.

That suits me fine, because i've recently developed an interest in functional programming and languages that have been designed for that paradigm - a paradigm which facilitates concurrency.

i must say i've never been that enthused about object-oriented (OO) programming. It just never felt 'right' to me. Okay, sure, my experiences with Java's version of 'object-orientation' have scarred me somewhat. And sure, OO in the language with which i'm most familiar, Perl, is 'bolted-on' (albeit rather elegantly, imho, and which, from what i understand, is capable of more strongly enforcing encapsulation, via closures, than the OO system of C++). And perhaps i haven't spent enough time with Smalltalk to really appreciate the marvels of a truly-OO language. For all i know, my issue could be that it's about 'nouns' rather than 'verbs', as described in the Steve Yegge's essay "Execution in the Kingdom of Nouns". Whatever the cause, OO programming simply doesn't hold much attraction for me.

Enter functional programming (FP). FP languages have been around for a while now, but interest in them appears to be on the upswing.

Functional programming takes its cue from mathematical functions. In mathematics, giving the same input to a function always produces the same output (math nerds, please correct me if i'm wrong). This is quite often not the case with the 'functions' of many programming languages; a C function, for example, can produce multiple outputs based on the same input, if that function accesses data outside its scope. Such behaviour can greatly reduce the predictability, testability and ultimate reliability of programs. So functional programming involves taking an approach in which, as much as possible, one given input to a function always results in the same output. Just as OOP doesn't necessarily require inbuilt support for OO (c.f. the base libraries for the GNOME project, which are written in C), FP doesn't necessarily require inbuilt support either - one could do FP in Lisp or Perl, for example. But again, just like OO, there are benefits to doing FP in a language designed with that end in mind.

Which brings me back to concurrency. 'Concurrency' in this context is about multiple tasks running at the same time. There's increasing interest in concurrency because we're starting to come up against boring physical limits in terms of our general approach to processor design. As we try to increase processing power by packing processors more and more tightly with wires and transistors, processors are becoming more vulnerable to their calculations being affected by quantum tunnelling and excessive heat (which is why older chips, such as the 8086, didn't require a fan over them, but more recent chips, such as the Athlon XP, do). As a result, processor designers are increasingly focusing on power increases through nifty tricks, rather than through more wires and transistors.

One such trick is parallelism. This involves doing two or more unrelated calculations at the same time on different processors (or different parts of the same processor). It's like cooking food and washing clothes at the same time; for the most part, the two tasks don't rely on each other, and so instead of completing one task and only then moving on to the other, both tasks can be undertaken concurrently. Parallelism is used as a basis for some of the most powerful computers today, including 'Beowulf' clusters of machines running BSD or Linux.

And here's the problem. The most common programming languages in use today aren't designed to take maximum advantage of any parallelism offered by their underlying hardware. And in fact, their design often actually precludes such optimisation taking place, because it's not easy to know which parts of the program can't be affected by other parts of the program, and can thus be executed in parallel. Put another way, it's difficult for the computer to say: "Can i run part X of the program without waiting for part Y to finish or not?"

This is where functional programming comes in. Because it minimises the effects that one part of a program can have on other parts of the program, the computer can make better guesses about what can and can't be run in parallel. This not only has performance benefits in terms of speed, but also in terms of reliability and scalability: one doesn't have to rely on a single processor, but can instead farm out various calculations to various processors, and shift calcuations from one processor to another if one processor fails for some reason. This is a strategy employed by IBM's zSeries/z9 mainframes:
The zSeries servers also effectively execute every instruction twice in order to assure processing integrity. If the instruction results differ, the zSeries server retries the instruction. If the instruction still fails, the zSeries/z9 server will shut down the failing processor and shift workload, "in flight," to any surviving processors, including one or more spares.
So in the context of ever-increasing demands for processor power and system reliability, functional programming has a bright future.

The main FP language i've been learning about is Haskell (although i've had a quick look at Erlang, which seems rather groovy). Haskell is actually being used for real-world systems - an implementation of Perl6, called 'Pugs', has been written in it, as has a revision control system called darcs. But i think i'll leave enthusing about Haskell for another time. :-)
Okay, time to share some lighter things, i think:
Having access to the source code of a program proved its worth today.

i've recently spent a few days re-configuring [ profile] sacred_harlot's home network, made possible by her purchase of a new hard drive for her PC. Previously, the PC's crippled hard drive meant that it could only be used as an LTSP terminal, with [ profile] sacred_harlot's laptop functioning as the LTSP server.

Replacing the hard drive, however, allowed me to install Mandriva Linux on that PC and configure it as the network router, thus freeing up the laptop for more portable, laptop-like purposes. :-)

In order to provide a consistent environment regardless of which computer people are using, i used NFS to make people's home directories available elsewhere on the network. This worked well, except for one particular program, which didn't like being run anywhere except on the PC - it would complain that it couldn't lock a particular file, because that file was already locked.

Using GDB, i was able to track down the line in the program where the problem occured. Some pondering and Googling helped me to determine that the problem was caused by the fact that the function trying to lock the file, flock(2), wasn't fully NFS-compatible. However, an alternative function, fcntl(2), was apparently happier to work with NFS.

Fortunately, the program in question is FOSS, so i was able to get the source code, change the offending line of code, re-compile and re-install it, et voilà! The program worked first try.

Whilst most PC users clearly don't have the knowledge, let alone the inclination, to go through the above process, using FOSS at least creates the possibility of rapid problem resolution in a way that restrictively-licensed, closed-source software (such as most of Microsoft's products) generally does not. Instead of spending much time tracking down the problem, and then having to figure out a workaround whilst waiting for someone else to (hopefully) fix it, i was able to rapidly track down the problem and fix it myself.

The freedom to modify the software i'm using has meant that i'm free to spend the afternoon doing things other than stressing over computer problems. :-)
Over the last few days, i've been busily coding a content generation system for the Pleasure Activism Australia Web site.

Until now, events listed in the 'Events' section of the Web site had to be added or removed by directly editing the relevant M4 source file, and then running a shell script to turn that source file into an HTML document, to be uploaded to the Web site. The WAP version of the 'Events' section - which hasn't been updated in months :-( - required manual editing also.

Now, however, events are stored in a single XML file, using the xCal format, an implementation of the iCalendar standard in XML format. The system i've coded removes past events from the file, gathers together events on a per-state (i.e. New South Wales, Victoria etc.) basis, sorts them into date order (soonest first), inserts these events into the relevant M4 source files, runs the convert-to-HTML script, and finally uploads the HTML docs to the Web site. It doesn't yet output WML docs for the WAP version, but that facility will be added shortly.

The end result will be a system where one can simply modify a single XML file and then run a program to ensure that the 'Events' section is up-to-date. And apart from the convert-to-HTML shell script, the system is written in Perl.

i actually wanted to do the project in Common Lisp, as a way of teaching myself CL through the development of a production system. Sadly, however, i didn't even make it to the stage of writing any CL code, because of difficulties with one particular library needed for the project: an interface to the GNU Readline facility.

The implementation of CL that i've chosen to play around with is CLisp. (i tried SBCL, but found its interactive shell to be much more awkward to use than the CLisp interactive shell.) Upon visiting the ASDF-install page to look for a CL interface to Readline, i found the CL-Readline libary - which requires the UFFI library to be installed, which works with all the major CL implementations apart from CLisp. :-/ Then i found the successor to the CL-Readline library, Linedit. which requires the Osicat library - which requires the UFFI library. Ack! Then i read about CFFI, a substitute for UFFI, which seemed promising, since it's apparently implemented purely in ANSI CL and should therefore be CLisp-compatible - except it turned out that the UFFI-compatibility layer was still under development. Argh!

Maybe there are other CL interfaces to Readline apart from CL-Readline and Linedit; but they weren't immediately apparent on the ASDF-Install page, on which libraries are listed higgeldy-piggeldy, rather than thematically, as is the case with the Comprehensive Perl Archive Network (CPAN).

So after all the above, and despite having found at least some of the other CL libraries i needed - CL-PPCRE and XMLS, but not an FTP or date-manipulation interface, do such interfaces exist? - i basically went "Stuff it, i'll learn CL through another project, and just do this one in Perl, where i know everything i need is available."

And so Yet Another Perl Application was born.
i just read an essay on the problems faced by Lisp, and the section on "Type Systems" has finally provoked me to ask something i've been wondering for a while now: are there any programming languages where the accepted values of a variable must be declared as part of that variable's type? i mean, seriously, how much time is wasted by programmers being forced to write code to check that all modifications to a variable are acceptable? Why on earth isn't the computer doing this for us?

So - are there any languages that do do this? And i'm not talking about languages where such functionality is merely possible - i imagine that there are a number of languages that merely make it possible. i'm talking about languages that won't let a program run until all variables have, in effect, have had some sort of XML-Schema-like thing declared for them.

Any suggestions?
Over the last week or so, i've finally taken the plunge and begun trying to actually write something in Common Lisp (CL), instead of just reading about it.

What i decided to program was something that i could write very quickly in Perl: a library of functions that perform interval arithmetic, a choice inspired by an article on the uses of interval arithmetic.

As the Wikipedia entry on interval arithmetic notes, multiplying two intervals requires finding the maximum of four numbers. "Okay," thought i, "i'm sure there's a CL function called MAXIMUM or something." Well, of course there is: it's actually called MAX. Except that it doesn't take a list or a vector or something of numbers as its argument; no, instead, it takes as arguments each number of the list whose maximum one wishes to find.

This means that you can't do this:

(max '(1 2 3 4 5))   (1)

Instead, if you want to pass MAX a list of arguments whose length will only be known at run-time, you have to do this:

(apply #'max '(1 2 3 4 5))   (2)

At least, that's the only way i can see of doing it. If that's true, it seems an oddly complicated way of performing what i would think is a common task: i've don't recall ever having written a program where i've had to find the maximum of a list of numbers whose length is known prior to run-time. And implementing a version of MAX that works in example (1) above is not particularly difficult:

(defun sensible-max (num-list)
  (let ((local-max 0))
    (dolist (item num-list local-max)
      (if (> item local-max)
        (setq local-max item)))))

i'm sure there are a number of implementation issues with this, particularly as num-list grows large; but i'm not sure there aren't implementation issues with the extant version of MAX - for example, what's the largest number of arguments that can be passed to a given function?

i can only assume that i'm naïvely missing something . . . .


flexibeast: Baphomet (Default)

Journal Tags


RSS Atom

Style Credit

Powered by Dreamwidth Studios