Thursday, August 21, 2014

The essence of a monad


Many people when they encounter the term 'monad' for the the first time, they decide to look it up, at least the curious bunch. But when they do consult the blogs, articles, Wikipedia, haskell documentations- explaining the term monad - they become deeply confused and angry at themselves for not being able to make any sense of it - at least people like me, whose brain is not bigger than the size of a peanut. They curse themselves and grok more blogs, articles, try to learn haskell in the hope that, may be this time they would be able to grasp it - but to no avail. Their brain hurts and they become even more confused. As they keep persisting in their efforts - they come across monads being described as elephants, space suits/burritos/fluffy clouds and what not! Some say - you need to know category theory to understand monad, some say no you do not - while people trying understand monad remain in a state of utter confusion for months or maybe years on end. This happened to me also - couple of years back(As I said before my brain is not bigger than a peanut!).

People say, "when you finally figure out what a monad is - you write a blog about it - that is a tradition". Well, I had decided to break with that tradition and not to write about it then - when that enlightenment happened! But I have now decided to go with tradition mainly because functional programming is being so ubiquitous and terms like monad/functors etc keep appearing all over the web and many people out their may be languishing to come to terms with monads. Hope this post would help them see what a monad is.

Then, why is monad so difficult to understand? I suppose there are couple of reasons for this. First, expectation on part of the person trying to understand monad. He expects monad to be self describing one whole thing. Once you read couple of blogs and articles, you expect to get a self explanatory picture of a concrete thing. That is a wrong expectation on the part of the learner. Monad is an abstraction of a design concept and there are different concrete implementations of this concept specific to particular problems. We need to keep this in mind.

Second, except for couple of blogs/presentations, people who try to explain monad to the uninitiated, are also, to some extent, to be blamed for the confusion they cause to the people trying learn monads. I just googled for "what is a monad" - opened three top links that came up.

This one straight away goes into examples like 'List comprehension', 'Input/Output', 'A parser' and ending with an example of 'Asynchronous programming'.The questioner was asking for a brief, succinct, practical explanation as to what a monad essentially is! Straight away jumping to some arbitrary examples does not help. Not to speak of the haskell syntax that may be unfamiliar to the user. So the user trying to learn monad become bewildered, deeply confused and perplexed. Here questioner's mistake was that - he asked for a a brief, succinct, practical explanation. That is a wrong expectation.

Second one from Wikipedia says, "In functional programming, a monad is a structure that represents computations defined as sequences of steps. A type with a monad structure defines what it means to chain operations, or nest functions of that type together. This allows the programmer to build pipelines that process data in steps, in which each action is decorated with additional processing rules provided by the monad". This one does not help. It sounds too abstract.

Another post mentions- "Douglas Crockford once said that monads are cursed – that once you understand monads for yourself you lose the ability to explain them to others". Then goes on to talk about 'Maybe' & JQuery being monad. This does not help either - why should we be not able to explain what a monad is? Maybe we do not understand monad or may be we do but partially!

All these and many more explanations have one thing in common. And that is that - they lack the motivation for monad and where it came from. That is where Brian Beckman: Don't fear the Monads and sigfpe shines. They talk about the motivation for monad.

So what is a monad?

In functional programming, functions are first class as is function composition. We build small functions and compose them to build bigger functions. So the way to tackle complexity is composition - it's of paramount importance. If functions do not compose, there arises an impedance mismatch. In pure functional languages like haskell function are considered to be mathematical- no side effect, no logging,
no exceptions. Same function called with same input should give the same output irrespective of who called it, when called it or however called it. Period. No deviation from this contract. But real world is not like this - Exception occur, latency happens, we need to handle unexpected result. So how do we account for these deviations? Well, taking recourse to monads of course! Monad is the vehicle we hop on to stay in course in face of these unavoidable deviations. Let's see some examples:

Example 1:

def getConnection(c: Credentials): Connection 

If we look at the signature of the method above carefully, we are lying about something! We are hiding something. What if the network is down or the database is down? We might not get a connection at all. We are not being explicit about this aspect of getConnection method failing to give us connection. That is where motivation for using or designing Maybe/Option monad comes from.

So the correct signature should be:

def getConnection(c: Credentials): Option[Connection]

If indeed getConnection returns a connection, we return Some(connection) or None if it fails. That we say beforehand so that caller may take alternate processing pipe line in the event of failing to get a connection.

While previous fails to compose, second composes and the signature speak out loudly what the contract is.

Example 2:

In scala, it is very idiomatic to do things like following:

 (1 to 10).toList.filter(_ % 2 == 0).flatMap(x => (0 to x).toList

Basically we like to keep piping or chaining output of one function call as the input to the other function till we compute our desired result(Like after each semi-colon we keep executing statement afer statement in imperative programming). But many very times output of one function call does not conform to the input of another function which is to say they do not compose. We need to coerce the output of a function to a format that can be fed as input to other function, we need to get rid of the impedance mismatch. That is the role flatMap/bind in monads. And each monad will have its own implementation of this method that is specific to the problem it is trying to solve. To drive home the point, we conjure up the following fictitious problem:

val urls = List("facebook.com", "twitter.com")

class Page

def getPage(url: String): Page = new Page

def outGoingLinks(page: Page): List[String] =List("bbc.com", "cnn.com")

What we want do to is this: we start with list of urls, crawl the pages corresponding to those urls &
get the total count of outgoing urls for all those pages. To achieve this we have defined our functions above. So following is the call to get the job done:

 urls.map(url => getPage(url)).flatMap(page => outGoingLinks(page)).size

res5: Int = 4

And we are done. 'outGoingLinks' function takes a page and returns a List[String]. This is the crucial point to understand. So, when all the pages(here two) considered, the result would be List(List[String], List[String]). But this not what we want - we want List[String]. This where output is deviating. But 'flatMap' apart from calling the 'outGoingLinks' function for each page also does one more thing. It concatenates the List[List[String], List[String]) to one List[String]. 'flatMap' implementation in the case of List monad just happens to be that it takes a function (something) -> List[something] and concatenates the resulting lists into a single List[something] - so that we can stay the course. 'flatMap' guides you to the happy path as Erik Meijer(of Rx/Link fame) says while teaching 'Principles of Reactive Programming' . It keeps you from falling off the cliff.

So, monad is an abstraction of a design concept in functional programming and the motivation comes from recognizing the fact that composition is important. There are various implementation of this concept like Try, Either, Future, Option etc.

Of course, there are monadic laws that monad should hold. But they are not a prerequisite to an understanding of what the essence of monads is. When monads hold all three monad laws, it is a guarantee that they will not misbehave. In fact, the Try monad above holds only two of the monad laws and but for all practical purposes, it behaves like a monad. 

As a last note, monads are not specific to functional programming alone. The .Net implementation Rx(Reactive extention) has been ported to java by Netflix and it is all monadic underneath. Its changing the the whole game of event/stream processing and how to write reactive applications that scales up to millions of user requests.

It has been a long post. But hope it helps people who are trying to get on with monad and removes some of the mists surrounding monads.

Friday, March 1, 2013

Fun with scala's for comprehension: Solving sudoku in 11 lines!

Scala's for loop(comprehension) may look deceptively similar to java's for loop, but it is much more powerful than java's for loop. While java's for loop is just to repeat some computation for specified number of times, scala's for comprehension has a mathematical underpining to it. They are analogous to set comprehensions, which are normally used for building more specific sets out of general sets. Shown below is a basic set comprehension of 10 even natural numbers.

 S = {2*x; x <- N; x <= 10}

The first part 2*x is called the output, N is the input set and x <= 10 is the predicate filter. The same translated to for comprehension along with output is shown below:

scala> val s = for {
     |  x <- 1 to Integer.MAX_VALUE
     |  if(x <= 10)
     | } yield 2*x

s: (2, 4, 6, 8, 10, 12, 14, 16, 18, 20)

Problem: Given a positive integer n, find all the pairs of integers (i,j) such that 1 <= j < i < n, and i+j is a prime number.

Such combinatorial problems lend themselves to be solved beautifully with for comprehensions. Let's first define a function that tells us if a number is prime or not!

We know a number is a prime if it's only divisors are 1 and the number itself. While in imperative style we would start out with how to do it, in functional style we state what the problem is! So here goes our definition of isPrime.

scala> def isPrime(n: Int) = (2 until n).forall(d => n % d != 0)
isPrime: (n: Int)Boolean


Now, to find all the pairs of numbers such that each two elements of the pairs sum up to a prime number, we just make use of for comprehensions and the isPrime function defined above:

scala> def findPrimePairs(n: Int) =
     |  for {
     |   i <- 1 until n
     |   j <- 1 until i
     |   if isPrime(i+j)
     |  } yield(i,j)
findPrimePairs: (n: Int) IndexedSeq[(Int, Int)]

scala> val primePairs = findPrimePairs(10)
primePairs: IndexedSeq[(Int, Int)] = Vector((2,1), (3,2), (4,1), (4,3), (5,2), (6,1), (6,5), (7,4), (7,6), (8,3), (8,5), (9,2), (9,4), (9,8))

Neat, elegant and concise.

Let's try one more problem before we delve into solving sudoku in less than 11 line!

Let's try find the numbers x,y,z below n such that x^2 + y^2 == z^2. Let's call the function rightAngledNumbers!

scala> def rightAngledNumbers(n: Int) =
     |  for {
     |   x <- 1 until n
     |   y <- x+1 until n
     |   z <- y+1 until n
     |   if(x*x + y*y == z*z)
     |  } yield (x,y,z)

rightAngledNumbers: (n: Int)IndexedSeq[(Int, Int, Int)]

val result = rightAngledNumbers(100)

result: = Vector((3,4,5), (5,12,13), (6,8,10), (7,24,25), (8,15,17), (9,12,15), (9,40,41), (10,24,26), (11,60,61), (12,16,20), (12,35,37), (13,84,85), (14,48,50), (15,20,25), (15,36,39), (16,30,34), (16,63,65), (18,24,30), (18,80,82), (20,21,29), (20,48,52), (21,28,35), (21,72,75), (24,32,40), (24,45,51), (24,70,74), (25,60,65), (27,36,45), (28,45,53), (30,40,50), (30,72,78), (32,60,68), (33,44,55), (33,56,65), (35,84,91), (36,48,60), (36,77,85), (39,52,65), (39,80,89), (40,42,58), (40,75,85), (42,56,70), (45,60,75), (48,55,73), (48,64,80), (51,68,85), (54,72,90), (57,76,95), (60,63,87), (65,72,97))

I am not sure how many lines of code will be needed, if we try to do the same thing in an imperative way!

Now, let's get back to solving sudoku in 11 lines. Let me clarify here - though our main logic of solving sudoku is confined to only 11 lines, yet the whole program is hardly but 11 lines. That is because, we have to do things like read the input from a file, index the cells with row, column and cell value etc. Also, we have to check on which box an empty cell falls along with a helper function to tell us if a particular value is allowed in an empty cell. Barring these, actual logic is, indeed confined to 11 lines. Let's first define few type aliases and show the function with 11 lines which solves the problem.

type Cell = (Int,Int,Int)

Cell represents a sudoku cell with ._1 being the value,._2 being the row and ._3
being column of the cell.

type Solutions = List[List[Cell]]

We can have more than one possible solution to a sudoku problem depending on how many cells are already filled in. That is why our Solutions type alias is List of List of Cells. Shown below is the function that solves the sudoku.

def fillCells(emptyCells: List[Cell]): Solutions = {
   if(emptyCells == Nil) List(List())
   else {
   for {
     filledCells <- fillCells(emptyCells.init)
     cellValue <- 1 to 9
     cell = (cellValue,emptyCells.last._2, emptyCells.last._3)
     if(isOk(cell, filledCells ::: nonEmpties))  
    } yield cell :: filledCells
   }}


Explanation: We first take all the empty cells. Out of these empty cells, take all but last one and call fillCells recursively, which in turn repeats the process until we hit the base case when we call fillCells(Nil) - this returns an empty list of list. Now call stack unfolds inside out, calling fillCells with one empty cell and trying fill that with one of the possible values in the range from 1 to 9 and checking if that value is ok(isOk) for that cell. If so, that cell is concatenated to the base case, giving us one partial solution. And process continues till we fill all the empty cells.

Shown below is couple of inputs and along with their solutions. The whole Sudoku code is to be found at:

https://github.com/ratulb/sudoku/

Problem 1:

1 9 0 2 0 0 8 0 0
4 0 7 0 9 0 0 0 0
0 3 0 0 0 1 4 0 0
0 0 3 0 0 0 0 0 2
9 8 0 0 0 0 0 7 3
2 0 0 0 0 0 6 0 0
0 0 5 3 0 0 0 1 0
0 0 0 0 4 0 9 0 6
0 0 9 0 0 6 0 5 4 


scala> val sol1: Sudoku.Solution = List(List((3,8,6), (8,8,4), (1,8,3), (2,8,1), (7,8,0), (2,7,7), (7,7,5), (5,7,3), (8,7,2), (1,7,1), (3,7,0), (8,6,8), (7,6,6), (9,6,5), (2,6,4), (4,6,1), (6,6,0), (9,5,8), (8,5,7), (5,5,5), (3,5,4), (4,5,3), (1,5,2), (7,5,1), (5,4,6), (2,4,5), (1,4,4), (6,4,3), (4,4,2), (4,3,7), (1,3,6), (8,3,5), (7,3,4), (9,3,3), (6,3,1), (5,3,0), (5,2,8), (9,2,7), (6,2,4), (7,2,3), (2,2,2), (8,2,0), (1,1,8), (6,1,7), (2,1,6), (3,1,5), (8,1,3), (5,1,1), (7,0,8), (3,0,7), (4,0,5), (5,0,4), (6,0,2)))

Problem 2:

0 0 6 9 0 0 3 0 1
0 0 4 0 3 1 0 0 7
2 0 0 0 0 0 0 6 0
1 0 0 0 0 0 0 0 0
0 2 0 7 0 3 0 8 0
0 0 0 0 0 0 0 0 3
0 4 0 0 0 0 0 0 2
3 0 0 1 7 0 9 0 0
8 0 9 0 0 2 1 0 0

scala> val sol2 = Sudoku.solution
sol2: Sudoku.Solution = List(List((6,8,8), (3,8,7), (4,8,4), (5,8,3), (7,8,1), (8,7,8), (4,7,7), (6,7,5), (2,7,2), (5,7,1), (7,6,7), (5,6,6), (9,6,5), (8,6,4), (3,6,3), (1,6,2), (6,6,0), (1,5,7), (7,5,6), (5,5,5), (9,5,4), (2,5,3), (8,5,2), (6,5,1), (4,5,0), (4,4,8), (6,4,6), (1,4,4), (5,4,2), (9,4,0), (5,3,8), (9,3,7), (2,3,6), (8,3,5), (6,3,4), (4,3,3), (7,3,2), (3,3,1), (9,2,8), (4,2,6), (7,2,5), (5,2,4), (8,2,3), (3,2,2), (1,2,1), (2,1,7), (8,1,6), (6,1,3), (9,1,1), (5,1,0), (5,0,7), (4,0,5), (2,0,4), (8,0,1), (7,0,0)))

Following listing shows the complete sudoku code.

import io.Source
object Sudoku {
  type Input = List[List[Int]]
  type Cell = (Int,Int,Int)
  type Solutions = List[List[Cell]]
 def readFile = {
  val lines = io.Source.fromFile("sudoku1.txt").getLines.toList
  lines.map { line => line.split(" ").toList.map(e => e.toInt)
 }
}
 def solution = {
  val in = readFile
  val cells = index(in)
  val empties = emptyCells(cells)
  val nonEmpties = nonEmptyCells(cells)
  solve(empties,nonEmpties)
 }
 def solve(empties: List[Cell], nonEmpties: List[Cell]): Solutions = {
  //Actual logic
  def fillCells(emptyCells: List[Cell]): Solutions = {
   if(emptyCells == Nil) List(List())
   else {
   for {
     filledCells <- fillCells(emptyCells.init)
     cellValue <- 1 to 9
     cell = (cellValue,emptyCells.last._2, emptyCells.last._3)
     if(isOk(cell, filledCells ::: nonEmpties))
    } yield cell :: filledCells
   }
  }
  fillCells(empties)
 }
 def isOk(c:Cell, cells: List[Cell]) = {
  (colCells(c,cells) ::: rowCells(c,cells) ::: boxCells(c,cells)).forall(e => e._1 != c._1)
 }
 def colCells(c: Cell, cells: List[Cell]) = cells.filter(e => e._3 == c._3)
 def rowCells(c: Cell, cells: List[Cell]) = cells.filter(e => e._2 == c._2)
 def emptyCells(cells: List[Cell]) = cells.filter(e => e._1 == 0) 
 def nonEmptyCells(cells: List[Cell]) = cells.filterNot(e => e._1 == 0) 
 def boxCells(c: Cell, cells: List[Cell]) = {
  val inBox = cellAt(c)
  val filtered = inBox match {
    case 1 => cells.filter{e =>(e._2 < 3 && e._3 < 3)}
    case 2 => cells.filter{e =>(e._2 < 3 && (e._3 > 2 && e._3 < 6))}
    case 3 => cells.filter{e => (e._2 < 3 && e._3 > 5)}
    case 4 => cells.filter{e => ((e._2 > 2 && e._2 < 6) && e._3 < 3)}
    case 5 => cells.filter{e => ((e._2 > 2 && e._2 < 6) && (e._3 > 2 && e._3 < 6))}
    case 6 => cells.filter{e => ((e._2 > 2 && e._2 < 6) && (e._3 > 5))}
    case 7 => cells.filter{e => (e._2 > 5 && e._3 < 3)}
    case 8 => cells.filter{e => (e._2 > 5 && (e._3 > 2 && e._3 < 6))}
    case 9 => cells.filter{e => (e._2 > 5 && e._3 > 5)}
  }
 filtered
 }
 def index(in: Input) = in.zipWithIndex.map{r => r._1.zipWithIndex.map { c => (c._1, r._2, c._2) }}.flatMap(l => l) 

 def cellAt(c: Cell) = c match {
   case (_,x,y) if(x < 3 && y < 3) => 1
   case (_,x,y) if (x < 3 && ( y > 2 && y < 6)) => 2
   case (_,x,y) if (x < 3 && (y > 5 && y < 9)) => 3
   case (_,x,y) if ((x > 2 && x < 6) && y < 3) => 4
   case (_,x,y) if ((x > 2 && x < 6) && (y > 2 && y < 6)) => 5
   case (_,x,y) if ((x > 2 && x < 6) && (y > 5 && y < 9)) => 6
   case (_,x,y) if (x > 5 && y < 3) => 7
   case (_,x,y) if (x > 5 && (y > 2 && y < 6)) => 8
   case _ => 9
 }
}



Conclusion: Scala's for comprehensions might look very ordinary - but they are not. They turn out to be extremely handy while trying to solve combinatorial problems like the ones shown above. Like many other constructs of functional programming, for comprehensions allow programmers to tackle complicated problems from a higher level of abstraction.










Monday, January 21, 2013

Recovering from scala delimited continuation encounter

As I have said before, delimited continuations are wild beasts. Its one of those things that bends and hurts the mind. You might have to spend months in a state of delirium trying to figure them out! In the scala-lang.org's delimited continuation page, one reader commented and I quote, "From my perspective, these are convoluted ways of adding numbers and I have no idea what is being gained or accomplished". Reading his comment - I had laughed my heart out -really! I could understand his frustration - even though my plight was only marginally better! So, if this is the first time you are hearing about them - maybe you should do some googling for them before going through rest of the stuff in this post. Once you are sure that you are confused enough - you should take some rest and then start with this post. Hopefully, once you are through with this post - some of the mists would disappear and you will be better equipped to explore them further. So, without further ado, let's get to them.

What is a continuation?

This is a rather simple idea. Its any computation that has a starting point, a body that the computation runs through and a exit point. It could be a servlet request-response cycle, where start point is when the request is received by the container, the whole rest of the compuation being the proccessing and rendering of the response by the container. Or in even more simplistic terms, it's just a method invocation. Starting with the call, till the method returns, is the full continuation. So you get the idea - it is nothing complicated - a routine idea. So, in the interest of keeping things simple, we would stick to this last defintion i.e. the method body is a continuation.

CPS:

Now, let's say our method takes a parameter of type `A` and produces a result of type `C`. Now, imagine, when the execution reaches some arbitray point `P`, we slice the method body into two halves with our mental scissor. Let us also presume that - at the moment we cut the method at point P, the first half was producing a value of type `B` and that value would have been passed on to the second half, as its input, had we not made the cut. If we think of our two halves as two methods, one taking an A and producing a B and another taking a B and producing a C and we combine them sequentially(telling the first method to feed its output to the second) - what we are effectively doing is - passing a continuation to the first method telling it not to return instead feed its produce to the supplied function. So, as it turns out, CPS(continuation passing style) is again a rather simple idea!

Delimited continuation:

In the above example, we passed a method to the first method which respresented the whole rest of the computation. Once the supplied continuation(method representing the second half) has run its run, we have a final result of type C. What if instead of passing the whole rest, we could pass a function(or some snippet of code) that would serve as partial rest? So, delimted or partial continuation denotes portion of the whole rest instead of the whole rest itself. This is important to remember.

Delimited continuation in scala:

Deviation:[Delimited continuation in scala is achieved via code   transformation. This transormation is done by a compiler plugin(enabled by passing `-P:continuations:enable` flag). So, wherever the plugin sees reset/shift combinations, those code sections are CPS transformed.]


The extent to which, the continuation reaches is denoted by `reset`.`shift` can occur only within a reset. Using shift, we basically say, transform the code surrounded by the inner most reset into a function body. Input and output types of the transformed function, of course, should line up with those specified in the shift signature. The reified function we may call, call as many times as we like, store it somewhere and call later or throw it out of the window. Its all upto us. This is all there is to in delimited continuation in scala. Lets now look at some examples which would help straighten these ideas.


scala -P:continuations:enable

import util.continuations._

def fun = { println("I am not part of reset. Hence not part of the reified function.")
   reset {
      shift {k: (Int => Int) => //Reify reset body into Int => Int function
        println("Calling the reified function with 2")
        k(2)
     } * 300 //reset body _ * 300
  }

  println("I am outside reset - hence not part of the delimited continuation")

 "Delimited continuations are hallucinating" // return value of fun
}

fun: String

scala> val v = fun
I am not part of reset. Hence not part of the reified function.
Calling the reified function with 2
I am outside reset - hence not part of the delimited continuation
v: String = Delimited continuations are hallucinating

Shown below is the same function as above without comments and print statements. Also, we store the reified function in the cont variable. We are also not calling the continuation i.e. the reified function.


var cont: Int => Int = _

def fun = {
   reset {
     shift {k: (Int => Int) =>
      cont = k //Store it
     } * 300
  }

 "Delimited continuations are hallucinating" // return value of fun
}


fun: String

scala> val v = fun
v: String = Delimited continuations are hallucinating

scala> val v = cont(2)
v: Int = 600


Programming inside out:

Its illuminating to think about what would happen if we were to nest resets within shifts. Turns out that inner resets are evaluated before the outer ones! Why would it behave like that? Well, as we now know, resets are reified into functions or closures. So when the outer reset is being reified, if there is any nested reset inside shift, that needs to be reified as well - and that has to happen before so that we are able to reify outer ones based on the inner ones. Some people call it programming inside out or inverted. Shown below is an example.

def fun = {
  reset {
    shift { k1: (Int => Int) =>
     {
      println("A")
      reset {
        shift { k2: (Int => Int) =>
         println("C")
        } * 200
      }
     }
     println("B")
    } * 300
  }
}

scala> fun
A
C
B

Possible use cases for delimited continuation:

Though scala delimited continuations have been out there for quite some time now(they came with scala 2.8) - their adoption has not quite taken off. One reason being, at least, at first glance, they are quite hard to understand. But once, more more developers become more comfortable with them, their usage would definitely grow. The biggest apeal of delimited continuations' is that we can pause a running program and rerun it later. This  opens the door for event based programming and workflows. Since rest of the running program gets converted into a closure, we can transmit the closure accross the wire and rerun it at some other location. `Swarm` is an example of this. In the near future, we are definitely going to see some novel applications of delimited continuations.



Monday, January 14, 2013

Scala for the uninitiated java developer: 22 scala things you may not have known



Scala's popularity is rising among developers. Yet, many java developers have hardly paid any attention to it - thinking it is some sort of esoteric programming langauge isolated from the java main land. They are the target audience of this post. This post is not a tutorial - it presents the scala features with minimum technical detail and tries to show how java and scala reside under the same roof(JVM).

First up - scala is a statically typed language that runs on the jvm and fully compatible with java. That means scala can use java classes and vice versa. It combines object oriented programming with functional programming - which means functions are treated on par with things like strings, objects, Integers etc. You can toss them around like you toss around an int as method parameter or a return value from a method. Scala is the brain child of Prof. Martin Odersky - the same guy who was hired by Sun to write the java compiler in those back old days. Big players like amazon, twitter, linkedin, NASA etc have already adopted scala. Martin Odersky taught a 7 week long online(cousera.org) scala course which ended in Nov 2012 - the course was attended by some 50,000 students! That goes on to show the rising popularity of scala. With that background, let's see what are the featues that scala has to offfer.   


1) Seamless integration with Java

Scala has been designed for seamless interoperabilty with java. Scala codes get compiled to JVM byte codes. Scala can make use of existing java libraries. Scala code can call java methods, access java fields, extends java classes and implement java interfaces without any wrapper classes or glue code. Scala not only re-uses java's types -it also dresses them up. Following is perfectly vaild scala code.

import java.util.ArrayList

val users = new ArrayList[String]
users.add("me")
users.add("you")
val size = users.size
size: Int = 2

val res = 100 > 90 // Dressed up java int with `>` method
res: Boolean = true


2) Scala is concise

In general scala programs are half the size of typical java programs. Fewer lines of code not only means less typing but also less bugs, less effort to develop and understand. Scala becomes less noisy and less boiler-platey by avoiding unnecessary keywords(like classes are public by default - hence no need for `public` keyword), making the body of the class definition the primary constructor, making semi colons optional and in most of the cases - by inferencing types, by allowing call to parameterless methods without parenthesis. Shown below are two classes with same functionality:

// this is Java
public class MyClass {
  private int index;
  private String name;

  public MyClass(int index, String name) {
    this.index = index;
    this.name = name;
    System.out.println("Constructor initialized")
  }
}

//This is scala
class MyClass(index: Int, name: String) {
  println("Constructor initialized")
}

// this is Java
boolean nameHasUpperCase = false;
  for (int i = 0; i < name.length(); ++i) {// name is String
   if (Character.isUpperCase(name.charAt(i))) {
     nameHasUpperCase = true;
     break;
   }
}

// this is scala
val nameHasUpperCase = name.exists(_.isUpper)

So much less noisy and less ceremony!


3) Traits - Eat your cake and have it too

We have been talking about component based, write once run anywhere software more than a decade now. But in reality, software is still a craft rather than a industry. We don't build software from reusable components. We keep re-implementing same interface again and again - violating the DRY principle. What if we could implement `run` method for running junit testcases once for all? Traits let's do that. Not only that, it also allows a scala class to inherit behaviour from multiple traits without leading to any diamond problem! Is not that the best of the both worlds? You bet - In deed it is!

class TestCase {
  def setup = println("setting up")
  def testMethod1 = println("Test1")
  def testMethod2 = println("Test2")
  def tearDown = println("shutting down")
}

trait TestRunner {
 self: TestCase =>
 // Find the test methods via reflection and execute them
 // Hard coded here!
 def run = {
   setup
   testMethod1
   testMethod2
   tearDown
 }
}

val tests = new TestCase with TestRunner
tests.run

setting up
Test1
Test2
shutting down

In the following example we add `product` and `add` method to ArrayList.

import java.util.ArrayList

trait Product {
 self: ArrayList[Int] =>// I like to mingle with ArrayList
 def product = {
  var p = 1
  for(i <- 0 until size)
   p = p * get(i)
  p
 }
}

trait Add {
 self: ArrayList[Int] =>
 def add = {
  var s = 0
  for(i <- 0 until size)
   s = s + get(i)
  s
 }
}

val xs = new ArrayList[Int] with Product with Add

xs.add(100)
xs.add(200)
xs.add(300)

xs.add // 600
xs.product // 6000000

Needless to say, these are very simplified usage of traits but they show how code reuse can be achieved.


4) Everything is objects

Scala is purely object oriented, in fact, more object oriented than java. Java distinguishes primitive types (such as boolean and int) from reference types - scala does not.So + is a method of the Int object and you can write 2.+(3) instead of 2+3. Methods are allowed to be operators, so that you can write 2+3 instead of 2.+(3). Scala functions are also objects which is why we can toss them around like any other value.


5) Singleton objects

In the spirit of being purely object oriented, scala avoids arcane concepts like static classes and members because they are not so object oriented! What scala has instead is singleton objects. They are automatically instantiated when accessed for the first time. They provide a nice way of modularising programs. Shown is an example of singleton object defintion.

object MySingleton {
  println("I am being materialized")
  def doIt = println("Yes sirrrr!")
}

MySingleton.doIt
I am being instantiated
Yes sirrrr!


6) Structural equality

Comparing to values for equality is every day business in programming. While java promotes referential equality, scala prefers structural equality with the commonly used `==` operator. We can check referential equality with the `eq` method though. Shown below is an example:

val list1 = new ArrayList[String]
list1.add("apple")
list1.add("banana")
list1.add("orange")

val list2 = new ArrayList[String]
list2.add("apple")
list2.add("banana")
list2.add("orange")

list1 == list2 //returns true

list1 eq list2 //returns false


7) Implicit conversion

This is a double-edged sword. While outrageously useful for extending and adapting existing and new types, care must to be taken since overuse might lead to ambiguity. They also support writing DSLs in scala. Shown below is an example where we add a new method to java String class without even touching it!

class CamelCaser(val str: String) {
  def toCamelCase =str.split(" ").map(s=> s.head.toUpper + s.tail.toLowerCase).mkString(" ")
}

implicit def camelCase(s: String) = new CamelCaser(s)

"tHIS JAVA STRING iS being CONVERTED to camel case BY IMPLICT conversion".toCamelCase

Result: This Java String Is Being Converted To Camel Case By Implict Conversion

8) Pattern matching

Scala has pattern matching, a functional programming technique that hasn't been seen before in mainstream languages.Pattern matching is switch statements on steroids. While java has recently added support for strings in switch statements - scala allows to pattern match on any sort of of data! Is not that awesome? You can pattern match on even user defined types! Just put the `case` keyword before your type definition and start match making as you like!


def whatIsIt(thingy: Any ) = thingy match {
 case (x,y) => println(x + " and "+ y)
 case (x,y,z) => println(x + ", "+ y + " and " + z)
 case s: String if s.length < 10 => println("Its a short string : "+ s)
 case s: String => println("Long strings are sooo long: "+s)
 case l:List[String] => println("Got a list of String")
 case _ => println("I am not programmed to receive you!")
}

whatIsIt((10,"big"))//prints 10 and big
whatIsIt(new Object)//I am not programmed to receive you!

case class Name(first: String, last: String)

case class Address(name: Name,street: String, city: String, state: String, zip: Int)
val name=Name("Mr.","Cowboy")

val address = new Address(name,"Luna road", "Dallas ","Texus", 75201)
address match {
 case Address(Name(first, last), street, city, state, zip) => println(last + ", " + zip)
 case _ => println("not an address") // the default case
}

// prints Cowboy, 75201


9) Option & Tuples

We need to check for Null references and guard against them all the time.It is such a nuisance. Programmers abhore them. That is where Option type comes handy. Option can be in two states - it has either something(Some(x)) or nothing(None). We can easily pattern match on option types.

val connection = Some(getConnection(url,user,password))

connection match {
  case Some(con) => println("Connected")
  case None => println("Opening connection")
}

Another very useful type is the tuple type. Tuples can contain different types of elements - And this makes life whole lot more easier while writting code! Bye bye value classes and pojos! Life does not get better than this - atleast not for me!

val item =("Mango", "fruit", 2.5, 10)
item: (String, String, Double, Int) = (Mango,fruit,2.5,10)

val quantity = item._4 //10

val emp =(101, "MARTIN", "SALESMAN", 7698, "28-11-81", 1400,30)
emp: (Int, String, String, Int, String, Int, Int) = (101,MARTIN,SALESMAN,7698,28-11-81,1400,30)

emp._5 //returns 28-11-81



10) Rich collection APIs

Scala has a common, uniform, and extremely useful collection framework. They are easy to use, concise, safe and performant.A small vocabulary of 20-50 methods is enough to solve most collection problems in a couple of operations.You can achieve with a single word what used to take one or several loops.

Example:

Here's one line of code that demonstrates many of the advantages of Scala's collections.

val (minors, adults) = people partition (_.age < 18)

Concise and elegant!


11) For expression

In scala everything is an expression. Every expression eveluates to something. `If` is an expression and has to return something. The last expression of a method is the return value of the expression. If a method does not return anything, it still returns `()` which is the instance value of `Unit`(Unit is sort of java `void` type). For expressions are not exception. They accept generators and optional guards and yield something.Also, any type that defines `map` can be used as a single genrator in a for expression. A type with map and `flatMap` allows for being used as multiple generators. Types with map, flatMap & `withFilter` can accept `if` conditions(guards) in for expression. And finally, types with `foreach` method can be used as for loop. Shown below is a usage of for expression used to count the occurrence of a particular word in the .txt files underneath the current directory.

import java.io.File
import io.Source

def wordCount(word: String) = {
  def perLine(line: String) = line.split(" ").count(w => w.equalsIgnoreCase(word))
 
  def lines(f: File) = Source.fromFile(f).getLines.toList

  val files = new File(".").listFiles
  val counts = for {
    file <- files //Generator
    if file.isFile && file.getName.endsWith(".txt")  //Guard
    line <- lines(file)
  } yield perLine(line)
  counts.sum
}


12) Asynch concurrency with futures

Like java scala has futures but unlike java they are much more pleasant to deal with. We can combine multiple futures together. You wrap your codes inside futures and let them loose out in the wild. Scala will turn back and do the rest. No submitting to thread pools and checking to see if they have completed what you told them to do. Really!


val f = future {
 println("Going to do some time consuming stuff")
 Thread.sleep(1000)
 println("Done the stuff and returning `Some result`")
 "Some result"
}

f onComplete {
 onDone
} // Future returned : Some result
   

def onDone(r: Try[String]): Unit = r match {
 case Success(v) => println("Future returned : " + v)
 case Failure(e) => println("Something gone wrong :"+e)
}


13) Scala is functional

This is where scala has raised the bar for future programming language development. Before scala, functional and object oriented programmers did not see eye to eye. Scala broke with tradition and unified functional and object oriented programming, after all, they both have their own sweet spots.

Scala treats functions on par with other values like int, boolean objects. As a consequence, we can store a function in a variable, pass it as argument to some other function and return a function from a method call. This helps a lot in composing bigger things from smaller parts and vice versa. Functions accepting/returning functions are called higher order functions.

Let's define a function that adds the integers between a and b.

def addInts(a: Int, b: Int): Int =
 if(a > b) 0 else a + addInts(a+1, b)

addInts(10,20) // 165

Now let's define another function that adds the squares of integers between a & b.

def addSquares(a: Int, b: Int):Int =
 if(a > b) 0 else a * a + addSquares(a+1, b)

addSquares(10,20) //2585

Now, we define two functions with following signatures.

def id(a: Int) = a
def square(a: Int) = a * a

Now, let's define a generic sum function which take a function as a parameter.

def sum(f: Int => Int, a: Int, b: Int):Int =
 if(a > b) 0 else f(a) + sum(f, a+1, b)

We can now write:

def addInts(a: Int, b: Int) = sum(id,a,b)

def addSquares(a: Int, b: Int) = sum(square,a,b)

We just touched upon the functional side of scala. We can nest functions(refer to `For expressions`), partially or incrementaly apply arguments to get back partially applied or curried functions. We can define partial functions that are not defined for some input values. We can also create closures and toss them around. We can even pattern match on functions.

What follows is an example of function composition:

def str(a: Int) = String.valueOf(a)
def len(s: String) = s.length
def fun(f: Int => String, g: String => Int) = g compose f
val digits = fun(toStr, strLen)
val numOfDigits = digits(10000)
numOfDigits: Int = 5


14) By name parameter

In normal cases, arguments to functions are evaluated and then passed along. But in case of by name parameters, the arguments are evaluated late at the call site. They come very handy for logging frameworks etc where we don't want log messages to be evaluated in case logging is turned off.

var loggingOn = true

def log(stmt: => Unit) =
 if(loggingOn) stmt else ()

log { println("A log message") } // A log message

loggingOn = false

log { println("A log message") } // Prints nothing

In a by name parameter, the type of argument is preceded by `=>`.


15) Structural types

Structural types are types having similar members. For example, we might want to be able to close a resource without caring whether it is a database connection or a file handle. So, given a resource we want do some operation on it and finally close the resource calling the `close` method on it. We also want to return the operation result. Let's call our method `withResource`. The type of the resource can be anything as long as it has a `close` method. Similarly, type of the operation result can be anything. Hence, `withResource` will have to have two type parameters. Shown below is the signature:

 def withResource[T <: { def close}, R](resource: T)
   (operation: T => R) = {
   val result = operation(resource)
   resource.close
   result
 }


16) Actor based concurrency

Actors are objects which encaptulate state and behaviours, and communicates exclusively via message passing. The messages are placed into the recipients' mail boxes. Messages are passed on to the `receive` method of an actor. Multiple actors piggy back on a single thread. Actors provide you following:

  • Simple and high-level abstractions for concurrency and parallelism.
  • Asynchronous, non-blocking and highly performant event-driven programming model.
  • Very lightweight event-driven processes (Order of millions of actors per GB RAM).

Shown below is a simple actor printing out the received message.

import akka.actor._
val actorSystem = ActorSystem("actorsystem")

class MyActor extends Actor {
 def receive = {
  case x => println("Received : "+ x)
 }
}

val myActor = actorSystem.actorOf(Props[MyActor])
myActor ! "Hi" // Received : Hi

17) Delimited continuations

Delimited continuations are wild beasts. Its one of those things that bends and hurts the mind. You might have to spend months in a state of stupor trying to grasp what the heck they are!

It does help if I told you delimited continuations are segments of a program flow that has been reified into functions. The `reset` sets the limit for the continuation while the `shift` reifies the current continuation up to the innermost enclosing reset.

var cont: Int => Int = _

def f = {
 reset {
  shift { k: (Int=>Int) =>
    cont = k //Save it for calling later
    k(7) //calling the continuation
  } + 1
 } //Continuation boundary ends
 println("Not part of reified continuation")
}

f // Not part of reified continuation
cont(100) // 101

As mentioned before, delimited continuations take some efforts to get used to. This post is not about teaching stuffs - its about showing features of scala with minimum technical details.

18) Macro

Scala macros are functions that are called by the compiler during code compilations. Macros are written in scala and they work with expression trees rather than raw strings. They can generate and typecheck code and have access to compiler APIs.


19) Compiler plugin

Scala compiler has an architecture that allow for custom compiler plugins to be invoked during compilation. A plugin adds an extra phase in the compilation process - wherein it might do extra type checks, code generation etc. Delimited continuations are implemented via a compiler plugin in scala.


20) In-built xml support

Scala has inbuilt xml support. You can easily create, parse, and process XML documents in scala.

val date = new java.util.Date
val xml = <today> {date} </today>
xml: scala.xml.Elem = <today> Mon Jan 14 15:54:44 IST 2013 </today>

xml \\ "today"
scala.xml.NodeSeq = NodeSeq(<today> Mon Jan 14 15:54:44 IST 2013 </today>)


21) Combinator Parsing

This is one good thing parser and DSL writters would love. No need to roll out your own - which is difficult - if you are not an expert - or fall back on something like javacc or antlar etc. Scala combinator parsers greatly simplifies parsing your DSL or embedded language.


22) GUI Programming

Scala provides a GUI library based on java swing classes but hides many of the complexities of swing APIs. The scala GUI APIs are much easier to deal with.





Friday, December 28, 2012

Iteratee fundamentals for the beginer


Iteratees are an alternative to commonly prevalent, imperative blocking IO. While reading from a stream, if is data is not available - the reading thread blocks - which is not desirable. The alernative would be to register a callback which gets invoked as data become available. Iteratees are callback handlers and much more. If you have heard about them but not sure what they are - then this post is for you.

Simply put, an iteratee is a consumer of data in chunks and it's counterpart, which is called `Enumerator`- is a producer of data. The imperative way of reading data conforms to pulling data out in chunks from the source. But in an iteratee based system, the producer pushes data into an iteratee in successive chunks, which finally computes a value. Iteratee's state changes as it receives chunks of data. Or to be more precise(since state change is a side-effect and side-effects don't compose) - when an iteratee receives a chunk of data - the iteratee is replicated with it's new state being computed from the old iteratee's state and chunk of input that it received. Also, with iteratees its a two way conversation between the producer and the consumer(i.e. the iteratee). The producer might hand out a chunk of data to the iteratee or it might say, I'm Empty now - but hang on - I will feed you as soon as I get a chunk in the near future or it might say - I have run out of data(EOF) - you make a call what to do next.

On the other hand, the iteratee might say - I am ready for next chunck(Cont), or I have had enough(Done) or it might throw up(Error) - because it is monday. I will not talk about Error anymore because ours' is an ideal world and it does not add any insight to the understanding of iteratees.

So, when an iteratee gets a chunk of input, it replicates itself transitioning from one state to other. It might transition to a `Done` state which would contain a computed value and possibly some unconsumed input. The iteratee might transition to a `Cont` state which would hold a function waiting to be invoked with next chunk of input once it arrives. Or the iteratee might enter into a Error state which might hold a error message and possibly the chunk of input that caused it to error out.


I have been talking about iteratees in the context of IO streams. For understanding's sake lets consider Lists as our source of data. So the examples I would develop would use Lists instead of streams. Once we get the idea of how iteratees behave - it should not be difficult to relate Lists to streams.

So, based on the ponderings so far, two types emerge. One is the input and another is the state of the iteratee. We parameterize on the element type of the input because each chunk of data could represent a byte, word, event or anything. So the types are:

scala> trait Input[+E]
defined trait Input


scala> object Empty extends Input[Nothing]
defined module Empty

scala> //Producer has finished producing

scala> object EOF extends Input[Nothing]
defined module EOF

scala> //The producer has produced a chunk

scala> case class Elem[+E](e: E) extends Input[E]
defined class Elem

Next up, we define the iteratee itself anlong with the various states it can be in after it receives a chunk of input. We paramterize the iteratee with `E` and `A` where former and later being the type of input it consumes and value it computes respectively. We also add a run method to our iteratee to extract the computed value. If our iteratee is already in the Done state then - we return the value inside it. If on the other hand, the iteratee is still in the Cont state, we send a EOF signal to it to indicate that we are interested in the value it has computed thus far.

scala> :paste
// Entering paste mode (ctrl-D to finish)

trait Iteratee[E,+A] {
    def run: A = this match {//run will get us the  result computed thus far - sending a EOF to itself if needed
      case Done(a, _) => a
      case Cont(k) => k(EOF).run
    }
}

//Done holds computed result of type A and input it may not have consumed
  case class Done[E,+A](a: A, next: Input[E]) extends Iteratee[E,A]
  //Cont state holds a function, which given an input, would return a new iteratee instance(Done or Cont)
  case class Cont[E,+A](k: Input[E] => Iteratee[E,A]) extends Iteratee[E,A]


// Exiting paste mode, now interpreting.

defined trait Iteratee
defined class Done
defined class Cont

We have said before that it is the job of the producer(aka the enumerator) to feed the iteratee its produce in chunks. To keep things simple lets write an enumerate function instead of a full-blown enumerator. In the enumerate function below, the produce is held in a List.

 scala> :paste
// Entering paste mode (ctrl-D to finish)

 def enumerate[E,A](produce: List[E], itr:Iteratee[E,A]): Iteratee[E,A] = {
     produce match {
       //No produce - return the Iteratee as it is
       case Nil => itr
       case e :: elems => itr match {//produced an elem/chunk
         case i@Done(_,_) => i//if Done - return current Iteratee
         case Cont(k) => enumerate(elems, k(Elem(e)))//Not yet `Done` continue feeding chunks of produce
       }
     }
  } 


// Exiting paste mode, now interpreting.

enumerate: [E, A](produce: List[E], itr: Iteratee[E,A])Iteratee[E,A]


Iteratees can come in different categories - some would take finite chunks of input and then they would be in Done state. Iteratees that take the head of a List and returns it or drops few elements and then returns the rest of the List would fall in this category. On the other hand some would consume the entire List and then return a result - iteratees that sum up the List elements would fall in this category. Some other iteratees never enter the Done state even after receiving an `EOF` signal - these iteratees are termed as divergent iteratees. Below are shown few example iteratees.


An iteratee which returns the head from an enumerator's produce:

scala> :paste
// Entering paste mode (ctrl-D to finish)

def head[E]: Iteratee[E, Option[E]] = {
    def step[E](in: Input[E]): Iteratee[E, Option[E]] = in match {
  //Got an elem - return a Done iteratee right away
  case Elem(e) => Done(Some(e),Empty)
  //Cont iteratee waiting for an input
  case Empty => Cont(step)
  case EOF => Done(None, EOF)
}
  Cont(step)
  }

// Exiting paste mode, now interpreting.

head: [E]=> Iteratee[E,Option[E]]

scala> val v =  enumerate(List(1,2,3), head[Int])
v: Iteratee[Int,Option[Int]] = Done(Some(1),Empty$@ade4cd)

scala> val result = v.run
result: Option[Int] = Some(1)

Iteratee that computes the length of the produce of an enumerator:

scala> :paste
// Entering paste mode (ctrl-D to finish)

def length[E]: Iteratee[E,Int] = {
  def step[E](acc: Int): Input[E] => Iteratee[E,Int] = {
    case Elem(e) => Cont(step(acc+1))
    case Empty => Cont(step(acc))
    case EOF => Done(acc, EOF)
  }
  Cont(step(0))
}

// Exiting paste mode, now interpreting.

length: [E]=> Iteratee[E,Int]

scala> val lenItr = enumerate(List(1,2,3,4,5,6), length[Int])
lenItr: Iteratee[Int,Int] = Cont(<function1>)

scala> val len = lenItr.run
len: Int = 6

Iteratee that will return a List containing every alternate element starting with the first:

scala> :paste
// Entering paste mode (ctrl-D to finish)

def alternate[E]: Iteratee[E, List[E]] = {
  def step(flag: Boolean, xs: List[E]): Input[E] => Iteratee[E, List[E]] = {
  case Elem(e) if(flag) => Cont(step(false,xs ::: List(e)))
  case Elem(e) if(!flag) => Cont(step(true, xs))
  case Empty => Cont(step(flag,xs))
  case EOF => Done(xs, EOF)
}
  Cont(step(true,Nil))
}

// Exiting paste mode, now interpreting.

alternate: [E]=> Iteratee[E,List[E]]

scala> val altItr = enumerate(List(1,2,3,4,5,6,7), alternate[Int])
altItr: Iteratee[Int,List[Int]] = Cont(<function1>)

scala> val altList = altItr.run
altList: List[Int] = List(1, 3, 5, 7)


Conclusion: I have just shown the basics of iteratees. Frameworks like Play 2 - have taken the iteratees to a whole new level combining them with scala futures and executing them asynchronously. As web applications are becoming more and more real-time data centric, iteratees provide yet another tool in the arsenal of developer to scale up web application.  

Monday, December 24, 2012

Basics of Actors in akka/scala part 2


Hot swap of akka's actors behaviour

Akka actors' receive method accepts a partial function from Any to Unit(i.e. PartialFunction[Any, Unit]) and this function can be dynamically changed at runtime. Shown below is a simple example of that:

scala> class MyActor extends Actor {
     |   def receive = {
     |     case s: String => println("Current behaviour : "+ s)
     |     case pf: PartialFunction[Any, Unit] => context.become(pf)
     |   }
     | }


scala> val pf:PartialFunction[Any, Unit] = {
     |   case s => println("Behaving differently : "+ s)
     | }
pf: PartialFunction[Any,Unit] = <function1>



scala> val system =ActorSystem("system")
system: akka.actor.ActorSystem = akka://system

scala> val ma = system.actorOf(Props[MyActor])
ma: akka.actor.ActorRef = Actor[akka://system/user/$a]

scala> ma ! "Testing"
Current behaviour : Testing

scala> ma ! pf

scala> ma ! "Testing again"
Behaving differently : Testing again

Complementing `become`, the context also provides an `unbecome` which reverts the actor's behavior to the previous one in the hotswap stack.

Router Actor

A router is just an actor that routes incoming messages to other actors. Akka has a default set of routers for various use cases. Shown below is a router which routes messages to routees in a round robin fashion.

scala> class Routee extends Actor {
     |   def receive = {
     |     case x => println(self.path + " : received : "+ x)
     |   }
     | } 
defined class Routee

scala> val routee1 = system.actorOf(Props[Routee],name="routee1")
routee1: akka.actor.ActorRef = Actor[akka://system/user/routee1]

scala> val routee3 = system.actorOf(Props[Routee],name="route3")
routee3: akka.actor.ActorRef = Actor[akka://system/user/route3]

scala> val routee2 = system.actorOf(Props[Routee],name="route2")
routee2: akka.actor.ActorRef = Actor[akka://system/user/route2]

scala> import akka.routing.RoundRobinRouter
import akka.routing.RoundRobinRouter

scala> val routees = Vector[ActorRef](routee1, routee2, routee3)
routees: scala.collection.immutable.Vector[akka.actor.ActorRef] = Vector(Actor[akka://system/user/routee1], Actor[akka://system/user/route2], Actor[akka://system/user/route3])

scala> val router = system.actorOf(Props().withRouter(RoundRobinRouter(routees = routees)))
router: akka.actor.ActorRef = Actor[akka://system/user/$b]

scala> router ! "testing"

scala> akka://system/user/routee1 : received : testing


scala> router ! "testing"
akka://system/user/route2 : received : testing

scala> router ! "testing"
akka://system/user/route3 : received : testing


Remote actors:

As mentioned in the previous post, akka actors shipped with scala does not support remoting by default. We need to add addtional jars from the akka distribution. Following are the jars that need to be in the class path:

  • akka-remote_2.10.0-RC5-2.1.0-RC6.jar
  • netty-3.5.8.Final.jar
  • protobuf-java-2.4.1.jar

The first step is to create a conf file called `application.conf` with following content:

 akka {
   actor {
     provider = "akka.remote.RemoteActorRefProvider"
   }
   remote {
     transport = "akka.remote.netty.NettyRemoteTransport"
     netty {
       hostname = "127.0.0.1"
       port = 2552
     }
   }
 }

This file should be available in the classpath. We launch  two scala consoles, from two terminals, making a copy of the `application.conf` and changing port number to 2553 in one of them.


class RemoteActor extends Actor {
 def receive = {
   case msg => println("Received : "+ msg)
               sender ! "Got : "+ msg
 }
}

usr@ubuntu:~/akka/remoting$ scala -cp ./:$AKKA_LIB/akka-remote_2.10.0-RC5-2.1.0-RC6.jar:$AKKA_LIB/netty-3.5.8.Final.jar:$AKKA_LIB/protobuf-java-2.4.1.jar
Welcome to Scala version 2.10.0-RC5 (Java HotSpot(TM) Server VM, Java 1.7.0).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import akka.actor._
import akka.actor._

scala> class RemoteActor extends Actor {
     |  def receive = {
     |    case msg => println("Received : "+ msg)
     |                sender ! "Got : "+ msg
     |  }
     | }
defined class RemoteActor

scala> val remoteSystem = ActorSystem("remotesystem")
remoteSystem: akka.actor.ActorSystem = akka://remotesystem


scala> val remoteActor = remoteSystem.actorOf(Props[RemoteActor],name="remoteactor")
remoteActor: akka.actor.ActorRef = Actor[akka://remotesystem/user/remoteactor]

scala> remoteActor ! "local msg"
Received : local msg

We send a message the actor locally - it receives the message and prints it but reply does not get printed because the sender(i.e. REPL) is not an actor - so the reply goes to the dead letter actor.

Now we go to the second scala console create an actor and look up the actor created before and send a message to it.

scala> class MyActor extends Actor {
     |    def receive = {
     |      case x: String => println("Received : "+ x)
     |      case (a: ActorRef, x: String) => a ! x
     |    }
     | }
defined class MyActor

scala> val system = ActorSystem("system")
system: akka.actor.ActorSystem = akka://system

val ma = system.actorOf(Props[MyActor],name="myactor")
ma: akka.actor.ActorRef = Actor[akka://system/user/myactor]


scala> val remoteActor = system.actorFor("akka://remotesystem@127.0.0.1:2553/user/remoteactor")
remoteActor: akka.actor.ActorRef = Actor[akka://remotesystem@127.0.0.1:2553/user/remoteactor]

scala> ma ! (remoteActor, "testing")

scala> [INFO] [12/24/2012 14:31:21.155] [system-akka.actor.default-dispatcher-5] [NettyRemoteTransport(akka://system@127.0.0.1:2552)] RemoteClientStarted@akka://remotesystem@127.0.0.1:2553
Received : Got : testing


On the remote actor side we see:

scala> [INFO] [12/24/2012 14:29:43.906] [remotesystem-15] [NettyRemoteTransport(akka://remotesystem@127.0.0.1:2553)] RemoteClientStarted@akka://system@127.0.0.1:2552
Received : testing

Conclusion: In these two parts posts we explored basics of scala/akka actors. Hopefully, anyone looking for where to begin about akka actors may have got some footing to explore the rest of akka.

Friday, December 21, 2012

Basics of actors in scala/Akka part 1

Starting with Scala 2.10.0 - scala actors will be replaced by akka actors.Since we are already using scala actors in production - the need arose to get familiarised with akka in general and specially actors . I had already, out of curiosity, tried to develop some knowledge about akka - but  was never quite able to do so - mainly because a) the documentation looks so voluminous - it scared me away each time. b) There are so many things - actors, typed actors, dataflow concurrency, software transaction memory, microkernel and what not. The lack of a step by step fast track guide - showing some basic stuff acted against me being able to give akka a shot in my previous attempts. So, if you are in the same situation, then this post is for you. I will show the basic stuff - that, I hope, will help you explore the rest. I am assuming prior familiarity with scala actors here.


I am using scala-2.10.0-RC5 - the lib already contains akka actors library.

Note: akka actors library shipped with scala does not support remote actors. How to do that I will in a subsequent post.

An scala actor needs to extend `akka.actor.Actor` trait overiding the `receive` message(unlike scala there is not `react` method). Importing `akka.actor._` package is enough for creating and testing actors locally. Also, actors are grouped into named `ActorSystem` which provide thread pool, dead letter box, message dispatcher etc etc - the infrastrucre. Also, there can be more than one actor system in the same VM. Following shows how to create an actor system, defining an instantiating an actor and sending a message to the actor:


scala> import akka.actor._
import akka.actor._

scala> val actorSystem = ActorSystem("actorsystem")
system: akka.actor.ActorSystem = akka://actorsystem

scala> class MyActor extends Actor {
     |   def receive = {
     |     case x => println("Received : "+ x)
     |   }
     | }
defined class MyActor

scala> val myActor = actorSystem.actorOf(Props[MyActor])
myActor: akka.actor.ActorRef = Actor[akka://actorsystem/user/$a]

scala> myActor ! "Hi"
Received : Hi

In the above `Props` is a factory for actor configuration.


Actors should be named. They are automatically started when created, have pre-start and pre-shutdown hooks. Above example shows how to create an actor from the system. If actors are to be created from within an actor, then we should use 'context.actorOf' instead. Following shows an actor with non-default constructor which creates actors in response to messages.

scala> val system = ActorSystem("system")
system: akka.actor.ActorSystem = akka://system

scala> :paste
// Entering paste mode (ctrl-D to finish)

case object Create
case object Kill
case class Msg(s: String)

class MyActor(arg: String) extends Actor {
  override def preStart = println("Some arbitrary arg : "+ arg)
  var myChild: Option[ActorRef] = None

  def receive = {
    case Msg(s) => (myChild getOrElse create) ! s; println("Passed on")
    case Create => myChild getOrElse create; println("create returned")
    case Kill => myChild match {
       case Some(a) => context.stop(a)
         myChild = None
         println("killed")
       case _ => //Do nothing
    }
  }
 
  def create: ActorRef = {
   val child = context.actorOf(Props(new Actor {
      def receive = {
        case x => println("Got a msg from my parent : "+ x)
      }
    }), name="childactor")
   
    myChild = Some(child)
    child
  }
 
  override def postStop = println("Some code to execute after stopping...")

}


// Exiting paste mode, now interpreting.

defined module Create
defined module Kill
defined class Msg
defined class MyActor


scala> val parentActor = system.actorOf(Props(new MyActor("Some constructor args")), name="parent")
parentActor: akka.actor.ActorRef = Actor[akka://system/user/parent]

scala> Some arbitrary arg : Some constructor args

scala> parentActor ! Create

scala> create returned

scala> parentActor ! Msg("Testing")

scala> Passed on
Got a msg from my parent : Testing


scala> parentActor ! Kill

scala> killed


scala> parentActor ! Create


scala> create returned


scala> parentActor ! Msg("Testing again")

scala> Passed on
Got a msg from my parent : Testing again


scala> parentActor ! PoisonPill

scala> Some code to execute after stopping...


scala> system.shutdown


parentActor: akka.actor.ActorRef = Actor[akka://system/user/parent]


Actors have parent-child relationship and parental supervision. An actor is a parent of each actor it creates. In the above line `user` is parent of actor named `parent`. `user` is a one of the system actors that gets created when an actor system is created. User created actors and their children are created underneath `user` system actor. Actors can looked up using `actorFor` on actor system or contenxt. The difference between actorOf and actorFor is that in the later case we get a reference to an ActorRef(If it exists otherwise reference to the dead letter system actor where all messages will passed on in case of no-existent actors) - but actor itself is not created.

scala> val system = ActorSystem("system")
system: akka.actor.ActorSystem = akka://system

scala> val p = system.actorFor("/user/parent")
p: akka.actor.ActorRef = Actor[akka://system/user/parent]

scala> parentActor ! Msg("Testing")

scala> Passed on
Got a msg from my parent : Testing


scala> p ! Msg("Testing again")

scala> Passed on
Got a msg from my parent : Testing again

scala> val nonExistentActor = system.actorFor("/user/xyz")
nonExistentActor: akka.actor.ActorRef = Actor[akka://system/user/xyz]

scala> nonExistentActor ! "This msg will be passed on to dead letter actor"


The preferred way of stopping an actor is by calling `context.stop(self)` in response to some shutdown message. Shown below are some other ways of stopping an actor. One things to notice is that in case of kill message the post shutdown hook is not getting called.

scala> val a1 = system.actorOf(Props(new MyActor("cons args")), name="a1") //actor names should be unique in an actor system
Some arbitrary arg : cons args
a1: akka.actor.ActorRef = Actor[akka://system/user/a1]

scala> val a2 = system.actorOf(Props(new MyActor("cons args")), name="a2")
Some arbitrary arg : cons args
a2: akka.actor.ActorRef = Actor[akka://system/user/a2]

scala> val a3 = system.actorOf(Props(new MyActor("cons args")), name="a3")
Some arbitrary arg : cons args
a3: akka.actor.ActorRef = Actor[akka://system/user/a3]

scala> a1 ! Msg("Msg to a1")

scala> Passed on
Got a msg from my parent : Msg to a1


scala> a2 ! Msg("Msg to a2")

scala> Passed on
Got a msg from my parent : Msg to a2


scala> a3 ! Msg("Msg to a3")

scala> Passed on
Got a msg from my parent : Msg to a3


scala> system.stop(a1)

scala> Some code to execute after stopping...


scala> a2 ! Kill

scala> killed


scala> a3 ! PoisonPill

scala> Some code to execute after stopping...

We talked about parental supervision above. When an actor fails, it's parent can decide what action should be taken. The parent actor can escalate it or may handle failure. It may kill all the children and restart them or kill only the failed actor and restart it. Other options are - let the failed actor continue(resume) handling messages or the parent may decide to stop the failed actor permanently.

To supervise a child, an actor needs to override the `supervisorStrategy` member of the actor.

Following shows an actor that kills failed child and restarts it in response.

import akka.actor.OneForOneStrategy
import scala.concurrent.duration._
import akka.actor.SupervisorStrategy._


scala> :paste
// Entering paste mode (ctrl-D to finish)

class MyActor extends Actor {
  override def supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 2, withinTimeRange = 1 minute) {
        case e: Exception => Restart//Exception to restart directive
  }
  var child = context.actorOf(Props[Child])

  def receive = {
    case s: String => child ! s
    case a: ActorRef => child = sender
    case e: Exception => child ! e
  }
}

class Child extends Actor {
  def receive = {
    case s: String => println("Parent msg : "+ s)
    case e: Exception => throw new Exception("throwing up")
    case _ => println("Don't know what it is!")
  }

  override def postRestart(reason: Throwable):Unit = {
    println("I am being restarted :" + reason.getMessage)
    context.parent ! self
  }
}


// Exiting paste mode, now interpreting.

defined class MyActor
defined class Child

scala> val parent = system.actorOf(Props[MyActor])
parent: akka.actor.ActorRef = Actor[akka://system/user/$a]

scala> parent ! "Testing"

scala> Parent msg : Testing


scala> parent ! new Exception

scala> [ERROR] [12/21/2012 16:58:27.328] [system-akka.actor.default-dispatcher-2] [akka://system/user/$a/$a] throwing up
java.lang.Exception: throwing up
....................

I am being restarted :throwing up


scala> parent ! "Testing after restart"

scala> Parent msg : Testing after restart

Conclusion: I have just touched upon basics stuff. In the next post I will briefly write about hot swapping of actor behaviour, routing and remote actors.