Skip to main content

Process text file concurrently

 Process the same input file concurrently Let's say we have a text file that contains 2 different types of lines.  For the sake of example, one group starts with "i:" (for Integer) and the other group starts with "s:" (for String), something like the following file: s:New york s:Apple i:387548 s:Amsterdam i:4556 i:39874 s:Orange i:56787 s:Banana i:4657567 s:Turkey i:45679456 s:Iran i:4356456 i:23423 i:456 s:Ukraine i:453645 i:5456 We want to process these input lines separately but concurrently. And the process can be anything, for this example, we just print them to the console. But what matters is that we open the file only for READ so 2 different processes (or routines) can read the file at the same time. The Cats Effect answer to open files (or in general any resources) in a safe and efficient way, is  Resource class. So we need a function to take the path to the file and give us a Resource of some type that allows us to read the file line by line. One o...

IO class excercises

 In the last post, we used the IO class (of cats-effects library) to do side effects or simply effects in a pure function

In this post, we will practice IO class a bit more. We write some functions that do some very useful and popular stuff on one or more IO objects.

Exercise 1:

Sequencing two IO objects and returning the second one. Meaning that we want to write a function that takes two IO objects of type A and B and returns an IO of type B, but only after making sure the IO of A is completed. The method signature should look like this:

def sequenceTakeLast[A, B](ioa: IO[A], iob: IO[B]): IO[B] = ???

For writing this function we should remember that the IO class is a monad and it has all the methods that a monad has. Including the flatMap method. There are 2 sub-tasks:

  1. We want to return the IO of B
  2. We want to make sure the IO of A is completed first
  def sequenceTakeLast[A, B](ioa: IO[A], iob: IO[B]): IO[B] = {
    ioa.flatMap(_ => iob.map(b => b))
  }

As you might know from your Scala learning experience, we can write a second version of this function using for-comprehension:

  def sequenceTakeLast[A, B](ioa: IO[A], iob: IO[B]): IO[B] =
    for {
      _ <- ioa
      b <- iob
    } yield b

There is another version for this function, using the specific operator of IO class called andThen which is written as *>

def sequenceTakeLast[A, B](ioa: IO[A], iob: IO[B]): IO[B] = ioa *> iob

Exercise 2:

The next exercise is a function to take two different IO objects, exactly like the previous exercise, but instead of returning the second one, it returns the first one. 

So we want to return the first one after making sure both IO objects have been completed in sequence.

The function signature looks like this:

def sequenceTakeFirst[A, B](ioa: IO[A], iob: IO[B]): IO[A] =

Sub-tasks are:

  1. We want to return the IO of B
  2. We want to make sure the IO of A is completed first
And as you have guessed it is very similar to the first function's implementation, only we return A at the end.

def sequenceTakeFirst[A, B](ioa: IO[A], iob: IO[B]): IO[A] =
    ioa.flatMap(a => iob.map(_ => a))

And the second version using a for-comprehension would be as:

  def sequenceTakeFirst[A, B](ioa: IO[A], iob: IO[B]): IO[A] =
    for {
      a <- ioa
      _ <- iob
    } yield a
As you might have guessed there is a specific operator for this case as well and it is called before and it is written as <*:

def sequenceTakeFirst3[A, B](ioa: IO[A], iob: IO[B]): IO[A] = ioa <* iob

Exercise 3:

Write a function that takes an IO and repeats it forever. It is actually a valid use case for a function that keeps reading a line from the console or printing the status of something on the screen.
The function signature is:

def forever[A](ioa: IO[A]): IO[A] = ???

The simplest way of implementing this function is by a naive recursion as follows:

def forever[A](ioa: IO[A]): IO[A] = ioa.flatMap(_ => forever(ioa))

it works as expected but because it is recursive (and not tail-recursive) it might cause stackoverflow error.
We can implement this function using the by-name version of andThen operator which is written as >>
def forever[A](ioa: IO[A]): IO[A] = ioa >> forever(ioa)

You can test these two functions using the following main method:

  def main(args: Array[String]): Unit = {
    import cats.effect.unsafe.implicits.global
    val io = IO(println("hello"))
    forever(io).unsafeRunSync()
  }

You will notice that none of these 2 versions of the forever function causes stackoverflow and it is because the second version is used the by-name version of andThen operator and the first version does not cause stackoverflow just because the flatMap method is using the FlatMap case class which is evaluating the parameter lazily. 

If we write the second version using the by-value version of the andThen operator then the program will crash with the stackoverflow error:

Bad version of the function, do not do this:

def forever_badVersion[A](ioa: IO[A]): IO[A] = ioa *> forever_badVersion(ioa)

Instead of any of these versions of the forever function, you can simply call foreverM method of the IO class:

def forever[A](ioa: IO[A]): IO[A] = ioa.foreverM

Comments

Popular posts from this blog

Functional algorithm to find the position of a sub-string in a string

Let's practice some functional problem-solving!  Consider the classic problem of finding the index of a sub-string in a larger string. So we want to write a function that if we pass it the following strings it returns 0 . "hello world", "hello" If we pass it the following strings it returns 6 "hello world", "world" If we pass it the following strings it returns None , because "apple" can not be found inside "hello world. "hello world", "apple" So our function signature should look like this: def findFirstSubString(str: String, subStr: String): Option[Int] = ??? If we want to implement it in an imperative way, and yes Scala allows imperative programming, we can implement the function as follows, but we don't want imperative programming, do we? def findFirstSubString_bad(str: String, subStr: String): Option[Int] = { var j = 0; var temp = "" for (i So let's find the functional algorith...

Find a pair in an array for the given sum

Algorithm: Find a pair in an array of numbers for a given sum Given an Array (or List) and a sum, we want to find a pair of numbers whose sum will be equal to the given sum.  For example for the [1,4,7] and 5 the outcome should be pair of 1 and 4. Or for [5, 7, 2, 8, 3] and 15 the outcome should be pair of 7 and 8.  Both pairs of 1 and 5, and 2 and 4 are correct for the input of [2, 4, 1, 5] and 6. The function signature looks like this: def findPairForSum(list: List[Int], sum: Int): Option[(Int, Int)] = ??? The return type is  Option[(Int, Int)] because there might be no pair at all. There are two approaches to solving this problem: The brute force approach which is the simplest and most naive one The first sort approach which is more efficient and a bit more complex Naive approach We can check the given sum against all possible pair combinations. In imperative programming, it would be the famous i and j for loops. But in functional programming, we can implement this us...

Functional algorithm to find how many times a sub-string occurs in a string

Another String search problem that we like to solve in a functional way. This problem is very similar to the previous one,  Functional algorithm to find the position of a sub-string in a string , but it is a bit easier in the sense that we can always return an Int , so we do not have to return an Option of Int , because the number of the occurrences of a String in another String is always a non-negative number. So the signature of the function that we want to write looks like this: def countSubstring(str: String, subString: String): Int = ??? ⚠️ Be careful about overthinking the problems and always remain within the scope, in this case, it is very easy to overthink this problem by mixing up the terms sub-string and word . We only care about sub-strings and our algorithm does not know about words, meaning that our function should return the number 3 ( and not 2 )  if we pass it the following parameter: "This book is the best book among my books", "book" As we saw...