Skip to main content

Process text file concurrently

 Process the same input file concurrently Let's say we have a text file that contains 2 different types of lines.  For the sake of example, one group starts with "i:" (for Integer) and the other group starts with "s:" (for String), something like the following file: s:New york s:Apple i:387548 s:Amsterdam i:4556 i:39874 s:Orange i:56787 s:Banana i:4657567 s:Turkey i:45679456 s:Iran i:4356456 i:23423 i:456 s:Ukraine i:453645 i:5456 We want to process these input lines separately but concurrently. And the process can be anything, for this example, we just print them to the console. But what matters is that we open the file only for READ so 2 different processes (or routines) can read the file at the same time. The Cats Effect answer to open files (or in general any resources) in a safe and efficient way, is  Resource class. So we need a function to take the path to the file and give us a Resource of some type that allows us to read the file line by line. One o...

Functional way of finding the max element in an array

 Finding the maximum element of an array is a relatively easy problem. An imperative solution would be to choose the first element as the max and then go through the rest of the array and assign the current element to the max if the current element is larger than the current max. In the end, the max variable will contain the max value of the array. 



But we are here to learn about Scala and functional programming so mutating a variable is not what we want to do. Also, we do not want to go through the array using a loop because a loop counter is also variable, in functional programming we usually implement loops or in general iterations using recursion.

So let's see how we would implement a function with the following signature for finding the max value of a given array (or List):

def maxValue(elements: List[Int]): Int = ???

For implementing recursion we can define a nested function inside our maxValue function which takes 2 parameters, one int which is the current max value, and one list which is the rest of the list. So the signature of the nested function for the recursive call is like this:

def loop(max: Int, elements: List[Int]): Int = ???

This function should have a base case that returns and some other cases to call itself.

    def loop(max: Int, elements: List[Int]): Int = {
      if (elements.isEmpty) max
      else if (elements.head > max) loop(elements.head, elements.tail)
      else loop(max, elements.tail)
    }

And the main function we only need to call this loop function with the initial list's head and tail. Meaning that initially, we consider that the first element of the list is the maximum value. We also assume the initial list is not empty, otherwise our function crashes!

So the complete function is like this:

  def maxValue(elements: List[Int]): Int = {

    def loop(max: Int, elements: List[Int]): Int = {
      if (elements.isEmpty) max
      else if (elements.head > max) loop(elements.head, elements.tail)
      else loop(max, elements.tail)
    }

    loop(elements.head, elements.tail)
  }

There are 2 things that we can improve on this function:

  1. It can not take an empty list
  2. It only works with a list of Int

First improvement, empty list thingy

We can fix this issue with 2 different approaches
  1. Fix it in the method signature
  2. Fix it in the method body
Fixing the method signature has the advantage that we keep this concern completely separate from our implementation, which means we do not add anything to our implementation, but it comes to the price of having a new type, in this case, NonEmptyList, available. We actually have this class available as it comes with the cats library. So we can simply change our method signature like this and we are good to go.

def maxValue(elements: NonEmptyList[Int]): Int = ???

Do not change the signature of the loop function, it might receive an empty list and it is crucial for the base case.

If we can't change our function's signature, then we should handle this case in the function's body. Simply we should check if the receiving list is empty at first and if it is, we should throw an Exception (I don't like this as throwing Exceptions is not a really functional way of raising a flag) or we can return a specific value, such as Int.MinValue but that's not great either. We can change the return type of our function to return Either of Throwable or Int, as follows (return type is not part of the function signature, although it is a contract change):

def maxValue(elements: List[Int]): Either[Throwable, Int] = ???

And the implementation as:

  def maxValue(elements: List[Int]): Either[Throwable, Int] = {
    def loop(max: Int, elements: List[Int]): Int = {
      if (elements.isEmpty) max
      else if (elements.head > max) loop(elements.head, elements.tail)
      else loop(max, elements.tail)
    }

    elements match {
      case Nil => Left(IllegalArgumentException("the list is empty"))
      case head :: tail => Right(loop(head, tail))
    }
  }

Cool, we did the first improvement. Let's fix the second issue, which is the fact our function is too specific. It can only take a List of Int. Finding the maximum value of a given list should be the same for any comparable list. So let's fix this by means of generic programming

The trait that provides comparability in Scala is Ordering trait, so we can write our method as follows as long as there is an implicit Ordering value in scope. 

  def maxValue[A](elements: NonEmptyList[A])(implicit ordering: Ordering[A]): A = {
    def loop(max: A, elements: List[A]): A = {
      if (elements.isEmpty) max
      else loop(ordering.max(max, elements.head), elements.tail)
    }

    loop(elements.head, elements.tail)
  }

And we can test it using a list of Int because an implicit Ordering value is always in scope for Int:

  def main(args: Array[String]): Unit = {
    val l = NonEmptyList(51, List(22, 3, 9))

    println(maxValue(l))
  }



Comments

Popular posts from this blog

Functional algorithm to find the position of a sub-string in a string

Let's practice some functional problem-solving!  Consider the classic problem of finding the index of a sub-string in a larger string. So we want to write a function that if we pass it the following strings it returns 0 . "hello world", "hello" If we pass it the following strings it returns 6 "hello world", "world" If we pass it the following strings it returns None , because "apple" can not be found inside "hello world. "hello world", "apple" So our function signature should look like this: def findFirstSubString(str: String, subStr: String): Option[Int] = ??? If we want to implement it in an imperative way, and yes Scala allows imperative programming, we can implement the function as follows, but we don't want imperative programming, do we? def findFirstSubString_bad(str: String, subStr: String): Option[Int] = { var j = 0; var temp = "" for (i So let's find the functional algorith...

Find a pair in an array for the given sum

Algorithm: Find a pair in an array of numbers for a given sum Given an Array (or List) and a sum, we want to find a pair of numbers whose sum will be equal to the given sum.  For example for the [1,4,7] and 5 the outcome should be pair of 1 and 4. Or for [5, 7, 2, 8, 3] and 15 the outcome should be pair of 7 and 8.  Both pairs of 1 and 5, and 2 and 4 are correct for the input of [2, 4, 1, 5] and 6. The function signature looks like this: def findPairForSum(list: List[Int], sum: Int): Option[(Int, Int)] = ??? The return type is  Option[(Int, Int)] because there might be no pair at all. There are two approaches to solving this problem: The brute force approach which is the simplest and most naive one The first sort approach which is more efficient and a bit more complex Naive approach We can check the given sum against all possible pair combinations. In imperative programming, it would be the famous i and j for loops. But in functional programming, we can implement this us...

Functional algorithm to find how many times a sub-string occurs in a string

Another String search problem that we like to solve in a functional way. This problem is very similar to the previous one,  Functional algorithm to find the position of a sub-string in a string , but it is a bit easier in the sense that we can always return an Int , so we do not have to return an Option of Int , because the number of the occurrences of a String in another String is always a non-negative number. So the signature of the function that we want to write looks like this: def countSubstring(str: String, subString: String): Int = ??? ⚠️ Be careful about overthinking the problems and always remain within the scope, in this case, it is very easy to overthink this problem by mixing up the terms sub-string and word . We only care about sub-strings and our algorithm does not know about words, meaning that our function should return the number 3 ( and not 2 )  if we pass it the following parameter: "This book is the best book among my books", "book" As we saw...