Very Basics of Concurrency in Go

Introduction

I’ve dabbled over many languages in the past. Let it be PHP, Java, Python, C/C++, JavaScript and now Go (some of them I only know the syntax). With the experience of all the languages, when I switched to Go, I had no pain understand slices, struct, interfaces, relatively less pain understanding pointers (I’m still learning). But in non of those languages, I did concurrent programming. Although in my first company I did some concurrent programming using Python threads, but the threads didn’t share any data between them.

While you read this post, take a moment to connect with me on LinkedIn.

To understand how goroutines or/and threads work, you need to understand how process management works at the operating system level. If you have read about operating system, process management in particular then this concept would be very easy for you. With that said, we must understand that concurrency is not parallelism. Refer to the following pictures which I shamelessly copied from StackOverflow.

We’ll proceed gradually towards the more advance examples as we proceed.

Pattern 1: WaitGroup

Let us consider this example with basic goroutine.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
package main

import (
    "fmt"
)

func hello() {
    fmt.Println("Hello world goroutine")
}

func main() {
    go hello()
    fmt.Println("main function")
}

Along with the main thread, a new go thread will be spawned for hello func. You might not see the output of hello as the program will exit before hello finishes.

main function

Concurrency is not about adding go keyword at the start of a function, rather it’s an art of orchestrating multiple go threads which in go world called goroutine.

To make this work, we need to add a WaitGroup to wait for all the goroutines to end processing.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
package main

import (
    "fmt"
    "sync"
)

var wg sync.WaitGroup

func hello() {
    fmt.Println("Hello world goroutine")
    wg.Done()
}

func main() {
    wg.Add(1)
    go hello()
    wg.Wait()
    fmt.Println("main function")
}

You create something called a WaitGroup. In most cases we deal with Add, Done and Wait methods of the WaitGroup. Out of which Add and Wait are called from the main goroutine, but Done is called from inside the goroutine… when they are done executing.

If you are having trouble understanding the concept of WaitGroups, you can imagine a playground for children. There is a guard waiting at the main entry. The playground represents the main goroutine, and each kid who enters the playground represents a goroutine. If there is no waitgroup, the guard will close the main entry in the evening and go to home at the end of the day.

But what happens if everytime a kids enters, it does a WaitGroup.Add(1) and before entring the ground and does a WaitGroup.Done() when done their work at the ground? This way guard has track of how many goroutines are running inside the playground.

You can take the same analogy in the program. We do wg.Add(1) to represent 1 go routine. In this case it is hello. Then we say the program to Wait until the goroutine is Done.

Hello world goroutine
main function

Moreover, the parameter to wg.Add could be the size of goroutine worker you want to reserve for the execution.

Pattern 2: Mutex

Mutex is short for mutual exclusion. To understand mutex, first let’s go through an example. This will make mutexes easier to understand.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
package main

import (  
    "fmt"
    "sync"
)

var x  = 0

func increment(wg *sync.WaitGroup) {
    x = x + 1
    wg.Done()
}
func main() {
    var w sync.WaitGroup
    for i := 0; i < 1000; i++ {
        w.Add(1)
        go increment(&w)
    }
    w.Wait()
    fmt.Println("final value of x", x)
}

You might be expecting the value of x to be 1000, but that barely happens however much time you try it. Although the number of goroutines launched is 1000 times, can you guess why the increment is less than 1000? Continue reading to find out.

final value of x 961

Suppose we are in middle of the for loop and the value of x is 100. A goroutine is launched to increment x. At the same time another goroutine is launched at the same time for same reason. Before the first goroutine can finish the increment to 101, another one comes and increment 100 to 101. 2 goroutines are exausted to increment the number just by 1. The thing happening between these two goroutines is known as race condition.

So what can we do to overcome this situation? We are going to use Mutex from the sync package.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
package main

import (  
    "fmt"
    "sync"
)

var x = 0

func increment(wg *sync.WaitGroup, m *sync.Mutex) {
    m.Lock()  // critical section starts
    x = x + 1
    m.Unlock()  // critical section ends
    wg.Done()
}
func main() {
    var w sync.WaitGroup
    var m sync.Mutex
    for i := 0; i < 1000; i++ {
        w.Add(1)
        go increment(&w, &m)
    }
    w.Wait()
    fmt.Println("final value of x", x)
}

m.Lock prevents any other goroutine to access the code between m.Unlock. Meaning that only one goroutine can process that part of code. If you run this code, you’d always get 1000 as answer.

final value of x 1000

There is another concept in concurrent programming called deadlock which we are not going to discuss in this post.

Pattern 3: Channels

Till now, we are able to run routines concurrently to leverage resources, but still there is no communication between them.

We use channels in Go to achive this communication. A few things to note about channels:

A channel is a FIFO queue.
Once you put any data on a channel, the channel is blocked for write until data is read.
The above statement is not true if the channel is buffered.

Now there are two kinds of channels, buffered and unbuffered. If using an unbuffered channel, one need to pull the data from the other end of the queue before something new is put to the queue. Unbuffered channels are blocking (syncronous) by nature. Once you put something into it, some consumer has to pull it out from other end before writing something else into the channel.

Here is one such example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
package main

import "fmt"

func square(i int, c chan int) {
	sq:= i * i
	c <- sq
}

func main() {
	ints := []int{7, 2, 8, -9, 4, 0}

	squaresChan := make(chan int)
	
	for _, v :=  range ints {
		go square(v, squaresChan)
		squares := <-squaresChan
		fmt.Println(squares)
	}
}

Let’s walk through the code now. square is a function which takes an int i and a channel c on which it will return the square of the number. On line 11 we are creating a int array ints. On line 13 we are creating a communication channel which will be used by all the goroutines dealing with squares. On line 16 we execute the goroutine, on next line we accept response from the channel and setting into to the variable squares. On next line finally we print it. Also note that we are doing this as a loop, which you can see on line 15.

I am not very pro at channels right now as I have not made use of it practically. But let’s see if I am able to leverage it in my personal projects.

Conclusion

As I have decided to stick with Go for a longer run, I gotta go learn the concurrency in Go inside out. I’ll try to write whatever I have learned in a meaningful way.

If you enjoyed this post, feel free to leave feedback and share it with like minded people.