lundi 18 mai 2020

Worker pool pattern - deadlock

Wait for task pattern is the base pattern for pooling pattern.

In the below code:

package main

import (
    "fmt"
    "runtime"
)

// pooling: You are a manager and you hire a team of employees. None of the new
// employees know what they are expected to do and wait for you to provide work.
// When work is provided to the group, any given employee can take it and you
// don't care who it is. The amount of time you wait for any given employee to
// take your work is unknown because you need a guarantee that the work your
// sending is received by an employee.
func pooling() {
    jobCh := make(chan int)     // signalling data on channel with guarantee - unbuffered
    resultCh := make(chan int)  // signalling data on channel with guarantee - unbuffered

    workers := runtime.NumCPU() // 4 workers
    for worker := 0; worker < workers; worker++ {
        go func(emp int) {
            var p int
            for p = range jobCh {
                fmt.Printf("employee %d : recv'd signal : %d\n", emp, p) // do the work
            }
            fmt.Printf("employee %d : recv'd shutdown signal\n", emp) // worker is signaled with closed state channel
            resultCh <- p * 2
        }(worker)
    }

    const jobs = 6
    for jobNum := 1; jobNum <= jobs; jobNum++ {
        jobCh <- jobNum
        fmt.Println("manager : sent signal :", jobNum)
    }

    close(jobCh)
    fmt.Println("manager : sent shutdown signal")

    for a := 1; a <= jobs; a++ {  //cannot range on 'resultCh'
        fmt.Println("Result received: ", <-resultCh)
    }
    fmt.Println("-------------------------------------------------------------")
}
func main() {
    pooling()
}

manager(pooling()) is not receiving all six results, from 4 workers(employees), as shown below,

$ uname -a 
Linux user 4.15.0-99-generic #100-Ubuntu SMP Wed Apr 22 20:32:56 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ 
$ go version
go version go1.14.1 linux/amd64
$
$ go install github.com/myhub/cs61a
$ 
$ 
$ bin/cs61a 
manager : sent signal : 1
manager : sent signal : 2
manager : sent signal : 3
employee 3 : recv'd signal : 3
employee 3 : recv'd signal : 4
manager : sent signal : 4
manager : sent signal : 5
employee 3 : recv'd signal : 5
employee 3 : recv'd signal : 6
manager : sent signal : 6
manager : sent shutdown signal
employee 3 : recv'd shutdown signal
employee 2 : recv'd signal : 2
Result received:  12
employee 0 : recv'd signal : 1
employee 0 : recv'd shutdown signal
employee 2 : recv'd shutdown signal
Result received:  2
Result received:  4
employee 1 : recv'd shutdown signal
Result received:  0
fatal error: all goroutines are asleep - deadlock!


goroutine 1 [chan receive]:
main.pooling()
        /home/../src/github.com/myhub/cs61a/Main.go:40 +0x25f
main.main()
        /home/../src/github.com/myhub/cs61a/Main.go:45 +0x20
$
$
$ bin/cs61a 
manager : sent signal : 1
employee 0 : recv'd signal : 1
manager : sent signal : 2
manager : sent signal : 3
manager : sent signal : 4
employee 3 : recv'd signal : 2
manager : sent signal : 5
manager : sent signal : 6
employee 2 : recv'd signal : 4
employee 2 : recv'd shutdown signal
employee 0 : recv'd signal : 5
manager : sent shutdown signal
Result received:  8
employee 0 : recv'd shutdown signal
Result received:  10
employee 1 : recv'd signal : 3
employee 1 : recv'd shutdown signal
Result received:  6
employee 3 : recv'd signal : 6
employee 3 : recv'd shutdown signal
Result received:  12
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan receive]:
main.pooling()
        /home/user/../github.com/myhub/cs61a/Main.go:40 +0x25f
main.main()
        /home/user/../github.com/myhub/cs61a/Main.go:45 +0x20

Edit:

As per @Mark comments, moving resultCh <- p * 2 into the loop gives below deadlock, which makes sense, because all goroutines are blocked. does buffered channel(of resultCh) help resolve this problem? but buffered channel does not signal data with guarantee..

$ go install github.com/myhub/cs61a
$ bin/cs61a 
manager : sent signal : 1
manager : sent signal : 2
manager : sent signal : 3
manager : sent signal : 4
employee 1 : recv'd signal : 2
employee 2 : recv'd signal : 3
employee 0 : recv'd signal : 1
employee 3 : recv'd signal : 4
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan send]:
main.pooling()
        /home/user/../myhub/cs61a/Main.go:33 +0xfb
main.main()
        /home/user/../myhub/cs61a/Main.go:46 +0x20

goroutine 6 [chan send]:
main.pooling.func1(0xc00001e0c0, 0xc00001e120, 0x0)
        /home/user/../myhub/cs61a/Main.go:24 +0x136
created by main.pooling
        /home/user/../myhub/cs61a/Main.go:20 +0xb7

goroutine 7 [chan send]:
main.pooling.func1(0xc00001e0c0, 0xc00001e120, 0x1)
        /home/user/../myhub/cs61a/Main.go:24 +0x136
created by main.pooling
        /home/user/../myhub/cs61a/Main.go:20 +0xb7

goroutine 8 [chan send]:
main.pooling.func1(0xc00001e0c0, 0xc00001e120, 0x2)
        /home/user/../myhub/cs61a/Main.go:24 +0x136
created by main.pooling
        /home/user/../myhub/cs61a/Main.go:20 +0xb7

goroutine 9 [chan send]:
main.pooling.func1(0xc00001e0c0, 0xc00001e120, 0x3)
        /home/user/../myhub/cs61a/Main.go:24 +0x136
created by main.pooling
        /home/user/../myhub/cs61a/Main.go:20 +0xb7
$
$
$

  1. Why is pooling() not able to receive results from all workers?

  2. Manager is receiving only 4 results out of 6. One of the result received is zero (Result received: 0), data sent on resultCh is always supposed to be non-zero, Why does resultCh receive zero value? It looks like resultCh is closed.

Note: Correct working of resultCh is not part of the responsibility of worker pool pattern. Worker pool pattern only ensure the work is submitted to employee successfully using jobCh

Aucun commentaire:

Enregistrer un commentaire