Channels of Go are great to deal with real concurrent programs. By real, I mean server or daemon processes which are supposed to work or listen permanently, without a down time. The posts on Go blog (1) (2) are very useful to learn the best practices and to understand the merits of channels.
On the other hand, when I use Go as a C substitute for my algorithms, I deal with bulk data. Often, I have to process an array of data as fast as possible. Although concurrency is not parallelism, I use the goroutines to better utilize 8 (pseudo-)cores of my CPU. A system similar to the map-reduce framework often comes handy: multiple heavy processors and multiple filters. This blog post about pipelines recommends the use of quit channels for graceful shutdowns but their shutdown is not as graceful as I want. As such, using that system does not give me the guaranty of completion.
Let me elaborate with an example. Here are three functions:
processiterates over its assigned slice, process each element and send the processed value to
outchannel. When it completes its slice, it signals
filterreads a value from
inchannel and adds it to the final list if it is eligible.
mainis a glue code that starts a goroutine for the
filter, shares the data between two processors and prints out the
finalvalues when the processors are finished.
Well, as experienced gophers can easily see it has a problem: it exits prematurely. We don’t wait for
filter to finish.
Monitoring the conclusion of
filter is not trivial. Since it’s in an infinite loop, we cannot just use
sync.WaitGroup. Let’s start with the question: when should it say it is finished?
- When all the
processors are complete and there is no data in the
channel. Task of signaling
filterabout the completion of
processors looks like a perfect fit for
quitchannels, as mentioned in the blog post. We can introduce a
quitchannel and let
filterdeal with its completion. However, I couldn’t find an elegant and race condition safe solution using quit channel.
I turned to collective knowledge and search for an elegant solution. I came accross this very nice answer on stackoverflow and I learned that receive operator returns two values: next value on the channel and the channel condition. Here is the table of possible values for
v, ok := <-myChannel:
len(myChannel) == 0
len(myChannel) > 0
v == value
ok == true
v == ZeroValue
ok == false
v == value
ok == false
Therefore, signalling of the completion of
processors can be done over one channel, elegantly.
Final code looks like this:
PS: While I was preparing the example, I realized that
Zero value used often for some types, expecially built-ins like
float. I’m not yet sure about how to handle them. Maybe I’ll return to it in a later post.