Learning GO: Concurrently execute expensive validations on several items and return the list of validation errors with #golang

By | May 24, 2018

Most of the time software for real use cases from financial industry is just validating inputs. When I say validating inputs I refer to business validation that actually checks against standard formats or generates complex validation against business rule sets.

Usually this step is very important because it is the step that gets hammered with the biggest quantity of data, so making it efficient and fast is a must.

In the following I am giving an example how doing this business validation step in a very efficient parallel way is trivial in golang.

Assume our input data is received in the form of a slice of input items:

inputData:=make([]Item,1000)
Populate(inputData)

– Populate(inputData) is a function that populates the array of elements by loading them by passing a file or getting them from another source like a messaging server or incoming http POST.

Assume also that we have some expensive business validation that needs to be performed just in case the Item is identified as needing one.

func ValidateItem(item *Item) error
{ ...}

In real life this can be the case when we have to process different types of input items (acknowledge messages, instruction messages etc.). We may need to execute the complex business validation just for “instruction messages” and not for “acknowledge messages”. There may be a simple function that decides that:

func needsComplexValidation(item *Item) bool
{ ...}

The following function is all we need to do a parallel validation of all input Items and return just a summary error:

func BusinessValidate(inputData []Item) error {
	businessErrors := make([]string, 0)
	mux := &sync.Mutex{}
	var wg sync.WaitGroup
	for _, item := range inputData {
		if needsComplexValidation(item) {
			wg.Add(1)
			go func(closureItem Item) {
				defer wg.Done()
				err = ValidateItem(closureInvoice)
				if err != nil {
					mux.Lock()
					businessErrors = append(businessErrors, err.Error())
					mux.Unlock()
				}
			}(item)
		}
        wg.Wait()
		if len(businessErrors) > 0 {
			return errors.New(strings.Join(businessErrors, "\n"))
		}
		return nil
	}
}

That is all.

So let me explain it bellow:

Line 2:
Define a slice where we will store the business validation errors

businessErrors := make([]string, 0)

Slices can be created with the built-in make function; this is how you create dynamically-sized arrays.

Line 3:
Define a mutex mux. This mutex will synchronize access to the businessErrors. We need this because slices in golang are not concurrence safe.

mux := &sync.Mutex{}

Note that we use the “sync” golang package for this.

Line 4:
Define a wait group that will be used to synchronize the goroutines and make sure we exit the main loop only after all the goroutines triggered inside the loop finish processing. As a consequence of using the wait group we are able to retrun a single error containing all the business validation errors.

var wg sync.WaitGroup

Line 5:
Define our for loop that iterates on the set of input Items:

for _, item := range inputData {
}

Note the use of “range” keyword. Using this keyword is a nice trick in golang to iterate through all the elements of a slice or map. It is like a build in iterator.

Line 7:
In case we need to perform the complex validation we first set a delta to the wait group. If the counter becomes zero, all goroutines blocked on Wait are released. If the counter goes negative, Add panics.

wg.Add(1)

Note that the counter is decreased making a deferred call to the wg.Done() method. By marking a call with the defer keyword will ensure that the call is executed before the current function exists, in our case at the end of the goroutine execution.

defer wg.Done()

Note that at line 16 we are setting a synchronize wait mark. The wait group will block the execution until waitgroup counter is zero.

wg.Wait()

Lines 8-16:
This is where the magic happens. We define a go routine.

go func(closureItem Item) {
...
}(item)

To understand the magic of goroutines I quote the following from golangbot

Goroutines are functions or methods that run concurrently with other functions or methods. Goroutines can be thought of as light weight threads. The cost of creating a Goroutine is tiny when compared to a thread. Hence its common for Go applications to have thousands of Goroutines running concurrently.
Goroutines are multiplexed to fewer number of OS threads. There might be only one thread in a program with thousands of Goroutines. If any Goroutine in that thread blocks say waiting for user input, then another OS thread is created and the remaining Goroutines are moved to the new OS thread. All these are taken care by the runtime and we as programmers are abstracted from these intricate details and are given a clean API to work with concurrency.

Our goroutine will take an input parameter of type Item:
see Line 8:

closureItem Item

and will receive a copy of item.
see Line 16:

}(item)

It is important to note that we need to pass a copy not just a pointer of the item. Because the goroutines will execute maybe at the same time on the entire slice of items, we must be able to provide all the goroutines with the corresponding item. If we passed a pointer as a parameter we would end up with all the goroutines executing with just the last item as input.

Lines 12-14:
Here we guard the writing to the slice of errors with mux lock and unlock

mux.Lock()
businessErrors = append(businessErrors, err.Error())
mux.Unlock()

How to use the function
The use of the function is straight forward. We just have to call at any point when we load or receive Item data:

inputData:=make([]Item,100)
...
receive(inputData) 
...
err:= BusinessValidate(inputData)
if err!=nil {
  return err
}
...
// continue processing
...

Contribute to this site maintenance !

This is a self hosted site, on own hardware and Internet connection. The old, down to earth way 🙂. If you think that you found something useful here please contribute. Choose the form below (default 1 EUR) or donate using Bitcoin (default 0.0001 BTC) using the QR code. Thank you !

€1.00

One thought on “Learning GO: Concurrently execute expensive validations on several items and return the list of validation errors with #golang

  1. Pingback: Learning GO: Limit #concurrency in #golang – blog.voina.org

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.