There are alternatives to back propagation. The simple evolution algorithm Continuous Gray Code Optimization works very well. You can find the paper online. The mutation operator is random plus or minus a.exp(-p.rnd()). If the neural network weight is constrained between -1 and 1 then a=2 to match the interval. rnd() returns a uniform random between 0 and 1. p is the so called precision and is a problem dependent positive number. It is easy to distribute training over many compute devices. Each device gets the full neural model and part of the training data (which can be local and private.) Each device is sent the same short sparse list of mutations and returns the cost for its part of the training data. The costs are summed and if an improvement an accept message is sent to each device else a reject message. Not much data is moving around per second. The devices could be anywhere on the internet, all around the place. Of course with evolution the faster the neural net the better. Fast Transform fixed filter bank neural nets are a good choice. There is some blog about them