Issue
Here is my problem: I'm trying to create a cluster of debian servers on which I could train my ANNs (language: Python, libraries: theano, Tensorflow, Keras).
So, I would like to have a master server on which the libraries are installed and on which I would just have to send my code and dataset. This server will then distribute all the calculations between 3 slave servers. I've heard about Pacemaker and Corosync, but all the articles I read talk about high availability, and not about shared computing. Do you have any ideas?
Solution
For this case, I searched around and decided to use Apache Spark and Elephas, which works with Keras. For the moment my installation works under python 2.7 and java 8, after having had problems under java 11 of which I don't know the source. Another track could be to use Apache Spark and dist-keras, a library developed by CERN. But after analysis, this solution seems to me much more complex to implement. Being a bit of a beginner, my choice is therefore Elephas.
Answered By - Asmoht Answer Checked By - Mildred Charles (WPSolving Admin)