CARLsim  3.1.3
CARLsim: a GPU-accelerated SNN simulator
Chapter 10: ECJ

CARLsim-ECJ Parameter-Tuning Framework Overview

CARLsim now has a software interface to an evolutionary computation system written in Java (ECJ) (Luke et al., 2006) to provide an automated parameter tuning framework. We found that an automated tuning framework became increasingly useful as our SNN models became more complex. Evolutionary Algorithms (EAs) enable flexible parameter tuning by means of optimizing a generic fitness function. The first version of the automated parameter-tuning framework used an EA library called Evolving Objects (EO) as the EA engine (Carlson et al., 2014). ECJ was chosen to supercede EO because it is under active development (Linux), supports multi-threading, has excellent documentation, and implements a variety of EAs (Luke at al., 2006).

10_ecj.png
Fig. 1. General approach to parameter tuning. ECJ performs the EA and passes the current generation of parameters (red arrow) to CARLsim for evaluation using the parameter tuning interface (PTI) code. CARLsim assigns each parameter set to an SNN and evaluates all the individuals in parallel, passing the fitness values back to ECJ for selection of individuals in the next generation (black arrow).

Source: Beyeler et al., 2015

Fig. 1 shows the general approach of the automated parameter tuning framework. ECJ implements an EA with a parameter file that includes: EA parameters, the number of individuals per generation, and parameter ranges. Each step of the EA is executed by ECJ except for the evaluation of the fitness function, which is completed by CARLsim. CARLsim evaluates the fitness function in parallel by running multiple SNN individuals simultaneously on the GPU, where the bulk of the computations occur. The majority of the execution time is spent running CARLsim’s optimized C++/CUDA code, and the overhead created by ECJ's operations is negligible.

PTI consists of two components which serve to define a standard interface for executing CARLsim models with ECJ. On the C++ side, PTI provides a small library that helps load a population of parameter vectors, and provides a standard mechanism for returning information about their performance (such as a fitness value) after the simulation has completed. On the ECJ side, PTI provides a Jar file with a number of ECJ extensions. These extensions allow ECJ to easily invoke and communicate with an external simulator when an evolutionary algorithm is launched. Communcation between the ECJ process and the simulator processes it launches is effected automatically via standard UNIX input/output streams.

In a typical usage scenario, at the beginning of every generation of the evolutionary algorithm, the parameters to be tuned are passed from ECJ to an SNN model that has been implemented in CARLsim, and which uses the PTI interface to define its input-output behavior. The model evaluates individuals in parallel and returns the resulting fitness values to ECJ via standard streams. The tuning framework allows users to tune virtually any SNN parameter, while the fitness functions can be written to depend on the neuronal activity or synaptic weights of the SNN.

Since
v3.0

10.2 Installation

The current version of the CARLsim paramter-tuning interface (PTI) uses Evolutionary Computations in Java (ECJ) (version 22 or greater is required). For information on how to install ECJ, please go here.

After ECJ version 22 or greater has been installed the user then needs to set the ECJ_JAR and ECJ_PTI_DIR environment variables in either the .bashrc file or the user.mk file located in the tools/ecj_pti subdirectory. The ECJ_JAR environment variable points to the current installation location of the ECJ jar file. The ECJ_PTI_DIR environment variable points to the desired location of the CARLsim-ECJ PTI code. The code below shows the default values that can be changed in the user.mk file:

#------------------------------------------------------------------------------
# CARLsim/ECJ Parameter Tuning Interface Options
#------------------------------------------------------------------------------
# path of evolutionary computation in java installation for ECJ-PTI CARLsim
# support
ECJ_JAR ?= /opt/ecj/jar/ecj.22.jar
ECJ_PTI_DIR ?= /opt/CARL/carlsim_ecj_pti
Warning
The user should note that the user.mk file in the tools/ecj_pti directory is distinct from the user.mk file in the main carlsim directory and not confuse them. Here we are configuring the user.mk file in the tools/ecj_pti subdirectory.

As an alternative, users can set these environment variables in their .bashrc files if they are using a Unix-like OS. The following lines would be appended to .bashrc.

export ECJ_JAR = /opt/ecj/jar/ecj.22.jar
export ECJ_PTI_DIR = /opt/CARL/carlsim_ecj_pti
Note
You may have to open a new shell or reboot to get export these variables.

Once the environment variables have been set. Navigate to tools/ecj_pti and run:

make && sudo make install

This will install the CARLsim-ECJ PTI library into the location pointed to by ECJ_PTI_DIR. Both the static C++ library and the Jar with the Java extensions for ECJ will be installed in this directory.

Since
v3.0

10.3 Example ECJ Parameter File

To operate an evolutionary algorithm with ECJ, users define a parameter file that details all the algorithm components, parameters, and data-logging mechanisms that will go into the experiment. Creating experiments in this way is a form of declarative programming or inversion of control. This paradigm offers a great deal of flexibility and makes automation of experiments easy in many ways, but it can have a steep learning curve for users who aren't accustomed to programming in this way.

A thorough introduction to using ECJ and its parameter language is beyond the scope of CARLsim's documentation. We encourage CARLsim users to make use of the ECJ tutorials and the fantastic manual that Sean Luke maintains over on the ECJ website.

In general, to use the CARLsim-ECJ PTI, users will create an SNN modeled after the tutorial program found in Tutorial 7: Parameter Tuning Interface (PTI). This program is responsible for reading a collection of parameters, instantiating a number of neural networks, executing them, and returning information about the behavior of each parameterized network (typically a scalar fitness value).

The user then must configure an ECJ parameter file defining the evolutionary search mechanism that will be used to search the parameter space for high-performing network configurations. Here we give a few relevant parts of an example parameter file that you may customize to your own purposes:

parent.0 = @ec.simple.SimpleEvolutionState simple.params
# Modifications to the Simple EA boiler plate
# =========================
eval = ecjapp.eval.SimpleGroupedEvaluator
generations = 50
pop.subpop.0.size = 10
# Set up our evolutionary algorithm
# =========================
pop.subpop.0.species = ec.vector.FloatVectorSpecies
pop.subpop.0.species.pipe = ec.vector.breed.VectorMutationPipeline
pop.subpop.0.species.pipe.likelihood = 1.0
pop.subpop.0.species.pipe.source.0 = ec.vector.breed.VectorCrossoverPipeline
pop.subpop.0.species.pipe.source.0.likelihood = 0.9
pop.subpop.0.species.pipe.source.0.source.0 = ec.select.TournamentSelection
#pop.subpop.0.species.pipe.source.0.source.0 = ec.es.ESSelection
pop.subpop.0.species.pipe.source.0.source.1 = same
select.tournament.size = 2
pop.subpop.0.species.ind = ec.vector.DoubleVectorIndividual
pop.subpop.0.species.fitness = ec.simple.SimpleFitness
pop.subpop.0.species.genome-size = 4
pop.subpop.0.species.min-gene = 0.0005
pop.subpop.0.species.max-gene = 0.5
pop.subpop.0.species.mutation-type = gauss
pop.subpop.0.species.mutation-stdev = 0.1
pop.subpop.0.species.mutation-bounded = true
pop.subpop.0.species.mutation-prob = 0.4
pop.subpop.0.species.crossover-likelihood =0.4
#pop.subpop.0.species.crossover-prob= 0.9
pop.subpop.0.species.crossover-type = two
# breed options
breed = ec.es.MuPlusLambdaBreeder
breed.elite.0 = 1
breed.reevaluate-elites.0 = false
# evolution strategies options
es.mu.0 = 5
es.lambda.0 = 5
# Termination condition
quit-on-run-complete = true
# Set up external fitness evaluation
# =========================
eval.problem.objective.idealFitnessValue = 0.333
eval.problem = ecjapp.eval.problem.CommandProblem
eval.problem.objective = ecjapp.eval.problem.objective.StringToDoubleObjective
eval.problem.simulationCommand = $carlsim_tuneFiringRatesECJ

It's probably easiest to start with this ECJ parameter file and modify it to your project's needs. The particular variables the user needs to edit are:

eval.problem.simulationCommand: which is the name of the carlsim binary ECJ executes every generation to evaluate the fitness function. The $ sign means the path is relative to the location of the parameter file.

generations: number of maximum generations to run.

pop.subpop.0.size: number of individuals in each generation.

pop.subpop.0.species.genome-size: total number of parameters to be tuned in each individual.

pop.subpop.0.species.min-gene: default minimum range value for all parameters to be tuned

pop.subpop.0.species.max-gene: default maximum range value for all parameters to be tuned

To specify the parameter range for each parameter individually, you define min-gene and max.gene values for additional pop.subpop members as is shown in the code below:

pop.subpop.0.species.min-gene.0=0.0004
pop.subpop.0.species.max-gene.0=0.004
pop.subpop.0.species.min-gene.1=0.00005
pop.subpop.0.species.max-gene.1=0.0005
pop.subpop.0.species.min-gene.2=0.01
pop.subpop.0.species.max-gene.2=0.1
pop.subpop.0.species.min-gene.3=0.1
pop.subpop.0.species.max-gene.3=0.2

However,you still need to keep the pop.subpop.0.species.min-gene and pop.subpop.0.species.max-gene in the parameter file.

for more information about the ECJ configuration file, please visit the ECJ homepage.

Users then need to implement their own CARLsim evaluation function. The overall structure is as follows. A specific Experiment class is implemented and inherited from the base Experiment class:

class TuneFiringRatesECJExperiment : public Experiment {

The only class functions functions are the default class constructor and the run function. The run function is where CARLsim code is written and executed. At the final step, the fitness values are output back to ECJ using standard Linux streams.

void run(const ParameterInstances &parameters, std::ostream &outputStream) const {
...
CARLsim* const network = new CARLsim("tuneFiringRatesECJ", GPU_MODE, SILENT);
...
network->setupNetwork();
...
network->runNetwork(runTime,0);
...
for(unsigned int i = 0; i < parameters.getNumInstances(); i++) {
...
outputStream << fitness[i] << endl;
...
}
See also
Tutorial 7: Parameter Tuning Interface (PTI)
Since
v3.0

10.4 ECJ Extensions

Most of what you need to know to write an ECJ parameter file is covered in the ECJ manual. PTI does add some non-standard extensions to ECJ, however, which can be used by setting parameters appropriately. This section serves as the canonical reference for these extensions.

10.4.1 SimpleGroupedEvaluator

The core extention that PTI adds to ECJ is a mechanism that allows us to evaluate multiple individuals via a single call to an external simulator. This allows us to, for instance, execute multiple individuals in parallel on a GPU (something that is not possible with ECJ's built-in mechanisms).

To use PTI's grouped evaluation feature, set the eval parameter in the ECJ parameter file like so:

eval = ecjapp.eval.SimpleGroupedEvaluator

The SimpleGroupedEvaluator class is a version of ECJ's SimpleEvaluator that has been modified so that individuals are evaluated in batches. It accepts the following additional parameters:

  • eval.chunk-size (optional):
    The maximum number of individuals that should be evaluated in each call to the external simulator. You might choose this value, for instance, to the the number of neural networks your GPU card is capable of executing simulataneously.
  • eval.measureEvalTimes (optional):
    If set to true, the number of milliseconds that elapse between generations will be printed. The output is in CSV format. Each row shows the job number, generation, and milliseconds, in that order.
  • eval.evalTimesFile (optional):
    If specified, the times recorded by the measureEvalTimes mechanism will be written to this file instead of to stdout.

SimpleGroupedEvaluator also responds to the same parameters as SimpleEvaluator, such as num-tests and, importantly, problem (see the ECJ manual).

Since
v3.1

10.4.2 CommandProblem

In ECJ, objective functions are defined by custom implementations of the Problem class. The CARLsim-ECJ jar file includes a special Problem that is meant to be used in conjunction with SimpleGroupedEvaluator to launch an external simulator. This is called CommandProblem, since it launches an external command:

eval.problem = ecjapp.problem.CommandProblem

This object accepts a number of crucially important parameters:

  • eval.problem.simulationCommand:
    Path to the binary executable for the external simulation. This should be your compiled CARLsim model.
  • eval.problem.errorGenesFile:
    If the external simulation crashes at any point or produces invalid output, the input (individuals) that caused the error will be recorded to this file. This can be very important for debugging models that behave incorrectly during parameter tuning.
  • eval.problem.errorResultsFile:
    If the external simulation crashes at any point or produces invalid output, the output that cause the error will be recorded to this file. This can be very important for debugging models that behave incorrectly during parameter tuning.
  • eval.problem.objective:
    The ObjectiveFunction that is used to convert the output of the external simulation into a scalar fitness value (See 10.4.3 ObjectiveFunction).
  • eval.problem.simulationCommandArguments (optional):
    Additional command-line arguments you would like passed to every invocation of the command. These arguments are constant, and do not change.
  • eval.problem.dynamicArguments (optional):
    A DynamicArguments for passing information about the evolutionary algorithm's state to the external simulation as command-line arguments (See 10.4.4 DynamicArguments and Multi-GPU Evolution).
  • eval.problem.reevaluate (optional):
    If set to true, individuals will have their fitness re-evaluated each generation. You may want to reevaluate fitnesses if there is noise in your objective function, for instance. Defaults to false.
Since
v3.1

10.4.3 ObjectiveFunction

In some applications, it's useful to think of the external simulator as performing a genotype-to-phenotype mapping: the genotype of each individual (a vector of parameters) is mapped to a phenotype that describes the resulting behavior of the simulation. An ObjectiveFunction in PTI's extension of ECJ performs the final step of converting the phenotype value into a scalar fitness value.

Preferably, the phenotype returned by the simulator is just a single scalar value: the individual's fitness. In CARLsim models, this means that we expect any extraction of features or other data from the SNN execution, and its synthesis into a fitness value, to be performed in the C++ model implementation.

When this is the case, the StringToDoubleObjective object allows ECJ to recognize the output of your model as a stream of fitness values:

eval.problem.objective = ecjapp.eval.problem.objective.StringToDoubleObjective

This objective accepts one optional parameter:

  • eval.problem.objective.idealFitnessValue (optional):
    When set, an individual will be considered optimal if its fitness is equal to or greater than the specified value.

There may be some cases where you wish to perform some post-processing in Java to convert more complex information on a network's behavior into a fitness value. If this is the case, then you may create your own ECJ extention by defining a subclass of ObjectiveFunction. Creating custom classes and using them with ECJ is beyond the scope of this documentation, however.

Since
v3.1

10.4.4 DynamicArguments and Multi-GPU Evolution

In some applications, your CARLsim model may need to know some specific information about the evolutionary algorithm state, beyond the genomes of individuals. In PTI's ECJ extension, a DynamicArguments object plays the role of taking data from variables in ECJ and sending it to the simulator as command-line arguments.

One important use of DynamicArguments is to invoke different simultaneous CARLsim instances on different GPU cards. The ThreadDynamicArguments object makes this possible by passing a thread ID to each instance of the simulator that is launched in a generation.

Say, for example, that your evolutionary algorithm produces 100 children, but that you can only evaluated 50 individuals at a time in your simulation. If you have set eval.chunk-size = 50, and if you have configured ECJ to use 2 evaluation threads, then two threads will be launched, each of which will send its 50 individuals to your CARLsim model to be evaluated on a GPU. ThreadDynamicArguments will assign one of these simulations a thread ID of 0, and the other one a thread ID of 1.

eval.problem.dynamicArguments = ecjapp.eval.problem.ThreadDynamicArguments

The following parameters are available when using this mechanism:

  • eval.problem.dynamicArguments.option:
    The name of the command-line option to pass the thread ID to.
  • eval.problem.dynamicArguments.modulo (optional):
    If specified, the thread ID modulo this value will be passed. Use this if you are using more evaluation threads than the number of GPUs you have available.
  • eval.problem.dynamicArguments.dynamicArguments (optional):
    Another DynamicArguments object. Use this if you want to chain arguments together to pass more information to the simulator.

If, for instance, option is set to '-device', then the simulator launched by thread 1 will have the argument '-device 1' passed to it as a command-line option.

PTI's ECJ extension also includes a DynamicArguments for telling the simulator what generation the evolutionary algorithm is on:

eval.problem.dynamicArguments = ecjapp.eval.problem.GenerationDynamicArguments

It accepts the following parameters:

  • eval.problem.dynamicArguments.option:
    The name of the command-line option to pass the generation number to.
  • eval.problem.dynamicArguments.dynamicArguments (optional):
    Another DynamicArguments object. Use this if you want to chain arguments together to pass more information to the simulator.
Since
v3.1

10.4.5 Recording Statistics

By default, ECJ produces an out.stat file that records the genome and fitness of the best individual in each generation, and of the best individual from the entire run. When it comes time to analyze the results of an EA, this file can be difficult to parse, and it may not contain all the information you need.

PTI's extension provides two addition ECJ Statistics objects that you may use to collect and store commonly-needed EA data in CSV format: ecjapp.statistics.FitnessStatistics and ecjapp.statistics.DoubleVectorGenomeStatistics.

To add a Statistics class to an EA configuration, we add them as children of the default stat object. To do so, first specify the number of children you wish to add, and then define the parameters for each child. In the following example we configure ECJ to collect both FitnessStatistics and DoubleVectorGenomeStatistics.

stat.num-children = 2
stat.child.0 = ecjapp.statistics.FitnessStatistics
stat.child.0.file = $fitness.csv
stat.child.1 = ecjapp.statistics.DoubleVectorGenomeStatistics
stat.child.1.pVectorLength = pop.subpop.0.species.genome-size
stat.child.1.file = $genomes.csv

FitnessStatistics records the minimum, maximum, and average fitness, and the standard deviation of fitnesses in the entire population at each generation. It also records the time (in milliseconds) that each generation finished evaluating. Alternatively, FitnessStatistics can be used to record the fitness of every individual in the population (see the individuals option below).

The following parameters are available for configuring FitnessStatistics:

  • stat.child.0.file:
    Name of the output file to write CSV data to. The fill will be created if it does not already exist. Prefixing the file name with a dollar sign ('$') indicates that it should be saved in the current directory. If you are running multiple runs of the EA (i.e. via the jobs parameter), then the filename will be prefixed with 'job.[jobnumber].', and one file will be created for each job.
  • stat.child.0.gzip (optional):
    If true, the data will be compressed with gzip.
  • stat.child.0.individuals (optional):
    If true, the fitness of every individual in the population will be recorded, instead of summary statistics. The resulting CSV will have 4 columns instead of the usual 7, with one row per individual per generation.

DoubleVectorGenomeStatistics records the genome and fitness of all individuals in the population. The following parameters are available:

  • stat.child.1.file
    Name of the output file to write CSV data to. The fill will be created if it does not already exist. Prefixing the file name with a dollar sign ('$') indicates that it should be saved in the current directory. If you are running multiple runs of the EA (i.e. via the jobs parameter), then the filename will be prefixed with 'job.[jobnumber].', and one file will be created for each job.
  • stat.child.1.pVectorLength:
    The name of a parameter whose value indicates the number of parameters in the genome. This tells DoubleVectorGenomeStatistics how many parameter columns need to be in its output. Typically you'll want to point this to the genome-size parameter of your species.
  • stat.child.1.compress (optional):
    If true, the data will be compressed with gzip.
  • stat.child.1.initOnly (optional):
    If true, only the genomes of the initial population will be recorded.
Since
v3.1

10.5 Tips and Tricks

10.5.1 Retrieving Genomes of the Best Individuals

After running an evolutionary algorithm a number of times to tune a spiking neural network, we often wish to retrieve the best individual found in each run for further analysis. This can be done by examining the output recorded by DoubleVectorGenomeStatistics and finding the individual with the highest fitness.

Alternatively, the following shell script can be copied and pasted to automatically extract the best individual. It assumes that we have run ECJ for 10 jobs, and that the output of DoubleVectorGenomeStatistics has been written to files named 'job.[num].genomes.csv', where [num] ranges from 0 to 9:

for j in {0..9}; do
(echo ‘cat job.$j.genomes.csv | tail -n+2 | sort -rk3 -t, | head -n1 | cut -d, -f4-‘ > bestInd$j.csv);
done;

10.5.2 Evaluating Individuals Outside of ECJ

Equally often, we want to run and analyze a simulation using high-fitness parameters that were found by ECJ.

Because PTI models use the stanard input stream to receive simulation parameters, you can always execute your simulation with any set of parameters by piping the vector of parameters into the simulation. For instance, if I want to use the parameters '0.1, 0.2, 0.3, 0.4, 0.5' to run a simulation whose binary is named 'carlsim_myModel', then I can do so like this:

echo "0.1,0.2,0.3,0.4,0.5" | ./carlsim_myModel

Similarly, if I would like to run several networks in parallel with the same parameters, I can repeat the parameter vector several times:

(for i in {1..40}; do echo "0.1,0.2,0.3,0.4,0.5"; done) | ./carlsim_myModel

Say that I have stored the best genome found from each run in a file, as described in the previous section. Perhaps my fitness evaluation is rather noisy, and I want to get a good estimate of the true expected fitness value of each job's best individual. With the following script, I can take 40 fitness samples from each job's best individual in parallel:

for j in {0..9}; do
(for i in {1..40}; do cat bestInd$j.csv; done) | ./carlsim_myModel
done;

References

Beyeler, M., Carlson, K. D., Chou, T. S., Dutt, N., Krichmar, J. L., CARLsim 3: A user-friendly and highly optimized library for the creation of neurobiologically detailed spiking neural networks. (Submitted)

Carlson, K. D., Nageswaran, J. M., Dutt, N., Krichmar, J. L., An efficient automated parameter tuning framework for spiking neural networks, Front. Neurosci., vol. 8, no. 10, 2014.

Luke, S., Panait, L., Balan, G., Paus, S., Skolicki, Z., Bassett, J., Hubley, R., and Chircop, A., ECJ: A java-based evolutionary computation research system, http://cs.gmu.edu/eclab/projects/ecj, 2006.