|
Posted
almost 13 years
ago
by
jeffheaton
Many machine learning training algorithms have the concept of an iteration or an epoch. Both terms mean the same thing. An iteration is simply on pass through the training algorithm. At the end of each iteration the accuracy of the machine
... [More]
learning algorithm is evaluated. This process is continued until the accuracy is sufficient.
The training iteration is something of a black box. What happens during a training iteration is largely dependant on what type of machine learning algorithm you are training. However, there are typically several steps and the machine-learning algorithm is not actually taught anything until the final step of the iteration.
The problem with this approach is that it often does not work terribly well with multi-core architecture. Additionally, it works even worse with a multi-computer grid. The reason for this is the threads must bottleneck at the end of an iteration. To use OpenCL terminology, the end of an iteration is a fence. Meaning all threads must reach that point before we continue. Then once we do continue, all of the threads must start up again. It would be as if we had to clear the entire global airspace once a day to complete the daily iteration. This is very wasteful, in terms of getting all of the aircraft down and them back up into the air again.
Not every aircraft can land instantly. Some are probably fairly far from an airport. Some are probably in the middle of a large ocean. So have to wait for every aircraft (thread) to finish its current task (a flight). Then we update something, and let the entire aircraft take off again. If you can pardon the pun, this approach "would not fly!"
Yet for a multi-threaded iteration based training algorithm this is exactly what you see. Look at my CPU usage history below. This is from a hyper-threaded i7 quad core. See the "icicles"? Can you guess where my program hit an iteration? Correct! At each icicle. In this image it is not that bad. There is quite a bit of training data. The longer an iteration takes, the more efficient this approach is. However notice on a small training set. The icicles are much larger!
Performance has frozen over!
A much better approach is to have a never-ending supply of small work-units. This can be challenging. And it often means changing the underlying algorithm to some degree. As an example of this we will look at Encog's Genetic Programming algorithm.
Genetic Programming works by creating a population of "programs" or "equations", referred to as genomes. Each of these genomes is evaluated and the top genomes mate and mutate to produce the next generation. In genetic programming iterations are often called generations. The typical approach for a non-species based genetic algorithm is as follows.
Create a random population of genomes
Evaluate every genome in the population
Add some percentage of the top genomes directly into the next generation (elitism)
Randomly choose to mutate or crossover
For mutation choose one parent from the population via tournament, then mutate it and add the child to the next generation
For crossover choose two parents from the population via tournament, then breed a new offspring that is added to the next generation
Does the next population have the desired population level? If not, go to step 3
Replace current population with new population
Continue back to step 2 until the evaluation has a good enough score
The iteration ends in step 8. This is an iteration. Steps 4 through 7 can be done in parallel. But then the whole process must synchronize at step 9. An icicle!
Humans like iterations! There are many children in the USA right now in Fifth Grade (assuming you are not reading this in the summer). They will all enter Sixth grade at approximately the same time. Typically sometime in August or September. That is not terribly natural.
An example more from nature is generations. This might be a Western concept, but I identify myself as "Generation X". My parents are baby boomers. Children in school now are either Generation Y or Millennials. When did Generation X stop? Who was the very last baby that was born into generation X? No one knows exactly. This is an attempt to place human conceived labels onto a natural process that cannot so easily be labeled.
We need a similar process for Genetic Programming. Encog GP makes use of the following process that is constantly running inside of a pool of threads equal to the core count of the machine.
Randomly choose to mutate or crossover
For mutation choose one parent from the population via tournament, then mutate it producing a child
For crossover choose two parents from the population via tournament, then breed a new child
For each child, choose an existing population member via anti-tournament to kill. The child replaces this "unlucky" genome.
Some thread synchronization is very necessary for this to happen. Particularly where genomes are entering and exiting the population. However, there are no icicles with this approach. The CPU is pegged solid!
You might be asking, "What happened to elitism". Elitism is where the top genomes are passed directly onto the next generation. This is actually not a natural process. The fact that genomes are killed by anti-tournament accomplishes much of the same thing.
[Less]
|
|
Posted
almost 13 years
ago
by
jeffheaton
In the last part we configured the new build server to contain Apache, Tomcat and Jenkins. We also installed other software packages that we would need to build Encog. In this part I will show you how to obtain Encog (using git) and build Encog
... [More]
(using Maven). We will do this outside of Jenkins. This will allow us to ensure that the build server has everything that it needs. It is also sometimes handy to be able to login to the build box and make changes and compile Encog. Perhaps to fix a build error from a computer that has nothing installed. So long as you can connect to the build server, you can make changes.
First, we will check out the Java core project for Encog. This can be found here.
https://github.com/encog/encog-java-core
There are two ways that you can use Git to "check out" the Encog project. You can "check out" Encog as either read-only or read/write. If you do not believe you will be making any changes to the Encog Core, then you can check out in read-only.
Check Out Read-Only
To check out in read-only you will use this git command.
mkdir ~/projects
cd ~/projects
git clone git://github.com/encog/encog-java-core.gitThis will check out Encog into a folder named encog-java-core in a projects directory.
The Jenkins build will check out Encog in read-only mode. Jenkins just builds Encog, it does not change Encog.
If you would like to change Encog, then you need to check out in write-mode.
Check Out Write-Mode
Checking out in write mode allows you to make changes to Encog and actually push these changes back into the Encog project. If you are planning on becoming an Encog contributor you will need to checkout in this mode. You will not be able to check out the main Encog repository in write-mode. Only a few people have this level of "committer" access to Encog. This is one of the beauties of Git. It is unnecessary for someone to have committer access to the main Encog repository.
If you want to contribute to Encog you should log into GitHub and sign up for an account. You can then "Fork" the Encog repository. This will create a new "Encog Repository" under your account. You will have full write access to your fork. You can then synchronize changes between your repo and the main Encog repository. This will allow someone on the Encog project (most likely Jeff Heaton, the author of this article) to approve the change and merge it into Encog. You will then be given full credit as the committer to this change.
Assuming you have created a GitHub account, you will need to generate a "key-pair" if you want to "check in" code from a command-line Linux account. Even though I have many "key-pairs" already in Git hub, I need to create a new one for the build box. Every machine that you use your GitHub account from, must have a key-pair. This is similar to the key-pair that you used to access Amazon EC2. Yet it is a different key-pair. For me, the key-pair I used with Amazon EC2 was to authorize connections from my MacBook Pro laptop and EC2. This keypair authorizes between the build server and GitHub.
To create this keypair, you will use the following command. This command should work either form UNIX or the Windows GitBASH program.
ssh-keygen -t rsa -C "[email protected]"Once you begin creating your key you will see something like the following.
[ec2-user@domU-12-31-39-07-7C-5C projects]$ ssh-keygen -t rsa -C "[email protected]"
Generating public/private rsa key pair.
Enter file in which to save the key (/home/ec2-user/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/ec2-user/.ssh/id_rsa.
Your public key has been saved in /home/ec2-user/.ssh/id_rsa.pub.
The key fingerprint is:
85:9f:ec:41:ff:6f:ff:ff:ea:01:02:2a:5e:5c:44:3a [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| ..++,|
| . . oB |
| . o E.= |
| = .. --o|
| Z =. - 00|
| .,.o ..|
| .. |
| |
| |
+-----------------+
[ec2-user@domU-12-31-39-07-7C-5C projects]$ Of course the above is from test key, and nothing I am actually using. :)vi ~/.ssh/id_rsa.pubNow you must configure this new key in GitHub. To do this, log into GitHub and perform the following.
Go to your Account Settings
Click "SSH Keys" in the left sidebar
Click "Add SSH key"
Paste your key into the "Key" field
Click "Add key"
Confirm the action by entering your GitHub password
You are now almost ready to checkout Encog. Because you do not “own” the central Encog repository, you will need to fork it first. Once you for it you can check it out using the following commands.
These commands use the URL for the central Encog repo. You will need to put in the URL for your own forked Encog repository.
mkdir ~/projects
cd ~/projects
git clone [email protected]:encog/encog-java-core.git
git clone [email protected]:encog/encog-java-examples.git
git clone [email protected]:encog/encog-java-workbench.git
git clone [email protected]:encog/encog-java-c.gitRegistering with Sonatype
For more information on adding your credentials to Nexus, refer to this article.
You must write your Nexus id/password to your Maven settings.xml file. You can see mine here.
sonatype-nexus-snapshots[user id][password]sonatype-nexus-staging[user id][password]Generate GPG Keys
Most likely you will not need to complete this section for Encog. This is how you upload to Maven Central. You would only need to do this if you were running the Encog project. However, I am including this section in case you are doing this sort of setup for another open source project.
In order to deploy completed Encog JAR files to the Maven Central Repository, it is necessary to GPG sign the artifacts. This requires the creation of a GPG key. This section will show you how to do that. You can also find a complete description of this process here.
It is unlikely you will need to complete this section, but I am adding it incase you are using these articles to setup another open source project (other than Encog). First, you must verify GPG is installed. This is done with the following command.
gpg --versionNext, you need to actually generate the key. This is done with the following command.
gpg --gen-keyYou should simply accept all of the defaults for GPG key generation. You may see the following message for awhile.
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.If you do, open a second connection to your build server and perform some sort of an operation that will take up CPU time. Perhaps a file search across the entire file system might be a good option here.
Building Encog Core
You can now build Encog core. You can build regardless of if you completed the last section. Most likely you did not complete the last section, unless you are setting up a new open source projet.
The following command uses Maven to compile Encog.
cd ~/projects/encog-java-core
mvn packageNow that Encog compiles, we will setup Jinkins to automate this process. This will be covered in the next part.
[Less]
|
|
Posted
almost 13 years
ago
by
jeffheaton
HyperNEAT is a neural network architecture that allows training of neural networks with a very large number if input neurons. Additionally, a HyperNEAT network can change its resolution at any time, without the need to retrain. In this post we will
... [More]
look at the Boxes Visualization Experiment. This was one of the experiments presented in the original journal paper that presented HyperNEAT. (A Hypercube-Based Indirect Encoding for Evolving Large-Scale Neural Networks, Artificial Life journal 15(2), Cambridge, MA: MIT Press, 2009)
The ability to change dimensions without retaining is the secret to how HyperNEAT is able to train for very high numbers of input neurons. HyperNEAT networks are typically trained with a fairly low resolution, and then used with a very high resolution. The Boxes Visualization experiment demonstrates this well. This example is included with Encog’s examples. This post will explain how this experiment works. The figure below shows the experiment running with a fairly poorly trained network at low-resolution.
Let me first describe how to read what you see above. The program above is running at 11x11 resolution. This is the training resolution for this experiment. You will notice two blue boxes on the grid. There is a large 3x3 box, and a smaller 1x1 box. These boxes are always there, however, their position changes. These two boxes are what the neural network sees. The goal of the neural network is to recognize the larger blue box. The neural network places a small red box near where it believes the center of the 3x3 box is.
As you can see from above, the network got it wrong. However, the example above was not trained very well. The network above has a score of only 53. This experiment scores between 0 and 110. 110 is perfect, and zero is performing no better than a random selection process. There are also purpose and white squares. These determine how certain the network was about a box being the center. The white boxes are more certain, the purple are less.
Lets look at the architecture of this network. There are 121 (11*11) input neurons. This corresponds to the individual grid cells. The two blue boxes influence the input neurons. If there is a blue cell (part of either the big or small box) on the grid then that input neuron is 1. If there is no blue on the grid cell, then the neuron is 0. This example presents the grid to the neural network in a row-major form. However, unlike many neural networks, HyperNEAT is actually aware that this is a grid. HyperNEAT makes use of something called a substrate. The substrate defines the layout of the input neurons. Substrates also define how dimensions can be increased and decreased. Defining the substrate is a very important aspect to using HyperNEAT. The substrate will be discussed in much greater detail in a later post.
This network also has 121 output neurons. These output neurons also correspond to the grid you see above. The output neuron that gets the highest value assigned to it by the neural network is considered to be the neural network’s guess at the center of the large blue box.
Now lets take a look at how a properly trained network performs. The image below shows a properly trained HyperNEAT network that scored 110. As you can see it finds the center of the blue box just fine at 11x11 resolution.
However, here is the beauty of it. We can also scale this up to 22x22. No need to retrain the neural network. You can see below the exact same network trying to recognize a 22x22. The boxes have been moved but the network still places the red box very close to the center of the blue box. Going to 22x22 requires a network with 484 input neurons and 484 output neurons.
Now lets take a brief look at how HyperNEAT is able to do this. You can see the HyperNEAT network that was trained here.
Notice anything strange? Look at the input and output neurons. There are 6 input neurons and 2 output neurons. How can this be? The answer is that ALL HyperNEAT networks have two output neurons and the number of substrate dimensions times two. The Boxes Visualization experiment’s substrate has a three dimensional substrate, so there are six input neurons.
The HyperNEAT network that you see above is a network used to create other networks. This is how HyperNEAT works. HyperNEAT networks create regular NEAT networks to actually be used to solve the problem. The above HyperNEAT network can create a network with 121 inputs and 121 outputs to solve the 11x11 grid. The beauty is that the same HyperNEAT network above can be used to create a regular NEAT network to solve the 22x22 grid.
This is just an introduction to the Boxes Visualization experiment. I will likely post other articles describing some of the other aspects of HyperNEAT.
[Less]
|
|
Posted
almost 13 years
ago
by
jeffheaton
Lately Scientific Linux (SL) has become my Linux distribution of choice. Scientific Linux is a Linux Distribution that is distributed created by a paid staff at Fermi Lab and CERN. Scientific Linux is in active use by many universities and
... [More]
research labs around the world. However, is Scientific Linux a useful distribution for a typical Linux user? I believe the answer is yes.
Scientific Linux is based on Red Hat Enterprise Linux (RHEL). There is not a free version of RHEL provided by Red Hat. Red Hat actually uses strict trademark enforcement in this area. However, because the sources of RHEL are open there is nothing to prevent you from compiling a Linux distribution that is essentially the same as RHEL.
That is exactly what CENTOS and Scientific Linux (SL) both do. In this respect the goal of SL and CENTOS are nearly identical. If you’ve used CENTOS before, you will find SL to be very similar. CENTOS stands for the Community Enterprise Operating System. CENTOS is maintained by a community of supporters.
CENTOS is a very common operating system in use by many web hosting companies. I currently run the Heaton Research website on a CENTOS server. For a console-mode web/database server CENTOS is currently what I use. You can certainly use CENTOS as a desktop operating system. However, its focus is to be an enterprise server.
Scientific Linux was created to standardize the Linux that is installed on research labs around the world. A common misconception is that SL is somehow configured for grid computing or comes with scientific applications preinstalled. This is not the case. Scientific Linux is simply a standard Enterprise Operating system that got its start in the research community.
Scientific Linux also typically is updated prior to CENTOS. This is perhaps due to the fact that there is a paid staff of developers working on SL. Also, SL is frequently used as a desktop operating system.
Previously I always used Ubuntu as my desktop Linux operating system. I am not a full-time Linux user. My primary computer is a MacBook Pro. However, I am an avid VMWare use and will very often spin up guest operating systems for a variety of purposes. I really like using Linux in my guest OS’s because Linux does not have the licensing agreements of the commercial Windows operating systems.
I am a legal user of Microsoft Windows. I typically purchase Microsoft’s rather expensive non-upgrade non-OEM retail version of their operating systems. Yet, after a few reinstalls, they always seem to shut my key down. They are always happy to restore my access. I’ve always thought it would be a great theme for a post-apoplectic move to have the last computer on Earth shut down due to a Microsoft Windows license activation issue. But I digress.
My biggest complaint about Ubuntu is that they have now added some rather intrusive ads. They have partnered with Amazon to install several applications into Ubuntu and run display ads in several of the applications. This is annoying for a “free” operating system. If I wanted a PC loaded down with crapware, I would just go to my local bigbox store and get one of their machines.
My current Linux desktop development platform is Scientific Linux 6.3. I make use of the “Minimal Desktop” configuration. Which, unlike Ubuntu, is actually minimum. I also install the following additional software, using YUM and other means.
Oracle JDK
Eclipse IDE
Intellij IDEA (commercial version)
Git
Maven
LaTex
Chrome
GCC/Make/other dev tools
With the above software installed I have a fairly complete development environment. I am able to develop Encog just fine. The only thing missing is .Net development. Typically I use a Windows 7 VMWare image on my Mac to do this.
[Less]
|
|
Posted
almost 13 years
ago
by
jeffheaton
The fact that Encog corrects for many of my dumb mistakes can make debugging interesting. Most machine learning algorithms contain multiple sub-algorithms to ultimately converge. Consider NEAT, or any GA for that matter. You have mutation and
... [More]
crossover working. Both going at the same goal (convergence) just by different means. Even if I destroy the crossover operation to the point of being counterproductive (like I just did by mistake), it still converges. Mutation just steps up to the plate and works harder. Of course overall convergence time suffers, but it still fundamentally "works". There are other examples of this. LMA is another great example, mess up either the Newton or Gradient Descent and the other side steps up. This is why when I make changes a real "warning sign" that I have to look out for is if all of a sudden more iterations are needed for something, even if it actually "works".
[Less]
|
|
Posted
almost 13 years
ago
by
jeffheaton
The fact that Encog corrects for many of my dumb mistakes can make debugging interesting. Most machine learning algorithms contain multiple sub-algorithms to ultimately converge. Consider NEAT, or any GA for that matter. You have mutation and
... [More]
crossover working. Both going at the same goal (convergence) just by different means. Even if I destroy the crossover operation to the point of being counterproductive (like I just did by mistake), it still converges. Mutation just steps up to the plate and works harder. Of course overall convergence time suffers, but it still fundamentally "works". There are other examples of this. LMA is another great example, mess up either the Newton or Gradient Descent and the other side steps up. This is why when I make changes a real "warning sign" that I have to look out for is if all of a sudden more iterations are needed for something, even if it actually "works".
[Less]
|
|
Posted
almost 13 years
ago
by
jeffheaton
The fact that Encog corrects for many of my dumb mistakes can make debugging interesting. Most machine learning algorithms contain multiple sub-algorithms to ultimately converge. Consider NEAT, or any GA for that matter. You have mutation and
... [More]
crossover working. Both going at the same goal (convergence) just by different means. Even if I destroy the crossover operation to the point of being counterproductive (like I just did by mistake), it still converges. Mutation just steps up to the plate and works harder. Of course overall convergence time suffers, but it still fundamentally "works". There are other examples of this. LMA is another great example, mess up either the Newton or Gradient Descent and the other side steps up. This is why when I make changes a real "warning sign" that I have to look out for is if all of a sudden more iterations are needed for something, even if it actually "works".
[Less]
|
|
Posted
almost 13 years
ago
by
jeffheaton
Over the last few weeks NEAT and HyperNEAT have been the focus of my Encog efforts. Support for NEAT has existed in Encog for some time. However, it has never worked terribly well. There were many bugs that were fixed during Encog 3.2.
... [More]
Additionally, there have been many performance improvements to NEAT.
HyperNEAT is the current area of work. HyperNEAT builds directly on top of NEAT. In a nutshell, HyperNEAT works by evolving neural networks to create neural networks. The final output from HyperNEAT is a neural network called a CPPN. The CPPN is used to create a regular NEAT neural network capable of solving the problem you are training for. This two-phased approach can allow HyperNEAT to train for very high-dimensional problems. These same high-dimensional problems would take an eternity to train in regular NEAT.
At this point I have both NEAT and HyperNEAT working relatively well in Encog for Java. There are more improvements to be made before I consider it done. However, it is getting close. The performance is not nearly where I would like it to be. Currently Encog NEAT is single threaded. I will add multi-threaded last. My goals at this point are as follows.
Finish last remaining HyperNEAT issues
Work on some core performance issues in NEAT/HyperNEAT
Port from Java to C#
Multi-Thread Java
Multi-Thread C#
I really want to have Encog 3.2 finalized before porting to C#. This limits the amount of “reporting” that I need to do as Encog 3.2 evolves. However, the multi-threading is very different on C# and Java. I do not want to shoehorn a Java threading model into C#. C# has the very cool Parallel class that I am trying to use through Encog for multithreading. Java works more using a thread pool.
[Less]
|
|
Posted
almost 13 years
ago
by
jeffheaton
Over the last few weeks NEAT and HyperNEAT have been the focus of my Encog efforts. Support for NEAT has existed in Encog for some time. However, it has never worked terribly well. There were many bugs that were fixed during Encog 3.2.
... [More]
Additionally, there have been many performance improvements to NEAT.
HyperNEAT is the current area of work. HyperNEAT builds directly on top of NEAT. In a nutshell, HyperNEAT works by evolving neural networks to create neural networks. The final output from HyperNEAT is a neural network called a CPPN. The CPPN is used to create a regular NEAT neural network capable of solving the problem you are training for. This two-phased approach can allow HyperNEAT to train for very high-dimensional problems. These same high-dimensional problems would take an eternity to train in regular NEAT.
At this point I have both NEAT and HyperNEAT working relatively well in Encog for Java. There are more improvements to be made before I consider it done. However, it is getting close. The performance is not nearly where I would like it to be. Currently Encog NEAT is single threaded. I will add multi-threaded last. My goals at this point are as follows.
Finish last remaining HyperNEAT issues
Work on some core performance issues in NEAT/HyperNEAT
Port from Java to C#
Multi-Thread Java
Multi-Thread C#
I really want to have Encog 3.2 finalized before porting to C#. This limits the amount of “reporting” that I need to do as Encog 3.2 evolves. However, the multi-threading is very different on C# and Java. I do not want to shoehorn a Java threading model into C#. C# has the very cool Parallel class that I am trying to use through Encog for multithreading. Java works more using a thread pool.
[Less]
|
|
Posted
almost 13 years
ago
by
jeffheaton
Over the last few weeks NEAT and HyperNEAT have been the focus of my Encog efforts. Support for NEAT has existed in Encog for some time. However, it has never worked terribly well. There were many bugs that were fixed during Encog 3.2.
... [More]
Additionally, there have been many performance improvements to NEAT.
HyperNEAT is the current area of work. HyperNEAT builds directly on top of NEAT. In a nutshell, HyperNEAT works by evolving neural networks to create neural networks. The final output from HyperNEAT is a neural network called a CPPN. The CPPN is used to create a regular NEAT neural network capable of solving the problem you are training for. This two-phased approach can allow HyperNEAT to train for very high-dimensional problems. These same high-dimensional problems would take an eternity to train in regular NEAT.
At this point I have both NEAT and HyperNEAT working relatively well in Encog for Java. There are more improvements to be made before I consider it done. However, it is getting close. The performance is not nearly where I would like it to be. Currently Encog NEAT is single threaded. I will add multi-threaded last. My goals at this point are as follows.
Finish last remaining HyperNEAT issues
Work on some core performance issues in NEAT/HyperNEAT
Port from Java to C#
Multi-Thread Java
Multi-Thread C#
I really want to have Encog 3.2 finalized before porting to C#. This limits the amount of “reporting” that I need to do as Encog 3.2 evolves. However, the multi-threading is very different on C# and Java. I do not want to shoehorn a Java threading model into C#. C# has the very cool Parallel class that I am trying to use through Encog for multithreading. Java works more using a thread pool.
[Less]
|