Turns out setting up and using MrBayes on an Ubuntu system is much easier than I had thought. If all you want is the normal (serial) version of MrBayes, you can just download it from the repositories. However, if you want a serious speed up in the time it takes to get a good result, you can also run it in parallel on a multicore system (that is, pretty much any computer made in the last 4 years). To get it set up and running on Linux, I used some information I found in a forum post. To recap from there:
- Install the parallel libraries you need from the repositories. The package names I used were: mpich2, libmpich2-dev, and libmpich2-1.2, and libreadline6-dev.
- Download the source code file for MrBayes and unarchive it (on Ubuntu you can just right-click and select ‘Extract Here’
- Find the ‘Makefile’ in the source code and change the line that says ‘MPI ?= no’ so that it says ‘MPI = yes’
- Open a terminal, and navigate to the MrBayes folder (e.g. type in ‘cd /path/to/folder/mrbayes-3.1.2/’) and then make the package (type ‘Make’ at the prompt). It might also be a good idea to change the file called ‘mb’ that is created to something like ‘mbpar’ so that you know it’s the parallel version. Also, I needed to make the file executable, so I typed ‘chmod +x mbpar’ to do that.
- Now, you’ll need to create a file in your home folder called ‘.mpd.conf’ with the line MPD_SECRETWORD=<secretword> in it. Change the <secretword> to something else though; it can be pretty much any word you like.
- ‘mpd &’ will launch the MPICH daemon, which needs to be running in order to handle communicating between the different cores.
- After all this, I was able to type ‘mpirun -np 2 /path/to/mrbayes/mbpar’ to run the parallel version of MrBayes in parallel on both cores of my dual core system. If you have more cores, you can always change the -np argument (e.g. to run it using 4 cores, type ‘mpirun -np 4 /path/to/mrbayes/mbpar’)
With the few tests I’ve done so far, I’ve seen about a 80% speed up by just using 2 cores instead of 1. It’s nice not to have to wait nearly so long to get my results. We’ll see what kind of time savings this could bring if I did it on an 8-core computer.
Thanks a lot for the post.
Thank you for this post and sharing your experience. It certainly made things easy for me. I can now make decent runs on my laptop. The one thing I notice a loss of, is the tab complete with file names and interrupting a run with Ctrl+C causes the program to exit. Nothing major and certainly an acceptable sacrifice for the increase in speeeeeeeeeeeeeeeeeeeeeeeeeed.
What version of Ubuntu do you use? I just get error messages while ‘make’ is running on Ubuntu 11.04.
I’m using 10.10, because I didn’t find 11.04 to be stable enough for my purposes yet. What error messages?
Thanks for the spp.est package. Very useful. I use it in microbial ecology and I have a suggestion (if its feasible and if it makes sense!). There exist a package called mothur (oriented for microbial ecology) into which there is a rarefaction module. I think its basically the same module as spp.est but with an extra parameter which is the frequency. For instance if you only want to output the data every n species). So Lets say we have a table representing a total of 500,000 species and we want only 10 points on our rarefaction curve we would calculate values every 50,000 values. The resulting rarefaction curve would not be as detailed as if using every value, but would still give a good idea… and be faster.
Cheers!
I’ll check it out. Another one of those things that I never really thought about though, since palaeo data usually consist of a few hundred values, not hundred thousand.
I am trying to compile MrBayes in parallel mode in Ubuntu (your site has been a huge help) but I get an error when I give the make command. It chugs along for a while then comes back with a long list of errors about an undefined reference to log. Have you come across this before or have any suggestions?
Can you post the error that it throws and the last few lines before that? That might help to get an idea about what is going on. More than likely it’s a package somewhere that needs to be tracked down and installed.
After issuing the make command the first few lines look like this:
root@ubuntu:/usr/local/bin/mrbayes_3.2.1/src# make
mpicc -O3 -ffast-math -Wall -DNDEBUG -I/usr/local/include/libhmsbeagle-1 -DUSECONFIG_H -c -o bayes.o bayes.c
mpicc -O3 -ffast-math -Wall -DNDEBUG -I/usr/local/include/libhmsbeagle-1 -DUSECONFIG_H -c -o command.o command.c
command.c: In function ‘FreeCharacters’:
command.c:9236:10: warning: variable ‘memoryLetFree’ set but not used [-Wunused-but-set-variable]
command.c: In function ‘ParseCommand’:
command.c:13792:12: warning: ‘numMatches’ may be used uninitialized in this function [-Wuninitialized]
mpicc -O3 -ffast-math -Wall -DNDEBUG -I/usr/local/include/libhmsbeagle-1 -DUSECONFIG_H
Then there is a long list of errors including in:
mbbeagle.c:(.text+0x1f02): undefined reference to `beagleSetCategoryRates’
mbbeagle.c:(.text+0x204a): undefined reference to `beagleUpdateTransitionMatrices’
mbbeagle.o: In function `LaunchBEAGLELogLikeForDivision’:
mbbeagle.c:(.text+0x23b8): undefined reference to `beagleResetScaleFactors’
mbbeagle.c:(.text+0x2559): undefined reference to `beagleResetScaleFactors’
best.o: In function `LnPriorProbGeneTree’:
best.c:(.text+0x1a1e): undefined reference to `log’
…and…
best.c:(.text+0x3027): undefined reference to `log’
best.c:(.text+0x3072): undefined reference to `log’
best.c:(.text+0x358a): undefined reference to `log’
best.o:best.c:(.text+0x35b6): more undefined references to `log’ follow
best.o: In function `Move_SpeciesTree’:
best.c:(.text+0x3971): undefined reference to `exp’
best.c:(.text+0x398c): undefined reference to `log’
collect2: ld returned 1 exit status
make: *** [mb] Error 1
I’ve tried running with and without beagle and I get the undefined log and exp in both cases.
Thank you for taking a look.
Could you fix this problem? I am having exactly the same 😦
Pingback: Making MrBays run on a mulitcore machine | Dave Wheeler's /t+