I was getting an error checking packages in R on an Ubuntu 10.10 machine using the ‘R CMD check’ command saying:
LaTeX errors when creating PDF version. This typically indicates Rd problems.
I have no idea what R’s problem with pdflatex was (I’ve been using pdflatex for writing papers just fine) but once I installed the texlive-full package (a metapackage that installed all sorts of other packages) the check ran without any errors. If in doubt, this package seems to install everything that might be missing (along with the kitchen sink).
Just a quick tip if you want to use the geiger package in R on an Ubuntu system, there are a few things you need to make sure you do. First, make sure you’ve installed the gfortran package and lapack-dev package in order to build the source packages. As well, you may need to reinstall the r-base package after you’ve done this. I was getting an error message like this:
/usr/lib/liblapack.so.3gf: undefined symbol: ATL_chemv
It seemed to go away after I reinstalled r-base and restarted the R session, although I’m not sure I needed to do both. Either way, it works now.
I recently had a paper that was published in Palaeontologia Electronica on my ‘fossil’ package for R. The journal is open access, so anyone can read it for free.
Vavrek, Matthew J. 2011. fossil: palaeoecological and palaeogeographical analysis tools. Palaeontologia Electronica, 14:1T. http://palaeo-electronica.org/2011_1/238/index.html
I have released a new package on CRAN called ‘parfossil‘ where I have been experimenting with a number of parallelized functions. While R is a readily accessible and fast-to-code language when it comes to stats, it can be really slow when it comes to large analyses, especially on non-desktop computers. However, for my research most of the analysis time is spent in repetitive loops, like Monte Carlo or bootstrap analysis. Currently, R runs everything in serial, with each permutation in a resampling analysis done one after another. This was the only way when most machines only had one core, but with the proliferation of multicore chips even on laptops, this means we are only using a portion of our computer’s power when we run an analysis. With multicore chips we can assign different tasks to different cores, but this is often a difficult thing to code for. Luckily Revolution Analytics has made available a simple to use package called ‘foreach‘ with an included function of the same name that makes the process of parallelization much easier. So far, with the functions I have recoded to run in parallel, I am seeing a speed up of 1.5 to 1.8 times just on my dual core laptop. I would imagine that using a quad core chip would see somewhere above a 3 times speedup; that could mean several hours to even days less of waiting in some cases for some really large data sets. And over the next few years most computer chips will have even more cores available to them. The future of R computing is in parallel.
I just uploaded a new version of fossil to the CRAN website, with a number of changes. There are some fixes in the way the spp.est() function was handling abundance data, and I’ve added a small species/locality dataset that I used for a number of new examples in the package. I’m also currently working on a new clustering method to include, but it’s still being worked on at the moment. Hopefully it’ll be in the package before too long.
Posted in Uncategorized
Tagged CRAN, R