Fri, 06 Jul 2007
entity resolution is hard
here's a funny (and surprisingly relevant) example of why entity resolution is hard. in this article about local celebrities and their ranks on google, the author is attempting to point out that there are lots of folks with similar names. the irony of course, is that one of the examples (well, two actually) is me.
more ...
Fri, 22 Jun 2007
genomics institute cluster upgrade
we finally got the order out for the new hardware for
the genomics institute high performance computing facility upgrade.
i can't wait for the new equipment to arrive.
more ...
Mon, 18 Jun 2007
cluster file system under strain
as previously posted, we're currently using gfs to power our 8TB file system.
this system however is under tremendous strain. traversing the file system, to do a full backup for exmample, takes over five days. and the slow down really
is in the directory traversal, our tape drives are only writing data at a
quarter of their capacity. enter gpfs.
more ...
Sat, 05 May 2007
introducing genebean
i'm not sure i like the name, but a group of cis undergrads here at penn
have written a web interface for the medline abstracts we annotate. check out
genebean.
more ...