Chemical Informatics Letters

Editor: Jonathan M Goodman

Volume 15

Optical Structure Recognition Application (OSRA)
OSRA is a free program which converts pictures of molecules (in most common formats, including PDF, GIF, JPEG, PNG, TIFF) into SMILES. It can be downloaded or used on-line.

The unfair use of copyright material is an important issue. Copyfraud, the false claim of copyright, is also a problem. In particular, factual data is not copyright (Harvard; BitLaw; Canada), although databases and assemblies of data may well be controlled by a copyright holder.

The Open Library
The Open Library is a project to build an open, free internet archive of every book every published. It has some way still to go to achieve this, but it is open for business. Is this the future of the book?

Libertas Academica
Libertas Academica publishes open access journals, including Perspectives in Medicinal Chemistry and Analytical Chemistry Insights. The latter has published seven papers so far.

Scholarly Societies
An overview of scholarly societies

A database of crystal structures, gathered from web resources, containing about sixty thousand structures, and available from the Unilever Centre for Molecular Science Informatics.

Top 500 computers
The list of the world's top 500 computers has been updated. First position is the IBM BlueGene at Lawrence Livermore National Laboratory, which is optimized to run molecular dynamics applications which look at materials aging. Since December 2005 (Chem. Inf. Lett. 2006, 13, #1, 7) the UK's top entry has moved up to number twenty-four (atomic weapons research), and the top UK academic institution, the University of Reading has thirty-sixth place. Cambridge University is now in 44th place.

Computational Organic Chemistry Blog
A computational organic chemistry blog from Professor Steven Bachrach, to accompany his book.

SOMA2: Open Source Modelling Environment
SOMA2 is a modelling environment for computational drug discovery and molecular modelling, which works through a web-browser.

Vatican library closed
This sudden announcement has only minor implications for chemistry, as the Vatican chemistry collection is small. Other routes will need to be found to access "Chemistry in Iraq and Persia in the tenth century AD" from the Memoirs of the Asiatic Society of Bengal. Fortunately, IBM is working on the problem.

WorldWideScience is a global science gateway-accelerating scientific discovery and progress through a multilateral partnership to enable federated searching of national and international scientific databases. It has partners worldwide, and is run from the US Department of Energy's Office of Scientific and Technical Information.

THESEUS is a research program, sponsored by the German government, which aims to develop better ways of using the knowledge available on the Internet. It focuses on semantic technologies, including the automatic generation of metadata for multimedia files, and looks forward to Web 3.0. It is not, therefore, a direct competitor to Google, and has more similarities to Quaero, but search-engines in their current forms will be made obsolete by this technology.

Ten-digit CAS numbers
CAS will start using 10-digit registry numbers from January 2008, to allow for the huge number of new molecules being added to the database. The new digit will be added to the left hand end of the number, and the rightmost digit remains a check-digit.

Image space is small?
A research project at CMU uses a library of images to fill gaps in pictures with parts missing. The method is not infallible, but has startling successes. This demonstrates that image-space is surprisingly small, even though it may be expected to be larger than chemical space.

The Red Cross has been patented
The Red Cross symbol has been patented, and Johnson and Johnson has sued the American Red Cross.

Cost of Open Access Journals
Can academic libraries afford open-access journals? Open access journals are welcomed because they do not require subscription payments. However, the cost is generally transferred to a charge for publication instead of a subscription charge. The saving in the library budget must be balanced against extra costs in other budgets.

The Yale Science Library has withdrawn from BioMed Central's Open Access publishing scheme. There is a response on the BioMed Central blog. The publisher suggests institutions "set aside" funding for open-access page charges.

The Cambridge Chemistry Department publishes over five hundred papers a year, according to Web Of Science. Open-access page charges are covered by the Wellcome Trust, and the NIH already contributes substantial funds to publishing costs, but only a small number of these papers are funded from such sources. To move entirely to open access publishing, the department could "set aside" a small fund of about a million dollars a year. This could be partly funded by cancelling all journal subscriptions, including access to all journal archives and databases, but even this rather drastic approach would leave the small fund a deficit which would grow by hundreds of thousands of dollars every year. CEN estimates Cornell University would pay an extra one and a half million dollars a year if all its departments move to open access publishing.

Not all open-access journals require page charges. For example, the Beilstein Journal of Organic Chemistry does not. Most publishers, however, reasonably expect to gain an income from their journals, and this could come from page-charges instead of subscription costs. This will mean that universities which publish a lot will provide a greater proportion of publishers' income and the proportion provided directly by companies will be reduced. The total amount of money universities pay to publishers may well increase with a move to open access. Scientists in less well-funded institutions and in developing countries may be able to read the scientific literature, but not contribute to it. Might this be the result of campaigning for more open access?

The current prices of journals are analysed on this website. Many Open Access Journals are available, including a large number of chemistry-related journals.

Super Natural Database
The Supernatural database contains almost forty-six thousand molecules. It is managed by the Structural Bioinformatics Group at Berlin.

Collaborative Drug Discovery
Collaborative Drug Discovery enables scientists to archive, mine, and collaborate to more effectively develop new drug candidates for commercial and humanitarian markets. Free trial access is available, but the website does not make it clear what access costs, nor who owns the intellectual property that might be generated by the process.

Science videos: ScienceHack and SciVee
SciVee and ScienceHack both have collections of videos of scientists explaining aspects of chemistry, physics, biology, etc.

British PhD Theses
How can you find out what is in a PhD thesis? Most British PhD theses are available, for a few Euros, through the British Library. Indices to thesis abstracts are commercially available. JISC has just published the results of its study of electronic thesis provision (PDF).

CogMap is an organization chart wiki, which helps people to understand organizations. It has been applied to the Beilstein database to show a way to organize of molecules.

Own a molecule?
The DMCA is a US copyright law which criminalises the act of circumventing access control, even without copyright infringement. This can be used to gain possession of a number. This cannot be used to gain possession of CAS registry numbers, for example, because they already have commercial significance. An InChI hash, however, might be a gift for someone.

PDFBox is an open source Java PDF library for working with PDF documents. This may make it easier to extract information from PDF files.

PhysChem Forum is a forum for scientists doing practical research and analytical work.

© 2007 J M Goodman, Cambridge; Chemical Informatics Letters ISSN 1752-0010
Cambridge Chemistry Home Page CIL Chemical Calculations Chemical Information Laboratory Goodman Research Group Webmaster: J M Goodman